# 📅 `Automate the Weekly Excel Work to Prepare Coversion and TAT Report` :  
> My objective here is to **completely Automate the manual Excel Work** to prepare the Weekly Conversion and TAT Report and Overall Status in **PPMC (Pre-Policy Medical Check-Up)** so that we can reduce the **One and Half of manual work to 4 Minutes** on Weekly basis for 25 Account Managers.


## 🔰 `Introduction` :
>**PPMC :** In the context of insurance, PPMC stands for Pre-Policy Medical Check-Up. This is a set of medical examinations and tests that a prospective policyholder must undergo before an insurance company approves their policy application. Here's a brief overview:

>**What is a Pre-Policy Medical Check-Up (PPMC) ?** <br>
A PPMC consists of various medical tests to assess the applicant's health status. These tests help the insurer determine the applicant's medical fitness and identify any pre-existing conditions. The results of these tests can influence the premium rates and coverage terms of the policy.

>**Importance of PPMC:** <br>
>1. Assessing Health Status: It provides a baseline health assessment for the policyholder.
>2. Identifying Pre-Existing Conditions: Helps insurers understand if there are any existing medical conditions that need to be considered.
>3. Determining Premiums: The results can affect the premium rates, ensuring they are appropriate for the individual's health status.
>4. Smooth Claim Settlement: The medical reports from the PPMC can be crucial during the claim settlement process.

>**Who Needs to Undergo PPMC ?** <br>
Typically, policyholders above a certain age (often 40 or 45 years) are required to undergo a PPMC. However, this can vary depending on the insurance company and the type of policy.


## 📊 `The Source Data` :
> Account manageers extract csv file from CRM in evening to prepare the WIP Report. This file have **185 Columns** and usually **lakhs of rows** based on their insurer volumes. They use the mainly following columns to prepare the next day appointment tracker :
>- **CorporateName :** Insurer Name
>- **PatientName :** Name of Insured who will go under the medicals.
>- **ApplicationId :** Insured Application Number
>- **AppointmentStatus :** Current starus of Appointment
>- **LastCallStatus :** Final Call Status
>- **NumberofAttempts :** No of Call Attempts
>- **DND :** Particular Case has been marked DND (Do Not Disturb) or not.


## ⭐ `Getting started` :
>We will first build the logic and will explain and make the program. And after the we will make a function to repeat this Weekly task, make it simple for the account manager who is not well verse with python and save the time.

### Import Libraries and Settings
**Settings Used : pd.set_option('display.max_columns', None) and pd.set_option('display.max_rows', 50)**

In [246]:
import pandas as pd
import datetime as dt
import numpy as np
from datetime import datetime
from pprint import pprint

pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', 50)

#### Import Raw Data
**Method Used : pd.read_csv("C:\\Users\\HP\\Downloads\\csv.csv", low_memory=False)**

In [247]:
# Import Raw Data
Raw_Data=pd.read_csv("C:\\Users\\HP\\Downloads\\csv.csv", low_memory=False)

In [248]:
col = ["CorporateName","RequestDate","PatientName","ApplicationId","OrderID","BookingId","PolicyNo","Age","Gender","RelationShip","EmailId",
       "ContactNo","PackageName","packageInvestigations","ApptCreatedDate","AppointmentDate","ApptTime","SecondPreferredDate",
       "SecondPreferredTime","VisitType","ProviderName","ProviderState","ProviderCity","ProviderLocation",
       "AppointmentStatus","ReportUploaded","reportUrl","QcApprovedDate","QC Approval Month","Photo Available & Type","ClientCity",
       "ClientState","ClientAddress","ClientPincode","AgentName","AgentCode","source","NumberofAttempts","lastCallDateTime","LastCallStatus",
       "ApptCreatedBySelfAllocation","ProductName","planType","loanId","mphName","PackageName","SplitDate","ApprovalType","AM_Name","Escort",
       "PriorityAssigned","DND"]

In [249]:
Raw_Data = Raw_Data[col]

#### Select Relevent Columns in relevant_data

In [250]:
# Relevant Columns to work with
relevant_columns=["RequestDate", "PatientName", "ApplicationId","ReportUploaded", "AppointmentStatus", "LastCallStatus",
                  "AppointmentDate", "QcApprovedDate", "NumberofAttempts", "DND"]

In [251]:
# Remove unnecessary columns and take relevant data in relevant_data
Relevant_Data = Raw_Data[relevant_columns]

In [252]:
Relevant_Data.head(3)

Unnamed: 0,RequestDate,PatientName,ApplicationId,ReportUploaded,AppointmentStatus,LastCallStatus,AppointmentDate,QcApprovedDate,NumberofAttempts,DND
0,18/10/2024,Rohit Makwana,GMDBA10030_1000000030_GL0301,No,Appointment Attended,,15/11/2024,,56.0,No
1,21/10/2024,Divya Bhalla,GMDMU11047_1000000017_GL0301,No,Appointment Attended,,15/11/2024,,50.0,No
2,07/11/2024,Sriteja Kolluri,GMDBA09838_1000000046_GL0301,No,Appointment Attended,Appointment Request Received,15/11/2024,,12.0,No


#### Select the Group Cases (Starts with "G" and ends with "01")
**Method Used : .str.startswith("G"), .str.endswith("01"), and(&), or(|)**

In [146]:
# Create group_case_index using .str.startswith("G") and .str.endswith("01"). I could have used .strip() also
group_case_index=Relevant_Data[((Relevant_Data["ApplicationId"].str.startswith("G"))&(Relevant_Data["ApplicationId"].str.endswith("01")))
|
((Relevant_Data["ApplicationId"].str.startswith(" G"))&(Relevant_Data["ApplicationId"].str.endswith("01")))].index

**Method Used : .loc[index]**

In [147]:
# Assign the Group Cases in df Data Frame
df = Relevant_Data.loc[group_case_index]

In [148]:
df.head(2)

Unnamed: 0,RequestDate,PatientName,ApplicationId,ReportUploaded,AppointmentStatus,LastCallStatus,AppointmentDate,QcApprovedDate,NumberofAttempts,DND
0,18/10/2024,Rohit Makwana,GMDBA10030_1000000030_GL0301,No,Appointment Attended,,15/11/2024,,56.0,No
1,07/11/2024,Sriteja Kolluri,GMDBA09838_1000000046_GL0301,No,Appointment Attended,Appointment Request Received,15/11/2024,,12.0,No


#### Check the data types and other Info of df
**Method Used : df.info()**

In [149]:
df.info()

<class 'pandas.core.frame.DataFrame'>
Index: 4572 entries, 0 to 7110
Data columns (total 10 columns):
 #   Column             Non-Null Count  Dtype  
---  ------             --------------  -----  
 0   RequestDate        4572 non-null   object 
 1   PatientName        4572 non-null   object 
 2   ApplicationId      4572 non-null   object 
 3   ReportUploaded     4572 non-null   object 
 4   AppointmentStatus  4572 non-null   object 
 5   LastCallStatus     2750 non-null   object 
 6   AppointmentDate    3693 non-null   object 
 7   QcApprovedDate     3334 non-null   object 
 8   NumberofAttempts   4094 non-null   float64
 9   DND                4572 non-null   object 
dtypes: float64(1), object(9)
memory usage: 392.9+ KB


#### Change the Data Types of Date Columns to Datetime
**Method Used : pd.to_datetime(df[col], format="%d/%m/%Y")**

In [150]:
# Changing the Data Types of Date Columns to Datetime
Date_Columns=["RequestDate", "AppointmentDate", "QcApprovedDate"]
for col in Date_Columns :
    df[col]= pd.to_datetime(df[col], format="%d/%m/%Y") 

#### Remove previous year Datas

In [151]:
# Previous financia years data index
previous_financial_years_data_index=df[df["RequestDate"]<="31/03/2024"].index

#removing Previous financia years data
df=df.drop(previous_financial_years_data_index)

**Method Used : df[Date_Columns].dtypes**

In [152]:
# Check Data types
df[Date_Columns].dtypes

RequestDate        datetime64[ns]
AppointmentDate    datetime64[ns]
QcApprovedDate     datetime64[ns]
dtype: object

#### Converted Data Frame Converted

**Method Used : df[ "col" ].isin ( [List of items] )**

In [153]:
# Index of completed Cases
completed_cases_index=df[df["AppointmentStatus"].isin(["QC Approved", "QC APPROVED", "Reports Uploaded", "Reports Uploaded by DC",
                                                       "Appointment Attended", "QC Rejected", "Sent For Interpretation"])].index

In [154]:
# Converted Data
Converted=df.loc[completed_cases_index, ["RequestDate", "PatientName", "ApplicationId","AppointmentDate","QcApprovedDate","ReportUploaded",
                                         "AppointmentStatus"]]

**Method Used : df[ "Date_Col" ].dt.days  and .astype( "O" )**

In [155]:
# Calculate the Appointment TAT and QC TAT and convert it into Object Type to Keep numerical as ell as string values
Converted["Appointment_TAT"]=(Converted["AppointmentDate"]-Converted["RequestDate"]).dt.days.astype("O")
Converted["QC_TAT"]=(Converted["QcApprovedDate"]-Converted["AppointmentDate"]).dt.days.astype("O")

**Method Used : df[ "col" ].fillna( value )**

In [156]:
# Replace Null values
Converted["QC_TAT"]=Converted["QC_TAT"].fillna("Pending")

**Method Used  isinstance( value, ( int, float ) ), if, elif, else, def, return**

In [157]:
# Define a function Set_TAT_Value to set appropriete TATA value
def Set_TAT_Value(value):
    if isinstance(value, (int, float)) and value<0 :
        return 0
    if isinstance(value, (int, float)) and value==0 :
        return "T0"
    elif isinstance(value, (int, float)) and value==1 :
        return "T+1"
    elif isinstance(value, (int, float)) and value==2 :
        return "T+2"
    elif isinstance(value, (int, float)) and value==3 :
        return "T+3"
    elif isinstance(value, (int, float)) and value==4 :
        return "T+4"
    elif isinstance(value, (int, float)) and value>4 :
        return ">T+4"
    else :
        return value 

**Method Used : df["col"].apply(customized function)**

In [158]:
# Apply the fuctions to the "Appointment_TAT" and "QC_TAT"
Converted["Appointment_TAT"] = Converted["Appointment_TAT"].apply(Set_TAT_Value)
Converted["QC_TAT"] = Converted["QC_TAT"].apply(Set_TAT_Value)

**Method Used : df[ "col" ].value_counts()**

In [159]:
# Check diffrent TAT Counts
Converted["QC_TAT"].value_counts()

QC_TAT
T+1        854
T0         468
T+2        333
T+3         81
Pending     25
T+4         24
>T+4        17
Name: count, dtype: int64

#### Pending Cases Data Frame pending
**Method Used : df[ "col" ].isna(), df[ "col" with condition ].index, df[ "col" ].isin( [ List of values ] )**

In [160]:
# Index of Pending Cases
pending_cases_index=df[~(df["AppointmentStatus"].isin(["QC Approved", "QC APPROVED", "Reports Uploaded",
                                                       "Appointment Attended", "QC Rejected", "Sent For Interpretation"]))].index

# Pending Cases
Pending=df.loc[pending_cases_index,["RequestDate", "PatientName", "ApplicationId", "AppointmentStatus", "LastCallStatus", "NumberofAttempts", "DND"]]

# Creating Status Column and and assigning Max Attempt Cases
Pending.loc[Pending["NumberofAttempts"]>30,"Status"]="Max Attempts"

# DND Cases Index
dnd_cases_index=Pending[(Pending["Status"].isna()) & (Pending["DND"]=="Yes")].index

# Assigning DND Cases to Status Column
Pending.loc[dnd_cases_index,"Status"]="DND"

# Appointment Status Cases Index
appointment_status_cases_index=Pending[(Pending["AppointmentStatus"].isin([
    "Cancelled", "Cancelled by insurer", "Cancelled By Provider", "Appointment Confirmed","Order sent to partner"]))&
(Pending["Status"].isna())].index 

# Assigning Appointment Status Cases to Status Column
Pending.loc[appointment_status_cases_index,"Status"] = Pending.loc[appointment_status_cases_index, "AppointmentStatus"]

# LastCallStatus index
LastCallStatus_index=Pending[Pending["Status"].isna()].index 

# Assigning LastCallStatus to Status Column
Pending.loc[LastCallStatus_index,"Status"]=Pending.loc[LastCallStatus_index,"LastCallStatus"]

# Replacing missing values in Status Column with Non Contactable
Pending["Status"]=Pending["Status"].fillna("Non Contactable")

# Working On It Data
Working_On_It_index=Pending[Pending["Status"].isin(["Appointment Request Received", "Direct Medical", "Location Constraint",
                                                    "Medical Done Report Awaited", "Reminder"])].index

Pending.loc[Working_On_It_index,"Status"]="Working On It"

# Workable Data
workable_index=Pending[Pending["Status"].isin(["Appointment Confirmed", "Callback", "Non Contactable", "Order sent to partner", "Working On It"])].index
Pending.loc[workable_index,"Type"]="Workable"

# Non-Workable Data
Pending.loc[Pending[Pending["Type"].isna()].index,"Type"]="Non_Workable"

**Method Used : df[ "col" ].value_counts()**

In [161]:
# Check the total Workable and Non-Workable Data
Pending["Type"].value_counts()

Type
Non_Workable    514
Workable        138
Name: count, dtype: int64

#### Non Workable DF Non_Workable_Data

In [162]:
# Non Workable Data
Non_Workable_Data=Pending.loc[Pending["Type"]=="Non_Workable"]

In [163]:
Converted.columns

Index(['RequestDate', 'PatientName', 'ApplicationId', 'AppointmentDate',
       'QcApprovedDate', 'ReportUploaded', 'AppointmentStatus',
       'Appointment_TAT', 'QC_TAT'],
      dtype='object')

### Prepare the Appt TAT Report

**Method Used : df[ "Date_Col" ].dt.strftime( '%b' ) to extract the Month**

In [164]:
# Format "RequestDate" as month names for grouping
Converted['Month'] = Converted['RequestDate'].dt.strftime('%b')

**Method Used : pd.pivot_table( df, values="Val_Col", index="Col to put in row", columns="Col to put in header", aggfunc='count', fill_value=0**

In [165]:
# Apply Pivot_Table function to get desired Output
Pivot_Output = pd.pivot_table(
    Converted,
    values='ApplicationId',          # Use a count for 'ApplicationId' or any other identifier
    index='Appointment_TAT',          # Group by 'Appointment_TAT'
    columns='Month',                  # Columns for each month
    aggfunc='count',                  # Count the number of occurrences
    fill_value=0                      # Fill NaNs with 0
)

**Method Used : df.reset_index(drop=False) to Reset the Index**

In [166]:
# Reset index 
Pivot_Output=Pivot_Output.reset_index(drop=False)

In [167]:
Pivot_Output

Month,Appointment_TAT,Apr,Aug,Jul,Jun,May,Nov,Oct,Sep
0,>T+4,134,163,175,165,147,20,135,174
1,T+1,9,16,10,23,21,9,15,21
2,T+2,13,33,29,24,27,20,31,29
3,T+3,14,24,19,26,20,7,21,31
4,T+4,13,23,28,25,25,1,24,28
5,T0,8,3,3,5,2,2,3,4


**Method Used : df.columns.name = None  to remove the text "Month"**

In [168]:
# Remove the column name "Month"
Pivot_Output.columns.name = None

In [169]:
Pivot_Output

Unnamed: 0,Appointment_TAT,Apr,Aug,Jul,Jun,May,Nov,Oct,Sep
0,>T+4,134,163,175,165,147,20,135,174
1,T+1,9,16,10,23,21,9,15,21
2,T+2,13,33,29,24,27,20,31,29
3,T+3,14,24,19,26,20,7,21,31
4,T+4,13,23,28,25,25,1,24,28
5,T0,8,3,3,5,2,2,3,4


In [170]:
# Reorder the index
Pivot_Output=Pivot_Output.loc[[5,1,2,3,4,0]]

In [171]:
Pivot_Output

Unnamed: 0,Appointment_TAT,Apr,Aug,Jul,Jun,May,Nov,Oct,Sep
5,T0,8,3,3,5,2,2,3,4
1,T+1,9,16,10,23,21,9,15,21
2,T+2,13,33,29,24,27,20,31,29
3,T+3,14,24,19,26,20,7,21,31
4,T+4,13,23,28,25,25,1,24,28
0,>T+4,134,163,175,165,147,20,135,174


**Method Used : df[ col ].cumsum()**

In [172]:
# Calculate and add the cumulative Sum Column
Pivot_Output["Apr1"]=Pivot_Output["Apr"].cumsum()
Pivot_Output["May1"]=Pivot_Output["May"].cumsum()
Pivot_Output["Jun1"]=Pivot_Output["Jun"].cumsum()
Pivot_Output["Jul1"]=Pivot_Output["Jul"].cumsum()
Pivot_Output["Aug1"]=Pivot_Output["Aug"].cumsum()
Pivot_Output["Sep1"]=Pivot_Output["Sep"].cumsum()
Pivot_Output["Oct1"]=Pivot_Output["Oct"].cumsum()
Pivot_Output["Nov1"]=Pivot_Output["Nov"].cumsum()

In [173]:
Pivot_Output

Unnamed: 0,Appointment_TAT,Apr,Aug,Jul,Jun,May,Nov,Oct,Sep,Apr1,May1,Jun1,Jul1,Aug1,Sep1,Oct1,Nov1
5,T0,8,3,3,5,2,2,3,4,8,2,5,3,3,4,3,2
1,T+1,9,16,10,23,21,9,15,21,17,23,28,13,19,25,18,11
2,T+2,13,33,29,24,27,20,31,29,30,50,52,42,52,54,49,31
3,T+3,14,24,19,26,20,7,21,31,44,70,78,61,76,85,70,38
4,T+4,13,23,28,25,25,1,24,28,57,95,103,89,99,113,94,39
0,>T+4,134,163,175,165,147,20,135,174,191,242,268,264,262,287,229,59


**Method Used : df.drop( columns=[ List of cols ],  inplace=True )**

In [174]:
# Remove the normal count month columns
Pivot_Output.drop(columns=["Apr", "Aug", "Jul", "Jun", "May", "Nov", "Oct", "Sep"], inplace=True)

In [175]:
Pivot_Output

Unnamed: 0,Appointment_TAT,Apr1,May1,Jun1,Jul1,Aug1,Sep1,Oct1,Nov1
5,T0,8,2,5,3,3,4,3,2
1,T+1,17,23,28,13,19,25,18,11
2,T+2,30,50,52,42,52,54,49,31
3,T+3,44,70,78,61,76,85,70,38
4,T+4,57,95,103,89,99,113,94,39
0,>T+4,191,242,268,264,262,287,229,59


**Dictionary used as mapper for renaming the col**

In [176]:
# Rename the cumulative Sum Column
new_col = { "Apr1" : "Apr",
          "May1" : "May",
          "Jun1" : "Jun",
          "Jul1" : "Jul",
          "Aug1" : "Aug",
          "Sep1" : "Sep",
          "Oct1" : "Oct",
          "Nov1" : "Nov"
         }       

**Method Used : df.rename( columns=new_col_Dict_Mapper, inplace=True)**

In [177]:
# Rename the cumulative Sum Column
Pivot_Output.rename(columns=new_col, inplace=True)

In [178]:
Pivot_Output

Unnamed: 0,Appointment_TAT,Apr,May,Jun,Jul,Aug,Sep,Oct,Nov
5,T0,8,2,5,3,3,4,3,2
1,T+1,17,23,28,13,19,25,18,11
2,T+2,30,50,52,42,52,54,49,31
3,T+3,44,70,78,61,76,85,70,38
4,T+4,57,95,103,89,99,113,94,39
0,>T+4,191,242,268,264,262,287,229,59


In [179]:
Per_Col=["Apr_Per", "May_Per", "Jun_Per", "Jul_Per", "Aug_Per", "Sep_Per", "Oct_Per", "Nov_Per"]
Norm_Col=["Apr", "May", "Jun", "Jul", "Aug", "Sep", "Oct", "Nov"]

**Method Used: for x,y in zip( List_x, List_y ): to perform operation on two list of same size**

In [180]:
for i,j in zip(Per_Col,Norm_Col) :
    Pivot_Output[i]=Pivot_Output[j]/Pivot_Output.loc[0,j]

In [181]:
Pivot_Output

Unnamed: 0,Appointment_TAT,Apr,May,Jun,Jul,Aug,Sep,Oct,Nov,Apr_Per,May_Per,Jun_Per,Jul_Per,Aug_Per,Sep_Per,Oct_Per,Nov_Per
5,T0,8,2,5,3,3,4,3,2,0.041885,0.008264,0.018657,0.011364,0.01145,0.013937,0.0131,0.033898
1,T+1,17,23,28,13,19,25,18,11,0.089005,0.095041,0.104478,0.049242,0.072519,0.087108,0.078603,0.186441
2,T+2,30,50,52,42,52,54,49,31,0.157068,0.206612,0.19403,0.159091,0.198473,0.188153,0.213974,0.525424
3,T+3,44,70,78,61,76,85,70,38,0.230366,0.289256,0.291045,0.231061,0.290076,0.296167,0.305677,0.644068
4,T+4,57,95,103,89,99,113,94,39,0.298429,0.392562,0.384328,0.337121,0.377863,0.393728,0.41048,0.661017
0,>T+4,191,242,268,264,262,287,229,59,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0


In [182]:
# Assign the 
Appt_TAT_Report = Pivot_Output.loc[:,["Apr", "Apr_Per", "May", "May_Per", "Jun", "Jun_Per","Jul", "Jul_Per", "Aug", "Aug_Per", "Sep", "Sep_Per",
                                      "Oct", "Oct_Per", "Nov", "Nov_Per"]]

In [183]:
Appt_TAT_Report

Unnamed: 0,Apr,Apr_Per,May,May_Per,Jun,Jun_Per,Jul,Jul_Per,Aug,Aug_Per,Sep,Sep_Per,Oct,Oct_Per,Nov,Nov_Per
5,8,0.041885,2,0.008264,5,0.018657,3,0.011364,3,0.01145,4,0.013937,3,0.0131,2,0.033898
1,17,0.089005,23,0.095041,28,0.104478,13,0.049242,19,0.072519,25,0.087108,18,0.078603,11,0.186441
2,30,0.157068,50,0.206612,52,0.19403,42,0.159091,52,0.198473,54,0.188153,49,0.213974,31,0.525424
3,44,0.230366,70,0.289256,78,0.291045,61,0.231061,76,0.290076,85,0.296167,70,0.305677,38,0.644068
4,57,0.298429,95,0.392562,103,0.384328,89,0.337121,99,0.377863,113,0.393728,94,0.41048,39,0.661017
0,191,1.0,242,1.0,268,1.0,264,1.0,262,1.0,287,1.0,229,1.0,59,1.0


### Prepare the overall conversion report

In [184]:
df.columns

Index(['RequestDate', 'PatientName', 'ApplicationId', 'ReportUploaded',
       'AppointmentStatus', 'LastCallStatus', 'AppointmentDate',
       'QcApprovedDate', 'NumberofAttempts', 'DND'],
      dtype='object')

In [185]:
# Format month from the RequestDate and add new column 'Month'
df["Month"] = df["RequestDate"].dt.strftime("%b")

In [186]:
# Count Cases for Each Month
monthly_cases = df.groupby('Month').size().reset_index(name='Cases Received')

In [187]:
#Sort the Months in Calendar Order
month_order = ["Apr", "May", "Jun", "Jul", "Aug", "Sep", "Oct", "Nov"]
monthly_cases['Month'] = pd.Categorical(monthly_cases['Month'], categories=month_order, ordered=True)
monthly_cases = monthly_cases.sort_values('Month')

In [188]:
# Calculate and Append the Grand Total

# Calculate the grand total
grand_total = monthly_cases['Cases Received'].sum()

# Create a DataFrame for the grand total row
grand_total_row = pd.DataFrame({'Month': ['Grand Total'], 'Cases Received': [grand_total]})

# Use pd.concat to add the grand total row to the monthly_cases DataFrame
monthly_cases = pd.concat([monthly_cases, grand_total_row], ignore_index=True)

In [189]:
monthly_cases

Unnamed: 0,Month,Cases Received
0,Apr,251
1,May,322
2,Jun,348
3,Jul,339
4,Aug,322
5,Sep,372
6,Oct,362
7,Nov,136
8,Grand Total,2452


#### Monthly Data For Non-Worakable Data

In [190]:
Non_Workable_Data.info()

<class 'pandas.core.frame.DataFrame'>
Index: 514 entries, 26 to 7104
Data columns (total 9 columns):
 #   Column             Non-Null Count  Dtype         
---  ------             --------------  -----         
 0   RequestDate        514 non-null    datetime64[ns]
 1   PatientName        514 non-null    object        
 2   ApplicationId      514 non-null    object        
 3   AppointmentStatus  514 non-null    object        
 4   LastCallStatus     406 non-null    object        
 5   NumberofAttempts   440 non-null    float64       
 6   DND                514 non-null    object        
 7   Status             514 non-null    object        
 8   Type               514 non-null    object        
dtypes: datetime64[ns](1), float64(1), object(7)
memory usage: 40.2+ KB


In [191]:
# Format month from the RequestDate and add new column 'Month'
Non_Workable_Data["Month"] = Non_Workable_Data["RequestDate"].dt.strftime("%b")

# Count Cases for Each Month
NW_monthly_cases = Non_Workable_Data.groupby('Month').size().reset_index(name='Cases Received')

#Sort the Months in Calendar Order
month_order = ["Apr", "May", "Jun", "Jul", "Aug", "Sep", "Oct", "Nov"]
NW_monthly_cases['Month'] = pd.Categorical(NW_monthly_cases['Month'], categories=month_order, ordered=True)
NW_monthly_cases = NW_monthly_cases.sort_values('Month')

# Calculate and Append the Grand Total

# Calculate the grand total
grand_total = NW_monthly_cases['Cases Received'].sum()

# Create a DataFrame for the grand total row
grand_total_row = pd.DataFrame({'Month': ['Grand Total'], 'Cases Received': [grand_total]})

# Use pd.concat to add the grand total row to the monthly_cases DataFrame
NW_monthly_cases = pd.concat([NW_monthly_cases, grand_total_row], ignore_index=True)

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  Non_Workable_Data["Month"] = Non_Workable_Data["RequestDate"].dt.strftime("%b")


In [192]:
# Rename the column
NW_monthly_cases = NW_monthly_cases.rename(columns = {"Cases Received" : "Non-Workable"})

In [193]:
NW_monthly_cases

Unnamed: 0,Month,Non-Workable
0,Apr,57
1,May,76
2,Jun,74
3,Jul,67
4,Aug,52
5,Sep,76
6,Oct,91
7,Nov,21
8,Grand Total,514


#### Monthly Data For Converted Data

In [194]:
Converted.info()

<class 'pandas.core.frame.DataFrame'>
Index: 1802 entries, 0 to 6388
Data columns (total 10 columns):
 #   Column             Non-Null Count  Dtype         
---  ------             --------------  -----         
 0   RequestDate        1802 non-null   datetime64[ns]
 1   PatientName        1802 non-null   object        
 2   ApplicationId      1802 non-null   object        
 3   AppointmentDate    1802 non-null   datetime64[ns]
 4   QcApprovedDate     1777 non-null   datetime64[ns]
 5   ReportUploaded     1802 non-null   object        
 6   AppointmentStatus  1802 non-null   object        
 7   Appointment_TAT    1802 non-null   object        
 8   QC_TAT             1802 non-null   object        
 9   Month              1802 non-null   object        
dtypes: datetime64[ns](3), object(7)
memory usage: 154.9+ KB


In [195]:
# Format month from the RequestDate and add new column 'Month'
Converted["Month"] = Converted["RequestDate"].dt.strftime("%b")

# Count Cases for Each Month
c_monthly_cases = Converted.groupby('Month').size().reset_index(name='Converted Data')

#Sort the Months in Calendar Order
month_order = ["Apr", "May", "Jun", "Jul", "Aug", "Sep", "Oct", "Nov"]
c_monthly_cases['Month'] = pd.Categorical(c_monthly_cases['Month'], categories=month_order, ordered=True)
c_monthly_cases = c_monthly_cases.sort_values('Month')

# Calculate and Append the Grand Total

# Calculate the grand total
grand_total = c_monthly_cases['Converted Data'].sum()

# Create a DataFrame for the grand total row
grand_total_row = pd.DataFrame({'Month': ['Grand Total'], 'Converted Data': [grand_total]})

# Use pd.concat to add the grand total row to the monthly_cases DataFrame
c_monthly_cases = pd.concat([c_monthly_cases, grand_total_row], ignore_index=True)

In [196]:
c_monthly_cases

Unnamed: 0,Month,Converted Data
0,Apr,191
1,May,242
2,Jun,268
3,Jul,264
4,Aug,262
5,Sep,287
6,Oct,229
7,Nov,59
8,Grand Total,1802


##### Overall Conversion

In [197]:
Overall_Conversion = monthly_cases

In [198]:
Overall_Conversion["Non-Workable"] = NW_monthly_cases ["Non-Workable"]

In [199]:
Overall_Conversion

Unnamed: 0,Month,Cases Received,Non-Workable
0,Apr,251,57
1,May,322,76
2,Jun,348,74
3,Jul,339,67
4,Aug,322,52
5,Sep,372,76
6,Oct,362,91
7,Nov,136,21
8,Grand Total,2452,514


In [200]:
Overall_Conversion["Workable"] = Overall_Conversion["Cases Received"] - Overall_Conversion["Non-Workable"]

In [201]:
Overall_Conversion

Unnamed: 0,Month,Cases Received,Non-Workable,Workable
0,Apr,251,57,194
1,May,322,76,246
2,Jun,348,74,274
3,Jul,339,67,272
4,Aug,322,52,270
5,Sep,372,76,296
6,Oct,362,91,271
7,Nov,136,21,115
8,Grand Total,2452,514,1938


In [202]:
Overall_Conversion["Converted Data"] = c_monthly_cases["Converted Data"]

In [203]:
Overall_Conversion

Unnamed: 0,Month,Cases Received,Non-Workable,Workable,Converted Data
0,Apr,251,57,194,191
1,May,322,76,246,242
2,Jun,348,74,274,268
3,Jul,339,67,272,264
4,Aug,322,52,270,262
5,Sep,372,76,296,287
6,Oct,362,91,271,229
7,Nov,136,21,115,59
8,Grand Total,2452,514,1938,1802


In [204]:
Overall_Conversion["Conversion on Workable Data"] = Overall_Conversion["Converted Data"]/Overall_Conversion["Workable"]

In [205]:
Overall_Conversion

Unnamed: 0,Month,Cases Received,Non-Workable,Workable,Converted Data,Conversion on Workable Data
0,Apr,251,57,194,191,0.984536
1,May,322,76,246,242,0.98374
2,Jun,348,74,274,268,0.978102
3,Jul,339,67,272,264,0.970588
4,Aug,322,52,270,262,0.97037
5,Sep,372,76,296,287,0.969595
6,Oct,362,91,271,229,0.845018
7,Nov,136,21,115,59,0.513043
8,Grand Total,2452,514,1938,1802,0.929825


### Export to Excel

In [206]:
# Get the current date and format it as dd-mm-yyyy
from datetime import datetime
current_date = datetime.now().strftime('%d-%m-%Y')
current_date

'15-11-2024'

In [207]:
# Define the destination path
destination= "C:/Users/HP/Downloads/" + "Report_" + current_date +".xlsx"
destination

'C:/Users/HP/Downloads/Report15-11-2024.xlsx'

In [208]:
# Create the writer variable
writer=pd.ExcelWriter(destination)
Raw_Data.to_excel(writer, sheet_name="Raw Data", index=False)
df.to_excel(writer, sheet_name="Cleaned Data", index=False)
Converted.to_excel(writer, sheet_name="Converted Data", index=False)
Pending.to_excel(writer, sheet_name="Pending Data", index=False)
Non_Workable_Data.to_excel(writer, sheet_name="Non_Workable Data", index=False)
Appt_TAT_Report.to_excel(writer, sheet_name="Appt_TAT_Report", index=False)
Overall_Conversion.to_excel(writer, sheet_name="Overall_Conversion", index=False)
writer.close()