# Surgery Process

### Final Analysis

The ER department has provided you with a data extract for all of the patients who received Laparoscopy Appendectomy & Laparoscopy Cholecystectomy’s in the time period as our other datasets.  Use this information to answer the following questions:

* Do you have enough information to check whether if the following targets have been met?
* If the current targets have been met?
* How many patients are in/ out of target?
* What is the average time for the patient journey from the time they are checked in at admitting to when they have surgery?
* Where is the longest wait between steps in the process?
* For both types of surgery, does visit to Diagnostic Imaging add a significant amount of time to the overall process?

In [25]:
# Import necessary libraries
import pandas as pd
import numpy as np

* Do you have enough information to check whether if the following targets have been met?

-Patients checked into the emergency department in 10 minutes.
**Answer:** **No**, as we don't have a record of time before the patient was checked into the emergency department.

-Patients seen by a triage nurse within 20 minutes.
**Answer:** **Yes**, can be computed using 'Patient Admitting - Check In' and 'Patient Triagne Nurse Visit'

-Patients admitted to the ER in 60 minutes.
**Answer:** **Yes**, can be computed using 'Patient Admitting - Check In' and 'Patient Admit to ER'

In [26]:
records = pd.read_csv('data/ER - Patient Log.csv')

display(records.head(5))
print(records.info())

Unnamed: 0,HCID,Patient Admitting - Check In,Patient Triagne Nurse Visit,Patient Admit to ER
0,1805294,2019-01-12 2:14:00,2019-01-12 2:24:00,2019-01-12 2:26:00
1,2233815,2019-01-21 12:17:25,2019-01-21 12:39:25,2019-01-21 13:10:25
2,1043375,2019-01-22 0:50:36,2019-01-22 0:56:36,2019-01-22 1:18:36
3,1203917,2019-01-29 15:43:00,2019-01-29 15:57:00,2019-01-29 16:09:00
4,2616633,2019-01-30 1:58:47,2019-01-30 2:28:47,2019-01-30 3:23:47


<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1229 entries, 0 to 1228
Data columns (total 4 columns):
 #   Column                        Non-Null Count  Dtype 
---  ------                        --------------  ----- 
 0   HCID                          1229 non-null   int64 
 1   Patient Admitting - Check In  1229 non-null   object
 2   Patient Triagne Nurse Visit   1229 non-null   object
 3   Patient Admit to ER           1229 non-null   object
dtypes: int64(1), object(3)
memory usage: 38.5+ KB
None


In [27]:
# Convert to date time data type
records['Patient Admitting - Check In'] = pd.to_datetime(records['Patient Admitting - Check In'], format='%Y-%m-%d %H:%M:%S', errors='coerce')
records['Patient Triagne Nurse Visit'] = pd.to_datetime(records['Patient Triagne Nurse Visit'], format='%Y-%m-%d %H:%M:%S', errors='coerce')
records['Patient Admit to ER'] = pd.to_datetime(records['Patient Admit to ER'], format='%Y-%m-%d %H:%M:%S', errors='coerce')

records.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1229 entries, 0 to 1228
Data columns (total 4 columns):
 #   Column                        Non-Null Count  Dtype         
---  ------                        --------------  -----         
 0   HCID                          1229 non-null   int64         
 1   Patient Admitting - Check In  1229 non-null   datetime64[ns]
 2   Patient Triagne Nurse Visit   1229 non-null   datetime64[ns]
 3   Patient Admit to ER           1229 non-null   datetime64[ns]
dtypes: datetime64[ns](3), int64(1)
memory usage: 38.5 KB


* Calculate if current targets are being met.

In [28]:
# Calculate time interval
records['Actual ER Triage Nurse Wait Time(mins)'] = ((records['Patient Triagne Nurse Visit'] - records['Patient Admitting - Check In']).dt.total_seconds() / 60.0)
records['Actual ER Admission Wait Time(mins)'] = ((records['Patient Admit to ER'] - records['Patient Admitting - Check In']).dt.total_seconds() / 60.0)

# Check if there are records were met
records['ER Wait Target Met'] = (records['Actual ER Triage Nurse Wait Time(mins)'] <= 20) & (records['Actual ER Admission Wait Time(mins)'] <= 60)
records.head(5)

Unnamed: 0,HCID,Patient Admitting - Check In,Patient Triagne Nurse Visit,Patient Admit to ER,Actual ER Triage Nurse Wait Time(mins),Actual ER Admission Wait Time(mins),ER Wait Target Met
0,1805294,2019-01-12 02:14:00,2019-01-12 02:24:00,2019-01-12 02:26:00,10.0,12.0,True
1,2233815,2019-01-21 12:17:25,2019-01-21 12:39:25,2019-01-21 13:10:25,22.0,53.0,False
2,1043375,2019-01-22 00:50:36,2019-01-22 00:56:36,2019-01-22 01:18:36,6.0,28.0,True
3,1203917,2019-01-29 15:43:00,2019-01-29 15:57:00,2019-01-29 16:09:00,14.0,26.0,True
4,2616633,2019-01-30 01:58:47,2019-01-30 02:28:47,2019-01-30 03:23:47,30.0,85.0,False


* How many patients are in/ out of target?

In [29]:
in_target = records[(records['ER Wait Target Met'] == True)]
out_target = records[(records['ER Wait Target Met'] == False)]

print(f'Number of records that are within the target: {in_target.shape[0]}')
print(f'Number of records that are out of the target: {out_target.shape[0]}')

Number of records that are within the target: 585
Number of records that are out of the target: 644


Merge the ER Dataset with the DI & OR Datasets to and answer the below questions:

* What is the average time for the patient journey from the time they are checked in at admitting to when they have surgery?

In [30]:
# Read DI records
di1 = pd.read_excel('data/DI - Visits 1.3.xlsx')
di2 = pd.read_excel('data/DI - Visits 2.3.xlsx')
di3 = pd.read_excel('data/DI - Visits 3.3.xlsx')
di_records = pd.concat([di1, di2, di3], axis=0)
# Filter only necessary columns
di_records.drop(columns=['Pt Age', 'Requesting Physician', 'Req Type - Abdominal'], inplace=True)
# Merge with DI records
records = records.merge(di_records, how='left', on='HCID')

# Read OR records
or_booking = pd.read_csv('data/OR Booking.csv')
or_booking.rename(columns={'HCID ': 'HCID'}, inplace=True)
or_booking['OR Booking Req DT/Tm'] = pd.to_datetime(or_booking['OR Booking Req DT/Tm'], format='%Y-%m-%d %H:%M:%S', errors='coerce')
or_booking['Proc DT'] = pd.to_datetime(or_booking['Proc DT'], format='%Y-%m-%d', errors='coerce')
or_booking['Pt OR Chk In'] = pd.to_datetime(or_booking['Pt OR Chk In'], format='%H%M', errors='coerce')
or_booking['Pt In OR'] = pd.to_datetime(or_booking['Pt In OR'], format='%H%M', errors='coerce')

# Filter only necessary columns
or_booking.drop(columns=['Pt Age', 'Req Proc Tm', 'Pt Loc', 'OR ', 'Pt Trns', 'ORR#'], inplace=True)

# Merge with ER records
records = records.merge(or_booking, how='left', on='HCID')

# Dropping null values
records = records[(records['Pt OR Chk In'].notna()) &
                  (records['Pt In OR'].notna())]

# Get actual OR Checkin and Surgery (mins)
records['Actual OR Checkin'] = pd.to_datetime(records['Proc DT'].dt.date.astype(str) + ' ' + records['Pt OR Chk In'].dt.time.astype(str))
records['Actual OR Surgery'] = pd.to_datetime(records['Proc DT'].dt.date.astype(str) + ' ' + records['Pt In OR'].dt.time.astype(str))

# Get time from checked in at admitting to when they have surgery
records['ER Admitting to Actual Surgery'] = ((records['Actual OR Surgery'] - records['Patient Admit to ER']).dt.total_seconds() / 60.0)

# Get time from checked in at admitting to when they have surgery
records['ER Checkin to Actual Surgery'] = ((records['Actual OR Surgery'] - records['Patient Admitting - Check In']).dt.total_seconds() / 60.0)

avg_er_admit_to_surgery = records['ER Admitting to Actual Surgery'].mean()
avg_er_checkin_to_surgery = records['ER Checkin to Actual Surgery'].mean()

print(f'Average waiting time from ER admission to actual surgery: {avg_er_admit_to_surgery:.2f} minutes')
print(f'Average waiting time from ER checkin to actual surgery: {avg_er_checkin_to_surgery:.2f} minutes')

Average waiting time from ER admission to actual surgery: 1285.33 minutes
Average waiting time from ER checkin to actual surgery: 1338.11 minutes


* Where is the longest wait between steps in the process?

**Answer:**

**Step6 OR Request to OR Checkin** has the longest wait with the average wait time of **1100.42 minutes** or **18.34 hours**.

In [31]:
# Checkin to Triage Nurse (mins)
records['Step1 Checkin to Triage Nurse'] = ((records['Patient Triagne Nurse Visit'] - records['Patient Admitting - Check In']).dt.total_seconds() / 60.0)

# Triage Nurse to DI Request (mins)
records['Step2 Triage Nurse to ER Admission'] = ((records['Patient Admit to ER'] - records['Patient Triagne Nurse Visit']).dt.total_seconds() / 60.0)

# DI Request to Actual DI (mins)
records['Step3 ER Admission to DI Request'] = ((records['DI Req - Time'] - records['Patient Admit to ER']).dt.total_seconds() / 60.0)
records['Step4 DI Request to Actual DI'] = ((records['DI - Pt in Suite'] - records['DI Req - Time']).dt.total_seconds() / 60.0)

# Actual DI to OR Request (mins)
records['Step5 Actual DI to OR Request'] = ((records['OR Booking Req DT/Tm'] - records['DI - Pt in Suite']).dt.total_seconds() / 60.0)

# OR Request to OR Checkin (mins)
records['Step6 OR Request to OR Checkin'] = ((records['Actual OR Checkin'] - records['OR Booking Req DT/Tm']).dt.total_seconds() / 60.0)

# OR Checkin to OR Surgery (mins)
records['Step7 OR Checkin to OR Surgery'] = ((records['Actual OR Surgery'] - records['Actual OR Checkin']).dt.total_seconds() / 60.0)

# Eliminate negative values
print(f'Total records before filtering negative time interval: {records.shape[0]}')
records = records.loc[~((records['Step1 Checkin to Triage Nurse'] < 0) |
                (records['Step2 Triage Nurse to ER Admission'] < 0) |
                (records['Step3 ER Admission to DI Request'] < 0) |
                (records['Step4 DI Request to Actual DI'] < 0) |
                (records['Step5 Actual DI to OR Request'] < 0) |
                (records['Step6 OR Request to OR Checkin'] < 0) |
                (records['Step7 OR Checkin to OR Surgery'] < 0))]
print(f'Total records after filtering negative time interval: {records.shape[0]}\n')

# Get the averages time interval of each step
steps = [
    'Step1 Checkin to Triage Nurse',
    'Step2 Triage Nurse to ER Admission',
    'Step3 ER Admission to DI Request',
    'Step4 DI Request to Actual DI',
    'Step5 Actual DI to OR Request',
    'Step6 OR Request to OR Checkin',
    'Step7 OR Checkin to OR Surgery'
]

step_avg_time = {}
for step in steps:
    average = records[step].mean()
    step_avg_time[step] = average
    print(f'Average waiting time of {step}: {average:.2f} minutes')

# Get the max waiting time
long_wait_step = max(step_avg_time, key=step_avg_time.get)
avg_wait_time_min = step_avg_time[long_wait_step]
avg_wait_time_hrs = avg_wait_time_min / 60.0
print(f'\n\t{long_wait_step} has the longest wait with the average wait time of {avg_wait_time_min:.2f} minutes or {avg_wait_time_hrs:.2f} hours')

Total records before filtering negative time interval: 1225
Total records after filtering negative time interval: 1208

Average waiting time of Step1 Checkin to Triage Nurse: 22.15 minutes
Average waiting time of Step2 Triage Nurse to ER Admission: 30.65 minutes
Average waiting time of Step3 ER Admission to DI Request: 85.21 minutes
Average waiting time of Step4 DI Request to Actual DI: 80.89 minutes
Average waiting time of Step5 Actual DI to OR Request: 45.20 minutes
Average waiting time of Step6 OR Request to OR Checkin: 1100.42 minutes
Average waiting time of Step7 OR Checkin to OR Surgery: 33.91 minutes

	Step6 OR Request to OR Checkin has the longest wait with the average wait time of 1100.42 minutes or 18.34 hours


Relative to all known targets for timeliness (the above and OR Booking Status) which parts of the process have the highest percentage of patients missing their target?  (2 Marks)

**Answer:**
From the summary table below, we can see that step of waiting for **Triage Nurse Visit** has the highest percentage of unmeet target accounting to **46.9%**.

In [32]:
# Get the target and actual time
records['Expected ER Triage Nurse Wait Time(mins)'] = 20.0
records['Expected ER Admission Wait Time(mins)'] = 60.0

# Get the target and actual time for OR Surgery
records['Expected OR Surgery Wait Time(mins)'] = records['Pt Priority'].str.extract('^.+-([0-6]+)H').astype(int)[0] * 60.0
records['Actual OR Surgery Wait Time(mins)'] = ((records['Actual OR Surgery'] - records['OR Booking Req DT/Tm']).dt.total_seconds() / 60.0)

# Get DI to OR Booking time
records['DI Request to OR Booking(mins) '] = ((records['OR Booking Req DT/Tm'] - records['DI Req - Time']).dt.total_seconds() / 60.0)

# Get overall wait time
records['Expected Overall Process(mins)'] = records['Expected ER Admission Wait Time(mins)'] + records['Expected OR Surgery Wait Time(mins)']

# Get overall total wait time
records['Actual Overall Process(mins)'] = ((records['Actual OR Surgery'] - records['Patient Admitting - Check In']).dt.total_seconds() / 60.0)

# Create new table that has the summarize time interval
timelines = records[[
    'HCID',
    'Proc Descr Mod',
    'Step1 Checkin to Triage Nurse',
    'Actual ER Triage Nurse Wait Time(mins)',
    'Expected ER Triage Nurse Wait Time(mins)',
    'Step2 Triage Nurse to ER Admission',
    'Actual ER Admission Wait Time(mins)',
    'Expected ER Admission Wait Time(mins)',
    'Step3 ER Admission to DI Request',
    'Step4 DI Request to Actual DI',
    'Step5 Actual DI to OR Request',
    'DI Request to OR Booking(mins) ',
    'Step6 OR Request to OR Checkin',
    'Step7 OR Checkin to OR Surgery',
    'Actual OR Surgery Wait Time(mins)',
    'Expected OR Surgery Wait Time(mins)',
    'Actual Overall Process(mins)',
    'Expected Overall Process(mins)'
]]

timelines['ER Triage Nurse Target Met'] = timelines['Actual ER Triage Nurse Wait Time(mins)'] <= timelines['Expected ER Triage Nurse Wait Time(mins)']
timelines['ER Admission Target Met'] = timelines['Actual ER Admission Wait Time(mins)'] <= timelines['Expected ER Admission Wait Time(mins)']
timelines['OR Surgery Target Met'] = timelines['Actual OR Surgery Wait Time(mins)'] <= timelines['Expected OR Surgery Wait Time(mins)']
timelines['Overall Target Met'] = timelines['Actual Overall Process(mins)'] <= timelines['Expected Overall Process(mins)']

triage_nurse = timelines['ER Triage Nurse Target Met'].value_counts()
er_admission = timelines['ER Admission Target Met'].value_counts()
or_surgery = timelines['OR Surgery Target Met'].value_counts()
overall = timelines['Overall Target Met'].value_counts()
count_tbl = pd.concat([triage_nurse, er_admission, or_surgery, overall], axis=1)

triage_nurse = timelines['ER Triage Nurse Target Met'].value_counts(normalize=True).mul(100).round(1).astype(str)+'%'
er_admission = timelines['ER Admission Target Met'].value_counts(normalize=True).mul(100).round(1).astype(str)+'%'
or_surgery = timelines['OR Surgery Target Met'].value_counts(normalize=True).mul(100.00).round(1).astype(str)+'%'
overall = timelines['Overall Target Met'].value_counts(normalize=True).mul(100).round(1).astype(str)+'%'
percent_tbl = pd.concat([triage_nurse, er_admission, or_surgery, overall], axis=1)

display('Total Count of each step processing time status:', count_tbl)
display('Total Percentage of each step processing time status:', percent_tbl)

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  timelines['ER Triage Nurse Target Met'] = timelines['Actual ER Triage Nurse Wait Time(mins)'] <= timelines['Expected ER Triage Nurse Wait Time(mins)']
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  timelines['ER Admission Target Met'] = timelines['Actual ER Admission Wait Time(mins)'] <= timelines['Expected ER Admission Wait Time(mins)']
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.

'Total Count of each step processing time status:'

Unnamed: 0,ER Triage Nurse Target Met,ER Admission Target Met,OR Surgery Target Met,Overall Target Met
True,641,861,753,550
False,567,347,455,658


'Total Percentage of each step processing time status:'

Unnamed: 0,ER Triage Nurse Target Met,ER Admission Target Met,OR Surgery Target Met,Overall Target Met
True,53.1%,71.3%,62.3%,45.5%
False,46.9%,28.7%,37.7%,54.5%


* For both types of surgery, does visit to Diagnostic Imaging add a significant amount of time to the overall process?

**Answer:**

Based on the summary table data provided below, it's evident that the time taken from DI request to OR Booking Request ('Step4 DI Request to Actual DI' + 'Step5 Actual DI to OR Request') represents approximately 5.74% of the total process duration. This indicates that Diagnostic Imaging contributes significantly to the overall process timeline.
Considering this insight, it becomes clear that focusing on optimizing the Diagnostic Imaging phase can lead to substantial improvements in the efficiency of the entire process.


In [33]:
cols = [
    'Step1 Checkin to Triage Nurse',
    'Step2 Triage Nurse to ER Admission',
    'Step3 ER Admission to DI Request',
    'Step4 DI Request to Actual DI',
    'Step5 Actual DI to OR Request',
    'Step6 OR Request to OR Checkin',
    'Step7 OR Checkin to OR Surgery'
]
steps = timelines[cols].sum().reset_index()
steps.rename({'index': 'Steps', 0: 'Total Time (mins)'}, axis=1, inplace=True)
# Get overall process time
overall_time = steps['Total Time (mins)'].sum()
print(f'Total Overall time: {overall_time:.2f} minutes')
# Get percentage of process time
steps = steps.set_index('Steps')
percent = (steps.div(steps.sum(axis=0), axis=1) * 100).reset_index()
steps.reset_index(inplace=True)
steps['Total Time (%)'] = percent['Total Time (mins)']
display(steps)

Total Overall time: 1586628.07 minutes


Unnamed: 0,Steps,Total Time (mins),Total Time (%)
0,Step1 Checkin to Triage Nurse,26759.0,1.686533
1,Step2 Triage Nurse to ER Admission,37030.0,2.33388
2,Step3 ER Admission to DI Request,61523.94,3.877654
3,Step4 DI Request to Actual DI,58401.3,3.680844
4,Step5 Actual DI to OR Request,32635.82,2.05693
5,Step6 OR Request to OR Checkin,1329309.0,83.782017
6,Step7 OR Checkin to OR Surgery,40969.0,2.582143


### Using aggregated data, what insights can we gain?

**Answer:**

* Current wait time for Triage Nurse visits is alarmingly falling short of our targets. 
With a staggering 46.9% of cases missing the mark, it's evident that immediate action is necessary.
Of particular concern is the maximum wait time of 80 minutes, which is four times the targeted 20-minute window.
This discrepancy raises questions about potential resource shortages or inefficiencies in nurse shift scheduling that may be contributing to this alarming trend.

* It's clear that the step from booking to actual surgery has the longest process time, which is taking an average of 1100 minutes.
However, it's worth noting that 62.3% of cases are hitting the set wait time target. This shows that improvements are possible and can be achieved.
The majority of our patients require Appendectomy which is a priority and should usually be operated on within 360 minutes or 6 hours.

* On a positive note, the step of ER Admission is performing well with an average process time of 30 minutes, which is only half of the target time.
With a success rate of 71.3%, this demonstrates that efficiency can be achieved within our system.

In [34]:
summary = timelines[cols].agg(['min', 'max', 'mean', 'std', 'sum'])
summary = summary.transpose().reset_index()
summary.rename({
                'index': 'Steps',
                'min': 'Minimum Time (mins)',
                'max': 'Maximum Time (mins)',
                'mean': 'Average Time (mins)',
                'std': 'Standard Deviation Time (mins)',
                'sum': 'Total Time (mins)',
                }, axis=1, inplace=True)
summary['Total Time (%)'] = percent['Total Time (mins)']
print('\nSummary of processing time for each step:')
display(summary)
print('\nTotal Percentage of each step processing time status:')
display(percent_tbl)
priority = records[['Proc Descr Mod', 'Expected OR Surgery Wait Time(mins)']]
print('\nTotal Patients per Procedure:')
display(priority.groupby(['Proc Descr Mod', 'Expected OR Surgery Wait Time(mins)']).value_counts())


Summary of processing time for each step:


Unnamed: 0,Steps,Minimum Time (mins),Maximum Time (mins),Average Time (mins),Standard Deviation Time (mins),Total Time (mins),Total Time (%)
0,Step1 Checkin to Triage Nurse,0.0,80.0,22.15149,15.473292,26759.0,1.686533
1,Step2 Triage Nurse to ER Admission,0.0,119.0,30.653974,17.049841,37030.0,2.33388
2,Step3 ER Admission to DI Request,0.992133,234.997517,85.213217,41.913733,61523.94,3.877654
3,Step4 DI Request to Actual DI,0.53065,226.311267,80.88823,58.588332,58401.3,3.680844
4,Step5 Actual DI to OR Request,0.47835,119.883783,45.201969,28.914303,32635.82,2.05693
5,Step6 OR Request to OR Checkin,15.0,6085.0,1100.421358,1177.653495,1329309.0,83.782017
6,Step7 OR Checkin to OR Surgery,0.0,189.0,33.914735,18.26066,40969.0,2.582143



Total Percentage of each step processing time status:


Unnamed: 0,ER Triage Nurse Target Met,ER Admission Target Met,OR Surgery Target Met,Overall Target Met
True,53.1%,71.3%,62.3%,45.5%
False,46.9%,28.7%,37.7%,54.5%



Total Patients per Procedure:


Proc Descr Mod               Expected OR Surgery Wait Time(mins)
Laparoscopy Appendectomy     60.0                                     1
                             120.0                                    5
                             360.0                                  544
                             720.0                                    4
                             1440.0                                   7
                             2160.0                                   4
Laparoscopy Cholecystectomy  120.0                                    4
                             360.0                                    4
                             720.0                                   15
                             1440.0                                 226
                             2160.0                                 394
dtype: int64