# Estimating Time from Referral to Procurement

**Objective:** Predict the time interval between hospital referral and organ procurement.

**Features:**

- Patient Demographics: `Age`, `Gender`, `Race`
- Medical Details: `Cause_of_Death_UNOS`, `Mechanism_of_Death`, `Circumstances_of_Death`, `ABO_BloodType`, `ABO_Rh`, `HeightIn`, `WeightKg`
- Referral and Authorization Details: `Tissue_Referral`, `Eye_Referral`, `brain_death`, `approached`, `authorized`
- Timing Information: `time_brain_death`, `time_asystole`, `time_referred`, `time_approached`, `time_authorized`, `Referral_Year`, `Referral_DayofWeek`, `Procured_Year`
- Outcome: `outcome_heart`, `outcome_liver`, `outcome_kidney_left`, `outcome_kidney_right`, `outcome_lung_left`, `outcome_lung_right`, `outcome_intestine`, `outcome_pancreas`

**Target:** `time_procured` - `time_referred`

**Model Type:** Regression (Linear Regression, Random Forest Regression, Gradient Boosting Regression)

In [1]:
import pandas as pd

df = pd.read_csv('data/referrals.csv')
df.head(5)

  df = pd.read_csv('data/referrals.csv')


Unnamed: 0,OPO,PatientID,Age,Gender,Race,HospitalID,brain_death,Cause_of_Death_OPO,Cause_of_Death_UNOS,Mechanism_of_Death,...,Referral_Year,Procured_Year,outcome_heart,outcome_liver,outcome_kidney_left,outcome_kidney_right,outcome_lung_left,outcome_lung_right,outcome_intestine,outcome_pancreas
0,OPO1,OPO1_P320866,62.0,M,White / Caucasian,OPO1_H23456,False,,Head Trauma,,...,2018,,,,,,,,,
1,OPO1,OPO1_P549364,14.0,F,White / Caucasian,OPO1_H11908,False,,,,...,2021,,,,,,,,,
2,OPO1,OPO1_P536997,55.0,M,White / Caucasian,OPO1_H23111,False,,CVA/Stroke,ICH/Stroke,...,2015,,,,,,,,,
3,OPO1,OPO1_P463285,48.0,F,Black / African American,OPO1_H26589,False,,Anoxia,Cardiovascular,...,2019,,,,,,,,,
4,OPO1,OPO1_P284978,80.0,F,White / Caucasian,OPO1_H5832,False,,,,...,2018,,,,,,,,,


# Exploratory data Analysis

In [2]:
df.shape

(133101, 38)

In [3]:
df.count()

OPO                       133101
PatientID                 133101
Age                       133017
Gender                    133040
Race                      133101
HospitalID                133101
brain_death               133101
Cause_of_Death_OPO         32396
Cause_of_Death_UNOS       103283
Mechanism_of_Death         98533
Circumstances_of_Death     98588
ABO_BloodType              18201
ABO_Rh                      8743
HeightIn                   98500
WeightKg                  123346
approached                133101
authorized                133101
procured                  133101
transplanted              133101
Tissue_Referral           133101
Eye_Referral              133101
time_asystole              89415
time_brain_death           11855
time_referred             133101
time_approached            16191
time_authorized            11665
time_procured               9502
Referral_DayofWeek        133101
Referral_Year             133101
Procured_Year               9543
outcome_he

## detect outliers

In [8]:
df['time_procured'][5]

nan

In [6]:
df['time_procured'].describe()

count                        9502
unique                       9496
top       2032-11-07 21:59:00.000
freq                            2
Name: time_procured, dtype: object