### I. Research Questions:
1. Which psychiatric diagnosis are associated with higher ICU readmission rates? -> Do patient with mood disorders (296.x), psychotic disorders (295.x) or substance use (303-305) have higher [30]-day readmission rates compared to those without?
2. Does discontinuation or poor psychotropic medications at discharge predict higher readmissions? -> Does the discontinuation of psychotropic medication upon discharge correlate with increased readmission risk?
3. Can we combine patient-level features to predict ICU readmission risk? -> which combination of demographics, ICU, severity scores and medication changes is most predictive of readmission?

In [1]:
import pandas as pd

# Step 1: Load datasets
admissions = pd.read_csv('./data/hosp/admissions.csv')
patients = pd.read_csv('./data/hosp/patients.csv')
diagnoses = pd.read_csv('./data/hosp/diagnoses_icd.csv')
icustays = pd.read_csv('./data/icu/icustays.csv')
prescriptions = pd.read_csv('./data/hosp/prescriptions.csv')

print("Initial admissions:", admissions.shape[0], "Patients:", patients['subject_id'].nunique(), "Notes:", diagnoses.shape[0])

# Step 2: Exclude patients under 18
patients_adult = patients[patients['anchor_age'] >= 18]
admissions = admissions[admissions['subject_id'].isin(patients_adult['subject_id'])].copy()
diagnoses = diagnoses[diagnoses['hadm_id'].isin(admissions['hadm_id'])].copy()

# Recalculate patients and notes counts
print("After excluding <18:", admissions.shape[0],
      "Patients:", admissions['subject_id'].nunique(),
      "Notes:", diagnoses.shape[0])

# Step 3: Identify psychiatric conditions
psych_icd_codes = ('296', '295', '303', '304', '305', 'F20', 'F21', 'F22', 'F23', 'F24', 'F25',
                   'F28', 'F29', 'F30', 'F31', 'F32', 'F33', 'F34', 'F39', 'F10', 'F11', 'F12', 
                   'F13', 'F14', 'F15', 'F16', 'F17', 'F18', 'F19')

diagnoses['psych_flag'] = diagnoses['icd_code'].astype(str).str.startswith(psych_icd_codes)
psych_admissions = diagnoses[diagnoses['psych_flag']]['hadm_id'].unique()
admissions = admissions[admissions['hadm_id'].isin(psych_admissions)].copy()
diagnoses = diagnoses[diagnoses['hadm_id'].isin(admissions['hadm_id'])].copy()
print("Psych admissions:", admissions.shape[0], "Patients:", admissions['subject_id'].nunique(), "Notes:", diagnoses.shape[0])

# Sort admissions
admissions = admissions.sort_values(['subject_id', 'admittime']).copy()



Initial admissions: 431088 Patients: 299777 Notes: 4752265
After excluding <18: 431088 Patients: 180747 Notes: 4752265
Psych admissions: 117189 Patients: 55780 Notes: 1401206


In [2]:
# Step 4: Define ICU readmissions
print("ICU admissions:", icustays.shape[0], "Patients:", icustays['subject_id'].nunique())
icu_stays = icustays['subject_id'].isin(admissions['subject_id']).copy()
icu_admissions = icustays.merge(admissions[['hadm_id', 'dischtime']], on='hadm_id', how='inner')
icu_admissions = icu_admissions.sort_values(['subject_id', 'intime']).copy()

icu_admissions['next_admission'] = icu_admissions.groupby('subject_id')['intime'].shift(-1)
icu_admissions['dischtime'] = pd.to_datetime(icu_admissions['dischtime'])
icu_admissions['next_admission'] = pd.to_datetime(icu_admissions['next_admission'])

icu_admissions['readmitted_30d'] = (icu_admissions['next_admission'] - icu_admissions['dischtime']).dt.days <= 30
icu_admissions['readmitted_30d'] = icu_admissions['readmitted_30d'].fillna(False)
print("ICU Psych readmissions (30d):", icu_admissions['readmitted_30d'].sum(), 
    "Unique Patients:", icu_admissions[icu_admissions['readmitted_30d']]['subject_id'].nunique())


ICU admissions: 73141 Patients: 50934
ICU Psych readmissions (30d): 3013 Unique Patients: 2191


In [3]:
# Step 5: Medication discontinuation
psych_meds = ['haloperidol', 'risperidone', 'quetiapine', 'olanzapine', 'lorazepam', 'diazepam', 'alprazolam',
              'fluoxetine', 'sertraline', 'citalopram', 'escitalopram', 'paroxetine', 'venlafaxine', 'duloxetine',
              'bupropion', 'mirtazapine', 'valproate', 'lithium']

# Total prescriptions before filtering
print("Total prescriptions:", prescriptions.shape[0], "Patients:", prescriptions['subject_id'].nunique())

# Filter prescriptions to adult psychiatric admissions
prescriptions = prescriptions[prescriptions['hadm_id'].isin(admissions['hadm_id'])].copy()
print("Prescriptions for psych admissions:", prescriptions.shape[0], "Patients:", prescriptions['subject_id'].nunique())

prescriptions['drug'] = prescriptions['drug'].astype(str).str.lower()
prescriptions['psych_med_flag'] = prescriptions['drug'].isin(psych_meds)
psych_prescriptions = prescriptions[prescriptions['psych_med_flag']].copy()
print("Psych prescriptions:", psych_prescriptions.shape[0], "Patients:", psych_prescriptions['subject_id'].nunique())

# Discontinued medications
psych_prescriptions.loc[:, 'stoptime'] = pd.to_datetime(psych_prescriptions['stoptime'], errors='coerce')
last_prescriptions = psych_prescriptions.groupby('hadm_id')['stoptime'].max().reset_index()
last_prescriptions.rename(columns={'stoptime': 'last_psych_med_time'}, inplace=True)

# Only include in admissions those hadm_id with psych meds
admissions_with_meds = admissions[admissions['hadm_id'].isin(last_prescriptions['hadm_id'])].copy()
admissions_with_meds = admissions_with_meds.merge(last_prescriptions, on='hadm_id', how='left')
admissions_with_meds['med_discontinued'] = admissions_with_meds['last_psych_med_time'].isnull()
discontinued_prescriptions = admissions_with_meds[admissions_with_meds['med_discontinued']]
print("Psych Medication discontinued admissions:", discontinued_prescriptions.shape[0])
print("Unique patients with psych medication discontinued:", discontinued_prescriptions['subject_id'].nunique())

Total prescriptions: 15399811 Patients: 158422
Prescriptions for psych admissions: 4233384 Patients: 45660
Psych prescriptions: 181314 Patients: 31764
Psych Medication discontinued admissions: 12
Unique patients with psych medication discontinued: 12


In [None]:
# To files
admissions.to_csv('./data/admissions_processed.csv', index=False)
diagnoses.to_csv('./data/diagnoses_processed.csv', index=False)
icu_admissions.to_csv('./data/icu_admissions_processed.csv', index=False)
last_prescriptions.to_csv('./data/last_prescriptions_processed.csv', index=False)

After data cleanup: 117189 Patients: 55780 Notes: 1401206
Final dataset: 117189 Patients: 55780 Notes: 1401206
