This Jupyter worksheet is meant to give a quick overview on some statistics of all our patients, while also providing a clear statement on how this data was retrieved (with references to the fields within the Castor Forms)

# Get all data from Castor database
Note that you need export rights for every individual center.

In [1]:
import covid19_import
study_data,study_struct,reports_data,reports_struct,complications_struct,optiongroups_structure = covid19_import.import_data()

Http Error: 500 Server Error: Internal Server Error for url: https://data.castoredc.com/api/study/2B1B90BA-69B8-48F5-BF2E-640284CB8709/export/data


NameError: error with api request (/study/2B1B90BA-69B8-48F5-BF2E-640284CB8709/export/data): 

## Export all variables to excel

In [2]:
import pandas
target_excel = '/Users/wouterpotters/Desktop/covid19_variables.xlsx'
writer = pandas.ExcelWriter(target_excel, engine='xlsxwriter')

readme = 'This excel sheet contains an overview of the variables that are used in the Castor EDC database for the COVID 19 project. \nThere are three tabs; \n(1) Admission variables; to be entered once and updated incidentally. \n(2) Daily reports are created once per day. \n(3) Complications are filled in as they arise.'
readme = pandas.DataFrame([x for x in readme.split('\n')])

# Write each dataframe to a different worksheet.
readme.to_excel(writer, sheet_name='README',index=False)
study_struct.to_excel(writer, sheet_name='AdmissionVariables',index=False)
reports_struct.to_excel(writer, sheet_name='DailyUpdateVariables',index=False)
complications_struct.to_excel(writer, sheet_name='ComplicationsVariables',index=False)
optiongroups_structure.to_excel(writer, sheet_name='OptionGroups',index=False)

writer.save() # save excel file

# Patient counts
## Total number of patients

In [3]:
print('Total number of patients (with any data) is: '+str(len(study_data)))

# HOSPITAL ADMISSION	ONSET & ADMISSION	admission_dt	Admission date at this facility:
has_admission_date = [x == False for x in study_data['admission_dt'].isna()]
print('Total number of patients with admission date: '+str(len(study_data[has_admission_date])))

# HOSPITAL ADMISSION	ONSET & ADMISSION	facility_transfer	Transfer from other facility?
# YES-facility is a study site	1
# YES-facility is not a study site	2
# No	3
# HOSPITAL ADMISSION	ONSET & ADMISSION	admission_facility_dt	Admission date at transfer facility
is_transferred = [(x == '1' or x == '2') for x in study_data['facility_transfer']]
has_transfer_admission_date = [x == False for x in study_data['admission_facility_dt'].isna()]
print('  > Transferred from other center: '+str(len(study_data[is_transferred]))+' (original admission date available in '+str(len(study_data[has_transfer_admission_date]))+'/'+str(len(study_data[is_transferred]))+')')

# TREATMENT	TREATMENTS during admission	Admission_dt_icu_1	Admission date ICU
has_ICU_admission_date = [x == False for x in study_data['Admission_dt_icu_1'].isna()]
# TREATMENT	TREATMENTS during admission	Discharge_dt_icu_1	Discharge date ICU
print('  > ICU admissions: '+str(len(study_data[has_ICU_admission_date])))
has_ICU_discharge_date = [x == False for x in study_data['Discharge_dt_icu_1'].isna()]
print('  > Discharged ICU admissions: '+str(len(study_data[has_ICU_admission_date and has_ICU_discharge_date])))


Total number of patients (with any data) is: 232
Total number of patients with admission date: 224
  > Transferred from other center: 36 (original admission date available in 35/36)
  > ICU admissions: 50
  > Discharged ICU admissions: 15


## Duration of hospital stay
1) Select all patients that are admitted to the hospital AND were at the ICU at some point.



In [4]:
# HOSPITAL ADMISSION	ONSET & ADMISSION	admission_dt	Admission date at this facility:
has_admission_date = [x == False for x in study_data['admission_dt'].isna()]

# TREATMENT	TREATMENTS during admission	Admission_dt_icu_1	Admission date ICU
has_ICU_admission_date = [x == False for x in study_data['Admission_dt_icu_1'].isna()]


2) Calculate time (in days) from admission to ICU admission



In [21]:
import numpy
# convert string to dates
patient_ICU_dates_admissions = pandas.DataFrame([pandas.to_datetime(study_data[x][has_admission_date and has_ICU_admission_date]) for x in study_data[['admission_dt','Admission_dt_icu_1']]]).transpose()
patient_ICU_dates_discharge_ICU = pandas.DataFrame([pandas.to_datetime(study_data[x][has_admission_date and has_ICU_admission_date and has_ICU_discharge_date]) for x in study_data[['Admission_dt_icu_1','Discharge_dt_icu_1']]]).transpose()

time_from_hospital_admission_to_ICU = patient_ICU_dates_admissions['Admission_dt_icu_1'] - patient_ICU_dates_admissions['admission_dt']
time_from_ICU_admission_to_ICU_discharge = patient_ICU_dates_discharge_ICU['Discharge_dt_icu_1'] - patient_ICU_dates_discharge_ICU['Admission_dt_icu_1']

print('These patients have a negative time difference between admission and ICU admission:')
print(time_from_hospital_admission_to_ICU[[x.days < 0 for x in time_from_hospital_admission_to_ICU]])

print('These patients have a negative time difference between ICU admission and ICU discharge:')
print(time_from_ICU_admission_to_ICU_discharge[[x.days < 0 for x in time_from_ICU_admission_to_ICU_discharge]])

# average stay on ICU
valid_ICU_durations = time_from_ICU_admission_to_ICU_discharge[[x.days > 0 for x in time_from_ICU_admission_to_ICU_discharge]]
print('Mean +/- std ICU stay (n=' + str(len(valid_ICU_durations)) + ' discharged patients): '+ str(numpy.mean(valid_ICU_durations)) + ' +/- ' + str(numpy.std(valid_ICU_durations)))


These patients have a negative time difference between admission and ICU admission:
Record ID
120008     -3 days
130023    -29 days
140010   -173 days
dtype: timedelta64[ns]
These patients have a negative time difference between ICU admission and ICU discharge:
Record ID
130002     -1 days
140012   -263 days
dtype: timedelta64[ns]
Mean +/- std ICU stay (n=11 discharged patients): 3 days 10:54:32.727272 +/- 2 days 00:23:42.246209



3) Calculate time on ICU (with active and discharged patients)


In [27]:
from datetime import datetime
import numpy
all_ICU_patients = pandas.DataFrame([pandas.to_datetime(study_data[x][has_admission_date and has_ICU_admission_date]) for x in study_data[['Admission_dt_icu_1','Discharge_dt_icu_1']]]).transpose()
all_ICU_patients = all_ICU_patients.fillna(datetime.now())

time_on_ICU = all_ICU_patients['Discharge_dt_icu_1']-all_ICU_patients['Admission_dt_icu_1']
valid_ICU_durations = time_on_ICU[[x.days >= 0 for x in time_on_ICU]]

# ignore ICU stays op < 0 days
print('Mean +/- std ICU stay (n=' + str(len(valid_ICU_durations)) + ' discharged and nondischarged patients): '+ str(numpy.mean(valid_ICU_durations)) + ' +/- ' + str(numpy.std(valid_ICU_durations)))


Mean +/- std ICU stay (n=48 discharged and nondischarged patients): 6 days 04:44:30.597812 +/- 5 days 06:22:20.315115


## Outcome so far
1) Alle ontslagen patienten

2) Outcome van alle ontslagen patienten


In [29]:
huidig aantal COVID patiënten in rest van ZH
kans van gemiddelde COVID zaal / MC patiënt naar IC moet < 24 uur. < 48 uur, < 72 uur 
kans dat gemiddelde COVID zaal patiënt kan worden ontslagen < 24 uur. < 48 uur, < 72 uur 
kans dat gemiddelde IC patiënt nog 24, 48 en 72 uur op IC moet blijven 
duur van gemiddelde zaal / MC opname
te verwachten gemiddelde resterende duur van zaal / MC opname 
te verwachten gemiddelde resterende duur van IC opname
% aantal opgenomen patiënten met te verwachten goede outcome (naar huis), met te verwachten redelijke outcome (revalidatie), matige outcome (verpleegtehuis), slechte outcome (palliatie, overlijden)
outcome so far van alle behandelde patiënten (naar huis, revalidatie, verpleeghuis, palliatie, ander centrum, onbekend)
aantal behandelde patiënten so far 
code waarin staat hoe het met de capaciteit in ZH is obv voor Corona beschikbare bedden - groen oranje rood zwart
(aantal patiënten dat optimale behandeling krijgt - controversieel - niet in 1.0 versie

SyntaxError: invalid syntax (<ipython-input-29-854c710a2ac7>, line 1)