![alt text](./Cerny_logo_1.jpg)

# Analysis of Cerny ventilation recordings

The data processed and analysed in this Notebook were collected by the **Neonatal Emergency and Transport Service of the Peter Cerny Foundation**, Budapest, Hungary

**Author: Dr Gusztav Belteki**


## Further processing and analysis of ventilator parameters 

This notebook import the preprocessed ventilator data from pickle archive and analyses all the ventilator parameter data and alarms data obtained with **0.5Hz sampling rate** from the Fabian ventilators at the Cerny neonatal transport service. It exports desrciptive statistics into Excel files and the further processed data as pickle archive.

Imported: **data_pars_1_150.pickle**, **data_pars_151_300.pickle**, **Fabian_parameters.xlsx**  

- Total: **246 cases** with ventilator data available (with >15 minutes recording time)
- Clinical data were not available for **4 cases** 
- Appropriate ventilator and clinical data are available for **242 cases**

Exported: dictionaries containing ventilation data as **data_pars_measurements_1_300.pickle,  data_pars_settings_1_300.pickle, data_pars_alarms_1_300.pickle**

### Importing the necessary libraries and setting options

In [None]:
import IPython
import pandas as pd
import numpy as np
import scipy as sp
import matplotlib
import matplotlib.pyplot as plt
import seaborn as sns
import sklearn as sk

import os
import sys
import re
import pickle

from scipy import stats
from pandas import Series, DataFrame
from datetime import datetime, timedelta

%matplotlib inline
matplotlib.style.use('classic')
matplotlib.rcParams['figure.facecolor'] = 'w'

pd.set_option('display.max_rows', 100)
pd.set_option('display.max_columns', 100)
# pd.set_option('mode.chained_assignment', None) 

In [None]:
print("Python version: {}".format(sys.version))
print("pandas version: {}".format(pd.__version__))
print("matplotlib version: {}".format(matplotlib.__version__))
print("NumPy version: {}".format(np.__version__))
print("SciPy version: {}".format(sp.__version__))
print("IPython version: {}".format(IPython.__version__))
print("scikit-learn version: {}".format(sk.__version__))

### List and set the working directory and the directory to write out data

In [None]:
# Topic of the Notebook which will also be the name of the subfolder containing results
TOPIC = 'fabian'

# Name of the external hard drive
DRIVE = 'GUSZTI'

# Directory containing clinical and blood gas data
CWD = '/Users/guszti/ventilation_fabian'

DIR_WRITE = '%s/%s/%s' % (CWD, 'Analyses', 'analysis_1_300')
if not os.path.isdir(DIR_WRITE):
    os.makedirs(DIR_WRITE)

# Images and raw data will be written on an external hard drive
if not os.path.isdir('/Volumes/%s/data_dump/%s' % (DRIVE, TOPIC)):
    os.makedirs('/Volumes/%s/data_dump/%s' % (DRIVE, TOPIC))
DATA_DUMP = '/Volumes/%s/data_dump/%s' % (DRIVE, TOPIC)

In [None]:
os.chdir(CWD)
os.getcwd()

In [None]:
DIR_WRITE, DATA_DUMP

### Import pickle archives

In [None]:
# Ventilator data

with open('%s/%s.pickle' % (DATA_DUMP, 'data_pars_1_150'), 'rb') as handle:
    data_pars_1_150 = pickle.load(handle)

with open('%s/%s.pickle' % (DATA_DUMP, 'data_pars_151_300'), 'rb') as handle:
    data_pars_151_300 = pickle.load(handle)
    
data_pars = {**data_pars_1_150, **data_pars_151_300}

In [None]:
len(data_pars)

In [None]:
# Clinical data

with open('%s/%s.pickle' % (DATA_DUMP, 'clin_df_1_300'), 'rb') as handle:
    clin_df = pickle.load(handle)

#### Limit ventilator data to cases for which clinical data and appropriate (>15 minutes) ventilator data are available




In [None]:
combined = sorted(set(list(clin_df.index)) & set(data_pars.keys()))
data_pars = {key : value for key, value in data_pars.items() if key in combined}
cases = sorted(data_pars.keys())

In [None]:
len(data_pars)

### Import table for interpreting ventilator parameters

In [None]:
par_key_table = pd.read_excel('Fabian_parameters.xlsx')

In [None]:
par_key_table;

### Create a dictionary of Dataframes with measured ventilator parameters only

In [None]:
ventilator_measurements = ['PIP', 'MAP', 'PEEP', 'Ti_PSV', 'Cdyn', 'C20_C', 'R', 'MV', 'MVresp',
 'VTemand', 'VTemand_resp', 'VTespon_pat', 'Leak', 'RR', 'Trigger', 'VTimand', 'FiO2',
 'Flow_demand', 'Flow_insp', 'Flow_exp',]

In [None]:
data_pars_measurements = {}

for case in cases:
    data_pars_measurements[case] = data_pars[case][ventilator_measurements].copy()

In [None]:
# Replace textual data with np.nan
repl_dct = {'off': np.nan, 'not valid': np.nan, 'out of range': np.nan, 'unused': np.nan}
for case in cases:
    data_pars_measurements[case].replace(repl_dct, inplace = True)

In [None]:
# Normalize relevant parameters to body weight
pars_per_kg = ['MV', 'VTimand', 'VTemand', 'VTespon_pat', 'VTemand_resp']

for case in cases:
    for par in pars_per_kg:
        data_pars_measurements[case]['%s_kg' % par] = \
        data_pars_measurements[case][par] / (clin_df.loc[case]['Weight'] / 1000)

In [None]:
# Drop columns which only have NaN values
for case in cases:
    data_pars_measurements[case].dropna(axis = 1, how = 'all', inplace = True)

#### Export dictionary containing measured ventilator parameters to a pickle archive

In [None]:
with open('%s/%s.pickle' % (DATA_DUMP, 'data_pars_measurements_1_300'), 'wb') as handle:
    pickle.dump(data_pars_measurements, handle, protocol=pickle.HIGHEST_PROTOCOL)

### Create a dictionary of Dataframes with ventilator settings

In [None]:
ventilator_settings = ['Patient_range', 'Ventilator_mode', 'PIP_set', 'PEEP_set', 'PIP_set_PSV', 
'FiO2_set', 'Flow_insp_set','Slope_set', 'Flow_exp_set', 'Ti_set', 'Te_set', 'RR_set', 
'IE_I_set', 'IE_E_set', 'Volume_limit_set', 'VG_set', 'Term_criteria_PSV_set', 'Apnea_time_set', 
'RR_backup_set', 'Trigger_sens_set', 'Powerstate', 'MV_lim_high_set',
'MV_lim_low_set', 'PIP_lim_high_set', 'PIP_lim_low_set', 'RR_lim_set', 'Leakage_lim_set',
'Measuring_unit_pressure_set', 'Flow_sensor_state', 'Oxy_sensor_state',
'P_man_breath_CPAP_HFO_set', 'P_man_breath_duoPAP_NCPAP_set', 'FiO2_flush_time_set', 'FiO2_flush_set',
'Ventilation_stopped', 'VG_state', 'Volume_limit_state', 'Ventilator_range', 'Trigger_mode',
'Pressure_rise_control']

In [None]:
data_pars_settings = {}

for case in cases:
    data_pars_settings[case] = data_pars[case][ventilator_settings].copy()

In [None]:
# Replace textual data with np.nan
repl_dct = {'off': np.nan, 'not valid': np.nan, 'out of range': np.nan, 'unused': np.nan}
for case in cases:
    data_pars_settings[case].replace(repl_dct, inplace = True)

In [None]:
# Normalize relevant parameters to body weight
pars_per_kg = ['Volume_limit_set', 'VG_set', 'MV_lim_high_set', 'MV_lim_low_set',]

for case in cases:
    for par in pars_per_kg:
        data_pars_settings[case]['%s_kg' % par] = \
        data_pars_settings[case][par] / (clin_df.loc[case]['Weight'] / 1000)

In [None]:
# Drop columns which only have NaN values
for case in cases:
    data_pars_settings[case].dropna(axis=1, how='all', inplace = True)

#### Export dictionary containing ventilator settings to a pickle archive

In [None]:
with open('%s/%s.pickle' % (DATA_DUMP, 'data_pars_settings_1_300'), 'wb') as handle:
    pickle.dump(data_pars_settings, handle, protocol=pickle.HIGHEST_PROTOCOL)

### Create a dictionary of Dataframes with ventilator alarms

In [None]:
ventilator_alarms = ['Alarm_susp', 'Alarm_Flat_battery', 'Alarm_Checksum_ctrl_PIC', 'Alarm_Checksum_monitor_PIC',
'Alarm_Safety_relay_defect', 'Alarm_Sens_dev_prox_pressure', 'Alarm_input_pressure_blender', 'Alarm_excess_pressure',
'Alarm_voltage_monit', 'Alarm_SPI_interface', 'Alarm_DIO2_interface', 'Alarm_COM_interface', 'Alarm_I2C_interface',
'Alarm_parallel_interface', 'Alarm_serial_tem_interface', 'Alarm_low_physical_memory', 'Alarm_Fan_defect',
'Alarm_CO2_interface', 'Alarm_blender_defect', 'Alarm_battery_defect', 'Alarm_input_pressure_O2_supply',
'Alarm_input_pressure_air_supply', 'Alarm_tube_occlusion', 'Alarm_patient_disconnected', 'Alarm_ETT_blocked',
'Alarm_flow_sensor_defect', 'Alarm_flow_sensor_clean', 'Alarm_flow_sensor_disconnected', 'Oxygen_sensor_defect',
'Oxygen_sensor_used_up', 'Oxyen_value_divergence', 'Alarm_O2_sensor_cal_error', 'Alarm_MV_high', 'Alarm_MV_low',
'Alarm_pressure_high', 'Alarm_pressure_low', 'Alarm_PEEP_high', 'Alarm_RR_high', 'Alarm_ETT_leak_high','Alarm_apnea',
'Alarm_DCO2_high', 'Alarm_DCO2_low', 'Alarm_etCO2_high','Alarm_etCO2_low', 'Alarm_PIP_not_reached',
'Alarm_limited_volume', 'Alarm_volume_not_reached', 'Alarm_power_failure', 'Alarm_charge_battery_60min',
'Alarm_charge_battery_30min', 'Alarm_charge_battery_15min', 'Alarm_nebulizer_disconnection',
'Alarm_nebulizer_system_error', 'Alarm_CO2_module_not_connected', 'Alarm_CO2_filterline_not_connected',
'Alarm_CO2_check_sampleline', 'Alarm_CO2_check_airway_adapter', 'Alarm_CO2_sensor_faulty',]

In [None]:
data_pars_alarms = {}

for case in cases:
    data_pars_alarms[case] = data_pars[case][ventilator_alarms].copy()

In [None]:
# Replace textual data with np.nan
repl_dct = {'off': np.nan, 'not valid': np.nan, 'out of range': np.nan, 'unused': np.nan}
for case in cases:
    data_pars_alarms[case].replace(repl_dct, inplace = True)

In [None]:
# Drop columns which only have NaN values (these alarms never went off)
for case in cases:
    for column in data_pars_alarms[case].columns:
        if data_pars_alarms[case][column].sum() == 0:
            del data_pars_alarms[case][column]     

#### Export dictionary containing ventilator alarms to a pickle archive

In [None]:
with open('%s/%s.pickle' % (DATA_DUMP, 'data_pars_alarms_1_300'), 'wb') as handle:
    pickle.dump(data_pars_alarms, handle, protocol=pickle.HIGHEST_PROTOCOL)