<a id='0'></a>

<h3>Table of contents</h3>

* [Introduction](#1)
* [Load Data](#2)        
* [Sample time series](#3)        
* [Distribution of variables](#4)     


**Work in progress. Please consider upvoting if it helps**

<a id='0'></a>
## <p style="background-color:#fdb913; font-family:Computer Modern;src: url('http://mirrors.ctan.org/fonts/cm-unicode/fonts/otf/cmunss.otf'); font-size:100%; text-align:center">Introduction</p>
In this competition we need to predict the airway pressure in the respiratory circuit at each time step. The input features are the lung parameters and attributes of the ventilator    
<div>

<div align="center">
    <img src="https://raw.githubusercontent.com/google/deluca-lung/main/assets/2020-10-02%20Ventilator%20diagram.svg"  width="700" height="200">
</div>    

As shown in the above diagram (provided in the data section) following are the two attributes that describe the condition of the patient
    
* R - lung attribute indicating how restricted the airway is (in cmH2O/L/S). Physically, this is the change in pressure per change in flow (air volume per time). Intuitively, one can imagine blowing up a balloon through a straw. We can change R by changing the diameter of the straw, with higher R being harder to blow.
* C - lung attribute indicating how compliant the lung is (in mL/cmH2O). Physically, this is the change in volume per change in pressure. Intuitively, one can imagine the same balloon example. We can change C by changing the thickness of the balloon’s latex, with higher C having thinner latex and easier to blow.

Following are the ventilator parameters
* u_in - the control input for the inspiratory solenoid valve. Ranges from 0 to 100.
* u_out - the control input for the exploratory solenoid valve. Either 0 or 1.

And following is the target variable that we need to predict
    
* pressure - the airway pressure measured in the respiratory circuit, measured in cmH2O. for each given time_step of the series

[back to top](#0)

<a id='1'></a>
## <p style="background-color:#fdb913; font-family:Computer Modern;src: url('http://mirrors.ctan.org/fonts/cm-unicode/fonts/otf/cmunss.otf'); font-size:100%; text-align:center"> Load Data</p>
<div>

[back to top](#0)    

In [None]:
import numpy as np
import pandas as pd
import os
import plotly.graph_objects as go
from plotly.offline import iplot, init_notebook_mode
init_notebook_mode(connected=True)
import seaborn as sns
import matplotlib.pyplot as plt
from plotly.offline import iplot, init_notebook_mode
init_notebook_mode(connected=True)
import plotly_express as px
import plotly.graph_objects as go
from plotly.subplots import make_subplots
from plotly.offline import init_notebook_mode
import plotly.io as pio
from plotly.subplots import make_subplots
# setting default template to plotly_white for all visualizations
pio.templates.default = "plotly_white"
%matplotlib inline
import gc

from colorama import Fore, Back, Style

y_ = Fore.YELLOW
r_ = Fore.RED
g_ = Fore.GREEN
b_ = Fore.BLUE
m_ = Fore.MAGENTA
c_ = Fore.CYAN
res = Style.RESET_ALL

import warnings
warnings.filterwarnings('ignore')

In [None]:
train_df = pd.read_csv('/kaggle/input/ventilator-pressure-prediction/train.csv', index_col=None)
test_df = pd.read_csv('/kaggle/input/ventilator-pressure-prediction/test.csv', index_col=None)
sample_submission = pd.read_csv('/kaggle/input/ventilator-pressure-prediction/sample_submission.csv', index_col=None)

print(f"{y_}Train data shape - {train_df.shape}{res}\n{m_}Test data shape - {test_df.shape}{res}\n{c_}Sample submission shape - {sample_submission.shape}{res}")

In [None]:
train_df.info()

In [None]:
test_df.info()

In [None]:
sample_submission.info()

**No Missing values**

In [None]:
_ = sns.heatmap(train_df.isna())

In [None]:
_ = sns.heatmap(test_df.isna())

In [None]:
train_df.head(20)

In [None]:
test_df.head(20)

<a id='3'></a>
## <p style="background-color:#fdb913; font-family:Computer Modern;src: url('http://mirrors.ctan.org/fonts/cm-unicode/fonts/otf/cmunss.otf'); font-size:100%; text-align:center"> Sample time series from training data</p>
<div>
    
The training data consists of time series data. Each time series represents approximately 3-second breath. 
    
[back to top](#0)

In [None]:
#Plot the data
breath1 = train_df.loc[train_df['breath_id'] == 1]
breath1
plt.rcParams['figure.dpi'] = 600
background_color = '#E3D6C9'
fig = plt.figure(figsize=(22,13), facecolor=background_color)
font_size = 20
gs = fig.add_gridspec(1,1)
gs.update(wspace=0.3, hspace=0.4)


locals()["ax"+str(0)] = fig.add_subplot(gs[0, 0])
locals()["ax"+str(0)].set_facecolor(background_color)
for s in [ 'right', 'top']:
    locals()["ax"+str(0)].spines[s].set_visible(False)

    #41533b
sns.lineplot(ax=locals()["ax"+str(0)],data=breath1, x="time_step", y="pressure", linewidth = 3, color='#FC6238')
locals()["ax0"].set_xticks(np.arange(0,3,0.25))
locals()["ax0"].set_xticklabels(np.arange(0,3,0.25), fontsize=font_size, fontweight='bold')
y_min = round(breath1.pressure.min(),2)
y_max = round(breath1.pressure.max(),2)
locals()["ax0"].set_yticks(np.arange(y_min,y_max))
locals()["ax0"].set_yticklabels(np.arange(y_min,y_max), fontsize=font_size, fontweight='bold')
locals()["ax0"].set_xlabel('Time Step (seconds)',fontsize=font_size,fontweight='bold')
locals()["ax0"].set_ylabel('Pressure',fontsize=font_size,fontweight='bold')
value = "Airway pressure in the respiratory circuit (cmH2O) at time step (second)"
locals()["ax"+str(0)].text(1.15, 9.5, value, ha='left', va='center', fontsize=font_size, color='White',fontweight='bold',
            bbox=dict(facecolor='#41533b', edgecolor=None, boxstyle='round', linewidth=0.1)
                              )

gs.tight_layout(fig, rect=[0, 0, 1, 1])
plt.show()                                                                


**More Samples**

In [None]:
sample_ts = np.random.choice(list(train_df.breath_id.unique()), 10)

#Plot the data
plt.rcParams['figure.dpi'] = 600
background_color = '#E3D6C9'
fig = plt.figure(figsize=(22,22), facecolor='#E3D6C9')

cols = 2
rows = 5

gs = fig.add_gridspec(rows,cols)
gs.update(wspace=0.3, hspace=0.4)

cell_count = 0
for row in range(0, rows):
    for col in range(0, cols):
        locals()["ax"+str(cell_count)] = fig.add_subplot(gs[row, col])
        locals()["ax"+str(cell_count)].set_facecolor(background_color)
        for loc in ["top","right"]:
            locals()["ax"+str(cell_count)].spines[loc].set_visible(False)
        cell_count += 1

cell_count = 0
for breath_id in sample_ts: 
    breath_ts = train_df.loc[train_df['breath_id'] == breath_id]
    sns.lineplot(ax=locals()["ax"+str(cell_count)],data=breath_ts, x="time_step", y="pressure", linewidth = 3, color='#FC6238')
    locals()["ax"+str(cell_count)].set_xticks(np.arange(0,3,0.25))
    locals()["ax"+str(cell_count)].set_xticklabels(np.arange(0,3,0.25), fontsize=font_size, fontweight='bold')
    y_min = round(float(breath_ts.pressure.min()),2)
    y_max = round(float(breath_ts.pressure.max()),2)
    y_ticks = np.round(np.arange(y_min,y_max, 4), decimals=2)
    locals()["ax"+str(cell_count)].set_yticks(y_ticks)
    locals()["ax"+str(cell_count)].set_yticklabels(y_ticks, fontsize=font_size, fontweight='bold')
    locals()["ax"+str(cell_count)].set_xlabel('Time Step (seconds)',fontsize=font_size,fontweight='bold')
    locals()["ax"+str(cell_count)].set_ylabel('Pressure',fontsize=font_size,fontweight='bold')
    cell_count +=1
gs.tight_layout(fig, rect=[0, 0, 1, 1])
plt.show()                                                                

**And few more samples...**

In [None]:
def plot_samples(num_ts, rows, cols, figsize=(22,22), axes=True):
    sample_ts = np.random.choice(list(train_df.breath_id.unique()), num_ts)
    #Plot the data**More Samples**
    plt.rcParams['figure.dpi'] = 600
    background_color = '#E3D6C9'
    fig = plt.figure(figsize=figsize, facecolor='#E3D6C9')

    gs = fig.add_gridspec(rows,cols)
    gs.update(wspace=0.3, hspace=0.4)

    cell_count = 0
    for row in range(0, rows):
        for col in range(0, cols):
            locals()["ax"+str(cell_count)] = fig.add_subplot(gs[row, col])
            locals()["ax"+str(cell_count)].set_facecolor(background_color)
            for loc in ["top","right"]:
                locals()["ax"+str(cell_count)].spines[loc].set_visible(False)              
            cell_count += 1

    cell_count = 0
    for breath_id in sample_ts: 
        breath_ts = train_df.loc[train_df['breath_id'] == breath_id]
        sns.lineplot(ax=locals()["ax"+str(cell_count)],data=breath_ts, x="time_step", y="pressure", linewidth = 3,color='#FC6238')
        if axes==True:
            locals()["ax"+str(cell_count)].set_xticks(np.arange(0,3,0.25))
            locals()["ax"+str(cell_count)].set_xticklabels(np.arange(0,3,0.25), fontsize=font_size, fontweight='bold')
            y_min = round(float(breath_ts.pressure.min()),2)
            y_max = round(float(breath_ts.pressure.max()),2)
            y_ticks = np.round(np.arange(y_min,y_max, 4), decimals=2)
            locals()["ax"+str(cell_count)].set_yticks(y_ticks)
            locals()["ax"+str(cell_count)].set_yticklabels(y_ticks, fontsize=font_size, fontweight='bold')
            locals()["ax"+str(cell_count)].set_xlabel('Time Step (seconds)',fontsize=font_size,fontweight='bold')
            locals()["ax"+str(cell_count)].set_ylabel('Pressure',fontsize=font_size,fontweight='bold')
        else:
            y_min = round(float(breath_ts.pressure.min()),2)
            y_max = round(float(breath_ts.pressure.max()),2)
            y_ticks = np.round(np.arange(y_min,y_max, 10), decimals=2) 
            locals()["ax"+str(cell_count)].set_xticks(np.arange(0,3,1))
            locals()["ax"+str(cell_count)].set_yticks(y_ticks) 
            locals()["ax"+str(cell_count)].set_xlabel(None)
            locals()["ax"+str(cell_count)].set_ylabel(None)
            #locals()["ax"+str(cell_count)].set_ylabel('Pressure',fontsize=12,fontweight='bold')
            #locals()["ax"+str(cell_count)].set_xlabel('Time step',fontsize=12,fontweight='bold')            
        cell_count +=1
    gs.tight_layout(fig, rect=[0, 0, 1, 1])
    plt.show()                                                                

plot_samples(40,4,10,axes=False)    

In [None]:
colors1 = ['#FC6238', '#FFD872','#F2D4CC','#E77577','#0065A2','#74737A']
colors2 = ['#3E7DCC', '#8F9CB3','#00C8C8','#F9D84A','#8CC0FF','#4D525A']
colors3 = ['#B29476', '#E3D6C9','#1F5C70','#FBA01D','#FCBC49','#393B45']
colors = ['#FC6238','#3E7DCC','#393B45']
colors_c = ['#E77577','#00C8C8','#1F5C70']
#sns.palplot(sns.color_palette(colors1),size=0.9)
#sns.palplot(sns.color_palette(colors2),size=0.9)
#sns.palplot(sns.color_palette(colors3),size=0.9)

### Lets look at the time series based on "R" values

In [None]:
plt.rc('legend',fontsize=24) 
def get_breath_ids_for_R(val, num_vals):
    val_list = list(train_df.loc[train_df['R'] == val]['breath_id'].unique())
    return np.random.choice(val_list, num_vals)

#array([20, 50,  5])
r_arr = np.array([])
for r in list(train_df['R'].unique()):
    r_arr = np.append(r_arr,get_breath_ids_for_R(r, 1))    
r_arr = r_arr.astype(np.int)

samples_ts = train_df.loc[train_df['breath_id'].isin(r_arr)]

plt.rcParams['figure.dpi'] = 600
background_color = '#E3D6C9'
fig = plt.figure(figsize=(22,13), facecolor=background_color)
font_size = 20
gs = fig.add_gridspec(1,1)
gs.update(wspace=0.3, hspace=0.4)


locals()["ax"+str(0)] = fig.add_subplot(gs[0, 0])
locals()["ax"+str(0)].set_facecolor(background_color)
for s in [ 'right', 'top']:
    locals()["ax"+str(0)].spines[s].set_visible(False)

    #41533b
sns.lineplot(ax=locals()["ax"+str(0)],data=samples_ts, x="time_step", y="pressure", hue='R', linewidth = 3,palette=colors)
locals()["ax0"].set_xticks(np.arange(0,3,0.25))
locals()["ax0"].set_xticklabels(np.arange(0,3,0.25), fontsize=font_size, fontweight='bold')
y_min = round(samples_ts.pressure.min(),2)
y_max = round(samples_ts.pressure.max(),2)
y_ticks = np.round(np.arange(y_min,y_max, 4), decimals=2)

locals()["ax0"].set_yticks(y_ticks)
locals()["ax0"].set_yticklabels(y_ticks, fontsize=font_size, fontweight='bold')
locals()["ax0"].set_xlabel('Time Step (seconds)',fontsize=font_size,fontweight='bold')
locals()["ax0"].set_ylabel('Pressure',fontsize=font_size,fontweight='bold')
#value = "Airway pressure in the respiratory circuit (cmH2O) at time step (second)"
#locals()["ax"+str(0)].text(1.15, 9.5, value, ha='left', va='center', fontsize=font_size, color='White',fontweight='bold',
#            bbox=dict(facecolor='#41533b', edgecolor=None, boxstyle='round', linewidth=0.1)
#                              )

gs.tight_layout(fig, rect=[0, 0, 1, 1])
plt.show()                                                                


In [None]:
plt.rc('legend',fontsize=24) 
#Plot the data
plt.rcParams['figure.dpi'] = 600
background_color = '#E3D6C9'
fig = plt.figure(figsize=(22,22), facecolor='#E3D6C9')

cols = 2
rows = 5

gs = fig.add_gridspec(rows,cols)
gs.update(wspace=0.3, hspace=0.4)

cell_count = 0
for row in range(0, rows):
    for col in range(0, cols):
        locals()["ax"+str(cell_count)] = fig.add_subplot(gs[row, col])
        locals()["ax"+str(cell_count)].set_facecolor(background_color)
        for loc in ["top","right"]:
            locals()["ax"+str(cell_count)].spines[loc].set_visible(False)
        cell_count += 1

cell_count = 0
for i in range(0,10): 
    #array([20, 50,  5])
    r_arr = np.array([])
    for r in list(train_df['R'].unique()):
        r_arr = np.append(r_arr,get_breath_ids_for_R(r, 1))    
    r_arr = r_arr.astype(np.int)

    samples_ts = train_df.loc[train_df['breath_id'].isin(r_arr)]
    sns.lineplot(ax=locals()["ax"+str(cell_count)],data=samples_ts, x="time_step", y="pressure", linewidth = 3, hue='R', palette=colors)
    locals()["ax"+str(cell_count)].set_xticks(np.arange(0,3,0.25))
    locals()["ax"+str(cell_count)].set_xticklabels(np.arange(0,3,0.25), fontsize=font_size, fontweight='bold')
    y_min = round(float(samples_ts.pressure.min()),2)
    y_max = round(float(samples_ts.pressure.max()),2)
    y_ticks = np.round(np.arange(y_min,y_max, 4), decimals=2)
    locals()["ax"+str(cell_count)].set_yticks(y_ticks)
    locals()["ax"+str(cell_count)].set_yticklabels(y_ticks, fontsize=font_size, fontweight='bold')
    locals()["ax"+str(cell_count)].set_xlabel('Time Step (seconds)',fontsize=font_size,fontweight='bold')
    locals()["ax"+str(cell_count)].set_ylabel('Pressure',fontsize=font_size,fontweight='bold')
    cell_count +=1
gs.tight_layout(fig, rect=[0, 0, 1, 1])
plt.show()                                                                

### Lets look at the time series based on "C" values

In [None]:
plt.rc('legend',fontsize=24) 
def get_breath_ids_for_C(val, num_vals):
    val_list = list(train_df.loc[train_df['C'] == val]['breath_id'].unique())
    return np.random.choice(val_list, num_vals)

#array([20, 50,  5])
c_arr = np.array([])
for c in list(train_df['C'].unique()):
    c_arr = np.append(c_arr,get_breath_ids_for_C(c, 1))    
c_arr = c_arr.astype(np.int)
samples_ts = train_df.loc[train_df['breath_id'].isin(c_arr)]

plt.rcParams['figure.dpi'] = 600
background_color = '#E3D6C9'
fig = plt.figure(figsize=(22,13), facecolor=background_color)
font_size = 20
gs = fig.add_gridspec(1,1)
gs.update(wspace=0.3, hspace=0.4)


locals()["ax"+str(0)] = fig.add_subplot(gs[0, 0])
locals()["ax"+str(0)].set_facecolor(background_color)
for s in [ 'right', 'top']:
    locals()["ax"+str(0)].spines[s].set_visible(False)

    #41533b
sns.lineplot(ax=locals()["ax"+str(0)],data=samples_ts, x="time_step", y="pressure", hue='C', linewidth = 3,palette=colors_c)
locals()["ax0"].set_xticks(np.arange(0,3,0.25))
locals()["ax0"].set_xticklabels(np.arange(0,3,0.25), fontsize=font_size, fontweight='bold')
y_min = round(samples_ts.pressure.min(),2)
y_max = round(samples_ts.pressure.max(),2)
y_ticks = np.round(np.arange(y_min,y_max, 4), decimals=2)

locals()["ax0"].set_yticks(y_ticks)
locals()["ax0"].set_yticklabels(y_ticks, fontsize=font_size, fontweight='bold')
locals()["ax0"].set_xlabel('Time Step (seconds)',fontsize=font_size,fontweight='bold')
locals()["ax0"].set_ylabel('Pressure',fontsize=font_size,fontweight='bold')
#value = "Airway pressure in the respiratory circuit (cmH2O) at time step (second)"
#locals()["ax"+str(0)].text(1.15, 9.5, value, ha='left', va='center', fontsize=font_size, color='White',fontweight='bold',
#            bbox=dict(facecolor='#41533b', edgecolor=None, boxstyle='round', linewidth=0.1)
#                              )

gs.tight_layout(fig, rect=[0, 0, 1, 1])
plt.show()                                                                


In [None]:
plt.rc('legend',fontsize=24) 
#Plot the data
plt.rcParams['figure.dpi'] = 600
background_color = '#E3D6C9'
fig = plt.figure(figsize=(22,22), facecolor='#E3D6C9')

cols = 2
rows = 5

gs = fig.add_gridspec(rows,cols)
gs.update(wspace=0.3, hspace=0.4)

cell_count = 0
for row in range(0, rows):
    for col in range(0, cols):
        locals()["ax"+str(cell_count)] = fig.add_subplot(gs[row, col])
        locals()["ax"+str(cell_count)].set_facecolor(background_color)
        for loc in ["top","right"]:
            locals()["ax"+str(cell_count)].spines[loc].set_visible(False)
        cell_count += 1

cell_count = 0
for i in range(0,10): 
    c_arr = np.array([])
    for c in list(train_df['C'].unique()):
        c_arr = np.append(c_arr,get_breath_ids_for_C(c, 1))    
    c_arr = c_arr.astype(np.int)
    samples_ts = train_df.loc[train_df['breath_id'].isin(c_arr)]
    sns.lineplot(ax=locals()["ax"+str(cell_count)],data=samples_ts, x="time_step", y="pressure", linewidth = 3, hue='C', palette=colors_c)
    locals()["ax"+str(cell_count)].set_xticks(np.arange(0,3,0.25))
    locals()["ax"+str(cell_count)].set_xticklabels(np.arange(0,3,0.25), fontsize=font_size, fontweight='bold')
    y_min = round(float(samples_ts.pressure.min()),2)
    y_max = round(float(samples_ts.pressure.max()),2)
    y_ticks = np.round(np.arange(y_min,y_max, 4), decimals=2)
    locals()["ax"+str(cell_count)].set_yticks(y_ticks)
    locals()["ax"+str(cell_count)].set_yticklabels(y_ticks, fontsize=font_size, fontweight='bold')
    locals()["ax"+str(cell_count)].set_xlabel('Time Step (seconds)',fontsize=font_size,fontweight='bold')
    locals()["ax"+str(cell_count)].set_ylabel('Pressure',fontsize=font_size,fontweight='bold')
    cell_count +=1
gs.tight_layout(fig, rect=[0, 0, 1, 1])
plt.show()                                                                

<a id='4'></a>
## <p style="background-color:#fdb913; font-family:Computer Modern;src: url('http://mirrors.ctan.org/fonts/cm-unicode/fonts/otf/cmunss.otf'); font-size:100%; text-align:center">Distribution variables</p>
<div>
    
[back to top](#0)

In [None]:
YELLOW = '#fdb913'
plt.rcParams['figure.dpi'] = 600
background_color = '#E3D6C9'
fig = plt.figure(figsize=(22,13), facecolor=background_color)
font_size = 20
gs = fig.add_gridspec(1,1)
gs.update(wspace=0.3, hspace=0.4)


locals()["ax"+str(0)] = fig.add_subplot(gs[0, 0])
locals()["ax"+str(0)].set_facecolor(background_color)
for s in [ 'right', 'top']:
    locals()["ax"+str(0)].spines[s].set_visible(False)

    #41533b
sns.kdeplot(ax=locals()["ax0"],data = train_df, x = 'pressure',color=YELLOW, fill=True,  #cut=0, bw_method=0.20, 
                lw=1.4, edgecolor='#9e9a75',alpha=1) 
locals()["ax0"].set_xlabel('Pressure',fontsize=font_size, fontweight='bold')
locals()["ax0"].set_ylabel('Density',fontsize=font_size, fontweight='bold')

gs.tight_layout(fig, rect=[0, 0, 1, 1])
plt.show()                                                                


In [None]:
train_df['R'].value_counts()

In [None]:
train_df['C'].value_counts()

In [None]:
plt.rcParams['figure.dpi'] = 600
background_color = '#E3D6C9'
fig = plt.figure(figsize=(22,13), facecolor=background_color)
font_size = 20
gs = fig.add_gridspec(1,1)
gs.update(wspace=0.3, hspace=0.4)


locals()["ax"+str(0)] = fig.add_subplot(gs[0, 0])
locals()["ax"+str(0)].set_facecolor(background_color)
for s in [ 'right', 'top']:
    locals()["ax"+str(0)].spines[s].set_visible(False)

    #41533b
sns.kdeplot(ax=locals()["ax0"],data = train_df, x = 'u_in',color='#41533b', fill=True,  #cut=0, bw_method=0.20, 
                lw=1.4, edgecolor='#9e9a75',alpha=1) 
locals()["ax0"].set_xlabel('u_in',fontsize=font_size, fontweight='bold')
locals()["ax0"].set_ylabel('Density',fontsize=font_size, fontweight='bold')

gs.tight_layout(fig, rect=[0, 0, 1, 1])
plt.show()                                                                


**To be continued..**