<font color="magenta" size=9>Simulation of CDAI Dataset</font>

Here we are going to try ot simulate the CDAI dataset using a stochastic proccess. We assume that CDAI follows brownian motion, so an individuals health moves randomly. The chances that the patient gets worse are the same that they get better. We also assume that if a patient get considerably worse at some timepoint, then they are removed from the study.

# <font color="gray">Setup Notebook

Import basic packages

In [None]:
%matplotlib inline
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

Import packages for creating widget app

In [None]:
import ipywidgets as ipw
from ipywidgets import interact_manual
from IPython.core.display import HTML, display, Javascript, clear_output

# <font color="blue">Sanity Check: Graph Random Walk

##  <font color="blue">Create Dataset

Decide number of subjects and number of subjects and number of timesteps for simulation

<font color="red">Note: this can later be replaced by an ipywidget

In [None]:
num_subjects = 200
num_steps  =  30

Create starting values

In [None]:
data = pd.DataFrame( np.random.normal(0,1, num_subjects) )

In [None]:
plt.hist(data[0])

Iteratively add data

In [None]:
for i in range(num_steps-1):
    data[ data.shape[1] ] = data[ data.shape[1]-1 ] + np.random.normal(0,1, num_subjects) 

In [None]:
data

##  <font color="blue">Plot Dataset

### <font color="blue">Line Plot

Transpose the dataset to make plotting easier


In [None]:
 data_graph = data.T

In [None]:
data_graph

Plot each subject into graph

In [None]:
plt.figure(figsize=(18, 10) )
for i in range(data_graph.shape[1]):
    plt.plot(  range(data_graph.shape[0]), data_graph[i], color="gray", alpha=0.5, linewidth=.5  ) #,# data=data_graph) #, marker='', color='olive', linewidth=2)
plt.show()


### <font color="blue">Box Plot

In [None]:
data.boxplot( figsize=(18,10) )

# <font color="green">CDAI Simulation

## <font color="darkRed">Create Dataset

Decide number of subjects and number of subjects and number of timesteps for simulation

<font color="red">Note: this can later be replaced by an ipywidget 

In [None]:
num_subjects = 300
num_steps  =  15
sd_cutoff = 1.7

Create starting data

In [None]:
data = pd.DataFrame( np.random.normal( 0,1, num_subjects) )

In [None]:
plt.hist(data[0])

Iteratively add data

In [None]:
for i in range(num_steps-1):
    rand_noise = np.random.normal(0,1, num_subjects)               # create noise for next step
    rand_noise[ np.where( rand_noise > sd_cutoff )[0] ] = np.nan   # make NaN values according to sd_cutoff
    data[ data.shape[1] ] = data[ data.shape[1]-1 ] + rand_noise   # combine noise with previous values 

In [None]:
data

## <font color="lightGreen">Graph Dataset

### <font color="lightGreen">Line Plot

Transpose the dataset to make plotting easier

In [None]:
 data_graph = data.T

Plot each subject into graph

In [None]:
plt.figure(figsize=(18, 10) )
for i in range(data_graph.shape[1]):
    plt.plot(  range(data_graph.shape[0]), data_graph[i], color="gray", alpha=0.5, linewidth=.5  ) #,# data=data_graph) #, marker='', color='olive', linewidth=2)
plt.show()


### <font color="lightGreen">Box Plot

In [None]:
data.boxplot( figsize=(18,10) )

### <font color="lightGreen">Calculate Key Statistics

Calculate percent of subjects removed from study

In [None]:
sum( pd.isna( data[ data.shape[1]-1 ]  ) ) / data.shape[0]

# <font color="magenta">Widget Application

You can use this tool to **simulate the CDAI Dataset**.  

In the simulation you can decide the length of the experiment, the number of subjects, and the cutoff valeus doctors use to remove patients from the study. 

In [None]:
@interact_manual( num_subjects=(10,2000), num_steps=(5,50), sd_cutoff=(.5,2) )
def simulator(num_subjects, num_steps, sd_cutoff  ):
    
    # Create Dataset for Brownian motion
    display(HTML("<h1 style='color:brown'> Brownian Motion </h1>"))
    data = pd.DataFrame( np.random.normal(0,1, num_subjects) )
    for i in range(num_steps-1):
        data[ data.shape[1] ] = data[ data.shape[1]-1 ] + np.random.normal(0,1, num_subjects) 
    
    # Create Graphs for Brownian Motion
    plt.figure(figsize=(14, 7) )
    for i in range(data.T.shape[1]):
        plt.plot(  range(1,data.T.shape[0]+1), data.T[i], color="gray", alpha=0.2, linewidth=.3  )
    data.boxplot(figsize=(14,7))
    plt.show()
    
    # Create Dataset for CDAI Simulaiton
    display(HTML("<h1 style='color:magenta'> CDAI Simulator </h1>"))
    data = pd.DataFrame( np.random.normal(0,1, num_subjects) )
    for i in range(num_steps-1):
        rand_noise = np.random.normal(0,1, num_subjects)               # create noise for next step
        rand_noise[ np.where( rand_noise > sd_cutoff )[0] ] = np.nan   # make NaN values according to sd_cutoff
        data[ data.shape[1] ] = data[ data.shape[1]-1 ] + rand_noise   # combine noise with previous values 
    
    # Create Graph For CDAI Simulation
    plt.figure(figsize=(14, 7) )
    removed = sum( pd.isna( data[ data.shape[1]-1 ]  ) ) / data.shape[0]
    display(HTML(f"<h3>Subjects Removed : {int(removed*100)}%</h3>"))
    for i in range(data.T.shape[1]):
        plt.plot(  range(1,data.T.shape[0]+1), data.T[i], color="gray", alpha=0.2, linewidth=.3  )
    display( data.boxplot(figsize=(14,7)) ) 
    plt.show()
