# FNN 	Forward Neural Network

Features of LR_RMH into a Neural Network.   

Considering that medical data is not linearly separable and LR cannot handle non-linearities, we implemented Deep Neural Networks; which could perform better with enough data for training. We started with a Forward Neural Network, feeding the same features used by the LR model and indeed its performance was higher. We implemented NN models using Keras with Tensorflow as the backend in a Google Cloud Platform. 

In [1]:
%c inline
import pandas as pd
import numpy as np
import json
#from importlib import reload
from sklearn.cross_validation import train_test_split

from keras.utils import plot_model
from IPython.display import SVG
from keras.utils.vis_utils import model_to_dot

import sys 
sys.path.append("../../src/models/train_model")
import NN_model
sys.path.append("../../src/features")
import build_features, vital_signs_features, age_features, RFV_features


%matplotlib inline

ERROR:root:Line magic function `%c` not found.
Using TensorFlow backend.


In [2]:
pd.options.mode.chained_assignment = None  # default='warn'

## Model Training

In [18]:
with open('../../fileConfig.json') as config_file:    
        fileConfig = json.load(config_file)

In [19]:
NN_model.FNN_model_training(fileConfig, 'ED_TOTAL_2009_2009.csv')

AUROC: 84.67%
AUROC: 84.37%
AUROC: 84.60%
AUROC: 88.42%
AUROC: 87.71%
AUROC: 85.27%
AUROC: 87.31%
AUROC: 84.94%
AUROC: 84.71%
AUROC: 83.40%
ROC AUC: 85.54% (+/- 0.02%)


## Model Training, step by step

### Reading CDC File

In [5]:
#reading file
processedDirectory = fileConfig['dataDirectory'] + fileConfig['processedDirectory'] 
cdc_input = pd.read_csv(processedDirectory + 'ED_TOTAL_2009_2009.csv' )

### Feature Engineering 

In [6]:
reload(RFV_features)
reload(build_features)
# Note: here we use the option to normalize numerical values, which was a must for the NN to converge.
predictors, target = build_features.get_all_features (cdc_input, normalize=True )

In [7]:
list(predictors)

['Temp_Baseline',
 'Pulse_Baseline',
 'Sys_BP_Baseline',
 'Resp_Rate_Baseline',
 'Oxygen_Sat_Baseline',
 'Reason_Chest_Pain',
 'Reason_Abdominal_Pain',
 'Reason_Headache',
 'Reason_Shortness_of_Breath',
 'Reason_Back_Pain',
 'Reason_Cough',
 'Reason_Nausea_Vomiting',
 'Reason_Fever_Chills',
 'Reason_Syncope',
 'Reason_Dizziness',
 'Reason_Psychiatric_Complaint',
 'Reason_Nervous_System',
 'Reason_Cardiovascular_Other',
 'Reason_Ears_Eyes_Complaint',
 'Reason_Respiratory_Other',
 'Reason_Gastrointestinal_Other',
 'Reason_Genitourinary_Other',
 'Reason_Skin_Hair_Nails_Complaint',
 'Reason_Musculoskeletal_Other',
 'Reason_Injury_Poisoning',
 'Reason_Other',
 'Hypothermia',
 'Hyperthermia',
 'Bradycardia',
 'Mild_Tachycardia',
 'Moderate_Tachycardia',
 'Severe_Tachycardia',
 'Hypotension',
 'Hypertension',
 'Bradypnea',
 'Moderate_Tachypnea',
 'Severe_Tachypnea',
 'Mild_Hypoxia',
 'Severe_Hypoxia',
 'Age_18_30',
 'Age_31_40',
 'Age_41_50',
 'Age_51_60',
 'Age_61_70',
 'Age_71_80',
 'Age_81

In [8]:
pd.set_option('display.max_columns', 0)

In [9]:
predictors.head()

Unnamed: 0,Temp_Baseline,Pulse_Baseline,Sys_BP_Baseline,Resp_Rate_Baseline,Oxygen_Sat_Baseline,Reason_Chest_Pain,Reason_Abdominal_Pain,Reason_Headache,Reason_Shortness_of_Breath,Reason_Back_Pain,Reason_Cough,Reason_Nausea_Vomiting,Reason_Fever_Chills,Reason_Syncope,Reason_Dizziness,Reason_Psychiatric_Complaint,Reason_Nervous_System,Reason_Cardiovascular_Other,Reason_Ears_Eyes_Complaint,Reason_Respiratory_Other,Reason_Gastrointestinal_Other,Reason_Genitourinary_Other,Reason_Skin_Hair_Nails_Complaint,Reason_Musculoskeletal_Other,Reason_Injury_Poisoning,Reason_Other,Hypothermia,Hyperthermia,Bradycardia,Mild_Tachycardia,Moderate_Tachycardia,Severe_Tachycardia,Hypotension,Hypertension,Bradypnea,Moderate_Tachypnea,Severe_Tachypnea,Mild_Hypoxia,Severe_Hypoxia,Age_18_30,Age_31_40,Age_41_50,Age_51_60,Age_61_70,Age_71_80,Age_81_Above,Male_Flag,Female_Flag,Ambulance_Arrival,Other_Arrival,Unknown_Arrival,rfv1_1,rfv1_2,rfv1_3,rfv1_4,rfv1_5,rfv2_1,rfv2_2,rfv2_3,rfv2_4,rfv2_5,rfv3_1,rfv3_2,rfv3_3,rfv3_4,rfv3_5,MSA_1,MSA_2,CHF,DIABETES_1,DIABETES_0
0,0.602941,0.09018,0.444828,0.108108,0.108108,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,1,1,0,0,0.1,0.9,0.0,0.0,0.1,0.1,0.9,0.1,0.0,0.1,0.0,0.0,0.0,0.0,0.0,1,0,0,0,1
1,0.544118,0.071142,0.575862,0.108108,0.108108,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,1,0,1,0,0,1,0,0.1,0.9,0.2,0.0,0.1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1,0,0,0,1
2,0.577206,0.089178,0.406897,0.135135,0.135135,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,1,1,0,0,0,0,0,0,0,1,0,1,0,0.1,0.5,0.4,0.5,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1,0,0,0,1
3,0.5625,0.087174,0.468966,0.121622,0.121622,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,1,0,0,0,0,1,0,0,1,0,0.1,0.0,0.5,0.0,0.1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1,0,0,0,1
4,0.5625,0.086172,0.62069,0.135135,0.135135,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,1,0,0,1,0,0,0,0,0,0,1,0,1,1,0,0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1,0,0,1,0


### NN model

In [10]:
X_train, X_dev, y_train, y_dev = train_test_split(predictors,target, test_size = 0.1)

In [11]:
nn_model= NN_model.create_model(X_train.shape[1:],l2=0.01)

In [12]:
nn_model.summary()

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
dense_41 (Dense)             (None, 100)               7200      
_________________________________________________________________
dense_42 (Dense)             (None, 100)               10100     
_________________________________________________________________
dense_43 (Dense)             (None, 100)               10100     
_________________________________________________________________
dense_44 (Dense)             (None, 1)                 101       
Total params: 27,501
Trainable params: 27,501
Non-trainable params: 0
_________________________________________________________________


In [14]:
roc_auc = NN_model.train_cdc_model (X_train, y_train, X_dev, y_dev, num_epochs=150,  
                                    network=nn_model, verbose_flag= True)

Train on 21888 samples, validate on 2433 samples
Epoch 1/150
Epoch 2/150
Epoch 3/150
Epoch 4/150
Epoch 5/150
Epoch 6/150
Epoch 7/150
Epoch 8/150
Epoch 9/150
Epoch 10/150
Epoch 11/150
Epoch 12/150
Epoch 13/150
Epoch 14/150
Epoch 15/150
Epoch 16/150
Epoch 17/150
Epoch 18/150
Epoch 19/150
Epoch 20/150
Epoch 21/150
Epoch 22/150
Epoch 23/150
Epoch 24/150
Epoch 25/150
Epoch 26/150
Epoch 27/150
Epoch 28/150
Epoch 29/150
Epoch 30/150
Epoch 31/150
Epoch 32/150
Epoch 33/150
Epoch 34/150
Epoch 35/150
Epoch 36/150
Epoch 37/150
Epoch 38/150
Epoch 39/150
Epoch 40/150
Epoch 41/150
Epoch 42/150
Epoch 43/150
Epoch 44/150
Epoch 45/150
Epoch 46/150
Epoch 47/150
Epoch 48/150
Epoch 49/150
Epoch 50/150
Epoch 51/150
Epoch 52/150
Epoch 53/150
Epoch 54/150
Epoch 55/150
Epoch 56/150
Epoch 57/150
Epoch 58/150
Epoch 59/150
Epoch 60/150
Epoch 61/150
Epoch 62/150
Epoch 63/150
Epoch 64/150
Epoch 65/150
Epoch 66/150
Epoch 67/150
Epoch 68/150
Epoch 69/150
Epoch 70/150
Epoch 71/150
Epoch 72/150
Epoch 73/150
Epoch 74/15

Example of results with manual tuning

```
epochs = 250,  ROC 83.81%
epochs = 235,  ROC 83.81%
epochs = 220,  ROC 84.68%
epochs = 200,  ROC 84.63%


epochs = 220, l2=0.01   84.20%
epochs =100 l2=0.01 83.48%
 num_epochs=150,  l2=0.01           AUROC: 85.41%
```

### Cross Validation

In [15]:
NN_model.cross_Validation (nepochs=100, predictors=predictors, target=target,l2=0.01,units_n = 100, n_layers = 3)

AUROC: 84.61%
AUROC: 84.39%
AUROC: 84.94%
AUROC: 88.34%
AUROC: 87.72%
AUROC: 85.27%
AUROC: 87.33%
AUROC: 84.75%
AUROC: 84.66%
AUROC: 83.32%
ROC AUC: 85.53% (+/- 0.02%)


Cross Validation result from w210 runs
```
NN_model.cross_Validation(150,predictors, target,l2=0.01)
AUROC: 84.62%
AUROC: 84.31%
AUROC: 84.48%
AUROC: 88.14%
AUROC: 87.97%
AUROC: 85.40%
AUROC: 87.13%
AUROC: 84.85%
AUROC: 84.62%
AUROC: 83.36%
ROC AUC: 85.49% (+/- 0.02%)
```