<h1>Table of Contents<span class="tocSkip"></span></h1>
<div class="toc"><ul class="toc-item"><li><span><a href="#Load-libraries" data-toc-modified-id="Load-libraries-1"><span class="toc-item-num">1&nbsp;&nbsp;</span>Load libraries</a></span></li><li><span><a href="#Import-X_transformed-and-y" data-toc-modified-id="Import-X_transformed-and-y-2"><span class="toc-item-num">2&nbsp;&nbsp;</span>Import X_transformed and y</a></span></li><li><span><a href="#Selecting-X-columns" data-toc-modified-id="Selecting-X-columns-3"><span class="toc-item-num">3&nbsp;&nbsp;</span>Selecting X columns</a></span></li><li><span><a href="#Splitting-data" data-toc-modified-id="Splitting-data-4"><span class="toc-item-num">4&nbsp;&nbsp;</span>Splitting data</a></span></li><li><span><a href="#Deeplearning" data-toc-modified-id="Deeplearning-5"><span class="toc-item-num">5&nbsp;&nbsp;</span>Deeplearning</a></span></li><li><span><a href="#Running-baseline-model" data-toc-modified-id="Running-baseline-model-6"><span class="toc-item-num">6&nbsp;&nbsp;</span>Running baseline model</a></span></li><li><span><a href="#Re-running-baseline-model-with-data-preparation-(scaling)" data-toc-modified-id="Re-running-baseline-model-with-data-preparation-(scaling)-7"><span class="toc-item-num">7&nbsp;&nbsp;</span>Re-running baseline model with data preparation (scaling)</a></span><ul class="toc-item"><li><span><a href="#StandardScaler" data-toc-modified-id="StandardScaler-7.1"><span class="toc-item-num">7.1&nbsp;&nbsp;</span>StandardScaler</a></span></li><li><span><a href="#MinMaxScaler" data-toc-modified-id="MinMaxScaler-7.2"><span class="toc-item-num">7.2&nbsp;&nbsp;</span>MinMaxScaler</a></span></li></ul></li><li><span><a href="#Tuning-layers-and-number-of-neurons-with-baseline" data-toc-modified-id="Tuning-layers-and-number-of-neurons-with-baseline-8"><span class="toc-item-num">8&nbsp;&nbsp;</span>Tuning layers and number of neurons with baseline</a></span><ul class="toc-item"><li><span><a href="#Smaller-network" data-toc-modified-id="Smaller-network-8.1"><span class="toc-item-num">8.1&nbsp;&nbsp;</span>Smaller network</a></span></li><li><span><a href="#Larger-network" data-toc-modified-id="Larger-network-8.2"><span class="toc-item-num">8.2&nbsp;&nbsp;</span>Larger network</a></span></li><li><span><a href="#small-with-minmax-and-0.1%-dropout" data-toc-modified-id="small-with-minmax-and-0.1%-dropout-8.3"><span class="toc-item-num">8.3&nbsp;&nbsp;</span>small with minmax and 0.1% dropout</a></span></li><li><span><a href="#Batch-normalization" data-toc-modified-id="Batch-normalization-8.4"><span class="toc-item-num">8.4&nbsp;&nbsp;</span>Batch normalization</a></span></li></ul></li><li><span><a href="#Summary" data-toc-modified-id="Summary-9"><span class="toc-item-num">9&nbsp;&nbsp;</span>Summary</a></span></li></ul></div>

### Load libraries

In [313]:
# Importing libraries
import numpy as np
import pandas as pd

# plotting
import matplotlib.pyplot as plt
import seaborn as sns
%matplotlib inline

# modelling
from sklearn.model_selection import train_test_split, StratifiedKFold, cross_val_score
from sklearn.preprocessing import  LabelEncoder, StandardScaler, MinMaxScaler
from sklearn.pipeline import Pipeline

# deeplearning
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense

from tensorflow.keras.wrappers.scikit_learn import KerasClassifier

import random
random.seed(1234)
np.random.seed(1234)
tf.random.set_seed(1234)

# accuracy
from sklearn.metrics import plot_confusion_matrix, confusion_matrix, classification_report
from sklearn.metrics import plot_roc_curve

import warnings
warnings.filterwarnings('ignore')

### Import X_transformed and y

In [314]:
X_transformed = pd.read_csv("X_transformed.csv")

y = pd.read_csv("y.csv")

### Selecting X columns

In [315]:
X_lines = X_transformed[['joey_lines', 'chandler_lines', 'ross_lines', 'monica_lines', 'rachel_lines', 'phoebe_lines', 'janice_lines','Gunther_lines', 'ugly_naked_guy']]

### Splitting data

In [316]:
X_train, X_test, y_train, y_test = train_test_split(X_lines, y, stratify=y, test_size=0.2, random_state=42)

### Running baseline model

In [326]:
# baseline model
def create_baseline():
    # create model
    model = Sequential()
    model.add(Dense(9, input_dim=9, activation='relu'))
    model.add(Dense(1, activation='sigmoid'))
    # Compile model
    model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
    return model

# evaluate model with standardized dataset
# estimator = KerasClassifier(build_fn=create_baseline, epochs=100, batch_size=5, verbose=0)
estimators.append(('mlp', KerasClassifier(build_fn=create_baseline, epochs=100, verbose=0)))
kfold = StratifiedKFold(n_splits=5, shuffle=True, random_state=42)
results = cross_val_score(estimator, X_lines, y, cv=kfold)
print("Baseline: %.2f%% (%.2f%%)" % (results.mean()*100, results.std()*100))

Baseline: 47.22% (7.11%)


### Re-running baseline model with data preparation (scaling)
#### StandardScaler

In [327]:
# evaluate baseline model with standardized dataset
estimators = []
estimators.append(('standardize', StandardScaler()))
# estimators.append(('standardize', MinMaxScaler()))
# estimators.append(('mlp', KerasClassifier(build_fn=create_baseline, epochs=100, batch_size=5, verbose=0)))
estimators.append(('mlp', KerasClassifier(build_fn=create_baseline, epochs=100, verbose=0)))
pipeline = Pipeline(estimators)
kfold = StratifiedKFold(n_splits=5, shuffle=True, random_state=42)
results = cross_val_score(pipeline, X_lines, y, cv=kfold)
print("Standardized: %.2f%% (%.2f%%)" % (results.mean()*100, results.std()*100))

Standardized: 43.54% (8.97%)


#### MinMaxScaler

In [328]:
# evaluate baseline model with standardized dataset
estimators = []
# estimators.append(('standardize', StandardScaler()))
estimators.append(('standardize', MinMaxScaler()))
# estimators.append(('mlp', KerasClassifier(build_fn=create_baseline, epochs=100, batch_size=5, verbose=0)))
estimators.append(('mlp', KerasClassifier(build_fn=create_baseline, epochs=100, verbose=0)))
pipeline = Pipeline(estimators)
kfold = StratifiedKFold(n_splits=5, shuffle=True, random_state=42)
results = cross_val_score(pipeline, X_lines, y, cv=kfold)
print("Standardized: %.2f%% (%.2f%%)" % (results.mean()*100, results.std()*100))

Standardized: 50.91% (9.87%)


The achieved results were:
- Baseline: 47.22%
- StandardScaler: 43.54%
- MinMaxScaler: 50.91%

Will follow through with model that has the minmax scaler. I plan to alter the size of the network to see if I can increase the cv accuracy.

### Tuning layers and number of neurons with baseline
#### Smaller network

In [341]:
def create_smaller():
    # create model
    model = Sequential()
    # 5 = rounded(9/2)
    model.add(Dense(5, input_dim=9, activation='relu'))
    model.add(Dense(1, activation='sigmoid'))
    # Compile model
    model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
    return model

# evaluate baseline model with standardized dataset
estimators = []
# estimators.append(('standardize', StandardScaler()))
estimators.append(('standardize', MinMaxScaler()))
# estimators.append(('mlp', KerasClassifier(build_fn=create_smaller, epochs=100, batch_size=5, verbose=0)))
estimators.append(('mlp', KerasClassifier(build_fn=create_smaller, epochs=100, verbose=0)))
pipeline = Pipeline(estimators)
kfold = StratifiedKFold(n_splits=5, shuffle=True, random_state=42)
results = cross_val_score(pipeline, X_lines, y, cv=kfold)
print("Standardized: %.2f%% (%.2f%%)" % (results.mean()*100, results.std()*100))

Standardized: 55.45% (7.31%)


#### Larger network

In [342]:
def create_larger():
    # create model
    model = Sequential()
    # 5 = rounded(9/2)
    model.add(Dense(9, input_dim=9, activation='relu'))
    model.add(Dense(5, activation='relu'))
    model.add(Dense(1, activation='sigmoid'))
    # Compile model
    model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
    return model


# evaluate baseline model with standardized dataset
estimators = []
# estimators.append(('standardize', StandardScaler()))
estimators.append(('standardize', MinMaxScaler()))
# estimators.append(('mlp', KerasClassifier(build_fn=create_larger, epochs=100, batch_size=5, verbose=0)))
estimators.append(('mlp', KerasClassifier(build_fn=create_larger, epochs=100, verbose=0)))
pipeline = Pipeline(estimators)
kfold = StratifiedKFold(n_splits=5, shuffle=True, random_state=42)
results = cross_val_score(pipeline, X_lines, y, cv=kfold)
print("Standardized: %.2f%% (%.2f%%)" % (results.mean()*100, results.std()*100))

Standardized: 47.22% (7.70%)


The achieved results were:
- Base with minmax: 50.91%
- Smaller base with minmax: 55.45% 
- Larger base with minmax: 47.22%

Seems like shrinking the model a bit with the minmax scaler did me quite the favour. Will follow through with this model and see how **dropout** affects the model.

#### small with minmax and 0.1% dropout

In [346]:
def create_smaller():
    # create model
    model = Sequential()
    # 5 = rounded(9/2)
    model.add(Dense(5, input_dim=9, activation='relu'))
    model.add(layers.Dropout(0.1)) # Set 10% of the nodes to 0.
    model.add(Dense(1, activation='sigmoid'))
    # Compile model
    model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
    return model

# evaluate baseline model with standardized dataset
estimators = []
# estimators.append(('standardize', StandardScaler()))
estimators.append(('standardize', MinMaxScaler()))
# estimators.append(('mlp', KerasClassifier(build_fn=create_smaller, epochs=100, batch_size=5, verbose=0)))
estimators.append(('mlp', KerasClassifier(build_fn=create_smaller, epochs=100, verbose=0)))
pipeline = Pipeline(estimators)
kfold = StratifiedKFold(n_splits=5, shuffle=True, random_state=42)
results = cross_val_score(pipeline, X_lines, y, cv=kfold)
print("Standardized: %.2f%% (%.2f%%)" % (results.mean()*100, results.std()*100))

Standardized: 48.62% (5.46%)


In [430]:
def create_smaller():
    # create model
    model = Sequential()
    # 5 = rounded(9/2)
    model.add(Dense(5, input_dim=9, activation='relu'))
    model.add(layers.Dropout(0.2)) # Set 10% of the nodes to 0.
    model.add(Dense(1, activation='sigmoid'))
    # Compile model
    model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
    return model

# evaluate baseline model with standardized dataset
estimators = []
# estimators.append(('standardize', StandardScaler()))
estimators.append(('standardize', MinMaxScaler()))
# estimators.append(('mlp', KerasClassifier(build_fn=create_smaller, epochs=100, batch_size=5, verbose=0)))
estimators.append(('mlp', KerasClassifier(build_fn=create_smaller, epochs=100, verbose=0)))
pipeline = Pipeline(estimators)
kfold = StratifiedKFold(n_splits=5, shuffle=True, random_state=42)
results = cross_val_score(pipeline, X_lines, y, cv=kfold)
print("Standardized: %.2f%% (%.2f%%)" % (results.mean()*100, results.std()*100))

Standardized: 53.57% (11.11%)


In [431]:
def create_smaller():
    # create model
    model = Sequential()
    # 5 = rounded(9/2)
    model.add(Dense(5, input_dim=9, activation='relu'))
    model.add(layers.Dropout(0.3)) # Set 10% of the nodes to 0.
    model.add(Dense(1, activation='sigmoid'))
    # Compile model
    model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
    return model

# evaluate baseline model with standardized dataset
estimators = []
# estimators.append(('standardize', StandardScaler()))
estimators.append(('standardize', MinMaxScaler()))
# estimators.append(('mlp', KerasClassifier(build_fn=create_smaller, epochs=100, batch_size=5, verbose=0)))
estimators.append(('mlp', KerasClassifier(build_fn=create_smaller, epochs=100, verbose=0)))
pipeline = Pipeline(estimators)
kfold = StratifiedKFold(n_splits=5, shuffle=True, random_state=42)
results = cross_val_score(pipeline, X_lines, y, cv=kfold)
print("Standardized: %.2f%% (%.2f%%)" % (results.mean()*100, results.std()*100))

Standardized: 49.51% (5.12%)


Tested dropouts ranging from 10-30% but seemed to produce lower cv accuracies than the smaller baseline model with a minmax scaler. 
Will test batch normalization with the hopes of improving the cv accuracy a bit more.

#### Batch normalization

In [433]:
def create_smaller():
    # create model
    model = Sequential()
    # 5 = rounded(9/2)
    model.add(Dense(5, input_dim=9, activation='relu'))
    model.add(layers.BatchNormalization())
    model.add(Dense(1, activation='sigmoid'))
    # Compile model
    model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
    return model

# evaluate baseline model with standardized dataset
estimators = []
# estimators.append(('standardize', StandardScaler()))
estimators.append(('standardize', MinMaxScaler()))
# estimators.append(('mlp', KerasClassifier(build_fn=create_smaller, epochs=100, batch_size=5, verbose=0)))
estimators.append(('mlp', KerasClassifier(build_fn=create_smaller, epochs=100, verbose=0)))
pipeline = Pipeline(estimators)
kfold = StratifiedKFold(n_splits=5, shuffle=True, random_state=42)
results = cross_val_score(pipeline, X_lines, y, cv=kfold)
print("Standardized: %.2f%% (%.2f%%)" % (results.mean()*100, results.std()*100))

Standardized: 46.30% (5.39%)


Batch normalization didn't improve accuracy.

### Summary
To sum things up, the model with the highest 5 fold cv accuracy (55.45%) was the smaller model with the minmax scalar. This was a great way to get a sense of how altering a model affects its cv accuracy. That being said, DeepLearning is definitely something I need to deep (pun intended) dive into and explore the possibilities. It was also great to see the implementation of deeplearning in python rather than coding the algorithms from scratch on MATLAB (aaah school).

I welcome suggestions and/or resources on how to improve my code and thinking process. Thank you in advance! :)  