### Bake a Deep Learning Classifier with Keras
---------------------------------------------------

Keras is a library that simplifies the construction of neural networks.

This notebook will highlight how to construct a simple feed-forward neural network to predict the final rankings of bakers from episode 5.

The features used in the model include the mean ranking for technical challenges and the ranking of the technical challenge for episode 5

In [1]:
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
import vapeplot
import seaborn as sns
import scipy.stats
from datetime import datetime
%matplotlib inline

In [2]:
import tensorflow as tf
from keras.models import Sequential
from keras.layers import Dense, Activation, Embedding, Flatten, Dropout
from keras.activations import relu, sigmoid, tanh

from sklearn.preprocessing import QuantileTransformer
from sklearn.metrics import roc_curve, auc
import warnings
warnings.filterwarnings("ignore")

def timestamp(): return datetime.today().strftime('%Y%m%d')

def quantile_scale(df,feats):
    qua = df
    scaler = QuantileTransformer(
        n_quantiles=10,
        random_state=42,
        ignore_implicit_zeros=True, #sparse matrix
    )
    # fit the scaler
    scaler.fit(qua[feats])
    # transform values
    qua[feats] = scaler.transform(qua[feats])
    return qua

def calc_95ci(a,confidence=0.95):
    a = 1.0 * np.array(a)
    n = len(a)
    m, se = np.nanmean(a), scipy.stats.sem(a)
    h = se * scipy.stats.t.ppf((1 + confidence) / 2., n-1)
    return h

def return_feats(df,feats,label):
    df = df.sample(frac=1.)
    X = np.matrix(df[feats])
    y = np.array(df[label])
    return X,y


Using TensorFlow backend.


In [3]:
# load data
episode=5
season=7
tech = pd.read_csv("../RESULTS/gbbo.techinical.data.20190907.tsv",sep='\t')

## since the minimum number of bakers for a season is 9 (season 1)
## we will convert all places greater than 9 to the value '9'
## this keeps the number of possible ranking positions the same (number of classes)
classes = np.array(tech['place'])
classes = np.where(classes<=7, classes, 8)
tech['place']=classes
feats = ['tech_mean','tech']
tech = tech.loc[tech['episode']==episode]

tech = quantile_scale(tech,feats)
X,y = return_feats(tech,feats,'place')
X_test, y_test = return_feats(tech.loc[tech['season']==season],feats,'place')
X_train, y_train = return_feats(tech.loc[tech['season']!=season],feats,'place')

In [4]:
from keras.wrappers.scikit_learn import KerasClassifier
from sklearn.model_selection import GridSearchCV

def create_model(layers, activations):
    n_dims = X.shape[1]
    n_classes = len(set(y))
    model = Sequential()
    for i,nodes in enumerate(layers):
        if i==0:
            model.add(Dense(n_dims,input_dim=n_dims))
            model.add(Activation(activations))
        else:
            model.add(Dense(nodes))
            model.add(Activation(activations))
    # output layer needs to have the same number of neurons
    # as the number of classes to predict
    model.add(Dense(n_classes))
    
    # binary_crossentropy is for binary models
    # categorical_crossentropy is for multiclass models
    model.compile(optimizer='adam',loss='categorical_crossentropy',metrics=['accuracy'])
    return model

model = KerasClassifier(build_fn=create_model,verbose=0)

#### Hyperparameters
----------------------

Hyperparameters are model settings that are defined before training. 
For Neural Networks, this include the learning rate, the number of hidden layers, number of neurons in hidden layers, and neuron activation functions

We will evaluate the performance of a neural network across different hyperparameter conditions

In [5]:
n_dims = np.matrix(X).shape[1]
n_classes = len(set(y))
print('Number of Bakers: {}'.format(n_classes))

#################
# hyperparamters
## number of hidden layers is the length of the entry
## the value is the number of neurons for each layer
layers= [[n_dims], [40,20], [45,30,15] ]
## activation functions for neurons
activations = ['sigmoid','relu','tanh']
## number of times the complete dataset is passed through
## the model. underfit if too low, overfit if too high
epochs = [10,25,50]
## size of subset for each epoch, determines the number of
## iterations for each epoch
batch_size = [10,25,50]
################
param_grid = dict(
    layers=layers,
    activations=activations,
    epochs=epochs,
    batch_size=batch_size
    )

# Leave One (Season) Out Cross Validation
# leave one out CV
from sklearn.model_selection import LeaveOneGroupOut
loo = LeaveOneGroupOut()
cv=loo.split(X,groups=tech['season'])

grid = GridSearchCV(estimator=model,
                    param_grid=param_grid,
                    cv=cv,
                    verbose=2,
                    n_jobs=8,
                   )

print('N Jobs:',len(layers)*len(activations)*len(epochs)*len(batch_size)*9)

Number of Bakers: 8
N Jobs: 729


Now we do the Leave One Out Cross Validation over all the different combinations of hyperparameters. 

-------------------------------------------------
#### This will take a while so let it bake!
-------------------------------------------------

In [6]:
grid_result = grid.fit(X,y)

Fitting 9 folds for each of 81 candidates, totalling 729 fits


[Parallel(n_jobs=8)]: Using backend LokyBackend with 8 concurrent workers.
[Parallel(n_jobs=8)]: Done  25 tasks      | elapsed:    5.2s
[Parallel(n_jobs=8)]: Done 146 tasks      | elapsed:   33.5s
[Parallel(n_jobs=8)]: Done 349 tasks      | elapsed:  2.2min
[Parallel(n_jobs=8)]: Done 632 tasks      | elapsed:  6.6min







Instructions for updating:
Use tf.where in 2.0, which has the same broadcast rule as np.where



[Parallel(n_jobs=8)]: Done 729 out of 729 | elapsed:  8.5min finished


In [11]:
grid_result.best_params_

{'activations': 'sigmoid', 'batch_size': 50, 'epochs': 50, 'layers': [40, 20]}