# Model Definition 2

In this model, we're going to try feeding the entire dataset into the 1D CNN for training, to see if there was any information lost in only feeding in the route data.

Let's set things up.

In [1]:
import pickle
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import joblib

import plaidml.keras as pk
pk.install_backend()

from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import train_test_split, RandomizedSearchCV
from sklearn.metrics import roc_auc_score, accuracy_score, classification_report
from keras.callbacks import ReduceLROnPlateau, ModelCheckpoint, EarlyStopping

from keras.models import Sequential, Model, load_model
from keras.layers import Dense, Conv1D, Dropout, LeakyReLU, MaxPooling1D, Embedding, Flatten, Input, Concatenate

### Reading in training and validation data

In [2]:
with open('./sequence_data.pickle', 'rb') as f:
    sequence_data = pickle.load(f)
    
with open('./numerical_data.pickle', 'rb') as f:
    numeric_data = pickle.load(f)

In [3]:
X_seq, y = sequence_data

In [4]:
X_num, y = numeric_data

In [5]:
print(X_seq.shape)
print(X_num.shape)
print(y.shape)

(68486, 5)
(68486, 8)
(68486,)


In [6]:
X = np.concatenate([X_num, X_seq], axis=1)

x_train, x_test, y_train, y_test = train_test_split(X, y, test_size=.33)

In [7]:
xs_train, xs_test, ys_train, ys_test = train_test_split(X_seq, y, test_size=.33)
xn_train, xn_test, yn_train, yn_test = train_test_split(X_num, y, test_size=.33)

### DeepLearning Methods:

In [8]:
vocab_size = 300
max_length = 5
batch_size = 32

#### convolutional NN Updated

The baseline logistic regressor from version 1 of model_def performed better than our baseline CNN, so I'm going to update it's features to include the numerical data, since I believe that the logistic regressor's higher performance was due to these data being included.

I'll update the input length to be 13, 5 for the route data + 8 for the numerical data, then use x_train and x_test, rather than xs_train and xs_test, which only contained sequence data. This will also allow for a second convolving layer.

In [9]:
cnn = Sequential()

cnn.add(Embedding(vocab_size, 3, input_length=13))

cnn.add(Conv1D(128, kernel_size=3, strides=1))
cnn.add(LeakyReLU())
cnn.add(MaxPooling1D(pool_size=2))

cnn.add(Conv1D(256, kernel_size=3, strides=1))
cnn.add(LeakyReLU())
cnn.add(MaxPooling1D(pool_size=2))

cnn.add(Flatten())

cnn.add(Dense(1024))
cnn.add(LeakyReLU())
cnn.add(Dropout(.5))

cnn.add(Dense(512))
cnn.add(LeakyReLU())
cnn.add(Dropout(.5))

cnn.add(Dense(256))
cnn.add(LeakyReLU())
cnn.add(Dropout(.5))

cnn.add(Dense(1, activation='sigmoid'))

cnn.compile(optimizer='adam', loss='binary_crossentropy',
             metrics=['acc'])

INFO:plaidml:Opening device "metal_amd_radeon_pro_570x.0"


In [10]:
learning_rate_reduction = ReduceLROnPlateau(monitor='val_acc', patience=3, 
                                            verbose=2, factor=0.5, min_lr=0.00001)

best_model = ModelCheckpoint('./cnn.2.1.h5', monitor='val_acc', verbose=2, 
                             save_best_only=True, mode='max')

early_stopping = EarlyStopping(monitor='val_loss', min_delta=1e-10, 
                               patience=5, restore_best_weights=True)

In [11]:
hist = cnn.fit(x_train, y_train,
         batch_size=batch_size,
         epochs=50,
         validation_data=(x_test, y_test),
         callbacks=[learning_rate_reduction,best_model,early_stopping],
         verbose=1)

Train on 45885 samples, validate on 22601 samples
Epoch 1/50

Epoch 00001: val_acc improved from -inf to 0.86155, saving model to ./cnn.2.1.h5
Epoch 2/50

Epoch 00002: val_acc improved from 0.86155 to 0.87173, saving model to ./cnn.2.1.h5
Epoch 3/50

Epoch 00003: val_acc did not improve from 0.87173
Epoch 4/50

Epoch 00004: val_acc did not improve from 0.87173
Epoch 5/50

Epoch 00005: val_acc improved from 0.87173 to 0.87248, saving model to ./cnn.2.1.h5
Epoch 6/50

Epoch 00006: val_acc improved from 0.87248 to 0.87916, saving model to ./cnn.2.1.h5
Epoch 7/50

Epoch 00007: val_acc improved from 0.87916 to 0.88208, saving model to ./cnn.2.1.h5
Epoch 8/50

Epoch 00008: val_acc improved from 0.88208 to 0.88284, saving model to ./cnn.2.1.h5
Epoch 9/50

Epoch 00009: val_acc did not improve from 0.88284
Epoch 10/50

Epoch 00010: val_acc improved from 0.88284 to 0.88562, saving model to ./cnn.2.1.h5
Epoch 11/50

Epoch 00011: val_acc did not improve from 0.88562
Epoch 12/50

Epoch 00012: val_a

We see marked improvement from 87.7% validation accuracy in version 1, to  89.4% validation accuracy in this version.

### Mixed Model

We will update our combined model to include the update to our CNN and see how the performance is impacted.

In [12]:
def build_model():

    # cnn
    seq_input = Input(shape=(13,))

    x = Embedding(vocab_size, 3, input_length=13)(seq_input)

    x = Conv1D(256, kernel_size=3, strides=1)(x)
    x = LeakyReLU()(x)
    x = MaxPooling1D(pool_size=2)(x)
    
    x = Conv1D(512, kernel_size=3, strides=1)(x)
    x = LeakyReLU()(x)
    x = MaxPooling1D(pool_size=2)(x)
    x = BatchNormalization()(x)

    x = Flatten()(x)

    x = Dense(1024)(x)
    x = LeakyReLU()(x)
    x = Dropout(.5)(x)

    x = Dense(1024)(x)
    x = LeakyReLU()(x)
    x = Dropout(.5)(x)

    x = Dense(512)(x)
    x = LeakyReLU()(x)
    x = Dropout(.5)(x)
    
    x = Dense(512)(x)
    x = LeakyReLU()(x)
    x = Dropout(.5)(x)

    seq_output = Dense(64, activation='relu')(x)

    cnn = Model(inputs=seq_input, outputs=seq_output)

    # mlp
    num_input = Input(shape=(8,))
    
    y = Dense(64)(num_input)
    y = LeakyReLU()(y)
    y = Dropout(.5)(y)

    y = Dense(1024)(y)
    y = LeakyReLU()(y)
    y = Dropout(.5)(y)

    y = Dense(1024)(y)
    y = LeakyReLU()(y)
    y = Dropout(.5)(y)

    y = Dense(512)(y)
    y = LeakyReLU()(y)
    y = Dropout(.5)(y)

    mlp_output = Dense(64, activation='relu')(y)

    mlp = Model(inputs=num_input, outputs=mlp_output)

    # combine
    combined = Concatenate()([cnn.output, mlp.output])
    
    z = Dense(512)(combined)
    z = LeakyReLU()(z)
    z = Dropout(.5)(z)
    
    z = Dense(512)(z)
    z = LeakyReLU()(z)
    z = Dropout(.2)(z)
    
    z = Dense(256)(z)
    z = LeakyReLU()(z)
    z = Dropout(.2)(z)
    
    z = Dense(64)(z)
    z = LeakyReLU()(z)
    z = Dropout(.2)(z)
    
    output = Dense(1, activation='sigmoid')(z)

    final_model = Model(inputs=mlp.inputs + cnn.inputs, outputs=[output])

    final_model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['acc'])
    
    return final_model

In [13]:
model = build_model()

In [14]:
num_train = x_train[:, :8]
seq_train = x_train  # [:, 8:]

num_test = x_test[:, :8]
seq_test = x_test  #[:, 8:]

xc_train = [num_train, seq_train] 
xc_test = [num_test, seq_test] 

In [15]:
learning_rate_reduction_combined = ReduceLROnPlateau(monitor='val_acc', patience=3, 
                                            verbose=2, factor=0.5, min_lr=0.00001)

best_model_combined = ModelCheckpoint('./combined_cnn_mlp_model.2.1.h5', monitor='val_acc', verbose=2, 
                             save_best_only=True, mode='max')

early_stopping_combined = EarlyStopping(monitor='val_loss', min_delta=1e-10, 
                               patience=10, restore_best_weights=True)

In [16]:
hist = model.fit(xc_train, y_train,
         batch_size=batch_size,
         epochs=50,
         validation_data=(xc_test, y_test),
         callbacks = [learning_rate_reduction_combined, best_model_combined, early_stopping_combined],
         verbose=1
)

Train on 45885 samples, validate on 22601 samples
Epoch 1/50

Epoch 00001: val_acc improved from -inf to 0.88425, saving model to ./combined_cnn_mlp_model.2.1.h5
Epoch 2/50

Epoch 00002: val_acc did not improve from 0.88425
Epoch 3/50

Epoch 00003: val_acc improved from 0.88425 to 0.89434, saving model to ./combined_cnn_mlp_model.2.1.h5
Epoch 4/50

Epoch 00004: val_acc did not improve from 0.89434
Epoch 5/50

Epoch 00005: val_acc did not improve from 0.89434
Epoch 6/50

Epoch 00006: ReduceLROnPlateau reducing learning rate to 0.0005000000237487257.

Epoch 00006: val_acc did not improve from 0.89434
Epoch 7/50

Epoch 00007: val_acc improved from 0.89434 to 0.89447, saving model to ./combined_cnn_mlp_model.2.1.h5
Epoch 8/50

Epoch 00008: val_acc improved from 0.89447 to 0.89646, saving model to ./combined_cnn_mlp_model.2.1.h5
Epoch 9/50

Epoch 00009: val_acc improved from 0.89646 to 0.89753, saving model to ./combined_cnn_mlp_model.2.1.h5
Epoch 10/50

Epoch 00010: val_acc improved from 0


Epoch 00037: val_acc improved from 0.90505 to 0.90554, saving model to ./combined_cnn_mlp_model.2.1.h5
Epoch 38/50

Epoch 00038: val_acc did not improve from 0.90554
Epoch 39/50

Epoch 00039: val_acc did not improve from 0.90554
Epoch 40/50

Epoch 00040: ReduceLROnPlateau reducing learning rate to 3.125000148429535e-05.

Epoch 00040: val_acc did not improve from 0.90554
Epoch 41/50

Epoch 00041: val_acc improved from 0.90554 to 0.90558, saving model to ./combined_cnn_mlp_model.2.1.h5
Epoch 42/50

Epoch 00042: val_acc did not improve from 0.90558
Epoch 43/50

Epoch 00043: ReduceLROnPlateau reducing learning rate to 1.5625000742147677e-05.

Epoch 00043: val_acc improved from 0.90558 to 0.90562, saving model to ./combined_cnn_mlp_model.2.1.h5
Epoch 44/50

Epoch 00044: val_acc improved from 0.90562 to 0.90571, saving model to ./combined_cnn_mlp_model.2.1.h5
Epoch 45/50

Epoch 00045: val_acc did not improve from 0.90571
Epoch 46/50

Epoch 00046: val_acc improved from 0.90571 to 0.90576, sa

With the improvements to the CNN part of our model, the overall performance has decreased from 91.0% to 90.5%.

Keeping in mind that our Logistic Regressor performed similarly to the MLP we built, let's try simplifying our CNN and MLP to see if, in this case, less could be more.