# MLP:
The first half of this notebook may be used to train an MLP

In [68]:
import pandas as pd
import numpy as np
from sklearn.decomposition import PCA
from src.utils import get_batches, shuffle, train_val_split, preds_to_scores, scores_to_preds
from src.mlp import MLP
from src.rnn import RNN

%load_ext autoreload
%autoreload 2

The autoreload extension is already loaded. To reload it, use:
  %reload_ext autoreload


In [2]:
# Define the path to the data. This is the training dataframe saved from the preprocessing notebook.
data_path = './data/train_df.pkl'
train_df = pd.read_pickle(data_path)

In [50]:
# To further isolate our data, we will only examine essays from a single set
# Feel free to experiment with different essay sets!

set = 3
df = train_df.loc[train_df['essay_set'] == set]
df.head()

Unnamed: 0,essay_id,essay_set,rater1_domain1,rater2_domain1,domain1_score,essays_embed,word_count,min_score,max_score,norm_score
3583,5978,3,1,1,1,"[[-0.045849, 0.085148, -0.12276, -0.39789, 0.7...",35,0.0,3.0,4
3585,5980,3,1,1,1,"[[0.52791, 0.34362, 0.15408, 0.24317, -0.39083...",85,0.0,3.0,4
3587,5982,3,2,2,2,"[[0.46549, 0.1526, -0.52178, -0.40354, -0.1565...",104,0.0,3.0,8
3588,5983,3,1,1,1,"[[0.66193, 0.16192, -0.090129, -0.59287, 0.153...",63,0.0,3.0,4
3590,5985,3,0,0,0,"[[0.14956, -0.25978, 0.66754, 0.83551, 0.41318...",89,0.0,3.0,0


In [62]:
# We should get a plot here to examine score distribution for this set

# In order to avoid bias toward more common scores, we will limit the number
# of essays from each scoring bucket to a set value
score_df = None
min_score = int(df['min_score'].min())
max_score = int(df['max_score'].max())

n_max = 500
for i in range(min_score,max_score+1):
    if score_df is None:
        score_df = df.loc[df['domain1_score'] == i][:n_max]
    else:
        temp_df = df.loc[df['domain1_score'] == i][:n_max]
        score_df = pd.concat([score_df, temp_df])
df = score_df

In [52]:
# Extract essay vectors and corresponding scores
X = np.array(df['essays_embed'])
y = np.array(df['domain1_score'])
X = np.stack(X, axis=0)
print('There are {} training essays, each of shape {} x {}'.format(X.shape[0], X.shape[1], X.shape[2]))

There are 979 training essays, each of shape 137 x 200


These essays are too large to feed directly into an MLP. Therefore, the next steps are to flatten the essays and subsequently perform Principle Component Analysis (PCA) in order to reduce the dimensionality. More info on PCA here if you are unfamiliar: https://stats.stackexchange.com/questions/2691/making-sense-of-principal-component-analysis-eigenvectors-eigenvalues

In [54]:
X_flatten = np.reshape(X, [X.shape[0], -1])
X_shuff, y = shuffle(X_flatten, y)

# Consider adding an assertion that n_components must be less than
# the number of essays in X_shuff

# Train the PCA model. Feel free to experiment with different numbers of 
# principle components to optimize the model
X_shuff = X_shuff[:300]
pca = PCA(n_components=300)
pca.fit(X_shuff)

# Perform the PCA transformation on the flattened data
X_PCA = pca.transform(X_flatten)
print('After PCA, there are {} essays each represented by a {} length vector'\
      .format(X_PCA.shape[0], X_PCA.shape[1]))

After PCA, there are 979 essays each represented by a 300 length vector


The next step is to shuffle the data and separate it into training and validation sets.

In [55]:
X, y = shuffle(X_PCA, y)
X_train, y_train, X_val, y_val = train_val_split(X, y, train_prop=0.85)

Here we need to transform the labels to the form that the network will predict. For example, in set 1, the essays are graded on a scale from 2-12, therefore there are 11 classes into which the network will try to classify each essay. However, the network will classify essays into the scale 0-10. Therefore, this step will perform this shift on the labels.

In [56]:
y_train_adj = scores_to_preds(y_train, min_score)
print('Training labels shifted from a scale of ({},{}) to ({},{})'\
      .format(min(y_train),max(y_train), min(y_train_adj), max(y_train_adj)))
y_val_adj = scores_to_preds(y_val, min_score)
print('Validation labels shifted from a scale of ({},{}) to ({},{})'\
      .format(min(y_val),max(y_val), min(y_val_adj), max(y_val_adj)))

Training labels shifted from a scale of (0,3) to (0,3)
Validation labels shifted from a scale of (0,3) to (0,3)


In [57]:
batch_size = 32
n_batches = round(X_train.shape[0]/batch_size)
input_dim = X_train.shape[1]
num_classes = max_score-min_score + 1
batch_gen = get_batches(X_train, y_train_adj, batch_size, net_type='mlp')

mlp_net = MLP(input_dim=input_dim, hidden_dims=[128,32], num_classes=num_classes, regression=False, l2_reg=1e-4)

In [58]:
mlp_net.train(gen=batch_gen, X_val=X_val, y_val=y_val_adj, n_epochs=500, n_batches=n_batches, lr=1e-3)

Epoch 10, Batch 1 -- Loss: 0.8917707800865173 Validation accuracy: 0.4109589159488678
[2 2 2 2 1 2 2 2 1 2 2 1 2 1 2 2 1 2 2 2]
[2 1 2 2 3 2 1 1 1 1 3 3 2 2 1 2 2 2 2 2]
Best validation accuracy! accuracy: 0.4109589159488678%
Model Saved


Epoch 20, Batch 1 -- Loss: 0.2889643609523773 Validation accuracy: 0.4383561611175537
[1 1 1 1 1 1 2 2 1 1 2 1 2 1 1 1 3 1 2 2]
[2 1 2 2 3 2 1 1 1 1 3 3 2 2 1 2 2 2 2 2]
Best validation accuracy! accuracy: 0.4383561611175537%
Model Saved


Epoch 30, Batch 1 -- Loss: 0.29769450426101685 Validation accuracy: 0.4041095972061157
[1 1 1 1 1 2 2 2 1 1 2 2 2 1 1 2 3 1 1 2]
[2 1 2 2 3 2 1 1 1 1 3 3 2 2 1 2 2 2 2 2]


Epoch 40, Batch 1 -- Loss: 0.10581934452056885 Validation accuracy: 0.39726027846336365
[1 1 1 1 1 2 2 2 1 1 2 2 2 1 1 1 3 1 1 2]
[2 1 2 2 3 2 1 1 1 1 3 3 2 2 1 2 2 2 2 2]


Epoch 50, Batch 1 -- Loss: 0.13126574456691742 Validation accuracy: 0.4041095972061157
[1 1 1 1 3 2 2 1 1 1 2 2 2 1 1 1 3 1 1 2]
[2 1 2 2 3 2 1 1 1 1 3 3 2 2 1 2 2 2 2 2]


Epoch 60, Batch 1 -- Loss: 0.013781010173261166 Validation accuracy: 0.36986300349235535
[1 1 1 1 3 2 2 1 2 1 2 3 2 1 1 1 3 1 2 2]
[2 1 2 2 3 2 1 1 1 1 3 3 2 2 1 2 2 2 2 2]


Epoch 70, Batch 1 -- Loss: 0.10488999634981155 Validation accuracy: 0.3835616409778595
[1 1 1 1 3 2 2 1 2 1 2 3 2 1 1 1 3 1 1 2]
[2 1 2 2 3 2 1 1 1 1 3 3 2 2 1 2 2 2 2 2]


Epoch 80, Batch 1 -- Loss: 0.17736192047595978 Validation accuracy: 0.3835616409778595
[1 1 1 1 3 2 2 1 2 1 2 3 2 1 1 1 3 1 1 2]
[2 1 2 2 3 2 1 1 1 1 3 3 2 2 1 2 2 2 2 2]


Epoch 90, Batch 1 -- Loss: 0.04077163338661194 Validation accuracy: 0.4109589159488678
[1 1 1 1 3 2 2 1 2 1 2 2 2 1 1 1 3 1 2 2]
[2 1 2 2 3 2 1 1 1 1 3 3 2 2 1 2 2 2 2 2]


Epoch 100, Batch 1 -- Loss: 0.009091063402593136 Validation accuracy: 0.4109589159488678
[1 1 1 1 1 2 2 1 2 1 2 2 2 1 1 1 3 1 2 2]
[2 1 2 2 3 2 1 1 1 1 3 3 2 2 1 2 2 2 2 2]


Epoch 110, Batch 1 -- Loss: 0.00876113586127758 Validation accuracy: 0.4109589159488678
[1 1 1 1 1 2 2 1 2 1 2 2 2 1 1 1 3 1 2 2]
[2 1 2 2 3 2 1 1 1 1 3 3 2 2 1 2 2 2 2 2]


Epoch 120, Batch 1 -- Loss: 0.00806284137070179 Validation accuracy: 0.4109589159488678
[1 1 1 2 2 2 2 1 2 1 2 2 2 1 1 1 3 1 2 2]
[2 1 2 2 3 2 1 1 1 1 3 3 2 2 1 2 2 2 2 2]


Epoch 130, Batch 1 -- Loss: 0.007894589565694332 Validation accuracy: 0.4178082048892975
[1 1 1 2 2 2 2 1 2 1 2 2 2 1 1 1 3 1 2 2]
[2 1 2 2 3 2 1 1 1 1 3 3 2 2 1 2 2 2 2 2]


Epoch 140, Batch 1 -- Loss: 0.0074392505921423435 Validation accuracy: 0.4178082048892975
[1 1 1 2 2 2 2 1 3 1 2 2 2 1 1 1 3 1 2 2]
[2 1 2 2 3 2 1 1 1 1 3 3 2 2 1 2 2 2 2 2]


Epoch 150, Batch 1 -- Loss: 0.006933273747563362 Validation accuracy: 0.42465752363204956
[1 1 1 2 2 2 2 1 3 1 2 2 2 1 1 1 3 1 2 2]
[2 1 2 2 3 2 1 1 1 1 3 3 2 2 1 2 2 2 2 2]


Epoch 160, Batch 1 -- Loss: 0.0066261859610676765 Validation accuracy: 0.4383561611175537
[1 1 1 2 2 2 2 1 3 1 2 2 2 1 1 1 3 3 2 2]
[2 1 2 2 3 2 1 1 1 1 3 3 2 2 1 2 2 2 2 2]


Epoch 170, Batch 1 -- Loss: 0.00647774338722229 Validation accuracy: 0.4452054798603058
[1 1 1 2 2 2 2 1 3 1 2 2 2 1 1 1 3 3 2 2]
[2 1 2 2 3 2 1 1 1 1 3 3 2 2 1 2 2 2 2 2]
Best validation accuracy! accuracy: 0.4452054798603058%
Model Saved


Epoch 180, Batch 1 -- Loss: 0.005972813814878464 Validation accuracy: 0.4452054798603058
[1 1 1 2 2 2 2 1 3 1 2 3 2 1 1 1 3 3 2 2]
[2 1 2 2 3 2 1 1 1 1 3 3 2 2 1 2 2 2 2 2]


Epoch 190, Batch 1 -- Loss: 0.005702212452888489 Validation accuracy: 0.43150684237480164
[1 1 1 2 2 2 2 1 3 1 2 3 2 1 1 1 3 3 2 2]
[2 1 2 2 3 2 1 1 1 1 3 3 2 2 1 2 2 2 2 2]


Epoch 200, Batch 1 -- Loss: 0.005618404597043991 Validation accuracy: 0.4383561611175537
[1 1 1 2 2 2 2 1 3 1 2 3 2 1 1 1 3 3 2 2]
[2 1 2 2 3 2 1 1 1 1 3 3 2 2 1 2 2 2 2 2]


Epoch 210, Batch 1 -- Loss: 0.005199503619223833 Validation accuracy: 0.43150684237480164
[1 1 1 2 2 2 2 1 3 1 2 3 2 1 1 1 3 3 2 2]
[2 1 2 2 3 2 1 1 1 1 3 3 2 2 1 2 2 2 2 2]


Epoch 220, Batch 1 -- Loss: 0.004908915143460035 Validation accuracy: 0.4109589159488678
[1 1 1 2 2 2 2 1 3 1 2 3 2 1 1 1 3 3 2 2]
[2 1 2 2 3 2 1 1 1 1 3 3 2 2 1 2 2 2 2 2]


Epoch 230, Batch 1 -- Loss: 0.004739752504974604 Validation accuracy: 0.39726027846336365
[1 1 1 2 2 2 2 2 3 1 2 3 2 1 1 1 3 3 2 2]
[2 1 2 2 3 2 1 1 1 1 3 3 2 2 1 2 2 2 2 2]


Epoch 240, Batch 1 -- Loss: 0.00463905232027173 Validation accuracy: 0.3904109597206116
[1 1 1 2 2 2 2 2 3 1 2 3 2 1 1 1 3 3 2 2]
[2 1 2 2 3 2 1 1 1 1 3 3 2 2 1 2 2 2 2 2]


Epoch 250, Batch 1 -- Loss: 0.004337003920227289 Validation accuracy: 0.3835616409778595
[1 1 1 2 2 2 2 2 3 1 2 3 2 1 1 1 3 3 2 2]
[2 1 2 2 3 2 1 1 1 1 3 3 2 2 1 2 2 2 2 2]


Epoch 260, Batch 1 -- Loss: 0.0041592977941036224 Validation accuracy: 0.39726027846336365
[1 1 1 2 2 2 2 2 3 1 2 3 2 1 1 1 3 3 2 2]
[2 1 2 2 3 2 1 1 1 1 3 3 2 2 1 2 2 2 2 2]


Epoch 270, Batch 1 -- Loss: 0.024469729512929916 Validation accuracy: 0.4383561611175537
[1 1 1 1 2 2 2 1 1 2 2 3 2 1 1 1 2 3 2 2]
[2 1 2 2 3 2 1 1 1 1 3 3 2 2 1 2 2 2 2 2]


Epoch 280, Batch 1 -- Loss: 0.010750755667686462 Validation accuracy: 0.4452054798603058
[1 1 1 1 2 2 2 1 1 2 2 3 2 1 1 1 2 3 2 2]
[2 1 2 2 3 2 1 1 1 1 3 3 2 2 1 2 2 2 2 2]


Epoch 290, Batch 1 -- Loss: 0.009202134795486927 Validation accuracy: 0.45890411734580994
[1 1 1 2 2 2 1 1 1 2 2 3 2 1 1 1 2 3 2 2]
[2 1 2 2 3 2 1 1 1 1 3 3 2 2 1 2 2 2 2 2]
Best validation accuracy! accuracy: 0.45890411734580994%
Model Saved


Epoch 300, Batch 1 -- Loss: 0.008234532549977303 Validation accuracy: 0.45890411734580994
[1 1 1 2 2 2 1 1 1 2 2 3 2 1 1 1 2 3 2 2]
[2 1 2 2 3 2 1 1 1 1 3 3 2 2 1 2 2 2 2 2]


Epoch 310, Batch 1 -- Loss: 0.00799835380166769 Validation accuracy: 0.45890411734580994
[1 1 1 2 2 2 1 1 1 2 2 3 2 1 1 1 2 3 2 2]
[2 1 2 2 3 2 1 1 1 1 3 3 2 2 1 2 2 2 2 2]


Epoch 320, Batch 1 -- Loss: 0.00788875576108694 Validation accuracy: 0.45205479860305786
[1 1 1 2 2 2 1 1 1 2 2 3 2 1 1 1 2 3 2 2]
[2 1 2 2 3 2 1 1 1 1 3 3 2 2 1 2 2 2 2 2]


Epoch 330, Batch 1 -- Loss: 0.0076670353300869465 Validation accuracy: 0.45205479860305786
[1 1 1 2 2 2 1 1 1 2 2 3 2 1 1 1 2 3 2 2]
[2 1 2 2 3 2 1 1 1 1 3 3 2 2 1 2 2 2 2 2]


Epoch 340, Batch 1 -- Loss: 0.007765050046145916 Validation accuracy: 0.4383561611175537
[1 1 1 2 2 2 1 1 2 2 2 3 2 1 1 1 2 3 2 2]
[2 1 2 2 3 2 1 1 1 1 3 3 2 2 1 2 2 2 2 2]


Epoch 350, Batch 1 -- Loss: 0.007505909539759159 Validation accuracy: 0.4383561611175537
[1 1 1 2 2 2 1 1 2 2 2 3 2 1 1 1 2 3 2 2]
[2 1 2 2 3 2 1 1 1 1 3 3 2 2 1 2 2 2 2 2]


Epoch 360, Batch 1 -- Loss: 0.007398518733680248 Validation accuracy: 0.4452054798603058
[1 1 1 2 2 2 1 1 2 2 2 3 2 1 1 1 2 3 2 2]
[2 1 2 2 3 2 1 1 1 1 3 3 2 2 1 2 2 2 2 2]


Epoch 370, Batch 1 -- Loss: 0.007235968019813299 Validation accuracy: 0.4452054798603058
[1 1 1 2 2 2 1 1 2 2 2 3 2 1 1 1 2 3 2 2]
[2 1 2 2 3 2 1 1 1 1 3 3 2 2 1 2 2 2 2 2]


Epoch 380, Batch 1 -- Loss: 0.007136772386729717 Validation accuracy: 0.4452054798603058
[1 1 1 2 2 2 1 1 2 2 2 3 2 1 1 1 2 3 2 2]
[2 1 2 2 3 2 1 1 1 1 3 3 2 2 1 2 2 2 2 2]


Epoch 390, Batch 1 -- Loss: 0.00683285016566515 Validation accuracy: 0.4452054798603058
[1 1 1 2 2 2 1 1 2 2 2 3 2 1 1 1 2 3 2 2]
[2 1 2 2 3 2 1 1 1 1 3 3 2 2 1 2 2 2 2 2]


Epoch 400, Batch 1 -- Loss: 0.006639535538852215 Validation accuracy: 0.4383561611175537
[1 1 1 2 2 2 1 1 2 2 2 3 2 1 1 1 2 3 2 2]
[2 1 2 2 3 2 1 1 1 1 3 3 2 2 1 2 2 2 2 2]


Epoch 410, Batch 1 -- Loss: 0.0063898940570652485 Validation accuracy: 0.4383561611175537
[1 1 1 2 2 2 1 1 2 2 2 3 2 1 1 1 2 3 2 2]
[2 1 2 2 3 2 1 1 1 1 3 3 2 2 1 2 2 2 2 2]


Epoch 420, Batch 1 -- Loss: 0.006227334029972553 Validation accuracy: 0.4383561611175537
[1 1 1 2 2 2 1 1 3 2 2 3 2 1 1 1 2 3 2 2]
[2 1 2 2 3 2 1 1 1 1 3 3 2 2 1 2 2 2 2 2]


Epoch 430, Batch 1 -- Loss: 0.005969890393316746 Validation accuracy: 0.4383561611175537
[1 1 1 2 2 2 1 1 3 2 2 3 2 1 1 1 2 3 2 2]
[2 1 2 2 3 2 1 1 1 1 3 3 2 2 1 2 2 2 2 2]


Epoch 440, Batch 1 -- Loss: 0.005764879751950502 Validation accuracy: 0.4383561611175537
[1 1 1 2 2 2 1 1 3 2 2 3 2 1 1 1 2 3 2 2]
[2 1 2 2 3 2 1 1 1 1 3 3 2 2 1 2 2 2 2 2]


Epoch 450, Batch 1 -- Loss: 0.0055386899039149284 Validation accuracy: 0.43150684237480164
[1 1 1 2 2 2 1 1 3 2 2 3 2 1 1 1 2 3 2 2]
[2 1 2 2 3 2 1 1 1 1 3 3 2 2 1 2 2 2 2 2]


Epoch 460, Batch 1 -- Loss: 0.005315128713846207 Validation accuracy: 0.43150684237480164
[1 1 1 2 2 2 1 1 3 2 2 3 2 1 1 1 2 3 2 2]
[2 1 2 2 3 2 1 1 1 1 3 3 2 2 1 2 2 2 2 2]


Epoch 470, Batch 1 -- Loss: 0.005101803690195084 Validation accuracy: 0.43150684237480164
[1 1 1 2 2 2 1 1 3 2 2 3 2 1 1 1 2 3 2 2]
[2 1 2 2 3 2 1 1 1 1 3 3 2 2 1 2 2 2 2 2]


Epoch 480, Batch 1 -- Loss: 0.004919599741697311 Validation accuracy: 0.4178082048892975
[1 1 1 2 2 2 2 1 3 2 2 3 2 1 1 1 2 3 2 2]
[2 1 2 2 3 2 1 1 1 1 3 3 2 2 1 2 2 2 2 2]


Epoch 490, Batch 1 -- Loss: 0.004662978928536177 Validation accuracy: 0.4178082048892975
[1 1 1 2 2 2 2 1 3 2 2 3 2 1 1 1 2 3 2 2]
[2 1 2 2 3 2 1 1 1 1 3 3 2 2 1 2 2 2 2 2]


Epoch 500, Batch 1 -- Loss: 0.004472282249480486 Validation accuracy: 0.42465752363204956
[1 1 1 2 2 2 2 1 3 1 2 3 2 1 1 1 2 3 2 2]
[2 1 2 2 3 2 1 1 1 1 3 3 2 2 1 2 2 2 2 2]
Best validation accuracy over the training period was: 0.45890411734580994%


In [59]:
preds = mlp_net.predict('./model/best_model_mlp', X_val)
# We need to map predictions from classes in the model to actual scores
preds = preds_to_scores(preds, min_score=min_score)

INFO:tensorflow:Restoring parameters from ./model/best_model_mlp


In [60]:
from src.preprocess import quadratic_weighted_kappa

quadratic_weighted_kappa(y_val, preds)

0.07015079142081104

# RNN:
The second half of this notebook may be used for training an RNN - specifically an LSTM or GRU

In [99]:
# Define the path to the data
data_path = './data/train_df.pkl'
train_df = pd.read_pickle(data_path)

# To further isolate our data, we will only examine essays from a single set
# Feel free to experiment with different essay sets!
set = 4
df = train_df.loc[train_df['essay_set'] == set]
df.head()

Unnamed: 0,essay_id,essay_set,rater1_domain1,rater2_domain1,domain1_score,essays_embed,word_count,min_score,max_score,norm_score
5309,8863,4,0,0,0,"[[-0.46765, 0.46369, 0.21761, -0.63619, 0.2019...",43,0.0,4.0,0
5310,8864,4,0,0,0,"[[0.15637, -0.40361, -0.29629, -0.34259, -0.18...",28,0.0,4.0,0
5311,8865,4,3,2,3,"[[-0.46765, 0.46369, 0.21761, -0.63619, 0.2019...",111,0.0,4.0,9
5312,8866,4,1,2,2,"[[-0.46765, 0.46369, 0.21761, -0.63619, 0.2019...",52,0.0,4.0,6
5313,8867,4,2,2,2,"[[-0.46765, 0.46369, 0.21761, -0.63619, 0.2019...",104,0.0,4.0,6


In [100]:
# In order to avoid bias toward more common scores, we will limit the number
# of essays from each scoring bucket to a set value
score_df = None
min_score = int(df['min_score'].min())
max_score = int(df['max_score'].max())

n_max = 500
for i in range(min_score,max_score+1):
    if score_df is None:
        score_df = df.loc[df['domain1_score'] == i][:n_max]
    else:
        temp_df = df.loc[df['domain1_score'] == i][:n_max]
        score_df = pd.concat([score_df, temp_df])
df = score_df

In [101]:
# Extract essay vectors and corresponding scores
X = np.array(df['essays_embed'])
y = np.array(df['domain1_score'])
X = np.stack(X, axis=0)
print('There are {} training essays, each of shape {} x {}'.format(X.shape[0], X.shape[1], X.shape[2]))

There are 1201 training essays, each of shape 120 x 200


The next step is to shuffle the data and separate it into training and validation sets.

In [102]:
X, y = shuffle(X, y)
X_train, y_train, X_val, y_val = train_val_split(X, y, train_prop=0.8)

Here we need to transform the labels to the form that the network will predict. For example, in set 1, the essays are graded on a scale from 2-12, therefore there are 11 classes into which the network will try to classify each essay. However, the network will classify essays into the scale 0-10. Therefore, this step will perform this shift on the labels.

In [103]:
y_train_adj = scores_to_preds(y_train, min_score)
print('Training labels shifted from a scale of ({},{}) to ({},{})'\
      .format(min(y_train),max(y_train), min(y_train_adj), max(y_train_adj)))
y_val_adj = scores_to_preds(y_val, min_score)
print('Validation labels shifted from a scale of ({},{}) to ({},{})'\
      .format(min(y_val),max(y_val), min(y_val_adj), max(y_val_adj)))

Training labels shifted from a scale of (0,3) to (0,3)
Validation labels shifted from a scale of (0,3) to (0,3)


In [104]:
batch_size = 32
n_batches = round(X_train.shape[0]/batch_size)
num_classes = max_score-min_score + 1
seq_length = X_train.shape[1]
embed_size = X_train.shape[2]
batch_gen = get_batches(X_train, y_train_adj, batch_size, net_type='gru')

my_net = RNN(num_classes, batch_size, seq_length, embed_size=embed_size, cell_type='gru',
                 rnn_size=128, num_layers=2, learning_rate=0.005, train_keep_prob=1, sampling=False)

In [105]:
X_val_short = X_val[:batch_size]
y_val_short = y_val[:batch_size]
n_epochs = 10

print('Training Network...')
my_net.train(batch_gen, X_val_short, y_val_short, n_epochs, n_batches)

Training Network...
Initializing training


Epoch 1, step 10 loss: 1.0048  validation accuracy: 0.375  0.1702 sec/batch
Best validation accuracy! - Saving Model
[4 1 1 1 3 0 4 4 1 1 4 1 1 1 4 1 1 1 3 1 1 4 1 1 2 1 4 4 4 4 1 0]
[2 1 1 0 2 2 3 1 1 1 2 1 0 0 1 1 1 2 2 2 1 1 1 1 2 1 2 2 2 1 2 1]


Epoch 1, step 20 loss: 1.2582  validation accuracy: 0.5  0.1813 sec/batch
Best validation accuracy! - Saving Model
[2 1 1 1 2 2 2 2 1 1 2 2 1 1 2 1 2 1 2 1 1 2 1 2 2 2 2 2 2 2 1 2]
[2 1 1 0 2 2 3 1 1 1 2 1 0 0 1 1 1 2 2 2 1 1 1 1 2 1 2 2 2 1 2 1]


Epoch 1, step 30 loss: 0.9866  validation accuracy: 0.5  0.1743 sec/batch
[2 1 1 1 2 2 2 2 1 1 2 2 1 1 2 2 1 1 2 1 1 2 1 1 2 2 2 1 2 2 1 2]
[2 1 1 0 2 2 3 1 1 1 2 1 0 0 1 1 1 2 2 2 1 1 1 1 2 1 2 2 2 1 2 1]


Epoch 2, step 10 loss: 0.7969  validation accuracy: 0.59375  0.3064 sec/batch
Best validation accuracy! - Saving Model
[2 1 1 1 2 1 2 2 1 0 1 1 1 1 1 1 1 1 2 1 1 2 1 1 2 1 2 1 2 1 1 1]
[2 1 1 0 2 2 3 1 1 1 2 1 0 0 1 1 1 2 2 2 1 1 1 1 2 1 2 2 2 1 2 1]


Epoch 2, step 20 loss: 0.7228  validation accuracy: 0.625  0.2063 sec/batch
Best validation accuracy! - Saving Model
[2 1 1 1 2 1 2 2 1 1 1 1 1 1 1 1 1 1 2 1 1 2 1 1 2 1 2 1 2 1 1 1]
[2 1 1 0 2 2 3 1 1 1 2 1 0 0 1 1 1 2 2 2 1 1 1 1 2 1 2 2 2 1 2 1]


Epoch 2, step 30 loss: 0.7920  validation accuracy: 0.53125  0.1795 sec/batch
[2 1 1 1 2 2 2 2 1 1 2 2 1 1 2 1 1 1 2 2 1 2 1 2 2 2 2 1 2 2 1 2]
[2 1 1 0 2 2 3 1 1 1 2 1 0 0 1 1 1 2 2 2 1 1 1 1 2 1 2 2 2 1 2 1]


Epoch 3, step 10 loss: 0.7502  validation accuracy: 0.5625  0.2111 sec/batch
[2 1 1 1 2 2 2 2 1 1 2 2 1 1 2 1 1 1 2 2 1 2 1 2 2 2 2 1 2 2 1 1]
[2 1 1 0 2 2 3 1 1 1 2 1 0 0 1 1 1 2 2 2 1 1 1 1 2 1 2 2 2 1 2 1]


Epoch 3, step 20 loss: 0.8479  validation accuracy: 0.5625  0.1720 sec/batch
[2 1 1 1 2 2 2 2 1 1 2 2 1 1 2 1 1 1 2 2 1 2 1 2 2 2 2 1 2 2 1 1]
[2 1 1 0 2 2 3 1 1 1 2 1 0 0 1 1 1 2 2 2 1 1 1 1 2 1 2 2 2 1 2 1]


Epoch 3, step 30 loss: 0.7431  validation accuracy: 0.625  0.1865 sec/batch
[2 1 1 1 2 2 2 2 1 1 2 2 1 1 2 1 1 1 2 2 1 2 1 2 2 1 2 1 2 1 1 1]
[2 1 1 0 2 2 3 1 1 1 2 1 0 0 1 1 1 2 2 2 1 1 1 1 2 1 2 2 2 1 2 1]


Epoch 4, step 10 loss: 0.8415  validation accuracy: 0.5  0.1943 sec/batch
[2 0 1 0 2 2 2 2 1 0 2 2 0 0 2 1 0 1 2 2 0 2 0 2 2 2 2 1 2 2 1 1]
[2 1 1 0 2 2 3 1 1 1 2 1 0 0 1 1 1 2 2 2 1 1 1 1 2 1 2 2 2 1 2 1]


Epoch 4, step 20 loss: 1.0323  validation accuracy: 0.71875  0.1794 sec/batch
Best validation accuracy! - Saving Model
[2 1 1 0 2 2 2 2 1 1 2 2 0 1 1 1 1 1 2 2 1 2 1 1 2 1 2 1 2 2 1 1]
[2 1 1 0 2 2 3 1 1 1 2 1 0 0 1 1 1 2 2 2 1 1 1 1 2 1 2 2 2 1 2 1]


Epoch 4, step 30 loss: 0.8829  validation accuracy: 0.6875  0.1775 sec/batch
[2 1 1 1 2 2 2 2 1 1 2 1 1 1 1 1 1 1 2 2 1 2 1 1 3 1 2 1 2 1 1 1]
[2 1 1 0 2 2 3 1 1 1 2 1 0 0 1 1 1 2 2 2 1 1 1 1 2 1 2 2 2 1 2 1]


Epoch 5, step 10 loss: 0.8396  validation accuracy: 0.65625  0.1735 sec/batch
[2 1 1 0 2 2 2 2 1 0 2 2 0 0 1 1 0 1 2 2 1 2 1 2 2 1 2 1 2 2 1 1]
[2 1 1 0 2 2 3 1 1 1 2 1 0 0 1 1 1 2 2 2 1 1 1 1 2 1 2 2 2 1 2 1]


Epoch 5, step 20 loss: 0.6839  validation accuracy: 0.6875  0.1931 sec/batch
[2 1 1 0 2 2 2 2 1 1 2 2 0 0 1 1 1 1 2 2 1 2 0 2 2 1 2 1 2 2 1 1]
[2 1 1 0 2 2 3 1 1 1 2 1 0 0 1 1 1 2 2 2 1 1 1 1 2 1 2 2 2 1 2 1]


Epoch 5, step 30 loss: 0.5486  validation accuracy: 0.75  0.1745 sec/batch
Best validation accuracy! - Saving Model
[2 1 1 0 2 2 2 2 1 1 2 1 0 0 1 1 1 1 2 2 1 2 1 2 2 1 2 1 2 2 1 1]
[2 1 1 0 2 2 3 1 1 1 2 1 0 0 1 1 1 2 2 2 1 1 1 1 2 1 2 2 2 1 2 1]


Epoch 6, step 10 loss: 0.5566  validation accuracy: 0.71875  0.1755 sec/batch
[2 1 1 0 2 2 2 2 1 1 2 1 0 0 2 1 1 1 2 2 1 2 1 2 2 1 2 1 2 2 1 1]
[2 1 1 0 2 2 3 1 1 1 2 1 0 0 1 1 1 2 2 2 1 1 1 1 2 1 2 2 2 1 2 1]


Epoch 6, step 20 loss: 0.6026  validation accuracy: 0.78125  0.1725 sec/batch
Best validation accuracy! - Saving Model
[2 1 1 0 2 2 2 2 1 1 2 1 0 0 1 1 1 1 2 2 1 1 0 1 3 1 2 1 2 1 1 1]
[2 1 1 0 2 2 3 1 1 1 2 1 0 0 1 1 1 2 2 2 1 1 1 1 2 1 2 2 2 1 2 1]


Epoch 6, step 30 loss: 0.6879  validation accuracy: 0.78125  0.1811 sec/batch
[2 1 1 0 2 2 2 2 1 1 2 1 0 0 1 1 1 1 2 2 1 1 1 2 2 1 2 1 2 2 1 1]
[2 1 1 0 2 2 3 1 1 1 2 1 0 0 1 1 1 2 2 2 1 1 1 1 2 1 2 2 2 1 2 1]


Epoch 7, step 10 loss: 0.6050  validation accuracy: 0.65625  0.1833 sec/batch
[2 1 1 0 2 2 2 2 0 1 2 1 0 0 2 1 1 1 2 2 1 2 1 2 2 1 2 1 2 2 1 0]
[2 1 1 0 2 2 3 1 1 1 2 1 0 0 1 1 1 2 2 2 1 1 1 1 2 1 2 2 2 1 2 1]


Epoch 7, step 20 loss: 0.4346  validation accuracy: 0.78125  0.1733 sec/batch
[2 1 1 0 2 2 2 2 1 0 2 1 0 0 1 1 1 1 2 2 1 2 1 1 2 1 2 1 2 1 1 1]
[2 1 1 0 2 2 3 1 1 1 2 1 0 0 1 1 1 2 2 2 1 1 1 1 2 1 2 2 2 1 2 1]


Epoch 7, step 30 loss: 0.4874  validation accuracy: 0.65625  0.2194 sec/batch
[2 1 1 0 2 2 2 2 1 0 2 1 0 0 0 1 1 1 2 2 1 2 0 2 2 1 2 1 2 2 1 1]
[2 1 1 0 2 2 3 1 1 1 2 1 0 0 1 1 1 2 2 2 1 1 1 1 2 1 2 2 2 1 2 1]


Epoch 8, step 10 loss: 0.6169  validation accuracy: 0.6875  0.1703 sec/batch
[2 1 1 0 2 2 2 2 0 0 2 1 0 0 0 1 1 1 2 2 1 1 1 2 2 1 2 0 2 1 1 0]
[2 1 1 0 2 2 3 1 1 1 2 1 0 0 1 1 1 2 2 2 1 1 1 1 2 1 2 2 2 1 2 1]


Epoch 8, step 20 loss: 0.4548  validation accuracy: 0.71875  0.1744 sec/batch
[2 1 1 0 2 2 2 2 0 1 2 1 0 0 1 1 0 1 2 2 1 1 1 2 3 1 2 1 2 1 1 1]
[2 1 1 0 2 2 3 1 1 1 2 1 0 0 1 1 1 2 2 2 1 1 1 1 2 1 2 2 2 1 2 1]


Epoch 8, step 30 loss: 0.4165  validation accuracy: 0.6875  0.1768 sec/batch
[2 1 1 0 2 2 2 2 1 1 2 1 1 0 1 1 1 1 2 2 1 2 1 2 3 1 2 1 2 2 1 1]
[2 1 1 0 2 2 3 1 1 1 2 1 0 0 1 1 1 2 2 2 1 1 1 1 2 1 2 2 2 1 2 1]


Epoch 9, step 10 loss: 0.4393  validation accuracy: 0.6875  0.1723 sec/batch
[2 1 1 0 2 2 2 2 1 1 2 1 0 0 1 1 1 1 2 2 1 2 0 2 2 1 2 1 2 2 1 0]
[2 1 1 0 2 2 3 1 1 1 2 1 0 0 1 1 1 2 2 2 1 1 1 1 2 1 2 2 2 1 2 1]


Epoch 9, step 20 loss: 0.3469  validation accuracy: 0.625  0.1758 sec/batch
[2 1 1 0 2 2 2 2 1 0 2 1 0 0 0 1 0 1 2 2 1 1 1 2 3 1 2 1 2 2 1 0]
[2 1 1 0 2 2 3 1 1 1 2 1 0 0 1 1 1 2 2 2 1 1 1 1 2 1 2 2 2 1 2 1]


Epoch 9, step 30 loss: 0.4279  validation accuracy: 0.6875  0.1822 sec/batch
[1 1 1 0 2 2 2 2 1 0 2 1 0 0 0 1 1 1 2 2 1 1 1 1 3 1 2 1 2 1 1 0]
[2 1 1 0 2 2 3 1 1 1 2 1 0 0 1 1 1 2 2 2 1 1 1 1 2 1 2 2 2 1 2 1]


Epoch 10, step 10 loss: 0.5123  validation accuracy: 0.65625  0.2112 sec/batch
[2 1 1 0 2 2 2 2 1 0 2 1 0 0 0 1 0 2 2 2 1 1 0 2 3 1 2 1 2 2 1 1]
[2 1 1 0 2 2 3 1 1 1 2 1 0 0 1 1 1 2 2 2 1 1 1 1 2 1 2 2 2 1 2 1]


Epoch 10, step 20 loss: 0.3319  validation accuracy: 0.71875  0.1926 sec/batch
[2 1 1 1 2 2 2 2 1 1 2 1 0 0 1 1 1 1 2 2 1 2 1 2 2 1 2 1 2 2 1 1]
[2 1 1 0 2 2 3 1 1 1 2 1 0 0 1 1 1 2 2 2 1 1 1 1 2 1 2 2 2 1 2 1]


Epoch 10, step 30 loss: 0.2391  validation accuracy: 0.6875  0.1732 sec/batch
[2 1 1 0 2 2 2 2 1 0 2 2 0 0 1 1 0 1 2 2 1 1 1 2 2 1 2 0 2 1 1 0]
[2 1 1 0 2 2 3 1 1 1 2 1 0 0 1 1 1 2 2 2 1 1 1 1 2 1 2 2 2 1 2 1]


In [106]:
batch_size = X_val.shape[0]
seq_length = X_val.shape[1]
embed_size = X_val.shape[2]

pred_net = RNN(num_classes, batch_size, seq_length, embed_size=embed_size, cell_type='gru',
                 rnn_size=128, num_layers=2, learning_rate=0.005, train_keep_prob=1, sampling=False)
preds = pred_net.predict('./model/best_model_rnn', X_val)
preds = scores_to_preds(preds, min_score)

INFO:tensorflow:Restoring parameters from ./model/best_model_rnn


Running network predictions


In [107]:
from src.preprocess import quadratic_weighted_kappa
quadratic_weighted_kappa(preds[0], y_val)

0.6913285600636436