# Part 1.2 - Extract Predictions for RNN 
In this notebook, we will load a pre-trained RNN model and run both our train & test timeseries samples through it, extracting the bottleneck features (the outputs of the layer just before the fully-connected / classification layer(s)). 

These features will be used to represent the embedded timeseries features for each sample. They will be concatenated with other features to train our final XGBoost classification model. 

In [23]:
import warnings
warnings.filterwarnings("ignore")
import math
import pandas as pd
import numpy as np
import time
import tensorflow as tf
from rnn import PlasticcRNN
import matplotlib.pyplot as plt
%matplotlib inline

print(tf.__version__)

1.13.1


### Load Train & Test Data

In [24]:
train = pd.read_pickle('train_rnn.pkl')
test = pd.read_pickle('test_rnn.pkl')

### Load pre-trained RNN model

You can find the code for the RNN model in `rnn.py` if you'd like to look further into the implementation.

In [25]:
model = PlasticcRNN('weight/rnn.npy')

Call `predict_bottleneck` to feed each training example through the pre-trained RNN model and extract the outputs from the layer just before the final classification layer .

In [26]:
train_bn = model.predict_bottleneck(train)

restore RNN/rnn3/bidirectional_rnn/fw/gru_cell/gates/kernel:0
restore RNN/rnn3/bidirectional_rnn/fw/gru_cell/gates/bias:0
restore RNN/rnn3/bidirectional_rnn/fw/gru_cell/candidate/kernel:0
restore RNN/rnn3/bidirectional_rnn/fw/gru_cell/candidate/bias:0
restore RNN/rnn3/bidirectional_rnn/bw/gru_cell/gates/kernel:0
restore RNN/rnn3/bidirectional_rnn/bw/gru_cell/gates/bias:0
restore RNN/rnn3/bidirectional_rnn/bw/gru_cell/candidate/kernel:0
restore RNN/rnn3/bidirectional_rnn/bw/gru_cell/candidate/bias:0
restore RNN/rnn5/bidirectional_rnn/fw/output_projection_wrapper/gru_cell/gates/kernel:0
restore RNN/rnn5/bidirectional_rnn/fw/output_projection_wrapper/gru_cell/gates/bias:0
restore RNN/rnn5/bidirectional_rnn/fw/output_projection_wrapper/gru_cell/candidate/kernel:0
restore RNN/rnn5/bidirectional_rnn/fw/output_projection_wrapper/gru_cell/candidate/bias:0
restore RNN/rnn5/bidirectional_rnn/fw/output_projection_wrapper/kernel:0
restore RNN/rnn5/bidirectional_rnn/fw/output_projection_wrapper/bia

  0%|          | 0/3 [00:00<?, ?it/s]

restore RNN/rnn4/bidirectional_rnn/bw/gru_cell/candidate/kernel:0
restore RNN/rnn4/bidirectional_rnn/bw/gru_cell/candidate/bias:0


4it [00:03,  1.03s/it]                       


Call `predict_bottleneck` to do the same with the testing data. This can take a little time, so it might be worthwhile to move onto the next nextbook and return to this once it's complete. 

In [27]:
test_bn = model.predict_bottleneck(test)

restore RNN/rnn3/bidirectional_rnn/fw/gru_cell/gates/kernel:0
restore RNN/rnn3/bidirectional_rnn/fw/gru_cell/gates/bias:0
restore RNN/rnn3/bidirectional_rnn/fw/gru_cell/candidate/kernel:0
restore RNN/rnn3/bidirectional_rnn/fw/gru_cell/candidate/bias:0
restore RNN/rnn3/bidirectional_rnn/bw/gru_cell/gates/kernel:0
restore RNN/rnn3/bidirectional_rnn/bw/gru_cell/gates/bias:0
restore RNN/rnn3/bidirectional_rnn/bw/gru_cell/candidate/kernel:0
restore RNN/rnn3/bidirectional_rnn/bw/gru_cell/candidate/bias:0
restore RNN/rnn5/bidirectional_rnn/fw/output_projection_wrapper/gru_cell/gates/kernel:0
restore RNN/rnn5/bidirectional_rnn/fw/output_projection_wrapper/gru_cell/gates/bias:0
restore RNN/rnn5/bidirectional_rnn/fw/output_projection_wrapper/gru_cell/candidate/kernel:0
restore RNN/rnn5/bidirectional_rnn/fw/output_projection_wrapper/gru_cell/candidate/bias:0
restore RNN/rnn5/bidirectional_rnn/fw/output_projection_wrapper/kernel:0
restore RNN/rnn5/bidirectional_rnn/fw/output_projection_wrapper/bia

  0%|          | 0/1 [00:00<?, ?it/s]

restore RNN/rnn4/bidirectional_rnn/bw/gru_cell/candidate/kernel:0
restore RNN/rnn4/bidirectional_rnn/bw/gru_cell/candidate/bias:0


2it [00:02,  1.32s/it]                       


Let's verify that we have embedded each of our timeseries into 16-dimensional space. 

In [28]:
print(train_bn.shape)
print(test_bn.shape)

(7848, 16)
(3036, 16)


### Convert Bottleneck Features to DataFrames

We will need to concatenate these bottleneck features with a Dataframe, so let's go ahead and create dataframes for the train and test datasets now. 

In [29]:
train_bn = pd.DataFrame(train_bn,columns=['bottleneck%d'%i for i in range(train_bn.shape[1])])
train_bn['object_id'] = train.object_id.unique().astype("int32")

In [30]:
train_bn.head()

Unnamed: 0,bottleneck0,bottleneck1,bottleneck2,bottleneck3,bottleneck4,bottleneck5,bottleneck6,bottleneck7,bottleneck8,bottleneck9,bottleneck10,bottleneck11,bottleneck12,bottleneck13,bottleneck14,bottleneck15,object_id
0,39.626457,0.497825,9.50353,0.61219,0.370296,36.647839,0.00145,0.050136,0.076503,7.458235,0.147237,0.000974,0.472516,56.94091,0.144692,20.955284,615
1,2.315192,6.36234,8.646493,0.750931,2.506892,8.227393,7.570269,9.884037,3.889832,1.148435,2.220537,0.156296,5.254804,1.756874,3.062057,19.185877,713
2,2.349471,29.854164,0.534167,1.965333,7.866547,0.414091,21.975964,3.617615,3.761395,2.10056,0.536902,11.514246,3.76677,0.182644,9.025029,3.807947,730
3,14.26315,32.955151,3.364045,11.65851,2.128215,0.905252,25.994488,2.800233,9.684278,8.299098,0.385034,8.098384,4.9815,0.281322,15.864852,2.367192,745
4,9.047191,27.102867,8.031672,6.901597,2.862491,1.167179,23.666925,1.425559,7.036583,7.211462,0.203276,9.391024,2.969179,0.419447,8.057733,1.478815,1124


In [31]:
test_bn = pd.DataFrame(test_bn,columns=['bottleneck%d'%i for i in range(test_bn.shape[1])])
test_bn['object_id'] = test.object_id.unique().astype("int32")

In [32]:
test_bn.head()

Unnamed: 0,bottleneck0,bottleneck1,bottleneck2,bottleneck3,bottleneck4,bottleneck5,bottleneck6,bottleneck7,bottleneck8,bottleneck9,bottleneck10,bottleneck11,bottleneck12,bottleneck13,bottleneck14,bottleneck15,object_id
0,5.314604,18.732872,0.109989,0.152188,20.551414,0.765192,24.027542,2.453748,7.116918,1.107046,5.734407,22.341146,11.663008,0.54258,17.808409,1.945663,13
1,5.084124,15.651435,1.266997,1.708105,1.504252,4.000192,14.55086,7.445744,4.548276,2.846517,3.295013,2.950758,6.132604,0.039986,12.180595,2.511426,14
2,3.302041,12.145276,2.155739,0.452114,2.118812,2.622386,15.338934,5.113633,6.906394,0.760518,1.073105,5.979373,4.356204,0.027025,7.365407,3.111598,17
3,0.603152,25.041437,1.588372,2.659944,7.473477,1.960465,20.626919,3.034693,4.911191,8.276651,0.550741,4.876296,6.742918,0.043288,4.671526,2.382284,23
4,0.928996,20.801142,4.854119,2.758434,2.354545,3.899116,8.542637,3.583603,8.108053,4.079149,1.013585,3.548541,3.466989,2.487076,6.708286,3.600367,34


### Store Features to Disk

In [33]:
train_bn.to_pickle('train_bn.pkl')
test_bn.to_pickle('test_bn.pkl')