## Exercise 3: Recurrent Networks


#### Instructions:
Train a recurrent network using LSTM that solves the given function. Submit all of the notebooks for each of the metrics.

a)	Let N be the length of your last name. Let S = (N % 5) + 1. Generate 10000 samples of training data with lengths S up to S + 5. Generate 1000 samples with the same parameters for testing.

b)	Train a LSTM that solves the metrics: skewness, geometric mean, and harmonic mean of the generated data. Use a different notebook for each metrics. This means that each of the metrics will have their own generated data. Train the network such that a mean absolute error of the testing data is below 0.1

c)	For each notebook, report the mean squared error, mean absolute error, and the decile errors. Decile errors refers to the 0, 10, 20, … 100th percentile of the errors.

In [1]:
import numpy as np
import tensorflow as tf
import scipy as sp
from sklearn.metrics import mean_squared_error as mse
from sklearn.metrics import mean_absolute_error as mae

In [2]:
class EarlyStoppingAt(tf.keras.callbacks.Callback):
    def __init__(self, monitor='val_loss', value=0.00001, verbose=0):
        super(tf.keras.callbacks.Callback, self).__init__()
        self.monitor = monitor
        self.value = value
        self.verbose = verbose

    def on_epoch_end(self, epoch, logs={}):
        if logs.get(self.monitor) <= self.value:
            self.model.stop_training = True

## ---------------- settings input

In [3]:
mode_y = 'skew'
N = len('Marohom')
S = (N % 5) + 1 ;
print('S = %s' % S)
print('solve for %s' % mode_y)

S = 3
solve for skew


## ---------------- data gen and prep

In [4]:
N_SAMPLES = 10_000
N_TEST = 1_000
MAX_TIMESTEPS = S + 5
MASK_VALUE = -1

train_X = np.random.uniform(size=(N_SAMPLES, MAX_TIMESTEPS, 1))
train_L = np.random.randint(2, MAX_TIMESTEPS, N_SAMPLES)

test_X = np.random.uniform(size=(N_TEST, MAX_TIMESTEPS, 1))
test_L = np.random.randint(2, MAX_TIMESTEPS, N_TEST)

In [5]:
for i in range(N_SAMPLES):
    train_X[i, train_L[i]:] = MASK_VALUE

In [6]:
for i in range(N_TEST):
    test_X[i, test_L[i]:] = MASK_VALUE

In [7]:
modeList = ['skew','gmean','hmean']

if mode_y == modeList[0]:
    train_y = sp.stats.skew(np.ma.masked_array(train_X, train_X==MASK_VALUE),axis=1, bias=True)
    test_y = sp.stats.skew(np.ma.masked_array(test_X, test_X==MASK_VALUE),axis=1, bias=True)
elif mode_y == modeList[1]:
    train_y = sp.stats.mstats.gmean(np.ma.masked_array(train_X, train_X==MASK_VALUE),axis=1)
    test_y = sp.stats.mstats.gmean(np.ma.masked_array(test_X, test_X==MASK_VALUE),axis=1)
elif mode_y == modeList[2]:
    train_y = sp.stats.mstats.hmean(np.ma.masked_array(train_X, train_X==MASK_VALUE),axis=1)
    test_y = sp.stats.mstats.hmean(np.ma.masked_array(test_X, test_X==MASK_VALUE),axis=1)
else:
    train_y = np.ma.masked_array(train_X, train_X==MASK_VALUE).std(axis=1).data
    test_y = np.ma.masked_array(test_X, test_X==MASK_VALUE).std(axis=1).data

## ---------------- model architecture

In [8]:
input_ = tf.keras.Input(shape=(None, 1))
masked = tf.keras.layers.Masking(MASK_VALUE)(input_)
lstm1 = tf.keras.layers.LSTM(15, return_sequences=True)(masked)
lstm2 = tf.keras.layers.LSTM(15)(lstm1)
output = tf.keras.layers.Dense(1)(lstm2)

model = tf.keras.Model(inputs=input_, outputs=output)
model.summary()

Model: "model"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
input_1 (InputLayer)         [(None, None, 1)]         0         
_________________________________________________________________
masking (Masking)            (None, None, 1)           0         
_________________________________________________________________
lstm (LSTM)                  (None, None, 15)          1020      
_________________________________________________________________
lstm_1 (LSTM)                (None, 15)                1860      
_________________________________________________________________
dense (Dense)                (None, 1)                 16        
Total params: 2,896
Trainable params: 2,896
Non-trainable params: 0
_________________________________________________________________


In [9]:
model.compile('adam', 'mse', metrics=['mae'])

In [10]:
#filepath = r'models/'
modelPath = ['skew/','gmean/','hmean/']
bestFilePath = r'models/best_model/' + modelPath[modeList.index(mode_y)]

cbList = [
    EarlyStoppingAt(monitor='val_mae', value=0.1),
    tf.keras.callbacks.ModelCheckpoint(bestFilePath, save_best_only=True)
         ]

In [11]:
hist = model.fit(train_X, train_y, validation_data = (test_X, test_y), callbacks=cbList, epochs=25)

Train on 10000 samples, validate on 1000 samples
Epoch 1/25
Instructions for updating:
If using Keras pass *_constraint arguments to layers.
INFO:tensorflow:Assets written to: models/best_model/skew/assets
Epoch 2/25
Epoch 3/25
Epoch 4/25
Epoch 5/25
Epoch 6/25
Epoch 7/25


In [12]:
prediction = model.predict(test_X)

## ---------------- metrics

In [13]:
mseList = []
for _ in range(1000):
    err = mse(test_y[_], prediction[_])
    mseList.append(err)

maeList = []
for _ in range(1000):
    err = mae(test_y[_], prediction[_])
    maeList.append(err)

In [14]:
print('Mean Squared Error: %f' % mse(test_y, prediction))
print('Mean Absolute Error: %f' % mae(test_y, prediction))

Mean Squared Error: 0.013894
Mean Absolute Error: 0.085636


In [15]:
for _ in [0,10,20,30,40,50,60,70,80,90,100]:
    print('MSE %ith Percentile Error: %f ' % (_,np.percentile(mseList, _)))


MSE 0th Percentile Error: 0.000000 
MSE 10th Percentile Error: 0.000151 
MSE 20th Percentile Error: 0.000548 
MSE 30th Percentile Error: 0.001339 
MSE 40th Percentile Error: 0.002475 
MSE 50th Percentile Error: 0.003915 
MSE 60th Percentile Error: 0.006308 
MSE 70th Percentile Error: 0.010137 
MSE 80th Percentile Error: 0.016375 
MSE 90th Percentile Error: 0.034050 
MSE 100th Percentile Error: 0.329885 


In [16]:
for _ in [0,10,20,30,40,50,60,70,80,90,100]:
    print('MAE %ith Percentile Error: %f ' % (_,np.percentile(maeList, _)))

MAE 0th Percentile Error: 0.000048 
MAE 10th Percentile Error: 0.012290 
MAE 20th Percentile Error: 0.023410 
MAE 30th Percentile Error: 0.036592 
MAE 40th Percentile Error: 0.049747 
MAE 50th Percentile Error: 0.062572 
MAE 60th Percentile Error: 0.079420 
MAE 70th Percentile Error: 0.100684 
MAE 80th Percentile Error: 0.127964 
MAE 90th Percentile Error: 0.184527 
MAE 100th Percentile Error: 0.574356 
