# **LSTM**

### Package requirements

In [None]:
!pip install focal-loss

In [None]:
!pip install plot_keras_history

Load training set

In [None]:
!unzip data_fly.zip

In [39]:
from Load import*
from Train import*
from Utils import*
from Data_augmentation import*
%matplotlib inline
%load_ext autoreload
%autoreload 2

The autoreload extension is already loaded. To reload it, use:
  %reload_ext autoreload


Unzip the model weights for LSTM. The folder should be then moved to a Results folder.

In [None]:
!unzip Results/opt_LSTM_model.zip

## **Building training / validation / testing data sets**

### Load training data in full length

In [40]:
X, Y = load_training_data()
train, test, val = train_te_val_split(X,Y)


### Load augmented training data (gaussian noise)

In [None]:
X, Y = load_training_data()
T=2
train, test, val = data_augmentation(X,Y,T)

### Load split data 
(the size T of the split can be tuned)

In [None]:
X, Y = load_training_data()
train, test, val = data_augmentation_2(X,Y)

### Evaluate on test set
Stack train and validation data to train and evaluate model against test set

In [None]:
X_tr = np.vstack((train[0], val[0]))
Y_tr = np.vstack((train[1], val[1]))

## **Building a model**

In [None]:
run_exp_hist(train[0], train[1], val[0], val[1], repeats = 5, gamma = 2, node = 100)

### Cross validation 
Train and validation loss evolutions are printed for each parameter.
When CV is set to false, the model is trained in X_train, Y_train only. 

Cross validation number of nodes and model type
(average on 5 repeats -> 5 fold cross validation)

In [None]:
for node in [100,200,300,400,500]:
  print('Node number: ', node)
  run_exp_hist(X_tr,Y_tr, test[0], test[1], node = node, m_type=0)

In [None]:
for node in [600,700,800,900,1000]:
  print('Node number: ', node)
  run_exp_hist(X_tr,Y_tr, test[0], test[1], node = node, m_type=0, CV = True)

Cross validation on dropout.

In [None]:
for d in [0,0.1,0.2,0.3]:
  print('Dropout value : ', d)
  run_exp_hist(X_tr,Y_tr, test[0], test[1], node = 100, m_type=0, dropout = d, CV = True)

Cross validation on model type

In [None]:
for m in [0,1,2]:
  print('Model type : ', m)
  run_exp_hist(X_tr,Y_tr, test[0], test[1], dropout = 0.1, node = 100, m_type=m, CV = True)

Cross validation on data structure (split windows)

In [None]:
for T in [4,7,9]:
  X, Y = load_training_data()
  train, test, val = data_augmentation(X,Y,T)
  print('Split : ', T)
  run_exp_hist(train[0],train[1], val[0], val[1], node = 100, m_type=1, dropout = 0.1, CV = False)

In [None]:
X, Y = load_training_data()
train, test, val = data_augmentation(X,Y,5)
for reg in[1e-7, 1e-6, 1e-5, 1e-4]:
  print("reg: ", reg)
  run_exp_hist(train[0],train[1], val[0], val[1], node = 100, m_type=1, dropout = 0.1, CV = False)


Cross validation on data structure (data augmented)

In [None]:
X, Y = load_training_data()
train, test, val = data_augmentation_2(X,Y)
X_tr = np.vstack((train[0], val[0]))
Y_tr = np.vstack((train[1], val[1])) 
run_exp_hist(X_tr,Y_tr, test[0], test[1], node = 100, m_type=1, dropout=0.1, CV = True)


## **Evaluating on test set**

Stack train and validation sets to build final train set

In [45]:
X_tr = np.vstack((train[0], val[0]))
Y_tr = np.vstack((train[1], val[1]))

In [None]:
run_exp_hist(X_tr, Y_tr, test[0], test[1], repeats = 2, gamma = 2, node=600, dropout = 0.1, m_type=1, LSTM = True, CV = False)


Single run on test set on best model architecture. 

In [None]:
hist, loss, accuracy, wf1, wf1_, mf1, F1_tab, Ptab, Rtab = evaluate_model(X_tr, Y_tr, test[0], test[1], verbose = 1, plot = 1, single_run = 1)

Building predictions and saving predictions

In [None]:
predict(test[0], test[1],"LSTM")

In [None]:
from google.colab import files
files.download('Results/LSTM_Annotation.csv')
! zip -r Results.zip Results/opt_LSTM_model
files.download("Results.zip")


Google colab bash shell

In [None]:
!bash 