# Python3 Part of the Data Analysis for the Journal Article Titled:
# <i>Natural language processing for cognitive behavioral therapy: extracting schemas from thought records</i>

>This script accompanies the journal article with the title stated above. The main aim of the research is to determine whether an algorithm can label utterances expressed in thought records with regard to the schema(s) they reflect. Thought record forms are a tool in cognitive therapy with which patients should gain insight into their maladaptive thought processes. According to the theory underlying cognitive therapy, it is these malaptive thought processes that result in the respective mental illness.  
This script complements an R/KnitR script that consists of the following sections:
    <ol> 
    <li>Preparing data for testing Hypothesis 1</li>
    <li>Testing Hypothesis 2</li>
    <li>Testing Hypothesis 3</li>
    <li>Testing Hypothesis 4</li>
    </ol>
Details concerning the hypotheses, the project background, the raw data, and the data collection process can all be found in the R/KnitR script.<br>
<br>
The modules below need to be installed before running the code:
    <ol>
    <li>gensim==3.8.3</li>
    <li>talos==0.6.3</li>
    <li>tensorflow==2.3.2</li>
    <li>statsmodels==0.10.2</li>
    <li>scipy==1.4.1</li>
    <li>scikit-learn==0.23.2</li>
    <li>numpy==1.16.3</li>
    <li>pandas==0.25.3</li>
    </ol>
<br>
The following inputs are required and can be found in the DataRepository/AnalysisArticle/Data directory:
    <ol>
    <li>glove.6B directory</li>
    <li>DatasetsForH1 directory</li>
    </ol>
<br>
Additionally, the following output is generated:
    <ol>
    <li>data_for_H2.csv file</li>
    <li>per_schema_models directory</li>
    </ol>
with the latter containing all trained per-schema RNN models in .h5 file format.  
<br>
The purpose of this script is to test Hypothesis 1, i.e. to see whether an algorithm can attach the correct schema label to thought record utterances more often than would be expected by chance. A thought record utterance could reflect none, any one, or multiple of 9 possible schemas. Additionally, labels are not binary (does or does not reflect schema) but ordinal (0 - has nothing to do with schema, 1 - has a little bit to do with the schema, 2 - has to do with the schema, 3 - fits perfectly with the schema). <br>
Utterances are in natural language format. It is therefore necessary to preprocess these pieces of text, which we do in R. We also split the entire raw dataset into training, validation and test sets. The test set is created by taking 15% of the raw data, the validation set is created by taking another 15% of the remaining data. <br>
Three algorithms are explored: k-nearest neighbors, support vector machines, and recurrent neural networks. We arrived at the former two, by following the decision tree presented by scikitLearn (https://scikit-learn.org/stable/tutorial/machine_learning_map/). The data are ordinal, labeled, and we have less than 100k samples. The recurrent neural networks are a logical choice for natural language data, since they allow modelling the temporal aspect that is inherent to sentences as sequences of words.<br>
The wall time of runtimes are provided in the first comment of cells of time intensive code. Additionally, the cell magic "%%time" in these cells ensures that runtimes are printed so that these can be compared to the reported runtimes to get appropriate estimates when running on a different machine.<br>
We import the following packages and functions:

In [13]:
import sys
# Add GPU paths
sys.path.append("C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v11.6/bin")
sys.path.append("C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v11.6/extras/CUPTI/lib64")
sys.path.append("C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v11.6/include")
sys.path.append("C:/tools/cuda/bin")

In [14]:
#set seed
seed = 57839
import os
os.environ['PYTHONHASHSEED']=str(seed)

import random
random.seed(seed)

import numpy as np
np.random.seed(seed)

import csv
import pandas as pd
import scipy
import scipy.stats as stats
import functools
import gensim
from gensim.models.doc2vec import Doc2Vec, TaggedDocument


import sklearn
from sklearn.neighbors import NearestNeighbors, KNeighborsClassifier, KNeighborsRegressor
from sklearn import metrics, preprocessing, svm
from sklearn.utils import resample
from sklearn.preprocessing import StandardScaler

import statsmodels.api as sm
import statsmodels.formula.api as smf

import tensorflow as tf
tf.random.set_seed(seed)

from tensorflow.python.keras.metrics import Metric
from tensorflow import keras
import talos
from keras.preprocessing.text import Tokenizer
from keras.preprocessing.sequence import pad_sequences
from keras.models import Sequential
from keras.utils import np_utils
from keras.layers import Dense, Flatten, Embedding, SimpleRNN, LSTM, GRU, Bidirectional,Dropout,Input

from keras import backend as K
session_conf = tf.compat.v1.ConfigProto(intra_op_parallelism_threads=1, inter_op_parallelism_threads=1)
sess = tf.compat.v1.Session(graph=tf.compat.v1.get_default_graph(), config=session_conf)
tf.compat.v1.keras.backend.set_session(sess)

import joblib 
import tensorflow_hub as hub
import tensorflow_text as text  # Imports TF ops for preprocessing.
import bert

In [15]:
print(sys.version)

3.10.1 (tags/v3.10.1:2cd268a, Dec  6 2021, 19:10:37) [MSC v.1929 64 bit (AMD64)]


In [16]:
#list packages and their version numbers as used in this script (code is taken from 
#https://stackoverflow.com/questions/40428931/package-for-listing-version-of-packages-used-in-a-jupyter-notebook)
import pkg_resources
import types
def get_imports():
    for name, val in globals().items():
        if isinstance(val, types.ModuleType):
            # Split ensures you get root package, 
            # not just imported function
            name = val.__name__.split(".")[0]

        elif isinstance(val, type):
            name = val.__module__.split(".")[0]

        # Some packages are weird and have different
        # imported names vs. system/pip names. Unfortunately,
        # there is no systematic way to get pip names from
        # a package's imported name. You'll have to add
        # exceptions to this list manually!
        poorly_named_packages = {
            "sklearn": "scikit-learn"
        }
        if name in poorly_named_packages.keys():
            name = poorly_named_packages[name]

        yield name
imports = list(set(get_imports()))

# The only way I found to get the version of the root package
# from only the name of the package is to cross-check the names 
# of installed packages vs. imported packages
requirements = []
for m in pkg_resources.working_set:
    if m.project_name in imports and m.project_name!="pip":
        requirements.append((m.project_name, m.version))

for r in requirements:
    print("{}=={}".format(*r))

tensorflow==2.8.0
talos==1.2.3
statsmodels==0.13.2
scipy==1.8.0
scikit-learn==1.0.2
pandas==1.4.2
numpy==1.22.3
keras==2.8.0
joblib==1.1.0
gensim==4.1.2


> We also set the working directory:

In [17]:
# Check for available GPUs
gpus = tf.config.list_physical_devices('GPU')
if gpus:
    try:
        for gpu in gpus:
            tf.config.experimental.set_memory_growth(gpu, True)
        logical_gpus = tf.config.list_logical_devices('GPU')
        print(len(gpus), "Physical GPUs," , len(logical_gpus), "Logical GPUs")
    except RuntimeError as e:
        print(e)

In [18]:
os.chdir("C:/Users/conno/Documents/Repo2")

## Importing the datasets from csv
> The preprocessed utterances are split into three sets in the R script. They are saved in three separate csv files. Additionally, the manually assigned labels that correspond with the utterances are saved in three separate csv files.

In [19]:
# read in datasets (already pre-processed)
def readcsv(fname,istext):
    if istext:
        with open(fname,'rt') as f:
            reader=csv.reader(f)
            next(reader)
            data = []
            for row in reader:
                for item in row:
                    data.append(item)
            f.close()
    else:
        with open(fname,'r') as f:
            reader=csv.reader(f,delimiter=';')
            next(reader)
            data = list(reader)
            data = np.asarray(data, dtype='int')
            f.close()
    return data 

# read in training, validation, and test set utterances
train_text = readcsv('Data/DatasetsForH1/H1_train_texts.csv',True)
val_text = readcsv('Data/DatasetsForH1/H1_validate_texts.csv', True)
test_text = readcsv('Data/DatasetsForH1/H1_test_texts.csv',True)

# read in training, validation, and test set labels
train_labels = readcsv('Data/DatasetsForH1/H1_train_labels.csv',False)[:,0:9]
val_labels = readcsv('Data/DatasetsForH1/H1_validate_labels.csv', False)[:,0:9]
test_labels = readcsv('Data/DatasetsForH1/H1_test_labels.csv',False)[:,0:9]

In [20]:
print(train_text[0:5])

['lot people may think well lot people might not like me', 'might not working fast enough their standards', 'may not able graduate', 'would get bad performance review', 'friends will get annoyed by me']


In [21]:
print(train_labels[0:5,:])

[[2 0 0 0 0 0 0 0 3]
 [0 3 0 0 0 0 0 0 0]
 [0 3 0 0 0 0 0 0 0]
 [0 3 0 0 0 0 0 0 0]
 [2 0 0 0 0 0 0 0 3]]


> As can be seen, some utterances have multiple schemas assigned. However, overall, the label matrices are sparse matrices. The first column of the labels corresponds to the "Attachment" schema, the second to the "Competence" schema, the third to last to the "Other's views on self" schema.

In [22]:
#for later use
schemas = ["Attach","Comp","Global","Health","Control","MetaCog","Others","Hopeless","OthViews"]

## Embedding the utterances using BERT 
>One of the things the paper mentioned that would be interesting to try would be to use a more modern approach such as BERT to classify the data. We will use a pretrained model from TensorFlow that was trained on english wikipedia to encode and then train the model for the purpose of classifying

Preprocessor: https://tfhub.dev/tensorflow/bert_en_uncased_preprocess/3 <br>
Encoder: https://tfhub.dev/tensorflow/small_bert/bert_en_uncased_L-4_H-512_A-8/1 <br>
References: https://www.tensorflow.org/text/tutorials/classify_text_with_bert <br>
Paper: https://arxiv.org/abs/1908.08962

In [24]:
preprocessing_layer = hub.load("https://tfhub.dev/tensorflow/bert_en_uncased_preprocess/3")

OSError: SavedModel file does not exist at: C:\Users\conno\AppData\Local\Temp\tfhub_modules\602d30248ff7929470db09f7385fc895e9ceb4c0\{saved_model.pbtxt|saved_model.pb}

In [12]:
%%time
train_encoder_inputs = preprocessing_layer(np.array(train_text))
test_encoder_inputs = preprocessing_layer(np.array(test_text))
val_encoder_inputs = preprocessing_layer(np.array(val_text))

Wall time: 359 ms


In [13]:
# Save progress BERT encoder inputs
joblib.dump(train_encoder_inputs, 'Data/Embeddings/BERT/train_encoder_inputs.pkl')
joblib.dump(test_encoder_inputs, 'Data/Embeddings/BERT/test_encoder_inputs.pkl')
joblib.dump(val_encoder_inputs, 'Data/Embeddings/BERT/val_encoder_inputs.pkl')

['Data/Embeddings/BERT/val_encoder_inputs.pkl']

In [14]:
train_encoder_inputs = joblib.load("Data/Embeddings/BERT/train_encoder_inputs.pkl")
test_encoder_inputs = joblib.load("Data/Embeddings/BERT/test_encoder_inputs.pkl")
val_encoder_inputs = joblib.load("Data/Embeddings/BERT/val_encoder_inputs.pkl")

In [15]:
train_encoder_inputs

{'input_type_ids': <tf.Tensor: shape=(4151, 128), dtype=int32, numpy=
 array([[0, 0, 0, ..., 0, 0, 0],
        [0, 0, 0, ..., 0, 0, 0],
        [0, 0, 0, ..., 0, 0, 0],
        ...,
        [0, 0, 0, ..., 0, 0, 0],
        [0, 0, 0, ..., 0, 0, 0],
        [0, 0, 0, ..., 0, 0, 0]])>,
 'input_mask': <tf.Tensor: shape=(4151, 128), dtype=int32, numpy=
 array([[1, 1, 1, ..., 0, 0, 0],
        [1, 1, 1, ..., 0, 0, 0],
        [1, 1, 1, ..., 0, 0, 0],
        ...,
        [1, 1, 1, ..., 0, 0, 0],
        [1, 1, 1, ..., 0, 0, 0],
        [1, 1, 1, ..., 0, 0, 0]])>,
 'input_word_ids': <tf.Tensor: shape=(4151, 128), dtype=int32, numpy=
 array([[ 101, 2843, 2111, ...,    0,    0,    0],
        [ 101, 2453, 2025, ...,    0,    0,    0],
        [ 101, 2089, 2025, ...,    0,    0,    0],
        ...,
        [ 101, 2228, 3685, ...,    0,    0,    0],
        [ 101, 2572, 2025, ...,    0,    0,    0],
        [ 101, 2016, 2467, ...,    0,    0,    0]])>}

### Load BERT Encoder From TensorFlow

In [16]:
# Load BERT Embeddings Model
encoder = hub.KerasLayer("https://tfhub.dev/tensorflow/small_bert/bert_en_uncased_L-4_H-512_A-8/1", trainable=True, name='BERT_encoder')

In [17]:
%%time
train_outputs = encoder(train_encoder_inputs)
joblib.dump(train_outputs, 'Data/Embeddings/BERT/train_outputs.pkl')

Wall time: 2min 48s


['Data/Embeddings/BERT/train_outputs.pkl']

In [18]:
%%time
test_outputs = encoder(test_encoder_inputs)
joblib.dump(test_outputs, 'Data/Embeddings/BERT/test_outputs.pkl')

Wall time: 33.8 s


['Data/Embeddings/BERT/test_outputs.pkl']

In [19]:
%%time
val_outputs = encoder(val_encoder_inputs)
joblib.dump(val_outputs, 'Data/Embeddings/BERT/val_outputs.pkl')

Wall time: 28.6 s


['Data/Embeddings/BERT/val_outputs.pkl']

In [25]:
train_outputs = joblib.load('Data/Embeddings/BERT/train_outputs.pkl')
val_outputs = joblib.load('Data/Embeddings/BERT/val_outputs.pkl')
test_outputs = joblib.load('Data/Embeddings/BERT/test_outputs.pkl')

In [26]:
train_outputs["sequence_output"]

<tf.Tensor: shape=(4151, 128, 512), dtype=float32, numpy=
array([[[-0.25892833,  0.02584415,  0.9365302 , ..., -0.96089983,
          2.0598676 ,  0.74836355],
        [-0.30464268,  0.5291489 , -0.0649645 , ...,  0.25906444,
          0.9800552 ,  0.8583957 ],
        [-0.6293181 ,  0.09620693, -0.60556984, ..., -0.3956486 ,
          0.29027948,  0.43135193],
        ...,
        [-0.4576985 , -0.2561149 ,  0.09631284, ...,  0.02169307,
          1.255548  ,  1.0473273 ],
        [-0.10746137, -0.33916706,  0.22457744, ...,  0.25563493,
          1.5337087 ,  0.67972326],
        [-0.17170031, -0.42424932,  0.33490247, ...,  0.05244198,
          1.0300156 ,  0.34129512]],

       [[ 0.2336044 ,  0.43632537, -0.19059956, ..., -1.0399663 ,
          0.70788187,  0.4525274 ],
        [-0.37154025,  0.6752021 , -0.50089115, ..., -0.88574004,
         -0.36298618,  0.60263026],
        [ 0.17977479,  0.6047257 , -0.3719903 , ...,  0.09685966,
         -0.520507  ,  0.9380712 ],
        .

In [27]:
train_outputs["pooled_output"]

<tf.Tensor: shape=(4151, 512), dtype=float32, numpy=
array([[ 0.9896333 ,  0.9338632 , -0.11650713, ...,  0.12776163,
        -0.29113525, -0.8954528 ],
       [ 0.9701939 , -0.12391764, -0.20218222, ..., -0.08036453,
        -0.37490538, -0.90642047],
       [ 0.8312658 , -0.80702513, -0.29396096, ..., -0.0246753 ,
        -0.52115893, -0.7763568 ],
       ...,
       [ 0.96101385,  0.80138934, -0.3839846 , ...,  0.4392669 ,
        -0.27659366, -0.9367416 ],
       [ 0.98007244,  0.0676232 , -0.00312316, ...,  0.04044244,
        -0.67166525, -0.7636678 ],
       [ 0.88983876,  0.97235435, -0.07664951, ...,  0.3174023 ,
        -0.6655615 , -0.8582763 ]], dtype=float32)>

## Embedding the utterances using GLoVE
> We have opted for representing the words in utterances as word vectors. We adopt the GLoVE word vector space that has been created with Wikipedia 2014. First, we tokenize the top 2000 words of the training set.  

In [28]:
# prepare tokenizer
max_words = 2000
t = Tokenizer(num_words = max_words)
t.fit_on_texts(train_text)
vocab_size = len(t.word_index) + 1
print(vocab_size)

2624


> The tokenizer takes the words and indexes these based on frequency. For the recurrent neural net, we need padded utterances sequences. Texts_to_sequences simply represents each utterance as a vector of tokens. Padding ensures that all vectors are of the same length, by appending 0s to the end of shorter vectors. We pad to a length of 25 words.

In [29]:
# integer encode all utterances
encoded_train = t.texts_to_sequences(train_text)
encoded_validate = t.texts_to_sequences(val_text)
encoded_test = t.texts_to_sequences(test_text)

# pad documents to a max length of 25 words
max_length = 25

padded_train = pad_sequences(encoded_train, maxlen=max_length, padding='post')
padded_validate = pad_sequences(encoded_validate, maxlen=max_length, padding='post')
padded_test = pad_sequences(encoded_test, maxlen=max_length, padding='post')

print(encoded_train[0:5])

[[147, 28, 48, 37, 101, 147, 28, 32, 1, 8, 5], [32, 1, 155, 658, 14, 125, 568], [48, 1, 19, 448], [2, 11, 53, 449, 659], [50, 6, 11, 373, 98, 5]]


In [30]:
print(padded_train[0:5])

[[147  28  48  37 101 147  28  32   1   8   5   0   0   0   0   0   0   0
    0   0   0   0   0   0   0]
 [ 32   1 155 658  14 125 568   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0]
 [ 48   1  19 448   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0]
 [  2  11  53 449 659   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0]
 [ 50   6  11 373  98   5   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0]]


> We can now load the GLoVE embeddings into memory.

In [31]:
%%time
# wall time to run: ~ 10sec
# load all embeddings into memory
embeddings_index = dict()
f = open('Data/glove.6B/glove.6B.100d.txt', encoding="utf8")
for line in f:
    values = line.split()
    word = values[0]
    coefs = np.asarray(values[1:], dtype='float32')
    embeddings_index[word] = coefs
f.close()
print('Loaded %s word vectors.' % len(embeddings_index))

Loaded 400000 word vectors.
Wall time: 6.37 s


> We can then create an embedding matrix by taking each word of the training set and finding the corresponding word vector in the GLoVE data. We only work with 100 dimensional representations.

In [32]:
vec_dims = 100
embedding_matrix = np.zeros((vocab_size, vec_dims))
for word, i in t.word_index.items():
    embedding_vector = embeddings_index.get(word)
    if embedding_vector is not None:
        embedding_matrix[i] = embedding_vector
embedding_matrix.shape

(2624, 100)

In [33]:
# create tfidf weighted encoding matrix of utterances
train_sequences = t.texts_to_matrix(train_text,mode='tfidf')
val_sequences =  t.texts_to_matrix(val_text,mode='tfidf')
test_sequences = t.texts_to_matrix(test_text,mode='tfidf')
print(train_sequences[0:5])
print(train_sequences.shape)

[[0.         1.29214445 0.         ... 0.         0.         0.        ]
 [0.         1.29214445 0.         ... 0.         0.         0.        ]
 [0.         1.29214445 0.         ... 0.         0.         0.        ]
 [0.         0.         1.69021763 ... 0.         0.         0.        ]
 [0.         0.         0.         ... 0.         0.         0.        ]]
(4151, 2000)


In [34]:
# we want to normalize the word vectors
def normalize(v):
    norm = np.linalg.norm(v)
    if norm == 0: 
       return v
    return v / norm

In [35]:
# create utterance embeddings as tfidf weighted average of normalized word vectors
def seq2vec(datarow,embedmat):
  #initialize an empty utterance vector of the same length as word2vec vectors
  seqvec = np.zeros((100,))
  #counter for number of words in a specific utterance
  wordcount = 1
  #we iterate over the 2000 possible words in a given utterance
  wordind = 1
  while (wordind < len(datarow)):
    #the tf-idf weight is saved in the cells of datarow
    tfidfweight = datarow[wordind]
    if not tfidfweight is None:
      wordembed = tfidfweight * embedmat[wordind,]
      seqvec = seqvec + normalize(wordembed)
      wordcount = wordcount + 1
    wordind = wordind + 1
  return seqvec/wordcount

In [36]:
# go through the matrix and embed each utterances
def embed_utts(sequences,embedmat):
  vecseq = [seq2vec(seq,embedmat)for seq in sequences]
  return vecseq

> we now have everything needed to create the utterance embeddings

In [32]:
%%time
# wall time to run: ~ 1min 14s
# embedd all three datasets
train_embedutts = embed_utts(train_sequences,embedding_matrix)
val_embedutts = embed_utts(val_sequences,embedding_matrix)
test_embedutts = embed_utts(test_sequences,embedding_matrix)
print(train_embedutts[0])

[-3.44478099e-05  3.48539840e-04  3.35962753e-04 -3.70855457e-04
 -2.63147438e-04  1.07227074e-04 -1.66559109e-04  1.94234500e-05
  7.42420570e-05 -1.80615841e-04  1.80387578e-05  4.92242034e-05
  2.75006568e-04  2.34416192e-05  8.31148165e-05 -2.93833280e-04
 -7.15121389e-05  2.98592314e-04 -4.55134987e-04  4.72657153e-04
  2.57585086e-04  1.69741478e-04  7.75960265e-05 -2.15817394e-04
 -4.34789085e-05  7.24571212e-05 -1.54585404e-04 -4.98166781e-04
  1.93941088e-04 -1.74921206e-04  2.37557331e-05  4.85150809e-04
  3.08554881e-05 -4.62293641e-05  1.35110613e-04  2.80284189e-04
 -3.22980711e-05  3.12968134e-04  8.27704500e-05 -2.40951546e-04
 -3.13527886e-04 -1.35440392e-04 -2.05195768e-05 -4.81099111e-04
 -2.75375333e-04 -1.27601856e-04  2.50256011e-04 -2.50631136e-04
 -1.51297680e-04 -8.33555219e-04  3.54382525e-05 -8.74190709e-05
  1.05239327e-05  8.00132559e-04 -1.52039351e-04 -1.90058573e-03
  9.49153013e-05 -1.17238522e-05  1.18845110e-03  3.93093667e-04
 -1.88908628e-04  9.94003

In [33]:
# Save embeddings for quick loading later
joblib.dump(train_embedutts, 'Data/Embeddings/GLoVE/train_embedutts.pkl')
joblib.dump(val_embedutts, 'Data/Embeddings/GLoVE/val_embedutts.pkl')
joblib.dump(test_embedutts, 'Data/Embeddings/GLoVE/test_embedutts.pkl')

['Data/Embeddings/GLoVE/test_embedutts.pkl']

In [37]:
# Load them back in
train_embedutts = joblib.load('Data/Embeddings/GLoVE/train_embedutts.pkl')
val_embedutts = joblib.load('Data/Embeddings/GLoVE/val_embedutts.pkl')
test_embedutts = joblib.load('Data/Embeddings/GLoVE/test_embedutts.pkl')

## Model evaluation
> We use the Spearman correlation to evaluate the models and choose the best one, because it can be used for both the regression and the classification outcomes. This is not the case for a weighted Cohen's Kappa, for example, which only works for class labels.

In [38]:
#### Goodness of Fit
def gof_spear(X,Y):
    #spearman correlation of columns (schemas)
    gof_spear = np.zeros(X.shape[1])    
    for schema in range(9):
        rho,p = scipy.stats.spearmanr(X[:,schema],Y[:,schema])
        gof_spear[schema]=rho
    return gof_spear

## Bootstrapping for confidence intervals
> Since all models are expensive to run, we only do a small bootstrapping to obtain some insight into how confident we can be about the predictions.

In [39]:
# we adopt the algorithm from the following website:
# https://machinelearningmastery.com/calculate-bootstrap-confidence-intervals-machine-learning-results-python/
# def bootstrap
def bootstrap(train_X, train_y, iterations, sample_size, sample_embeds, sample_labels, classification, model, embedding="GLoVE"):
    stats = np.zeros((iterations,9))
    for l in range(iterations):
        # prepare bootstrap sample
        bootstrap_sample_indx = random.sample(list(enumerate(sample_embeds)), sample_size)
        bootstrap_sample_utts = [sample_embeds[i] for (i,j) in bootstrap_sample_indx]
        bootstrap_sample_labels = [sample_labels[i] for (i,j) in bootstrap_sample_indx]
        # evaluate model
        if model=="knn":
            model_gof=my_kNN(train_X, train_y, np.array(bootstrap_sample_utts),np.array(bootstrap_sample_labels),classification)
        elif model=="svm":
            model_gof=my_svm(train_y, np.array(bootstrap_sample_utts),np.array(bootstrap_sample_labels),classification, embedding)
        elif model=="rnn":
            model_gof=my_rnn_fixed(train_X, train_y, np.array(bootstrap_sample_utts),np.array(bootstrap_sample_labels),classification)
        stats[l,:] = model_gof
    # confidence intervals
    cis = np.zeros((2,9))
    alpha = 0.95
    p = ((1.0-alpha)/2.0) * 100
    cis[0,:] = [max(0.0, np.percentile(stats[:,i], p)) for i in range(9)]
    p = (alpha+((1.0-alpha)/2.0)) * 100
    cis[1,:] = [min(1.0, np.percentile(stats[:,i], p)) for i in range(9)]
    return cis

# configure bootstrap
n_iterations = 100
n_size = int(len(val_text) * 0.75)

## k-nearest Neighbors Classification and Regression
> Since we have ordinal labels for our data, we train both classification and regression algorithms and see which one performs better. We also have multi-label data, and therefore write a custom kNN algorithm. We use the cosine distance to find the nearest neighbors.

In [40]:
# cosine distance
def cosine_dist(X,Y):
    return scipy.spatial.distance.cosine(X,Y)

In [41]:
#kNN algorithm
def knn_custom(train_X,test_X,train_y,test_y,k,dist,classification):
    #empty array to collect the results (should have shape of samples to classify)
    votes = np.zeros(test_y.shape)
    #fit the knn
    knn=NearestNeighbors(n_neighbors=k, metric=dist)
    knn.fit(train_X)
    #collect neighbors
    i=0 # index to collect votes of the neighbors
    for sample in test_X:
        neighbors=knn.kneighbors([sample],k,return_distance=False)[0]
        if classification:
            output_y = np.zeros((k,test_y.shape[1]))
            j=0
            for neighbor in neighbors:
                output_y[j,:] = train_y[neighbor,:]
                j=j+1
            votes[i,:] = stats.mode(output_y,nan_policy='omit')[0]
        else:
            output_y = np.zeros(test_y.shape[1])
            for neighbor in neighbors:
                output_y += train_y[neighbor,:]
                votes[i,:] = np.divide(output_y,k)
        i=i+1
    return votes

> To evaluate choices for k, we use a performance metric that is a weighted mean of the spearman correlation for each choice of k. As weights we use the frequencies of schemas (# of utterances with labels > 0 for a given schema/total number of utterances) in the training set.

In [42]:
# weighting model output (spearman correlations) by schema frequencies in training set and returning mean over schemas
def performance(train_y,output):
    train_y = np.array(train_y)
    train_y[train_y>0]=1
    weighting = train_y.sum(axis=0)/train_y.shape[0]
    perf = output * weighting
    return np.nanmean(np.array(perf), axis=0)

In [43]:
# finding best k by testing some values for k
def find_k(train_X, test_X, train_y, test_y, dist, classification):
    perf = 0
    best_k = 0
    for k in [2,3,4,5,6,7,8,9,10,30,100]:
        knn_k = knn_custom(train_X, test_X, train_y, test_y,k,dist,classification)
        knn_gof_spear = gof_spear(knn_k,test_y)
        print('Results for choice of k is %s.' % k)
        print(pd.DataFrame(data=knn_gof_spear,index=schemas,columns=['gof']))
        if perf < performance(train_y,knn_gof_spear):
            perf = performance(train_y,knn_gof_spear)
            best_k = k
    return best_k

In [45]:
%%time
# wall time to run: ~ 15min
# find best k for classification
knn_class_k_glove = find_k(train_embedutts,val_embedutts,train_labels,val_labels,cosine_dist,1)

Wall time: 0 ns


In [42]:
print('Best choice for classification k is with GLoVE: %s' % knn_class_k_glove)

Best choice for classification k is with GLoVE: 100


In [43]:
knn_class_k_bert = find_k(train_outputs["pooled_output"],val_outputs["pooled_output"],train_labels,val_labels,cosine_dist,1) 

Results for choice of k is 2.
               gof
Attach    0.520207
Comp      0.538653
Global    0.440397
Health    0.517527
Control   0.067278
MetaCog  -0.008059
Others   -0.008982
Hopeless  0.501311
OthViews  0.498057
Results for choice of k is 3.
               gof
Attach    0.532753
Comp      0.513828
Global    0.496175
Health    0.523938
Control   0.136950
MetaCog  -0.005695
Others         NaN
Hopeless  0.496209
OthViews  0.508845
Results for choice of k is 4.
               gof
Attach    0.546332
Comp      0.579449
Global    0.508682
Health    0.517595
Control   0.125264
MetaCog        NaN
Others   -0.006347
Hopeless  0.560365
OthViews  0.500715
Results for choice of k is 5.
               gof
Attach    0.495804
Comp      0.553438
Global    0.507869
Health    0.517595
Control   0.140895
MetaCog        NaN
Others         NaN
Hopeless  0.511332
OthViews  0.464652
Results for choice of k is 6.
               gof
Attach    0.510742
Comp      0.570426
Global    0.502463
Health    0.51

In [44]:
print('Best choice for classification k is with BERT Embeddings: %s' % knn_class_k_bert)

Best choice for classification k is with BERT Embeddings: 7


In [45]:
%%time
# wall time to run: ~ 15min
# find best k for regression
knn_reg_k_glove = find_k(train_embedutts,val_embedutts,train_labels,val_labels,cosine_dist,0)

Results for choice of k is 2.
               gof
Attach    0.301481
Comp      0.172417
Global    0.421375
Health    0.275585
Control        NaN
MetaCog        NaN
Others    0.039193
Hopeless  0.324713
OthViews  0.245623
Results for choice of k is 3.
               gof
Attach    0.309176
Comp      0.181459
Global    0.409544
Health    0.275585
Control   0.107267
MetaCog        NaN
Others    0.028961
Hopeless  0.324815
OthViews  0.257459
Results for choice of k is 4.
               gof
Attach    0.543883
Comp      0.643099
Global    0.526240
Health    0.550825
Control   0.236496
MetaCog  -0.023015
Others    0.022758
Hopeless  0.524707
OthViews  0.447674
Results for choice of k is 5.
               gof
Attach    0.579190
Comp      0.652242
Global    0.485816
Health    0.583227
Control   0.222230
MetaCog  -0.031264
Others    0.065061
Hopeless  0.527806
OthViews  0.466995
Results for choice of k is 6.
               gof
Attach    0.591500
Comp      0.643973
Global    0.480999
Health    0.55

In [46]:
print('Best choice for regression k for GLoVE is: %s' % knn_reg_k_glove)

Best choice for regression k for GLoVE is: 8


In [47]:
%%time
# wall time to run: ~ 15min
# find best k for regression
knn_reg_bert_k = find_k(train_outputs["pooled_output"],val_outputs["pooled_output"],train_labels,val_labels,cosine_dist,0) 

Results for choice of k is 2.
               gof
Attach    0.560861
Comp      0.550576
Global    0.423506
Health    0.424523
Control   0.192486
MetaCog   0.068470
Others    0.077364
Hopeless  0.450919
OthViews  0.512912
Results for choice of k is 3.
               gof
Attach    0.576818
Comp      0.545873
Global    0.454746
Health    0.401647
Control   0.205284
MetaCog   0.087813
Others    0.087105
Hopeless  0.441195
OthViews  0.498680
Results for choice of k is 4.
               gof
Attach    0.576737
Comp      0.558685
Global    0.442498
Health    0.380061
Control   0.236177
MetaCog   0.111981
Others    0.066656
Hopeless  0.454449
OthViews  0.497457
Results for choice of k is 5.
               gof
Attach    0.577037
Comp      0.552242
Global    0.439031
Health    0.365531
Control   0.235795
MetaCog   0.099507
Others    0.134306
Hopeless  0.444075
OthViews  0.503337
Results for choice of k is 6.
               gof
Attach    0.566563
Comp      0.560409
Global    0.430962
Health    0.37

In [48]:
print('Best choice for regression k for BERT is: %s' % knn_reg_bert_k)

Best choice for regression k for BERT is: 4


> Since this is needed for the bootstrapping algorithm, we define a function that takes testset and labels and returns the goodness of fit. We print the results on the testset.

In [49]:
def my_kNN(train_X, train_y, test_X,test_y,classification):
    if classification:
        my_knn=knn_custom(train_X,test_X,train_y,test_y,4,cosine_dist,1)
    else:
        my_knn=knn_custom(train_X,test_X,train_y,test_y,5,cosine_dist,0)
    return gof_spear(my_knn,test_y)

In [50]:
%%time
#wall time to run: ~ 2.5 min
output_kNN_class_glove = my_kNN(train_embedutts, train_labels, test_embedutts,test_labels,1)
output_kNN_reg_glove = my_kNN(train_embedutts, train_labels, test_embedutts,test_labels,0)

Wall time: 2min


In [51]:
# Save Models
joblib.dump(output_kNN_class_glove, 'Data/KNN/output_kNN_class_glove.pkl')
joblib.dump(output_kNN_reg_glove, 'Data/KNN/output_kNN_reg_glove.pkl')

['Data/KNN/output_kNN_reg_glove.pkl']

In [52]:
%%time
#wall time to run: ~ 4min
output_kNN_class_bert = my_kNN(train_outputs["pooled_output"], train_labels, test_outputs["pooled_output"],test_labels,1)
output_kNN_reg_bert = my_kNN(train_outputs["pooled_output"], train_labels, test_outputs["pooled_output"],test_labels,0)

Wall time: 3min 7s


In [53]:
# Save models
joblib.dump(output_kNN_class_bert, 'Data/KNN/output_kNN_class_bert.pkl')
joblib.dump(output_kNN_reg_bert, 'Data/KNN/output_kNN_reg_bert.pkl')

['Data/KNN/output_kNN_reg_bert.pkl']

In [46]:
# Load back in saved models
output_kNN_class_glove = joblib.load('Data/KNN/output_kNN_class_glove.pkl')
output_kNN_reg_glove = joblib.load('Data/KNN/output_kNN_reg_glove.pkl')
output_kNN_class_bert = joblib.load('Data/KNN/output_kNN_class_bert.pkl')
output_kNN_reg_bert = joblib.load('Data/KNN/output_kNN_reg_bert.pkl')

In [55]:
print('KNN Classification Prediction GLoVE')
print(pd.DataFrame(data=output_kNN_class_glove,index=schemas,columns=['estimate']))

KNN Classification Prediction GLoVE
          estimate
Attach    0.130606
Comp      0.135201
Global    0.204418
Health    0.249344
Control  -0.011459
MetaCog        NaN
Others         NaN
Hopeless  0.167857
OthViews  0.157289


In [56]:
print('KNN Classification Prediction BERT')
print(pd.DataFrame(data=output_kNN_class_bert,index=schemas,columns=['estimate']))

KNN Classification Prediction BERT
          estimate
Attach    0.531913
Comp      0.587256
Global    0.375899
Health    0.500031
Control   0.102623
MetaCog        NaN
Others    0.183972
Hopeless  0.491567
OthViews  0.401721


In [57]:
print('KNN Regression Prediction GloVE')
print(pd.DataFrame(data=output_kNN_reg_glove,index=schemas,columns=['estimate']))

KNN Regression Prediction GloVE
          estimate
Attach    0.606499
Comp      0.701906
Global    0.417896
Health    0.656053
Control   0.216933
MetaCog   0.019173
Others    0.237087
Hopeless  0.534698
OthViews  0.461305


In [58]:
print('KNN Regression Prediction BERT')
print(pd.DataFrame(data=output_kNN_reg_bert,index=schemas,columns=['estimate']))

KNN Regression Prediction BERT
          estimate
Attach    0.527163
Comp      0.604087
Global    0.399843
Health    0.406143
Control   0.198838
MetaCog   0.042465
Others    0.142093
Hopeless  0.459757
OthViews  0.437244


In [59]:
%%time
# wall time to run: ~ 3h 30min
# bootstrap confidence intervals for kNN regression and classification
bs_knn_reg_glove = bootstrap(train_embedutts, train_labels, n_iterations,n_size,test_embedutts,test_labels,0,"knn")
bs_knn_class_glove = bootstrap(train_embedutts, train_labels, n_iterations,n_size,test_embedutts,test_labels,1,"knn")

Wall time: 2h 7min 22s


In [60]:
print(f'KNN Classification 95% Confidence Intervals with GLoVE')
print(pd.DataFrame(data=np.transpose(bs_knn_class_glove),index=schemas,columns=['low','high']))

KNN Classification 95% Confidence Intervals with GLoVE
               low      high
Attach    0.086165  0.165345
Comp      0.089736  0.169913
Global    0.119456  0.247420
Health    0.000000  1.000000
Control   0.000000  1.000000
MetaCog   0.000000  1.000000
Others    0.000000  1.000000
Hopeless  0.000000  1.000000
OthViews  0.000000  1.000000


In [61]:
print(f'KNN Regression 95% Confidence Intervals with GloVE')
print(pd.DataFrame(data=np.transpose(bs_knn_reg_glove),index=schemas,columns=['low','high']))

KNN Regression 95% Confidence Intervals with GloVE
               low      high
Attach    0.556876  0.641433
Comp      0.666403  0.742867
Global    0.356754  0.471224
Health    0.587198  0.722349
Control   0.151941  0.271973
MetaCog   0.000000  0.065638
Others    0.131293  0.319599
Hopeless  0.487370  0.580630
OthViews  0.419942  0.506835


In [62]:
%%time
# wall time to run: ~ 3h 30min
# bootstrap confidence intervals for kNN regression and classification
bs_knn_reg_bert = bootstrap(train_outputs["pooled_output"], train_labels, n_iterations,n_size,test_outputs["pooled_output"],test_labels,0,"knn", "BERT")
bs_knn_class_bert = bootstrap(train_outputs["pooled_output"], train_labels, n_iterations,n_size,test_outputs["pooled_output"],test_labels,1,"knn", "BERT")

Wall time: 3h 19min 38s


In [63]:
print(f'Bert KNN Classification 95% Confidence Intervals')
print(pd.DataFrame(data=np.transpose(bs_knn_class_bert),index=schemas,columns=['low','high']))

Bert KNN Classification 95% Confidence Intervals
               low      high
Attach    0.488830  0.581333
Comp      0.537804  0.629931
Global    0.320721  0.432018
Health    0.395903  0.594334
Control   0.038999  0.175488
MetaCog   0.000000  1.000000
Others    0.000000  1.000000
Hopeless  0.436791  0.553120
OthViews  0.343502  0.449071


In [64]:
print(f'Bert KNN Regression 95% Confidence Intervals')
print(pd.DataFrame(data=np.transpose(bs_knn_reg_bert),index=schemas,columns=['low','high']))

Bert KNN Regression 95% Confidence Intervals
               low      high
Attach    0.481005  0.557272
Comp      0.567890  0.629916
Global    0.350038  0.460044
Health    0.335964  0.468665
Control   0.147965  0.260360
MetaCog   0.000000  0.106342
Others    0.056678  0.220705
Hopeless  0.415600  0.498845
OthViews  0.380510  0.476266


In [65]:
# Save Results for quick loading later if project stops
joblib.dump(bs_knn_reg_glove, 'Data/BootstrapResults/KNN/bs_knn_reg_glove.pkl')
joblib.dump(bs_knn_class_glove, 'Data/BootstrapResults/KNN/bs_knn_class_glove.pkl')
joblib.dump(bs_knn_reg_bert, 'Data/BootstrapResults/KNN/bs_knn_reg_bert.pkl')
joblib.dump(bs_knn_class_bert, 'Data/BootstrapResults/KNN/bs_knn_class_bert.pkl')

['Data/BootstrapResults/KNN/bs_knn_class_bert.pkl']

In [47]:
bs_knn_reg_glove = joblib.load('Data/BootstrapResults/KNN/bs_knn_reg_glove.pkl')
bs_knn_class_glove = joblib.load('Data/BootstrapResults/KNN/bs_knn_class_glove.pkl')
bs_knn_reg_bert = joblib.load('Data/BootstrapResults/KNN/bs_knn_reg_bert.pkl')
bs_knn_class_bert = joblib.load('Data/BootstrapResults/KNN/bs_knn_class_bert.pkl')

## Support vector machine
> The second algorithm we chose are support vector machines (SVMs). Again, we train both a support vector classification (SVC) and a support vectore regression (SVR). We only try all three types of standard kernels and do not do any additional parameter tuning. Just like the kNN, the support vector machine takes as input the utterances encoded as averages of word vectors. Support vector classification and regression do not allow for multilabel output. We therefore train disjoint models, one for each schema.<br>
For both types of SVM, we first transform the input texts as the algorithm expects normally distributed input centered around 0 and with a standard deviation of 1.

In [31]:
#SVM/SVR
def svm_scaler(train_X):
        #scale the data
        scaler_texts = StandardScaler()
        scaler_texts = scaler_texts.fit(train_X)
        return scaler_texts

In [32]:
scaler_texts_glove = svm_scaler(train_embedutts)

In [33]:
scaler_texts_bert = svm_scaler(train_outputs["pooled_output"])

>Since SVMs, unlike kNNs, can be trained and reused, we write a method that returns all 9 models and a separate one for the predictions.

In [34]:
def svm_custom(train_X,train_y,text_scaler,kern,classification):
        models=[]
        train_X = text_scaler.transform(train_X)
        #fit a new support vector regression for each schema
        for schema in range(9):
            if classification:
                model = svm.SVC(kernel=kern)
            else:
                model = svm.SVR(kernel=kern)
            model.fit(train_X, train_y[:,schema])
            models.append(model)
        return models

In [35]:
def svm_predict(svm_models,test_X,train_y,test_y,text_scaler):
    #empty array to collect the results (should have shape of samples to classify)
    votes = np.zeros(test_y.shape)
    for schema in range(9):
        svm_model=svm_models[schema]
        prediction = svm_model.predict(text_scaler.transform(test_X))
        votes[:,schema] = prediction
    out = votes
    gof = gof_spear(out,test_y)
    perf = performance(train_y,gof)
    return out,perf

In [36]:
def svm_models(train_X,train_Y,val_X, val_Y, scaler_texts, sv, classification):
    svm_rbf_models =  svm_custom(train_X,train_Y,scaler_texts,'rbf',classification)
    svm_rbf_out, svm_rbf_perf = svm_predict(svm_rbf_models,val_X,train_Y,val_Y,scaler_texts)
    svm_lin_models = svm_custom(train_X,train_Y,scaler_texts,'linear',classification)
    svm_lin_out, svm_lin_perf = svm_predict(svm_lin_models,val_X,train_Y,val_Y,scaler_texts)
    svm_poly_models = svm_custom(train_X,train_Y,scaler_texts,'poly',classification)
    svm_poly_out, svm_poly_perf = svm_predict(svm_poly_models,val_X,train_Y,val_Y,scaler_texts)
    print(pd.DataFrame(data=[svm_rbf_perf,svm_lin_perf,svm_poly_perf],index=['rbf','lin','poly'],columns=[sv]))
    models = {'rbf': svm_rbf_models, 'lin': svm_lin_models, 'poly': svm_poly_models}
    return models

In [76]:
%%time
# wall time to run: ~ 45sec
# svm
print('GLoVE SVM Results: ')
glove_svm = svm_models(train_embedutts,train_labels,val_embedutts, val_labels, scaler_texts_glove, 'svm', 1)
# Save Results for quick loading later if project stops
joblib.dump(glove_svm, 'Data/SVM/glove_svm_model.pkl')

GLoVE SVM Results: 
Wall time: 3.61 s


['Data/SVM/glove_svm_model.pkl']

In [77]:
%%time
# wall time to run: ~ 1 min
# svm
print('BERT SVM Results: ')
bert_svm = svm_models(train_outputs["pooled_output"],train_labels,val_outputs["pooled_output"], val_labels, scaler_texts_bert, 'svm', 1)
joblib.dump(bert_svm, 'Data/SVM/bert_svm_model.pkl')

BERT SVM Results: 
Wall time: 9.09 s


['Data/SVM/bert_svm_model.pkl']

In [74]:
%%time
# wall time to run: ~ 1 min
print('GLoVE SVR Results: ')
glove_svr = svm_models(train_embedutts,train_labels,val_embedutts, val_labels, scaler_texts_glove, 'svr', 0)
joblib.dump(glove_svr, 'Data/SVM/glove_svr_model.pkl')

GLoVE SVR Results: 
           svr
rbf   0.076675
lin   0.064361
poly  0.066954
Wall time: 58.7 s


['Data/SVM/glove_svr_model.pkl']

In [78]:
%%time
# wall time to run: ~ 45sec
# svm
print('BERT SVR Results: ')
bert_svr = svm_models(train_outputs["pooled_output"],train_labels,val_outputs["pooled_output"], val_labels, scaler_texts_bert, 'svr', 0)
joblib.dump(bert_svr, 'Data/SVM/bert_svr_model.pkl')

BERT SVR Results: 
Wall time: 8.53 s


['Data/SVM/bert_svr_model.pkl']

In [48]:
# Save Results for quick loading later if project stops
glove_svm = joblib.load('Data/SVM/glove_svm_model.pkl')
bert_svm = joblib.load('Data/SVM/bert_svm_model.pkl')
glove_svr = joblib.load('Data/SVM/glove_svr_model.pkl')
bert_svr = joblib.load('Data/SVM/bert_svr_model.pkl')

> In both algorithms, the radial basis function (rbf) kernel outperformed linear and polynomial kernels. We therefore opt for the rbf kernel when predicting the labels of the test dataset.

In [38]:
%%time
# wall time to run: ~10sec

def my_svm(train_y, test_X, test_y,classification, embedding):
    if embedding == 'GLoVE':
        svm_rbf_models = glove_svm['rbf']
        svr_rbf_models = glove_svr['rbf']
        scaler_texts = scaler_texts_glove
    elif embedding == 'BERT':
        svm_rbf_models = bert_svm['rbf']
        svr_rbf_models = bert_svr['rbf']
        scaler_texts = scaler_texts_bert
        
    if classification:
        my_svm_out, my_svm_perf=svm_predict(svm_rbf_models,test_X,train_y,test_y,scaler_texts)
    else:
        my_svm_out, my_svm_perf=svm_predict(svr_rbf_models,test_X,train_y,test_y,scaler_texts)
    return gof_spear(my_svm_out,test_y)

output_SVC_glove = my_svm(train_labels, test_embedutts,test_labels,1, 'GLoVE')
output_SVR_glove = my_svm(train_labels, test_embedutts,test_labels,0, 'GLoVE')
output_SVC_bert = my_svm(train_labels, test_outputs["pooled_output"],test_labels,1, 'BERT')
output_SVR_bert = my_svm(train_labels, test_outputs["pooled_output"],test_labels,0, 'BERT')

Wall time: 7.1 s


In [101]:
print('SVM Classification Prediction with GLoVE')
print(pd.DataFrame(data=output_SVC_glove,index=schemas,columns=['estimate']))

SVM Classification Prediction with GLoVE
          estimate
Attach    0.647714
Comp      0.684661
Global    0.357601
Health    0.729181
Control        NaN
MetaCog        NaN
Others         NaN
Hopeless  0.489903
OthViews  0.476297


In [102]:
print('SVM Regression Prediction with GLoVE')
print(pd.DataFrame(data=output_SVR_glove,index=schemas,columns=['estimate']))

SVM Regression Prediction with GLoVE
          estimate
Attach    0.675340
Comp      0.640866
Global    0.489372
Health    0.349064
Control   0.310007
MetaCog   0.114894
Others    0.185827
Hopeless  0.535979
OthViews  0.516635


In [103]:
print('SVM Classification Prediction with BERT')
print(pd.DataFrame(data=output_SVC_bert,index=schemas,columns=['estimate']))

SVM Classification Prediction with BERT
          estimate
Attach    0.577022
Comp      0.689898
Global    0.373424
Health    0.561481
Control  -0.011459
MetaCog        NaN
Others         NaN
Hopeless  0.517952
OthViews  0.471861


In [104]:
print('SVM Regression Prediction with BERT')
print(pd.DataFrame(data=output_SVR_bert,index=schemas,columns=['estimate']))

SVM Regression Prediction with BERT
          estimate
Attach    0.663786
Comp      0.672748
Global    0.506108
Health    0.305131
Control   0.306820
MetaCog   0.141697
Others    0.114909
Hopeless  0.521259
OthViews  0.491448


In [95]:
%%time
# wall time to run: ~ 3min 15sec
# bootstrap confidence intervals for SVR and SVC
bs_svc_glove = bootstrap(train_embedutts, train_labels, n_iterations,n_size,test_embedutts,test_labels, 1,"svm", "GLoVE")
bs_svr_glove = bootstrap(train_embedutts, train_labels, n_iterations,n_size,test_embedutts,test_labels, 0,"svm", "GLoVE")

Wall time: 2min 11s


In [96]:
print(f'SVM Classification 95% Confidence Intervals with GLoVE')
print(pd.DataFrame(data=np.transpose(bs_svc_glove),index=schemas,columns=['low','high']))

SVM Classification 95% Confidence Intervals with GLoVE
               low      high
Attach    0.611903  0.690208
Comp      0.645037  0.727340
Global    0.296369  0.407668
Health    0.648348  0.797075
Control   0.000000  1.000000
MetaCog   0.000000  1.000000
Others    0.000000  1.000000
Hopeless  0.421759  0.545510
OthViews  0.424901  0.537975


In [97]:
print(f'SVM Regression 95% Confidence Intervals with GLoVE')
print(pd.DataFrame(data=np.transpose(bs_svr_glove),index=schemas,columns=['low','high']))

SVM Regression 95% Confidence Intervals with GLoVE
               low      high
Attach    0.649124  0.700444
Comp      0.612465  0.667220
Global    0.454683  0.524970
Health    0.302367  0.392112
Control   0.265696  0.341165
MetaCog   0.068038  0.154142
Others    0.117043  0.231642
Hopeless  0.494234  0.563619
OthViews  0.483961  0.547114


In [39]:
%%time
# wall time to run: ~ 3min 15sec
# bootstrap confidence intervals for SVR and SVC
bs_svc_bert = bootstrap(train_outputs["pooled_output"], train_labels, n_iterations,n_size,test_outputs["pooled_output"],test_labels,1,"svm", "BERT")
bs_svr_bert = bootstrap(train_outputs["pooled_output"], train_labels, n_iterations,n_size,test_outputs["pooled_output"],test_labels,0,"svm", "BERT")

Wall time: 5min 56s


In [108]:
print(f'SVM Classification 95% Confidence Intervals with BERT')
print(pd.DataFrame(data=np.transpose(bs_svc_bert),index=schemas,columns=['low','high']))

SVM Classification 95% Confidence Intervals with BERT
               low      high
Attach    0.540790  0.615402
Comp      0.644692  0.728376
Global    0.303498  0.435311
Health    0.472830  0.633118
Control   0.000000  1.000000
MetaCog   0.000000  1.000000
Others    0.000000  1.000000
Hopeless  0.455907  0.579318
OthViews  0.408834  0.525559


In [109]:
print(f'SVM Regression 95% Confidence Intervals with BERT')
print(pd.DataFrame(data=np.transpose(bs_svr_bert),index=schemas,columns=['low','high']))

SVM Regression 95% Confidence Intervals with BERT
               low      high
Attach    0.637180  0.686531
Comp      0.646569  0.698269
Global    0.474639  0.535724
Health    0.245164  0.360440
Control   0.262643  0.342105
MetaCog   0.103795  0.179311
Others    0.056081  0.167729
Hopeless  0.489317  0.545555
OthViews  0.444821  0.529859


In [110]:
# Save Results for quick loading later if project stops
joblib.dump(bs_svc_glove, 'Data/BootstrapResults/SVM/bs_svc_glove.pkl')
joblib.dump(bs_svr_glove, 'Data/BootstrapResults/SVM/bs_svr_glove.pkl')
joblib.dump(bs_svc_bert, 'Data/BootstrapResults/SVM/bs_svc_bert.pkl')
joblib.dump(bs_svr_bert, 'Data/BootstrapResults/SVM/bs_svr_bert.pkl')

['Data/BootstrapResults/SVM/bs_svr_bert.pkl']

In [49]:
bs_svc_glove = joblib.load('Data/BootstrapResults/SVM/bs_svc_glove.pkl')
bs_svr_glove = joblib.load('Data/BootstrapResults/SVM/bs_svr_glove.pkl')
bs_svc_bert = joblib.load('Data/BootstrapResults/SVM/bs_svc_bert.pkl')
bs_svr_bert = joblib.load('Data/BootstrapResults/SVM/bs_svr_bert.pkl')

## Recurrent neural networks

> We train two types of recurrent neural networks: a multilabel RNN that predicts all 9 schemas simultaneously and a set of 9 single-label RNNs that predict the labels for each schema separately. Each RNN consists of 4 layers: an embedding layer, a bidirectional LSTM layer, a dropout layer, and an output layer.

### Training Multilabel RNN
> We used as inspiration for the architecture of all RNNs the paper: Kshirsagar, R., Morris, R., & Bowman, S. (2017). Detecting and explaining crisis. arXiv preprint arXiv:1705.09585. However, we used long short-term memory (LSTM) instead of a gated recurrent unit (GRU).

In [50]:
# define multilabel model GLoVE
def multilabel_model_glove(train_X, train_y, test_X, test_y,params):
    # build the model
    model = Sequential()
    e = Embedding(vocab_size, 100, weights=[embedding_matrix], input_length=max_length, trainable=False)
    #embedding layer
    model.add(e)
    #LSTM layer
    model.add(Bidirectional(LSTM(params['lstm_units'])))
    #dropout layer
    model.add(Dropout(params['dropout']))
    #output layer
    model.add(Dense(9, activation='sigmoid'))
    # compile the model
    model.compile(optimizer=params['optimizer'], loss=params['losses'], metrics=['mean_absolute_error'])
    # summarize the model
    print(model.summary())
    # fit the model
    out = model.fit(train_X, train_y, 
                    validation_data=[test_X,test_y],
                    batch_size=params['batch_size'], 
                    epochs=params['epochs'], 
                    verbose=0)
    return out, model

In [51]:
# define multilabel model BERT
def multilabel_model_bert(train_X, train_y, test_X, test_y,params):
    # build the model
    model = Sequential()
    #No embedding layer for bert outputs if sequence is used
    model.add(Input(shape=(128, 512,)))
    #LSTM layer
    model.add(Bidirectional(LSTM(params['lstm_units'])))
    #dropout layer
    model.add(Dropout(params['dropout']))
    #output layer
    model.add(Dense(9, activation='sigmoid'))
    # compile the model
    model.compile(optimizer=params['optimizer'], loss=params['losses'], metrics=['mean_absolute_error'])
    # summarize the model
    print(model.summary())
    # fit the model
    out = model.fit(train_X, train_y, 
                    validation_data=[test_X,test_y],
                    batch_size=params['batch_size'], 
                    epochs=params['epochs'], 
                    verbose=0)
    return out, model

In [52]:
def grid_search(train_X, test_X, train_y, test_y, multilabel_model, exp_name):
    #define hyperparameter grid
    p={'lstm_units':[50,100],
       'optimizer':['rmsprop','Adam'],
       'losses':['binary_crossentropy','categorical_crossentropy','mean_absolute_error'],
       'dropout':[0.1,0.5],
       'batch_size': [32,64],
       'epochs':[100]} 
    #scan the grid
    tal=talos.Scan(x=train_X,
                   y=train_y,
                   x_val=test_X,
                   y_val=test_y,
                   model=multilabel_model,
                   params=p,
                   experiment_name= exp_name,
                   print_params=True,
                   clear_session=True)
    return tal

#### GLoVE Embeddings RNN Parameter Tuning

In [114]:
# wall time to run grid search: ~ 2h 10min
#run the small grid search
%time tal = grid_search(padded_train, padded_validate, train_labels, val_labels, multilabel_model_glove, 'multilabel_rnn_glove')
#analyze the outcome
analyze_object=talos.Analyze(tal)
analysis_results = analyze_object.data
#let's have a look at the results of the grid search
print(analysis_results)

  0%|                                                                                           | 0/48 [00:00<?, ?it/s]

{'batch_size': 32, 'dropout': 0.1, 'epochs': 100, 'losses': 'binary_crossentropy', 'lstm_units': 50, 'optimizer': 'rmsprop'}
Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 embedding (Embedding)       (None, 25, 100)           262400    
                                                                 
 bidirectional (Bidirectiona  (None, 100)              60400     
 l)                                                              
                                                                 
 dropout (Dropout)           (None, 100)               0         
                                                                 
 dense (Dense)               (None, 9)                 909       
                                                                 
Total params: 323,709
Trainable params: 61,309
Non-trainable params: 262,400
____________________________________________________

  2%|█▋                                                                                 | 1/48 [01:13<57:54, 73.92s/it]

{'batch_size': 32, 'dropout': 0.1, 'epochs': 100, 'losses': 'binary_crossentropy', 'lstm_units': 50, 'optimizer': 'Adam'}
Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 embedding (Embedding)       (None, 25, 100)           262400    
                                                                 
 bidirectional (Bidirectiona  (None, 100)              60400     
 l)                                                              
                                                                 
 dropout (Dropout)           (None, 100)               0         
                                                                 
 dense (Dense)               (None, 9)                 909       
                                                                 
Total params: 323,709
Trainable params: 61,309
Non-trainable params: 262,400
_______________________________________________________

  4%|███▍                                                                               | 2/48 [02:25<55:43, 72.68s/it]

{'batch_size': 32, 'dropout': 0.1, 'epochs': 100, 'losses': 'binary_crossentropy', 'lstm_units': 100, 'optimizer': 'rmsprop'}
Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 embedding (Embedding)       (None, 25, 100)           262400    
                                                                 
 bidirectional (Bidirectiona  (None, 200)              160800    
 l)                                                              
                                                                 
 dropout (Dropout)           (None, 200)               0         
                                                                 
 dense (Dense)               (None, 9)                 1809      
                                                                 
Total params: 425,009
Trainable params: 162,609
Non-trainable params: 262,400
__________________________________________________

  6%|█████                                                                            | 3/48 [04:08<1:04:44, 86.33s/it]

{'batch_size': 32, 'dropout': 0.1, 'epochs': 100, 'losses': 'binary_crossentropy', 'lstm_units': 100, 'optimizer': 'Adam'}
Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 embedding (Embedding)       (None, 25, 100)           262400    
                                                                 
 bidirectional (Bidirectiona  (None, 200)              160800    
 l)                                                              
                                                                 
 dropout (Dropout)           (None, 200)               0         
                                                                 
 dense (Dense)               (None, 9)                 1809      
                                                                 
Total params: 425,009
Trainable params: 162,609
Non-trainable params: 262,400
_____________________________________________________

  8%|██████▊                                                                          | 4/48 [05:51<1:08:09, 92.95s/it]

{'batch_size': 32, 'dropout': 0.1, 'epochs': 100, 'losses': 'categorical_crossentropy', 'lstm_units': 50, 'optimizer': 'rmsprop'}
Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 embedding (Embedding)       (None, 25, 100)           262400    
                                                                 
 bidirectional (Bidirectiona  (None, 100)              60400     
 l)                                                              
                                                                 
 dropout (Dropout)           (None, 100)               0         
                                                                 
 dense (Dense)               (None, 9)                 909       
                                                                 
Total params: 323,709
Trainable params: 61,309
Non-trainable params: 262,400
_______________________________________________

 10%|████████▍                                                                        | 5/48 [07:03<1:01:17, 85.52s/it]

{'batch_size': 32, 'dropout': 0.1, 'epochs': 100, 'losses': 'categorical_crossentropy', 'lstm_units': 50, 'optimizer': 'Adam'}
Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 embedding (Embedding)       (None, 25, 100)           262400    
                                                                 
 bidirectional (Bidirectiona  (None, 100)              60400     
 l)                                                              
                                                                 
 dropout (Dropout)           (None, 100)               0         
                                                                 
 dense (Dense)               (None, 9)                 909       
                                                                 
Total params: 323,709
Trainable params: 61,309
Non-trainable params: 262,400
__________________________________________________

 12%|██████████▍                                                                        | 6/48 [08:18<57:18, 81.87s/it]

{'batch_size': 32, 'dropout': 0.1, 'epochs': 100, 'losses': 'categorical_crossentropy', 'lstm_units': 100, 'optimizer': 'rmsprop'}
Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 embedding (Embedding)       (None, 25, 100)           262400    
                                                                 
 bidirectional (Bidirectiona  (None, 200)              160800    
 l)                                                              
                                                                 
 dropout (Dropout)           (None, 200)               0         
                                                                 
 dense (Dense)               (None, 9)                 1809      
                                                                 
Total params: 425,009
Trainable params: 162,609
Non-trainable params: 262,400
_____________________________________________

 15%|███████████▊                                                                     | 7/48 [10:14<1:03:38, 93.14s/it]

{'batch_size': 32, 'dropout': 0.1, 'epochs': 100, 'losses': 'categorical_crossentropy', 'lstm_units': 100, 'optimizer': 'Adam'}
Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 embedding (Embedding)       (None, 25, 100)           262400    
                                                                 
 bidirectional (Bidirectiona  (None, 200)              160800    
 l)                                                              
                                                                 
 dropout (Dropout)           (None, 200)               0         
                                                                 
 dense (Dense)               (None, 9)                 1809      
                                                                 
Total params: 425,009
Trainable params: 162,609
Non-trainable params: 262,400
________________________________________________

 17%|█████████████▎                                                                  | 8/48 [12:34<1:12:03, 108.08s/it]

{'batch_size': 32, 'dropout': 0.1, 'epochs': 100, 'losses': 'mean_absolute_error', 'lstm_units': 50, 'optimizer': 'rmsprop'}
Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 embedding (Embedding)       (None, 25, 100)           262400    
                                                                 
 bidirectional (Bidirectiona  (None, 100)              60400     
 l)                                                              
                                                                 
 dropout (Dropout)           (None, 100)               0         
                                                                 
 dense (Dense)               (None, 9)                 909       
                                                                 
Total params: 323,709
Trainable params: 61,309
Non-trainable params: 262,400
____________________________________________________

 19%|███████████████                                                                 | 9/48 [14:10<1:07:44, 104.22s/it]

{'batch_size': 32, 'dropout': 0.1, 'epochs': 100, 'losses': 'mean_absolute_error', 'lstm_units': 50, 'optimizer': 'Adam'}
Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 embedding (Embedding)       (None, 25, 100)           262400    
                                                                 
 bidirectional (Bidirectiona  (None, 100)              60400     
 l)                                                              
                                                                 
 dropout (Dropout)           (None, 100)               0         
                                                                 
 dense (Dense)               (None, 9)                 909       
                                                                 
Total params: 323,709
Trainable params: 61,309
Non-trainable params: 262,400
_______________________________________________________

 21%|████████████████▋                                                               | 10/48 [15:32<1:01:37, 97.31s/it]

{'batch_size': 32, 'dropout': 0.1, 'epochs': 100, 'losses': 'mean_absolute_error', 'lstm_units': 100, 'optimizer': 'rmsprop'}
Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 embedding (Embedding)       (None, 25, 100)           262400    
                                                                 
 bidirectional (Bidirectiona  (None, 200)              160800    
 l)                                                              
                                                                 
 dropout (Dropout)           (None, 200)               0         
                                                                 
 dense (Dense)               (None, 9)                 1809      
                                                                 
Total params: 425,009
Trainable params: 162,609
Non-trainable params: 262,400
__________________________________________________

 23%|██████████████████                                                             | 11/48 [17:45<1:06:40, 108.13s/it]

{'batch_size': 32, 'dropout': 0.1, 'epochs': 100, 'losses': 'mean_absolute_error', 'lstm_units': 100, 'optimizer': 'Adam'}
Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 embedding (Embedding)       (None, 25, 100)           262400    
                                                                 
 bidirectional (Bidirectiona  (None, 200)              160800    
 l)                                                              
                                                                 
 dropout (Dropout)           (None, 200)               0         
                                                                 
 dense (Dense)               (None, 9)                 1809      
                                                                 
Total params: 425,009
Trainable params: 162,609
Non-trainable params: 262,400
_____________________________________________________

 25%|███████████████████▊                                                           | 12/48 [19:49<1:07:47, 112.99s/it]

{'batch_size': 32, 'dropout': 0.5, 'epochs': 100, 'losses': 'binary_crossentropy', 'lstm_units': 50, 'optimizer': 'rmsprop'}
Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 embedding (Embedding)       (None, 25, 100)           262400    
                                                                 
 bidirectional (Bidirectiona  (None, 100)              60400     
 l)                                                              
                                                                 
 dropout (Dropout)           (None, 100)               0         
                                                                 
 dense (Dense)               (None, 9)                 909       
                                                                 
Total params: 323,709
Trainable params: 61,309
Non-trainable params: 262,400
____________________________________________________

 27%|█████████████████████▍                                                         | 13/48 [21:13<1:00:49, 104.28s/it]

{'batch_size': 32, 'dropout': 0.5, 'epochs': 100, 'losses': 'binary_crossentropy', 'lstm_units': 50, 'optimizer': 'Adam'}
Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 embedding (Embedding)       (None, 25, 100)           262400    
                                                                 
 bidirectional (Bidirectiona  (None, 100)              60400     
 l)                                                              
                                                                 
 dropout (Dropout)           (None, 100)               0         
                                                                 
 dense (Dense)               (None, 9)                 909       
                                                                 
Total params: 323,709
Trainable params: 61,309
Non-trainable params: 262,400
_______________________________________________________

 29%|███████████████████████▋                                                         | 14/48 [22:48<57:28, 101.42s/it]

{'batch_size': 32, 'dropout': 0.5, 'epochs': 100, 'losses': 'binary_crossentropy', 'lstm_units': 100, 'optimizer': 'rmsprop'}
Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 embedding (Embedding)       (None, 25, 100)           262400    
                                                                 
 bidirectional (Bidirectiona  (None, 200)              160800    
 l)                                                              
                                                                 
 dropout (Dropout)           (None, 200)               0         
                                                                 
 dense (Dense)               (None, 9)                 1809      
                                                                 
Total params: 425,009
Trainable params: 162,609
Non-trainable params: 262,400
__________________________________________________

 31%|████████████████████████▋                                                      | 15/48 [26:05<1:11:40, 130.30s/it]

{'batch_size': 32, 'dropout': 0.5, 'epochs': 100, 'losses': 'binary_crossentropy', 'lstm_units': 100, 'optimizer': 'Adam'}
Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 embedding (Embedding)       (None, 25, 100)           262400    
                                                                 
 bidirectional (Bidirectiona  (None, 200)              160800    
 l)                                                              
                                                                 
 dropout (Dropout)           (None, 200)               0         
                                                                 
 dense (Dense)               (None, 9)                 1809      
                                                                 
Total params: 425,009
Trainable params: 162,609
Non-trainable params: 262,400
_____________________________________________________

 33%|██████████████████████████▎                                                    | 16/48 [29:07<1:17:45, 145.80s/it]

{'batch_size': 32, 'dropout': 0.5, 'epochs': 100, 'losses': 'categorical_crossentropy', 'lstm_units': 50, 'optimizer': 'rmsprop'}
Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 embedding (Embedding)       (None, 25, 100)           262400    
                                                                 
 bidirectional (Bidirectiona  (None, 100)              60400     
 l)                                                              
                                                                 
 dropout (Dropout)           (None, 100)               0         
                                                                 
 dense (Dense)               (None, 9)                 909       
                                                                 
Total params: 323,709
Trainable params: 61,309
Non-trainable params: 262,400
_______________________________________________

 35%|███████████████████████████▉                                                   | 17/48 [30:53<1:09:07, 133.77s/it]

{'batch_size': 32, 'dropout': 0.5, 'epochs': 100, 'losses': 'categorical_crossentropy', 'lstm_units': 50, 'optimizer': 'Adam'}
Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 embedding (Embedding)       (None, 25, 100)           262400    
                                                                 
 bidirectional (Bidirectiona  (None, 100)              60400     
 l)                                                              
                                                                 
 dropout (Dropout)           (None, 100)               0         
                                                                 
 dense (Dense)               (None, 9)                 909       
                                                                 
Total params: 323,709
Trainable params: 61,309
Non-trainable params: 262,400
__________________________________________________

 38%|█████████████████████████████▋                                                 | 18/48 [32:36<1:02:21, 124.70s/it]

{'batch_size': 32, 'dropout': 0.5, 'epochs': 100, 'losses': 'categorical_crossentropy', 'lstm_units': 100, 'optimizer': 'rmsprop'}
Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 embedding (Embedding)       (None, 25, 100)           262400    
                                                                 
 bidirectional (Bidirectiona  (None, 200)              160800    
 l)                                                              
                                                                 
 dropout (Dropout)           (None, 200)               0         
                                                                 
 dense (Dense)               (None, 9)                 1809      
                                                                 
Total params: 425,009
Trainable params: 162,609
Non-trainable params: 262,400
_____________________________________________

 40%|███████████████████████████████▎                                               | 19/48 [35:58<1:11:26, 147.82s/it]

{'batch_size': 32, 'dropout': 0.5, 'epochs': 100, 'losses': 'categorical_crossentropy', 'lstm_units': 100, 'optimizer': 'Adam'}
Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 embedding (Embedding)       (None, 25, 100)           262400    
                                                                 
 bidirectional (Bidirectiona  (None, 200)              160800    
 l)                                                              
                                                                 
 dropout (Dropout)           (None, 200)               0         
                                                                 
 dense (Dense)               (None, 9)                 1809      
                                                                 
Total params: 425,009
Trainable params: 162,609
Non-trainable params: 262,400
________________________________________________

 42%|████████████████████████████████▉                                              | 20/48 [39:31<1:18:08, 167.45s/it]

{'batch_size': 32, 'dropout': 0.5, 'epochs': 100, 'losses': 'mean_absolute_error', 'lstm_units': 50, 'optimizer': 'rmsprop'}
Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 embedding (Embedding)       (None, 25, 100)           262400    
                                                                 
 bidirectional (Bidirectiona  (None, 100)              60400     
 l)                                                              
                                                                 
 dropout (Dropout)           (None, 100)               0         
                                                                 
 dense (Dense)               (None, 9)                 909       
                                                                 
Total params: 323,709
Trainable params: 61,309
Non-trainable params: 262,400
____________________________________________________

 44%|██████████████████████████████████▌                                            | 21/48 [41:24<1:08:02, 151.19s/it]

{'batch_size': 32, 'dropout': 0.5, 'epochs': 100, 'losses': 'mean_absolute_error', 'lstm_units': 50, 'optimizer': 'Adam'}
Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 embedding (Embedding)       (None, 25, 100)           262400    
                                                                 
 bidirectional (Bidirectiona  (None, 100)              60400     
 l)                                                              
                                                                 
 dropout (Dropout)           (None, 100)               0         
                                                                 
 dense (Dense)               (None, 9)                 909       
                                                                 
Total params: 323,709
Trainable params: 61,309
Non-trainable params: 262,400
_______________________________________________________

 46%|████████████████████████████████████▏                                          | 22/48 [43:23<1:01:19, 141.53s/it]

{'batch_size': 32, 'dropout': 0.5, 'epochs': 100, 'losses': 'mean_absolute_error', 'lstm_units': 100, 'optimizer': 'rmsprop'}
Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 embedding (Embedding)       (None, 25, 100)           262400    
                                                                 
 bidirectional (Bidirectiona  (None, 200)              160800    
 l)                                                              
                                                                 
 dropout (Dropout)           (None, 200)               0         
                                                                 
 dense (Dense)               (None, 9)                 1809      
                                                                 
Total params: 425,009
Trainable params: 162,609
Non-trainable params: 262,400
__________________________________________________

 48%|█████████████████████████████████████▊                                         | 23/48 [47:31<1:12:10, 173.23s/it]

{'batch_size': 32, 'dropout': 0.5, 'epochs': 100, 'losses': 'mean_absolute_error', 'lstm_units': 100, 'optimizer': 'Adam'}
Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 embedding (Embedding)       (None, 25, 100)           262400    
                                                                 
 bidirectional (Bidirectiona  (None, 200)              160800    
 l)                                                              
                                                                 
 dropout (Dropout)           (None, 200)               0         
                                                                 
 dense (Dense)               (None, 9)                 1809      
                                                                 
Total params: 425,009
Trainable params: 162,609
Non-trainable params: 262,400
_____________________________________________________

 50%|███████████████████████████████████████▌                                       | 24/48 [52:01<1:20:59, 202.47s/it]

{'batch_size': 64, 'dropout': 0.1, 'epochs': 100, 'losses': 'binary_crossentropy', 'lstm_units': 50, 'optimizer': 'rmsprop'}
Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 embedding (Embedding)       (None, 25, 100)           262400    
                                                                 
 bidirectional (Bidirectiona  (None, 100)              60400     
 l)                                                              
                                                                 
 dropout (Dropout)           (None, 100)               0         
                                                                 
 dense (Dense)               (None, 9)                 909       
                                                                 
Total params: 323,709
Trainable params: 61,309
Non-trainable params: 262,400
____________________________________________________

 52%|█████████████████████████████████████████▏                                     | 25/48 [53:24<1:03:47, 166.43s/it]

{'batch_size': 64, 'dropout': 0.1, 'epochs': 100, 'losses': 'binary_crossentropy', 'lstm_units': 50, 'optimizer': 'Adam'}
Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 embedding (Embedding)       (None, 25, 100)           262400    
                                                                 
 bidirectional (Bidirectiona  (None, 100)              60400     
 l)                                                              
                                                                 
 dropout (Dropout)           (None, 100)               0         
                                                                 
 dense (Dense)               (None, 9)                 909       
                                                                 
Total params: 323,709
Trainable params: 61,309
Non-trainable params: 262,400
_______________________________________________________

 54%|███████████████████████████████████████████▉                                     | 26/48 [54:44<51:34, 140.67s/it]

{'batch_size': 64, 'dropout': 0.1, 'epochs': 100, 'losses': 'binary_crossentropy', 'lstm_units': 100, 'optimizer': 'rmsprop'}
Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 embedding (Embedding)       (None, 25, 100)           262400    
                                                                 
 bidirectional (Bidirectiona  (None, 200)              160800    
 l)                                                              
                                                                 
 dropout (Dropout)           (None, 200)               0         
                                                                 
 dense (Dense)               (None, 9)                 1809      
                                                                 
Total params: 425,009
Trainable params: 162,609
Non-trainable params: 262,400
__________________________________________________

 56%|█████████████████████████████████████████████▌                                   | 27/48 [57:57<54:41, 156.27s/it]

{'batch_size': 64, 'dropout': 0.1, 'epochs': 100, 'losses': 'binary_crossentropy', 'lstm_units': 100, 'optimizer': 'Adam'}
Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 embedding (Embedding)       (None, 25, 100)           262400    
                                                                 
 bidirectional (Bidirectiona  (None, 200)              160800    
 l)                                                              
                                                                 
 dropout (Dropout)           (None, 200)               0         
                                                                 
 dense (Dense)               (None, 9)                 1809      
                                                                 
Total params: 425,009
Trainable params: 162,609
Non-trainable params: 262,400
_____________________________________________________

 58%|██████████████████████████████████████████████                                 | 28/48 [1:01:11<55:49, 167.49s/it]

{'batch_size': 64, 'dropout': 0.1, 'epochs': 100, 'losses': 'categorical_crossentropy', 'lstm_units': 50, 'optimizer': 'rmsprop'}
Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 embedding (Embedding)       (None, 25, 100)           262400    
                                                                 
 bidirectional (Bidirectiona  (None, 100)              60400     
 l)                                                              
                                                                 
 dropout (Dropout)           (None, 100)               0         
                                                                 
 dense (Dense)               (None, 9)                 909       
                                                                 
Total params: 323,709
Trainable params: 61,309
Non-trainable params: 262,400
_______________________________________________

 60%|███████████████████████████████████████████████▋                               | 29/48 [1:02:47<46:19, 146.27s/it]

{'batch_size': 64, 'dropout': 0.1, 'epochs': 100, 'losses': 'categorical_crossentropy', 'lstm_units': 50, 'optimizer': 'Adam'}
Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 embedding (Embedding)       (None, 25, 100)           262400    
                                                                 
 bidirectional (Bidirectiona  (None, 100)              60400     
 l)                                                              
                                                                 
 dropout (Dropout)           (None, 100)               0         
                                                                 
 dense (Dense)               (None, 9)                 909       
                                                                 
Total params: 323,709
Trainable params: 61,309
Non-trainable params: 262,400
__________________________________________________

 62%|█████████████████████████████████████████████████▍                             | 30/48 [1:04:22<39:14, 130.83s/it]

{'batch_size': 64, 'dropout': 0.1, 'epochs': 100, 'losses': 'categorical_crossentropy', 'lstm_units': 100, 'optimizer': 'rmsprop'}
Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 embedding (Embedding)       (None, 25, 100)           262400    
                                                                 
 bidirectional (Bidirectiona  (None, 200)              160800    
 l)                                                              
                                                                 
 dropout (Dropout)           (None, 200)               0         
                                                                 
 dense (Dense)               (None, 9)                 1809      
                                                                 
Total params: 425,009
Trainable params: 162,609
Non-trainable params: 262,400
_____________________________________________

 65%|███████████████████████████████████████████████████                            | 31/48 [1:07:51<43:44, 154.40s/it]

{'batch_size': 64, 'dropout': 0.1, 'epochs': 100, 'losses': 'categorical_crossentropy', 'lstm_units': 100, 'optimizer': 'Adam'}
Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 embedding (Embedding)       (None, 25, 100)           262400    
                                                                 
 bidirectional (Bidirectiona  (None, 200)              160800    
 l)                                                              
                                                                 
 dropout (Dropout)           (None, 200)               0         
                                                                 
 dense (Dense)               (None, 9)                 1809      
                                                                 
Total params: 425,009
Trainable params: 162,609
Non-trainable params: 262,400
________________________________________________

 67%|████████████████████████████████████████████████████▋                          | 32/48 [1:11:19<45:24, 170.28s/it]

{'batch_size': 64, 'dropout': 0.1, 'epochs': 100, 'losses': 'mean_absolute_error', 'lstm_units': 50, 'optimizer': 'rmsprop'}
Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 embedding (Embedding)       (None, 25, 100)           262400    
                                                                 
 bidirectional (Bidirectiona  (None, 100)              60400     
 l)                                                              
                                                                 
 dropout (Dropout)           (None, 100)               0         
                                                                 
 dense (Dense)               (None, 9)                 909       
                                                                 
Total params: 323,709
Trainable params: 61,309
Non-trainable params: 262,400
____________________________________________________

 69%|██████████████████████████████████████████████████████▎                        | 33/48 [1:12:59<37:16, 149.13s/it]

{'batch_size': 64, 'dropout': 0.1, 'epochs': 100, 'losses': 'mean_absolute_error', 'lstm_units': 50, 'optimizer': 'Adam'}
Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 embedding (Embedding)       (None, 25, 100)           262400    
                                                                 
 bidirectional (Bidirectiona  (None, 100)              60400     
 l)                                                              
                                                                 
 dropout (Dropout)           (None, 100)               0         
                                                                 
 dense (Dense)               (None, 9)                 909       
                                                                 
Total params: 323,709
Trainable params: 61,309
Non-trainable params: 262,400
_______________________________________________________

 71%|███████████████████████████████████████████████████████▉                       | 34/48 [1:14:21<30:09, 129.24s/it]

{'batch_size': 64, 'dropout': 0.1, 'epochs': 100, 'losses': 'mean_absolute_error', 'lstm_units': 100, 'optimizer': 'rmsprop'}
Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 embedding (Embedding)       (None, 25, 100)           262400    
                                                                 
 bidirectional (Bidirectiona  (None, 200)              160800    
 l)                                                              
                                                                 
 dropout (Dropout)           (None, 200)               0         
                                                                 
 dense (Dense)               (None, 9)                 1809      
                                                                 
Total params: 425,009
Trainable params: 162,609
Non-trainable params: 262,400
__________________________________________________

 73%|█████████████████████████████████████████████████████████▌                     | 35/48 [1:17:20<31:12, 144.05s/it]

{'batch_size': 64, 'dropout': 0.1, 'epochs': 100, 'losses': 'mean_absolute_error', 'lstm_units': 100, 'optimizer': 'Adam'}
Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 embedding (Embedding)       (None, 25, 100)           262400    
                                                                 
 bidirectional (Bidirectiona  (None, 200)              160800    
 l)                                                              
                                                                 
 dropout (Dropout)           (None, 200)               0         
                                                                 
 dense (Dense)               (None, 9)                 1809      
                                                                 
Total params: 425,009
Trainable params: 162,609
Non-trainable params: 262,400
_____________________________________________________

 75%|███████████████████████████████████████████████████████████▎                   | 36/48 [1:20:28<31:26, 157.22s/it]

{'batch_size': 64, 'dropout': 0.5, 'epochs': 100, 'losses': 'binary_crossentropy', 'lstm_units': 50, 'optimizer': 'rmsprop'}
Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 embedding (Embedding)       (None, 25, 100)           262400    
                                                                 
 bidirectional (Bidirectiona  (None, 100)              60400     
 l)                                                              
                                                                 
 dropout (Dropout)           (None, 100)               0         
                                                                 
 dense (Dense)               (None, 9)                 909       
                                                                 
Total params: 323,709
Trainable params: 61,309
Non-trainable params: 262,400
____________________________________________________

 77%|████████████████████████████████████████████████████████████▉                  | 37/48 [1:22:07<25:37, 139.77s/it]

{'batch_size': 64, 'dropout': 0.5, 'epochs': 100, 'losses': 'binary_crossentropy', 'lstm_units': 50, 'optimizer': 'Adam'}
Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 embedding (Embedding)       (None, 25, 100)           262400    
                                                                 
 bidirectional (Bidirectiona  (None, 100)              60400     
 l)                                                              
                                                                 
 dropout (Dropout)           (None, 100)               0         
                                                                 
 dense (Dense)               (None, 9)                 909       
                                                                 
Total params: 323,709
Trainable params: 61,309
Non-trainable params: 262,400
_______________________________________________________

 79%|██████████████████████████████████████████████████████████████▌                | 38/48 [1:23:47<21:18, 127.87s/it]

{'batch_size': 64, 'dropout': 0.5, 'epochs': 100, 'losses': 'binary_crossentropy', 'lstm_units': 100, 'optimizer': 'rmsprop'}
Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 embedding (Embedding)       (None, 25, 100)           262400    
                                                                 
 bidirectional (Bidirectiona  (None, 200)              160800    
 l)                                                              
                                                                 
 dropout (Dropout)           (None, 200)               0         
                                                                 
 dense (Dense)               (None, 9)                 1809      
                                                                 
Total params: 425,009
Trainable params: 162,609
Non-trainable params: 262,400
__________________________________________________

 81%|████████████████████████████████████████████████████████████████▏              | 39/48 [1:27:17<22:51, 152.44s/it]

{'batch_size': 64, 'dropout': 0.5, 'epochs': 100, 'losses': 'binary_crossentropy', 'lstm_units': 100, 'optimizer': 'Adam'}
Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 embedding (Embedding)       (None, 25, 100)           262400    
                                                                 
 bidirectional (Bidirectiona  (None, 200)              160800    
 l)                                                              
                                                                 
 dropout (Dropout)           (None, 200)               0         
                                                                 
 dense (Dense)               (None, 9)                 1809      
                                                                 
Total params: 425,009
Trainable params: 162,609
Non-trainable params: 262,400
_____________________________________________________

 83%|█████████████████████████████████████████████████████████████████▊             | 40/48 [1:30:49<22:41, 170.19s/it]

{'batch_size': 64, 'dropout': 0.5, 'epochs': 100, 'losses': 'categorical_crossentropy', 'lstm_units': 50, 'optimizer': 'rmsprop'}
Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 embedding (Embedding)       (None, 25, 100)           262400    
                                                                 
 bidirectional (Bidirectiona  (None, 100)              60400     
 l)                                                              
                                                                 
 dropout (Dropout)           (None, 100)               0         
                                                                 
 dense (Dense)               (None, 9)                 909       
                                                                 
Total params: 323,709
Trainable params: 61,309
Non-trainable params: 262,400
_______________________________________________

 85%|███████████████████████████████████████████████████████████████████▍           | 41/48 [1:32:37<17:42, 151.77s/it]

{'batch_size': 64, 'dropout': 0.5, 'epochs': 100, 'losses': 'categorical_crossentropy', 'lstm_units': 50, 'optimizer': 'Adam'}
Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 embedding (Embedding)       (None, 25, 100)           262400    
                                                                 
 bidirectional (Bidirectiona  (None, 100)              60400     
 l)                                                              
                                                                 
 dropout (Dropout)           (None, 100)               0         
                                                                 
 dense (Dense)               (None, 9)                 909       
                                                                 
Total params: 323,709
Trainable params: 61,309
Non-trainable params: 262,400
__________________________________________________

 88%|█████████████████████████████████████████████████████████████████████▏         | 42/48 [1:34:53<14:41, 146.88s/it]

{'batch_size': 64, 'dropout': 0.5, 'epochs': 100, 'losses': 'categorical_crossentropy', 'lstm_units': 100, 'optimizer': 'rmsprop'}
Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 embedding (Embedding)       (None, 25, 100)           262400    
                                                                 
 bidirectional (Bidirectiona  (None, 200)              160800    
 l)                                                              
                                                                 
 dropout (Dropout)           (None, 200)               0         
                                                                 
 dense (Dense)               (None, 9)                 1809      
                                                                 
Total params: 425,009
Trainable params: 162,609
Non-trainable params: 262,400
_____________________________________________

 90%|██████████████████████████████████████████████████████████████████████▊        | 43/48 [1:38:17<13:39, 163.95s/it]

{'batch_size': 64, 'dropout': 0.5, 'epochs': 100, 'losses': 'categorical_crossentropy', 'lstm_units': 100, 'optimizer': 'Adam'}
Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 embedding (Embedding)       (None, 25, 100)           262400    
                                                                 
 bidirectional (Bidirectiona  (None, 200)              160800    
 l)                                                              
                                                                 
 dropout (Dropout)           (None, 200)               0         
                                                                 
 dense (Dense)               (None, 9)                 1809      
                                                                 
Total params: 425,009
Trainable params: 162,609
Non-trainable params: 262,400
________________________________________________

 92%|████████████████████████████████████████████████████████████████████████▍      | 44/48 [1:41:42<11:46, 176.54s/it]

{'batch_size': 64, 'dropout': 0.5, 'epochs': 100, 'losses': 'mean_absolute_error', 'lstm_units': 50, 'optimizer': 'rmsprop'}
Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 embedding (Embedding)       (None, 25, 100)           262400    
                                                                 
 bidirectional (Bidirectiona  (None, 100)              60400     
 l)                                                              
                                                                 
 dropout (Dropout)           (None, 100)               0         
                                                                 
 dense (Dense)               (None, 9)                 909       
                                                                 
Total params: 323,709
Trainable params: 61,309
Non-trainable params: 262,400
____________________________________________________

 94%|██████████████████████████████████████████████████████████████████████████     | 45/48 [1:44:13<08:25, 168.62s/it]

{'batch_size': 64, 'dropout': 0.5, 'epochs': 100, 'losses': 'mean_absolute_error', 'lstm_units': 50, 'optimizer': 'Adam'}
Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 embedding (Embedding)       (None, 25, 100)           262400    
                                                                 
 bidirectional (Bidirectiona  (None, 100)              60400     
 l)                                                              
                                                                 
 dropout (Dropout)           (None, 100)               0         
                                                                 
 dense (Dense)               (None, 9)                 909       
                                                                 
Total params: 323,709
Trainable params: 61,309
Non-trainable params: 262,400
_______________________________________________________

 96%|███████████████████████████████████████████████████████████████████████████▋   | 46/48 [1:46:00<05:00, 150.36s/it]

{'batch_size': 64, 'dropout': 0.5, 'epochs': 100, 'losses': 'mean_absolute_error', 'lstm_units': 100, 'optimizer': 'rmsprop'}
Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 embedding (Embedding)       (None, 25, 100)           262400    
                                                                 
 bidirectional (Bidirectiona  (None, 200)              160800    
 l)                                                              
                                                                 
 dropout (Dropout)           (None, 200)               0         
                                                                 
 dense (Dense)               (None, 9)                 1809      
                                                                 
Total params: 425,009
Trainable params: 162,609
Non-trainable params: 262,400
__________________________________________________

 98%|█████████████████████████████████████████████████████████████████████████████▎ | 47/48 [1:49:46<02:52, 172.90s/it]

{'batch_size': 64, 'dropout': 0.5, 'epochs': 100, 'losses': 'mean_absolute_error', 'lstm_units': 100, 'optimizer': 'Adam'}
Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 embedding (Embedding)       (None, 25, 100)           262400    
                                                                 
 bidirectional (Bidirectiona  (None, 200)              160800    
 l)                                                              
                                                                 
 dropout (Dropout)           (None, 200)               0         
                                                                 
 dense (Dense)               (None, 9)                 1809      
                                                                 
Total params: 425,009
Trainable params: 162,609
Non-trainable params: 262,400
_____________________________________________________

100%|███████████████████████████████████████████████████████████████████████████████| 48/48 [1:54:15<00:00, 142.82s/it]

Wall time: 1h 54min 15s
              start              end    duration  round_epochs        loss  \
0   05/10/22-203225  05/10/22-203338   73.721319           100  -93.232689   
1   05/10/22-203339  05/10/22-203450   71.585853           100  -74.255096   
2   05/10/22-203450  05/10/22-203633  102.349468           100 -187.947327   
3   05/10/22-203633  05/10/22-203816  102.885891           100 -147.428848   
4   05/10/22-203816  05/10/22-203928   72.092832           100   46.448235   
5   05/10/22-203928  05/10/22-204043   74.571780           100   44.170319   
6   05/10/22-204043  05/10/22-204239  116.094220           100   84.605247   
7   05/10/22-204239  05/10/22-204459  139.855941           100   80.855774   
8   05/10/22-204500  05/10/22-204635   95.492358           100    0.317915   
9   05/10/22-204635  05/10/22-204757   81.612673           100    0.317915   
10  05/10/22-204757  05/10/22-205010  132.454166           100    0.317915   
11  05/10/22-205010  05/10/22-205214  12




In [115]:
#we choose the best model of the grid search on the basis of the MAE metric, lower values are better
mlm_model_glove = tal.best_model(metric='mean_absolute_error', asc=True)
#to get an idea of how our best model performs, we check predictions on the validation set
prediction_mlm_val_glove = mlm_model_glove.predict(padded_validate)
output_mlm_val_glove = gof_spear(prediction_mlm_val_glove,val_labels)

In [116]:
print(pd.DataFrame(data=output_mlm_val_glove,index=schemas,columns=['estimate']))

          estimate
Attach   -0.001602
Comp     -0.043018
Global    0.070571
Health   -0.056575
Control  -0.075047
MetaCog  -0.146498
Others   -0.042516
Hopeless -0.046862
OthViews  0.125471


In [117]:
#the predictions make sense considering what we got from KNN and SVM. We deploy the model.
talos.Deploy(tal,'mlm_rnn_glove',metric='mean_absolute_error',asc=True)

Deploy package mlm_rnn_glove have been saved.


<talos.commands.deploy.Deploy at 0x26b0e137d30>

#### Checkpoint After Parameter Analysis

In [53]:
#we restore the deployed Talos experiment
restore_glove = talos.Restore('Data/mlm_rnn_glove.zip')
#to get the best performing parameters, we get the results of the Talos experiment
scan_results_glove = restore_glove.results

In [54]:
#select the row with the smallest mean absolute error
print(scan_results_glove[scan_results_glove.mean_absolute_error == scan_results_glove.mean_absolute_error.min()]) 

              start              end    duration  round_epochs      loss  \
32  05/10/22-214344  05/10/22-214523   99.546668           100  0.317915   
34  05/10/22-214647  05/10/22-214945  178.382386           100  0.317915   
44  05/10/22-221408  05/10/22-221637  149.898346           100  0.317915   
46  05/10/22-221825  05/10/22-222211  225.284788           100  0.317915   

    mean_absolute_error  val_loss  val_mean_absolute_error  batch_size  \
32             0.317915  0.315901                 0.315901          64   
34             0.317915  0.315901                 0.315901          64   
44             0.317915  0.315901                 0.315901          64   
46             0.317915  0.315901                 0.315901          64   

    dropout  epochs               losses  lstm_units optimizer  
32      0.1     100  mean_absolute_error          50   rmsprop  
34      0.1     100  mean_absolute_error         100   rmsprop  
44      0.5     100  mean_absolute_error          50 

### BERT Embeddings RNN Parameter Tuning

In [55]:
# wall time to run grid search: ~ 2h 10min
#run the small grid search
%time tal = grid_search(train_outputs["sequence_output"], val_outputs["sequence_output"], train_labels, val_labels, multilabel_model_bert, 'multilabel_rnn_bert')
#analyze the outcome
analyze_object=talos.Analyze(tal)
analysis_results = analyze_object.data
#let's have a look at the results of the grid search
print(analysis_results)

  0%|                                                                                           | 0/48 [00:00<?, ?it/s]

{'batch_size': 32, 'dropout': 0.1, 'epochs': 100, 'losses': 'binary_crossentropy', 'lstm_units': 50, 'optimizer': 'rmsprop'}
Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 bidirectional (Bidirectiona  (None, 100)              225200    
 l)                                                              
                                                                 
 dropout (Dropout)           (None, 100)               0         
                                                                 
 dense (Dense)               (None, 9)                 909       
                                                                 
Total params: 226,109
Trainable params: 226,109
Non-trainable params: 0
_________________________________________________________________
None


  2%|█▋                                                                              | 1/48 [06:52<5:22:57, 412.28s/it]

{'batch_size': 32, 'dropout': 0.1, 'epochs': 100, 'losses': 'binary_crossentropy', 'lstm_units': 50, 'optimizer': 'Adam'}
Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 bidirectional (Bidirectiona  (None, 100)              225200    
 l)                                                              
                                                                 
 dropout (Dropout)           (None, 100)               0         
                                                                 
 dense (Dense)               (None, 9)                 909       
                                                                 
Total params: 226,109
Trainable params: 226,109
Non-trainable params: 0
_________________________________________________________________
None


  4%|███▎                                                                            | 2/48 [13:52<5:19:36, 416.88s/it]

{'batch_size': 32, 'dropout': 0.1, 'epochs': 100, 'losses': 'binary_crossentropy', 'lstm_units': 100, 'optimizer': 'rmsprop'}
Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 bidirectional (Bidirectiona  (None, 200)              490400    
 l)                                                              
                                                                 
 dropout (Dropout)           (None, 200)               0         
                                                                 
 dense (Dense)               (None, 9)                 1809      
                                                                 
Total params: 492,209
Trainable params: 492,209
Non-trainable params: 0
_________________________________________________________________
None


  6%|█████                                                                           | 3/48 [24:20<6:24:57, 513.27s/it]

{'batch_size': 32, 'dropout': 0.1, 'epochs': 100, 'losses': 'binary_crossentropy', 'lstm_units': 100, 'optimizer': 'Adam'}
Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 bidirectional (Bidirectiona  (None, 200)              490400    
 l)                                                              
                                                                 
 dropout (Dropout)           (None, 200)               0         
                                                                 
 dense (Dense)               (None, 9)                 1809      
                                                                 
Total params: 492,209
Trainable params: 492,209
Non-trainable params: 0
_________________________________________________________________
None


  8%|██████▋                                                                         | 4/48 [35:23<6:59:43, 572.35s/it]

{'batch_size': 32, 'dropout': 0.1, 'epochs': 100, 'losses': 'categorical_crossentropy', 'lstm_units': 50, 'optimizer': 'rmsprop'}
Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 bidirectional (Bidirectiona  (None, 100)              225200    
 l)                                                              
                                                                 
 dropout (Dropout)           (None, 100)               0         
                                                                 
 dense (Dense)               (None, 9)                 909       
                                                                 
Total params: 226,109
Trainable params: 226,109
Non-trainable params: 0
_________________________________________________________________
None


 10%|████████▎                                                                       | 5/48 [46:08<7:08:58, 598.58s/it]

{'batch_size': 32, 'dropout': 0.1, 'epochs': 100, 'losses': 'categorical_crossentropy', 'lstm_units': 50, 'optimizer': 'Adam'}
Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 bidirectional (Bidirectiona  (None, 100)              225200    
 l)                                                              
                                                                 
 dropout (Dropout)           (None, 100)               0         
                                                                 
 dense (Dense)               (None, 9)                 909       
                                                                 
Total params: 226,109
Trainable params: 226,109
Non-trainable params: 0
_________________________________________________________________
None


 12%|██████████                                                                      | 6/48 [56:11<7:00:02, 600.06s/it]

{'batch_size': 32, 'dropout': 0.1, 'epochs': 100, 'losses': 'categorical_crossentropy', 'lstm_units': 100, 'optimizer': 'rmsprop'}
Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 bidirectional (Bidirectiona  (None, 200)              490400    
 l)                                                              
                                                                 
 dropout (Dropout)           (None, 200)               0         
                                                                 
 dense (Dense)               (None, 9)                 1809      
                                                                 
Total params: 492,209
Trainable params: 492,209
Non-trainable params: 0
_________________________________________________________________
None


 15%|███████████▍                                                                  | 7/48 [1:10:58<7:54:07, 693.85s/it]

{'batch_size': 32, 'dropout': 0.1, 'epochs': 100, 'losses': 'categorical_crossentropy', 'lstm_units': 100, 'optimizer': 'Adam'}
Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 bidirectional (Bidirectiona  (None, 200)              490400    
 l)                                                              
                                                                 
 dropout (Dropout)           (None, 200)               0         
                                                                 
 dense (Dense)               (None, 9)                 1809      
                                                                 
Total params: 492,209
Trainable params: 492,209
Non-trainable params: 0
_________________________________________________________________
None


 17%|█████████████                                                                 | 8/48 [1:26:15<8:30:05, 765.13s/it]

{'batch_size': 32, 'dropout': 0.1, 'epochs': 100, 'losses': 'mean_absolute_error', 'lstm_units': 50, 'optimizer': 'rmsprop'}
Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 bidirectional (Bidirectiona  (None, 100)              225200    
 l)                                                              
                                                                 
 dropout (Dropout)           (None, 100)               0         
                                                                 
 dense (Dense)               (None, 9)                 909       
                                                                 
Total params: 226,109
Trainable params: 226,109
Non-trainable params: 0
_________________________________________________________________
None


 19%|██████████████▋                                                               | 9/48 [1:37:48<8:02:34, 742.44s/it]

{'batch_size': 32, 'dropout': 0.1, 'epochs': 100, 'losses': 'mean_absolute_error', 'lstm_units': 50, 'optimizer': 'Adam'}
Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 bidirectional (Bidirectiona  (None, 100)              225200    
 l)                                                              
                                                                 
 dropout (Dropout)           (None, 100)               0         
                                                                 
 dense (Dense)               (None, 9)                 909       
                                                                 
Total params: 226,109
Trainable params: 226,109
Non-trainable params: 0
_________________________________________________________________
None


 21%|████████████████                                                             | 10/48 [1:49:25<7:41:20, 728.43s/it]

{'batch_size': 32, 'dropout': 0.1, 'epochs': 100, 'losses': 'mean_absolute_error', 'lstm_units': 100, 'optimizer': 'rmsprop'}
Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 bidirectional (Bidirectiona  (None, 200)              490400    
 l)                                                              
                                                                 
 dropout (Dropout)           (None, 200)               0         
                                                                 
 dense (Dense)               (None, 9)                 1809      
                                                                 
Total params: 492,209
Trainable params: 492,209
Non-trainable params: 0
_________________________________________________________________
None


 23%|█████████████████▋                                                           | 11/48 [2:05:56<8:18:38, 808.61s/it]

{'batch_size': 32, 'dropout': 0.1, 'epochs': 100, 'losses': 'mean_absolute_error', 'lstm_units': 100, 'optimizer': 'Adam'}
Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 bidirectional (Bidirectiona  (None, 200)              490400    
 l)                                                              
                                                                 
 dropout (Dropout)           (None, 200)               0         
                                                                 
 dense (Dense)               (None, 9)                 1809      
                                                                 
Total params: 492,209
Trainable params: 492,209
Non-trainable params: 0
_________________________________________________________________
None


 25%|███████████████████▎                                                         | 12/48 [2:22:12<8:35:49, 859.72s/it]

{'batch_size': 32, 'dropout': 0.5, 'epochs': 100, 'losses': 'binary_crossentropy', 'lstm_units': 50, 'optimizer': 'rmsprop'}
Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 bidirectional (Bidirectiona  (None, 100)              225200    
 l)                                                              
                                                                 
 dropout (Dropout)           (None, 100)               0         
                                                                 
 dense (Dense)               (None, 9)                 909       
                                                                 
Total params: 226,109
Trainable params: 226,109
Non-trainable params: 0
_________________________________________________________________
None


 27%|████████████████████▊                                                        | 13/48 [2:33:43<7:51:41, 808.62s/it]

{'batch_size': 32, 'dropout': 0.5, 'epochs': 100, 'losses': 'binary_crossentropy', 'lstm_units': 50, 'optimizer': 'Adam'}
Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 bidirectional (Bidirectiona  (None, 100)              225200    
 l)                                                              
                                                                 
 dropout (Dropout)           (None, 100)               0         
                                                                 
 dense (Dense)               (None, 9)                 909       
                                                                 
Total params: 226,109
Trainable params: 226,109
Non-trainable params: 0
_________________________________________________________________
None


 29%|██████████████████████▍                                                      | 14/48 [2:47:20<7:39:35, 811.04s/it]

{'batch_size': 32, 'dropout': 0.5, 'epochs': 100, 'losses': 'binary_crossentropy', 'lstm_units': 100, 'optimizer': 'rmsprop'}
Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 bidirectional (Bidirectiona  (None, 200)              490400    
 l)                                                              
                                                                 
 dropout (Dropout)           (None, 200)               0         
                                                                 
 dense (Dense)               (None, 9)                 1809      
                                                                 
Total params: 492,209
Trainable params: 492,209
Non-trainable params: 0
_________________________________________________________________
None


 31%|████████████████████████                                                     | 15/48 [3:04:01<7:57:36, 868.37s/it]

{'batch_size': 32, 'dropout': 0.5, 'epochs': 100, 'losses': 'binary_crossentropy', 'lstm_units': 100, 'optimizer': 'Adam'}
Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 bidirectional (Bidirectiona  (None, 200)              490400    
 l)                                                              
                                                                 
 dropout (Dropout)           (None, 200)               0         
                                                                 
 dense (Dense)               (None, 9)                 1809      
                                                                 
Total params: 492,209
Trainable params: 492,209
Non-trainable params: 0
_________________________________________________________________
None


 33%|█████████████████████████▋                                                   | 16/48 [3:19:01<7:48:15, 877.98s/it]

{'batch_size': 32, 'dropout': 0.5, 'epochs': 100, 'losses': 'categorical_crossentropy', 'lstm_units': 50, 'optimizer': 'rmsprop'}
Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 bidirectional (Bidirectiona  (None, 100)              225200    
 l)                                                              
                                                                 
 dropout (Dropout)           (None, 100)               0         
                                                                 
 dense (Dense)               (None, 9)                 909       
                                                                 
Total params: 226,109
Trainable params: 226,109
Non-trainable params: 0
_________________________________________________________________
None


 35%|███████████████████████████▎                                                 | 17/48 [3:31:32<7:13:48, 839.63s/it]

{'batch_size': 32, 'dropout': 0.5, 'epochs': 100, 'losses': 'categorical_crossentropy', 'lstm_units': 50, 'optimizer': 'Adam'}
Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 bidirectional (Bidirectiona  (None, 100)              225200    
 l)                                                              
                                                                 
 dropout (Dropout)           (None, 100)               0         
                                                                 
 dense (Dense)               (None, 9)                 909       
                                                                 
Total params: 226,109
Trainable params: 226,109
Non-trainable params: 0
_________________________________________________________________
None


 38%|████████████████████████████▉                                                | 18/48 [3:44:17<6:48:38, 817.29s/it]

{'batch_size': 32, 'dropout': 0.5, 'epochs': 100, 'losses': 'categorical_crossentropy', 'lstm_units': 100, 'optimizer': 'rmsprop'}
Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 bidirectional (Bidirectiona  (None, 200)              490400    
 l)                                                              
                                                                 
 dropout (Dropout)           (None, 200)               0         
                                                                 
 dense (Dense)               (None, 9)                 1809      
                                                                 
Total params: 492,209
Trainable params: 492,209
Non-trainable params: 0
_________________________________________________________________
None


 40%|██████████████████████████████▍                                              | 19/48 [4:06:55<7:53:32, 979.74s/it]

{'batch_size': 32, 'dropout': 0.5, 'epochs': 100, 'losses': 'categorical_crossentropy', 'lstm_units': 100, 'optimizer': 'Adam'}
Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 bidirectional (Bidirectiona  (None, 200)              490400    
 l)                                                              
                                                                 
 dropout (Dropout)           (None, 200)               0         
                                                                 
 dense (Dense)               (None, 9)                 1809      
                                                                 
Total params: 492,209
Trainable params: 492,209
Non-trainable params: 0
_________________________________________________________________
None


 42%|████████████████████████████████                                             | 20/48 [4:24:09<7:44:48, 996.03s/it]

{'batch_size': 32, 'dropout': 0.5, 'epochs': 100, 'losses': 'mean_absolute_error', 'lstm_units': 50, 'optimizer': 'rmsprop'}
Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 bidirectional (Bidirectiona  (None, 100)              225200    
 l)                                                              
                                                                 
 dropout (Dropout)           (None, 100)               0         
                                                                 
 dense (Dense)               (None, 9)                 909       
                                                                 
Total params: 226,109
Trainable params: 226,109
Non-trainable params: 0
_________________________________________________________________
None


 44%|█████████████████████████████████▋                                           | 21/48 [4:36:56<6:57:15, 927.24s/it]

{'batch_size': 32, 'dropout': 0.5, 'epochs': 100, 'losses': 'mean_absolute_error', 'lstm_units': 50, 'optimizer': 'Adam'}
Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 bidirectional (Bidirectiona  (None, 100)              225200    
 l)                                                              
                                                                 
 dropout (Dropout)           (None, 100)               0         
                                                                 
 dense (Dense)               (None, 9)                 909       
                                                                 
Total params: 226,109
Trainable params: 226,109
Non-trainable params: 0
_________________________________________________________________
None


 46%|███████████████████████████████████▎                                         | 22/48 [4:49:33<6:19:41, 876.20s/it]

{'batch_size': 32, 'dropout': 0.5, 'epochs': 100, 'losses': 'mean_absolute_error', 'lstm_units': 100, 'optimizer': 'rmsprop'}
Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 bidirectional (Bidirectiona  (None, 200)              490400    
 l)                                                              
                                                                 
 dropout (Dropout)           (None, 200)               0         
                                                                 
 dense (Dense)               (None, 9)                 1809      
                                                                 
Total params: 492,209
Trainable params: 492,209
Non-trainable params: 0
_________________________________________________________________
None


 48%|████████████████████████████████████▉                                        | 23/48 [5:03:07<5:57:17, 857.51s/it]

{'batch_size': 32, 'dropout': 0.5, 'epochs': 100, 'losses': 'mean_absolute_error', 'lstm_units': 100, 'optimizer': 'Adam'}
Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 bidirectional (Bidirectiona  (None, 200)              490400    
 l)                                                              
                                                                 
 dropout (Dropout)           (None, 200)               0         
                                                                 
 dense (Dense)               (None, 9)                 1809      
                                                                 
Total params: 492,209
Trainable params: 492,209
Non-trainable params: 0
_________________________________________________________________
None


 50%|██████████████████████████████████████▌                                      | 24/48 [5:20:10<6:02:51, 907.14s/it]

{'batch_size': 64, 'dropout': 0.1, 'epochs': 100, 'losses': 'binary_crossentropy', 'lstm_units': 50, 'optimizer': 'rmsprop'}
Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 bidirectional (Bidirectiona  (None, 100)              225200    
 l)                                                              
                                                                 
 dropout (Dropout)           (None, 100)               0         
                                                                 
 dense (Dense)               (None, 9)                 909       
                                                                 
Total params: 226,109
Trainable params: 226,109
Non-trainable params: 0
_________________________________________________________________
None


 52%|████████████████████████████████████████                                     | 25/48 [5:30:01<5:11:22, 812.30s/it]

{'batch_size': 64, 'dropout': 0.1, 'epochs': 100, 'losses': 'binary_crossentropy', 'lstm_units': 50, 'optimizer': 'Adam'}
Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 bidirectional (Bidirectiona  (None, 100)              225200    
 l)                                                              
                                                                 
 dropout (Dropout)           (None, 100)               0         
                                                                 
 dense (Dense)               (None, 9)                 909       
                                                                 
Total params: 226,109
Trainable params: 226,109
Non-trainable params: 0
_________________________________________________________________
None


 54%|█████████████████████████████████████████▋                                   | 26/48 [5:39:39<4:32:04, 742.01s/it]

{'batch_size': 64, 'dropout': 0.1, 'epochs': 100, 'losses': 'binary_crossentropy', 'lstm_units': 100, 'optimizer': 'rmsprop'}
Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 bidirectional (Bidirectiona  (None, 200)              490400    
 l)                                                              
                                                                 
 dropout (Dropout)           (None, 200)               0         
                                                                 
 dense (Dense)               (None, 9)                 1809      
                                                                 
Total params: 492,209
Trainable params: 492,209
Non-trainable params: 0
_________________________________________________________________
None


 56%|███████████████████████████████████████████▎                                 | 27/48 [5:58:05<4:57:52, 851.06s/it]

{'batch_size': 64, 'dropout': 0.1, 'epochs': 100, 'losses': 'binary_crossentropy', 'lstm_units': 100, 'optimizer': 'Adam'}
Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 bidirectional (Bidirectiona  (None, 200)              490400    
 l)                                                              
                                                                 
 dropout (Dropout)           (None, 200)               0         
                                                                 
 dense (Dense)               (None, 9)                 1809      
                                                                 
Total params: 492,209
Trainable params: 492,209
Non-trainable params: 0
_________________________________________________________________
None


 58%|████████████████████████████████████████████▉                                | 28/48 [6:15:30<5:03:07, 909.39s/it]

{'batch_size': 64, 'dropout': 0.1, 'epochs': 100, 'losses': 'categorical_crossentropy', 'lstm_units': 50, 'optimizer': 'rmsprop'}
Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 bidirectional (Bidirectiona  (None, 100)              225200    
 l)                                                              
                                                                 
 dropout (Dropout)           (None, 100)               0         
                                                                 
 dense (Dense)               (None, 9)                 909       
                                                                 
Total params: 226,109
Trainable params: 226,109
Non-trainable params: 0
_________________________________________________________________
None


 60%|██████████████████████████████████████████████▌                              | 29/48 [6:26:02<4:21:39, 826.28s/it]

{'batch_size': 64, 'dropout': 0.1, 'epochs': 100, 'losses': 'categorical_crossentropy', 'lstm_units': 50, 'optimizer': 'Adam'}
Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 bidirectional (Bidirectiona  (None, 100)              225200    
 l)                                                              
                                                                 
 dropout (Dropout)           (None, 100)               0         
                                                                 
 dense (Dense)               (None, 9)                 909       
                                                                 
Total params: 226,109
Trainable params: 226,109
Non-trainable params: 0
_________________________________________________________________
None


 62%|████████████████████████████████████████████████▏                            | 30/48 [6:36:01<3:47:21, 757.87s/it]

{'batch_size': 64, 'dropout': 0.1, 'epochs': 100, 'losses': 'categorical_crossentropy', 'lstm_units': 100, 'optimizer': 'rmsprop'}
Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 bidirectional (Bidirectiona  (None, 200)              490400    
 l)                                                              
                                                                 
 dropout (Dropout)           (None, 200)               0         
                                                                 
 dense (Dense)               (None, 9)                 1809      
                                                                 
Total params: 492,209
Trainable params: 492,209
Non-trainable params: 0
_________________________________________________________________
None


 65%|█████████████████████████████████████████████████▋                           | 31/48 [6:52:49<3:56:02, 833.07s/it]

{'batch_size': 64, 'dropout': 0.1, 'epochs': 100, 'losses': 'categorical_crossentropy', 'lstm_units': 100, 'optimizer': 'Adam'}
Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 bidirectional (Bidirectiona  (None, 200)              490400    
 l)                                                              
                                                                 
 dropout (Dropout)           (None, 200)               0         
                                                                 
 dense (Dense)               (None, 9)                 1809      
                                                                 
Total params: 492,209
Trainable params: 492,209
Non-trainable params: 0
_________________________________________________________________
None


 67%|███████████████████████████████████████████████████▎                         | 32/48 [7:12:01<4:07:40, 928.79s/it]

{'batch_size': 64, 'dropout': 0.1, 'epochs': 100, 'losses': 'mean_absolute_error', 'lstm_units': 50, 'optimizer': 'rmsprop'}
Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 bidirectional (Bidirectiona  (None, 100)              225200    
 l)                                                              
                                                                 
 dropout (Dropout)           (None, 100)               0         
                                                                 
 dense (Dense)               (None, 9)                 909       
                                                                 
Total params: 226,109
Trainable params: 226,109
Non-trainable params: 0
_________________________________________________________________
None


 69%|████████████████████████████████████████████████████▉                        | 33/48 [7:22:27<3:29:27, 837.83s/it]

{'batch_size': 64, 'dropout': 0.1, 'epochs': 100, 'losses': 'mean_absolute_error', 'lstm_units': 50, 'optimizer': 'Adam'}
Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 bidirectional (Bidirectiona  (None, 100)              225200    
 l)                                                              
                                                                 
 dropout (Dropout)           (None, 100)               0         
                                                                 
 dense (Dense)               (None, 9)                 909       
                                                                 
Total params: 226,109
Trainable params: 226,109
Non-trainable params: 0
_________________________________________________________________
None


 71%|██████████████████████████████████████████████████████▌                      | 34/48 [7:32:14<2:57:55, 762.53s/it]

{'batch_size': 64, 'dropout': 0.1, 'epochs': 100, 'losses': 'mean_absolute_error', 'lstm_units': 100, 'optimizer': 'rmsprop'}
Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 bidirectional (Bidirectiona  (None, 200)              490400    
 l)                                                              
                                                                 
 dropout (Dropout)           (None, 200)               0         
                                                                 
 dense (Dense)               (None, 9)                 1809      
                                                                 
Total params: 492,209
Trainable params: 492,209
Non-trainable params: 0
_________________________________________________________________
None


 73%|████████████████████████████████████████████████████████▏                    | 35/48 [7:51:13<3:09:43, 875.65s/it]

{'batch_size': 64, 'dropout': 0.1, 'epochs': 100, 'losses': 'mean_absolute_error', 'lstm_units': 100, 'optimizer': 'Adam'}
Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 bidirectional (Bidirectiona  (None, 200)              490400    
 l)                                                              
                                                                 
 dropout (Dropout)           (None, 200)               0         
                                                                 
 dense (Dense)               (None, 9)                 1809      
                                                                 
Total params: 492,209
Trainable params: 492,209
Non-trainable params: 0
_________________________________________________________________
None


 75%|█████████████████████████████████████████████████████████▊                   | 36/48 [8:08:30<3:04:46, 923.84s/it]

{'batch_size': 64, 'dropout': 0.5, 'epochs': 100, 'losses': 'binary_crossentropy', 'lstm_units': 50, 'optimizer': 'rmsprop'}
Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 bidirectional (Bidirectiona  (None, 100)              225200    
 l)                                                              
                                                                 
 dropout (Dropout)           (None, 100)               0         
                                                                 
 dense (Dense)               (None, 9)                 909       
                                                                 
Total params: 226,109
Trainable params: 226,109
Non-trainable params: 0
_________________________________________________________________
None


 77%|███████████████████████████████████████████████████████████▎                 | 37/48 [8:18:21<2:31:03, 823.95s/it]

{'batch_size': 64, 'dropout': 0.5, 'epochs': 100, 'losses': 'binary_crossentropy', 'lstm_units': 50, 'optimizer': 'Adam'}
Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 bidirectional (Bidirectiona  (None, 100)              225200    
 l)                                                              
                                                                 
 dropout (Dropout)           (None, 100)               0         
                                                                 
 dense (Dense)               (None, 9)                 909       
                                                                 
Total params: 226,109
Trainable params: 226,109
Non-trainable params: 0
_________________________________________________________________
None


 79%|████████████████████████████████████████████████████████████▉                | 38/48 [8:27:57<2:04:55, 749.57s/it]

{'batch_size': 64, 'dropout': 0.5, 'epochs': 100, 'losses': 'binary_crossentropy', 'lstm_units': 100, 'optimizer': 'rmsprop'}
Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 bidirectional (Bidirectiona  (None, 200)              490400    
 l)                                                              
                                                                 
 dropout (Dropout)           (None, 200)               0         
                                                                 
 dense (Dense)               (None, 9)                 1809      
                                                                 
Total params: 492,209
Trainable params: 492,209
Non-trainable params: 0
_________________________________________________________________
None


 81%|██████████████████████████████████████████████████████████████▌              | 39/48 [8:45:51<2:07:02, 846.98s/it]

{'batch_size': 64, 'dropout': 0.5, 'epochs': 100, 'losses': 'binary_crossentropy', 'lstm_units': 100, 'optimizer': 'Adam'}
Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 bidirectional (Bidirectiona  (None, 200)              490400    
 l)                                                              
                                                                 
 dropout (Dropout)           (None, 200)               0         
                                                                 
 dense (Dense)               (None, 9)                 1809      
                                                                 
Total params: 492,209
Trainable params: 492,209
Non-trainable params: 0
_________________________________________________________________
None


 83%|████████████████████████████████████████████████████████████████▏            | 40/48 [9:04:14<2:03:11, 923.97s/it]

{'batch_size': 64, 'dropout': 0.5, 'epochs': 100, 'losses': 'categorical_crossentropy', 'lstm_units': 50, 'optimizer': 'rmsprop'}
Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 bidirectional (Bidirectiona  (None, 100)              225200    
 l)                                                              
                                                                 
 dropout (Dropout)           (None, 100)               0         
                                                                 
 dense (Dense)               (None, 9)                 909       
                                                                 
Total params: 226,109
Trainable params: 226,109
Non-trainable params: 0
_________________________________________________________________
None


 85%|█████████████████████████████████████████████████████████████████▊           | 41/48 [9:15:24<1:38:53, 847.69s/it]

{'batch_size': 64, 'dropout': 0.5, 'epochs': 100, 'losses': 'categorical_crossentropy', 'lstm_units': 50, 'optimizer': 'Adam'}
Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 bidirectional (Bidirectiona  (None, 100)              225200    
 l)                                                              
                                                                 
 dropout (Dropout)           (None, 100)               0         
                                                                 
 dense (Dense)               (None, 9)                 909       
                                                                 
Total params: 226,109
Trainable params: 226,109
Non-trainable params: 0
_________________________________________________________________
None


 88%|███████████████████████████████████████████████████████████████████▍         | 42/48 [9:26:41<1:19:38, 796.33s/it]

{'batch_size': 64, 'dropout': 0.5, 'epochs': 100, 'losses': 'categorical_crossentropy', 'lstm_units': 100, 'optimizer': 'rmsprop'}
Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 bidirectional (Bidirectiona  (None, 200)              490400    
 l)                                                              
                                                                 
 dropout (Dropout)           (None, 200)               0         
                                                                 
 dense (Dense)               (None, 9)                 1809      
                                                                 
Total params: 492,209
Trainable params: 492,209
Non-trainable params: 0
_________________________________________________________________
None


 90%|████████████████████████████████████████████████████████████████████▉        | 43/48 [9:46:54<1:16:46, 921.29s/it]

{'batch_size': 64, 'dropout': 0.5, 'epochs': 100, 'losses': 'categorical_crossentropy', 'lstm_units': 100, 'optimizer': 'Adam'}
Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 bidirectional (Bidirectiona  (None, 200)              490400    
 l)                                                              
                                                                 
 dropout (Dropout)           (None, 200)               0         
                                                                 
 dense (Dense)               (None, 9)                 1809      
                                                                 
Total params: 492,209
Trainable params: 492,209
Non-trainable params: 0
_________________________________________________________________
None


 92%|█████████████████████████████████████████████████████████████████████▋      | 44/48 [10:04:50<1:04:30, 967.72s/it]

{'batch_size': 64, 'dropout': 0.5, 'epochs': 100, 'losses': 'mean_absolute_error', 'lstm_units': 50, 'optimizer': 'rmsprop'}
Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 bidirectional (Bidirectiona  (None, 100)              225200    
 l)                                                              
                                                                 
 dropout (Dropout)           (None, 100)               0         
                                                                 
 dense (Dense)               (None, 9)                 909       
                                                                 
Total params: 226,109
Trainable params: 226,109
Non-trainable params: 0
_________________________________________________________________
None


 94%|█████████████████████████████████████████████████████████████████████████▏    | 45/48 [10:14:23<42:28, 849.34s/it]

{'batch_size': 64, 'dropout': 0.5, 'epochs': 100, 'losses': 'mean_absolute_error', 'lstm_units': 50, 'optimizer': 'Adam'}
Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 bidirectional (Bidirectiona  (None, 100)              225200    
 l)                                                              
                                                                 
 dropout (Dropout)           (None, 100)               0         
                                                                 
 dense (Dense)               (None, 9)                 909       
                                                                 
Total params: 226,109
Trainable params: 226,109
Non-trainable params: 0
_________________________________________________________________
None


 96%|██████████████████████████████████████████████████████████████████████████▊   | 46/48 [10:23:57<25:33, 766.85s/it]

{'batch_size': 64, 'dropout': 0.5, 'epochs': 100, 'losses': 'mean_absolute_error', 'lstm_units': 100, 'optimizer': 'rmsprop'}
Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 bidirectional (Bidirectiona  (None, 200)              490400    
 l)                                                              
                                                                 
 dropout (Dropout)           (None, 200)               0         
                                                                 
 dense (Dense)               (None, 9)                 1809      
                                                                 
Total params: 492,209
Trainable params: 492,209
Non-trainable params: 0
_________________________________________________________________
None


 98%|████████████████████████████████████████████████████████████████████████████▍ | 47/48 [10:43:54<14:55, 895.96s/it]

{'batch_size': 64, 'dropout': 0.5, 'epochs': 100, 'losses': 'mean_absolute_error', 'lstm_units': 100, 'optimizer': 'Adam'}
Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 bidirectional (Bidirectiona  (None, 200)              490400    
 l)                                                              
                                                                 
 dropout (Dropout)           (None, 200)               0         
                                                                 
 dense (Dense)               (None, 9)                 1809      
                                                                 
Total params: 492,209
Trainable params: 492,209
Non-trainable params: 0
_________________________________________________________________
None


100%|██████████████████████████████████████████████████████████████████████████████| 48/48 [11:03:28<00:00, 829.35s/it]

Wall time: 11h 3min 28s
              start              end     duration  round_epochs        loss  \
0   05/11/22-105921  05/11/22-110613   412.157741           100 -117.364006   
1   05/11/22-110613  05/11/22-111313   419.906254           100  -75.218109   
2   05/11/22-111314  05/11/22-112341   627.789979           100 -235.665237   
3   05/11/22-112342  05/11/22-113444   662.715708           100 -146.606827   
4   05/11/22-113444  05/11/22-114529   644.905038           100   77.595299   
5   05/11/22-114530  05/11/22-115532   602.767111           100   60.911186   
6   05/11/22-115532  05/11/22-121019   886.779396           100  127.547699   
7   05/11/22-121019  05/11/22-122537   917.544783           100   93.097160   
8   05/11/22-122537  05/11/22-123709   692.355245           100    0.222509   
9   05/11/22-123710  05/11/22-124847   696.886122           100    0.317915   
10  05/11/22-124847  05/11/22-130517   990.257118           100    0.209205   
11  05/11/22-130517  05/11/2




In [65]:
#we choose the best model of the grid search on the basis of the MAE metric, lower values are better
mlm_model_bert = tal.best_model(metric='mean_absolute_error', asc=True)
#to get an idea of how our best model performs, we check predictions on the validation set
prediction_mlm_val_bert = mlm_model_bert.predict(val_outputs["sequence_output"])
output_mlm_val_bert = gof_spear(prediction_mlm_val_bert,val_labels)

In [66]:
print(pd.DataFrame(data=output_mlm_val_bert,index=schemas,columns=['estimate']))

          estimate
Attach    0.673755
Comp      0.638589
Global    0.544970
Health    0.348355
Control   0.055208
MetaCog   0.044337
Others    0.001242
Hopeless  0.489562
OthViews  0.529834


#### Checkpoint After Parameter Analysis

In [74]:
#the predictions make sense considering what we got from KNN and SVM. We deploy the model.
talos.Deploy(tal,'mlm_rnn_bert',metric='mean_absolute_error',asc=True)

Deploy package mlm_rnn_bert have been saved.
data is not 2d, dummy data written instead.


<talos.commands.deploy.Deploy at 0x23bf9506830>

In [75]:
#we restore the deployed Talos experiment
restore_bert = talos.Restore('Data/mlm_rnn_bert.zip')
#to get the best performing parameters, we get the results of the Talos experiment
scan_results_bert = restore_bert.results

EmptyDataError: No columns to parse from file

In [73]:
#select the row with the smallest mean absolute error
print(scan_results_bert[scan_results_bert.mean_absolute_error == scan_results_bert.mean_absolute_error.min()]) 

NameError: name 'scan_results_bert' is not defined

### Train the Optimal Model Based On Tuned Parameters
#### Optimal GLoVE Multilabel Model

In [None]:
def mlm_fixed_glove(train_X, train_y, test_X, test_y):
    # build the model
    model = Sequential()
    e = Embedding(vocab_size, 100, weights=[embedding_matrix], input_length=max_length, trainable=False)
    #embedding layer
    model.add(e)
    #LSTM layer
    model.add(Bidirectional(LSTM(100)))
    #dropout layer
    model.add(Dropout(0.1))
    #output layer
    model.add(Dense(9, activation='sigmoid'))
    # compile the model
    model.compile(optimizer='Adam', loss='categorical_crossentropy', metrics=['mean_absolute_error'])
    # summarize the model
    print(model.summary())
    # fit the model
    out = model.fit(train_X, train_y, 
                    validation_data=[test_X,test_y],
                    batch_size=32, 
                    epochs=100, 
                    verbose=0)
    return out, model

In [220]:
%%time
# wall time to run: ~ 10min
#we train the model
res, model = mlm_fixed_glove(padded_train, train_labels, padded_validate, val_labels)
#we save models to files to free up working memory
model_name = 'Data/MLMs/mlm_model_glove'
model.save(model_name + '.h5')

Model: "sequential_25"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 embedding_13 (Embedding)    (None, 25, 100)           262400    
                                                                 
 bidirectional_25 (Bidirecti  (None, 200)              160800    
 onal)                                                           
                                                                 
 dropout_25 (Dropout)        (None, 200)               0         
                                                                 
 dense_25 (Dense)            (None, 9)                 1809      
                                                                 
Total params: 425,009
Trainable params: 162,609
Non-trainable params: 262,400
_________________________________________________________________
None
Epoch 1/100
Epoch 2/100
Epoch 3/100
Epoch 4/100
Epoch 5/100
Epoch 6/100
Epoch 7/100
Epoch 8/100
Epoch 

Epoch 92/100
Epoch 93/100
Epoch 94/100
Epoch 95/100
Epoch 96/100
Epoch 97/100
Epoch 98/100
Epoch 99/100
Epoch 100/100
Wall time: 9min 38s


#### Optimal BERT Multilabel Model

In [169]:
def mlm_fixed_bert(train_X, train_y, test_X, test_y):
    # build the model
    model = Sequential()
    
    model.add(Input(shape=(128, 512,)))
    #LSTM layer
    model.add(Bidirectional(LSTM(100)))
    #dropout layer
    model.add(Dropout(0.1))
    #output layer
    model.add(Dense(9, activation='sigmoid'))
    
    model.build()
    # compile the model
    model.compile(optimizer='Adam', loss='categorical_crossentropy', metrics=['mean_absolute_error'])
    # summarize the model
    print(model.summary())
    # fit the model
    out = model.fit(train_X, train_y, 
                    validation_data=[test_X,test_y],
                    batch_size=32, 
                    epochs=100, 
                    verbose=0)
    return out, model

In [170]:
%%time
# wall time to run: ~ 45min
#we train the model
res, model = mlm_fixed_bert(train_outputs["sequence_output"], train_labels, val_outputs["sequence_output"], val_labels)
#we save models to files to free up working memory
model_name = 'Data/MLMs/mlm_model_bert'
model.save(model_name + '.h5')

Model: "sequential_3"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 bidirectional_3 (Bidirectio  (None, 200)              490400    
 nal)                                                            
                                                                 
 dropout_3 (Dropout)         (None, 200)               0         
                                                                 
 dense_3 (Dense)             (None, 9)                 1809      
                                                                 
Total params: 492,209
Trainable params: 492,209
Non-trainable params: 0
_________________________________________________________________
None
Epoch 1/100
Epoch 2/100
Epoch 3/100
Epoch 4/100
Epoch 5/100
Epoch 6/100
Epoch 7/100
Epoch 8/100
Epoch 9/100
Epoch 10/100
Epoch 11/100
Epoch 12/100
Epoch 13/100
Epoch 14/100
Epoch 15/100
Epoch 16/100
Epoch 17/100
Epoch 18/100
Epoch 19/100
Epo

Epoch 44/100
Epoch 45/100
Epoch 46/100
Epoch 47/100
Epoch 48/100
Epoch 49/100
Epoch 50/100
Epoch 51/100
Epoch 52/100
Epoch 53/100
Epoch 54/100
Epoch 55/100
Epoch 56/100
Epoch 57/100
Epoch 58/100
Epoch 59/100
Epoch 60/100
Epoch 61/100
Epoch 62/100
Epoch 63/100
Epoch 64/100
Epoch 65/100
Epoch 66/100
Epoch 67/100
Epoch 68/100
Epoch 69/100
Epoch 70/100
Epoch 71/100
Epoch 72/100
Epoch 73/100
Epoch 74/100
Epoch 75/100
Epoch 76/100
Epoch 77/100
Epoch 78/100
Epoch 79/100
Epoch 80/100
Epoch 81/100
Epoch 82/100
Epoch 83/100
Epoch 84/100
Epoch 85/100
Epoch 86/100
Epoch 87/100
Epoch 88/100
Epoch 89/100
Epoch 90/100
Epoch 91/100


Epoch 92/100
Epoch 93/100
Epoch 94/100
Epoch 95/100
Epoch 96/100
Epoch 97/100
Epoch 98/100
Epoch 99/100
Epoch 100/100
Wall time: 39min 23s


### Multilabel Predictions

In [60]:
 #generate predictions with the per-schema models
def predict_schema_mlm(test_text, test_labels, model_name):
    model_name = "Data/MLMs/mlm_model_" + model_name
    #print(model_name)
    model = keras.models.load_model(model_name + '.h5')
    all_preds = model.predict(test_text)
    #print(all_preds)
    #print(test_labels)
    all_gofs = gof_spear(all_preds,test_labels)
    return all_gofs,all_preds

output_mlm_glove,idx_mlm_glove = predict_schema_mlm(padded_test,test_labels, "glove")
print(output_mlm_glove)

[nan nan nan nan nan nan nan nan nan]


### Training Per-Schema RNNs
> We also train separate RNNs per schema. For this, we can use the output layer to compute a probability for each of the four possible labels. This way, the labels are treated as separate classes. We take over the parameter values from the multilabel model for the number of LSTM units, the dropout rate, the loss function, the evaluation metric, the batch size, and the number of epochs. To obtain the probability for each class, the units of the output layer have a softmax activation function. For the evaluation, the class with the highest probability is chosen per model. The resulting models are written to files and loaded again for prediction.

In [174]:
#define separate models
def perschema_models_glove(train_X, train_y, test_X, test_y):
    model = Sequential()
    e = Embedding(vocab_size, 100, weights=[embedding_matrix], input_length=max_length, trainable=False)
    model.add(e)
    model.add(Bidirectional(LSTM(100)))
    model.add(Dropout(0.1))
    model.add(Dense(4, activation='softmax'))
    # compile the model
    model.compile(optimizer='Adam', loss='categorical_crossentropy', metrics=['mean_absolute_error'])
    # summarize the model
    print(model.summary())
    # fit the model
    model.fit(train_X, train_y,
              validation_data=[test_X,test_y],
              batch_size=32, 
              epochs=100, 
              verbose=1)
    out=model.predict(test_X)
    gof,p=scipy.stats.spearmanr(out,test_y,axis=None)
    return gof, model

In [176]:
%%time
#Wall time: 52min 36s
directory_name = "Data/PSMs/per_schema_models_glove"
for i in range(9):
    train_label_schema = np_utils.to_categorical(train_labels[:,i])
    val_label_schema = np_utils.to_categorical(val_labels[:,i])
    val_output_slm, model = perschema_models_glove(padded_train,train_label_schema,padded_validate,val_label_schema)
    #we write trained models to files to free up working memory
    model_name = '/schema_model_' + schemas[i]
    save_model_under = directory_name + model_name
    model.save(save_model_under + '.h5')

Model: "sequential_6"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 embedding_4 (Embedding)     (None, 25, 100)           262400    
                                                                 
 bidirectional_6 (Bidirectio  (None, 200)              160800    
 nal)                                                            
                                                                 
 dropout_6 (Dropout)         (None, 200)               0         
                                                                 
 dense_6 (Dense)             (None, 4)                 804       
                                                                 
Total params: 424,004
Trainable params: 161,604
Non-trainable params: 262,400
_________________________________________________________________
None
Epoch 1/100
Epoch 2/100
Epoch 3/100
Epoch 4/100
Epoch 5/100
Epoch 6/100
Epoch 7/100
Epoch 8/100
Epoch 9

Epoch 44/100
Epoch 45/100
Epoch 46/100
Epoch 47/100
Epoch 48/100
Epoch 49/100
Epoch 50/100
Epoch 51/100
Epoch 52/100
Epoch 53/100
Epoch 54/100
Epoch 55/100
Epoch 56/100
Epoch 57/100
Epoch 58/100
Epoch 59/100
Epoch 60/100
Epoch 61/100
Epoch 62/100
Epoch 63/100
Epoch 64/100
Epoch 65/100
Epoch 66/100
Epoch 67/100
Epoch 68/100
Epoch 69/100
Epoch 70/100
Epoch 71/100
Epoch 72/100
Epoch 73/100
Epoch 74/100
Epoch 75/100
Epoch 76/100
Epoch 77/100
Epoch 78/100
Epoch 79/100
Epoch 80/100
Epoch 81/100
Epoch 82/100
Epoch 83/100
Epoch 84/100
Epoch 85/100
Epoch 86/100
Epoch 87/100
Epoch 88/100
Epoch 89/100
Epoch 90/100
Epoch 91/100
Epoch 92/100
Epoch 93/100


Epoch 94/100
Epoch 95/100
Epoch 96/100
Epoch 97/100
Epoch 98/100
Epoch 99/100
Epoch 100/100
Model: "sequential_7"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 embedding_5 (Embedding)     (None, 25, 100)           262400    
                                                                 
 bidirectional_7 (Bidirectio  (None, 200)              160800    
 nal)                                                            
                                                                 
 dropout_7 (Dropout)         (None, 200)               0         
                                                                 
 dense_7 (Dense)             (None, 4)                 804       
                                                                 
Total params: 424,004
Trainable params: 161,604
Non-trainable params: 262,400
_________________________________________________________________
None
Epoch 1/100

Epoch 37/100
Epoch 38/100
Epoch 39/100
Epoch 40/100
Epoch 41/100
Epoch 42/100
Epoch 43/100
Epoch 44/100
Epoch 45/100
Epoch 46/100
Epoch 47/100
Epoch 48/100
Epoch 49/100
Epoch 50/100
Epoch 51/100
Epoch 52/100
Epoch 53/100
Epoch 54/100
Epoch 55/100
Epoch 56/100
Epoch 57/100
Epoch 58/100
Epoch 59/100
Epoch 60/100
Epoch 61/100
Epoch 62/100
Epoch 63/100
Epoch 64/100
Epoch 65/100
Epoch 66/100
Epoch 67/100
Epoch 68/100
Epoch 69/100
Epoch 70/100
Epoch 71/100
Epoch 72/100
Epoch 73/100
Epoch 74/100
Epoch 75/100
Epoch 76/100
Epoch 77/100
Epoch 78/100
Epoch 79/100
Epoch 80/100
Epoch 81/100
Epoch 82/100
Epoch 83/100
Epoch 84/100
Epoch 85/100
Epoch 86/100
Epoch 87/100
Epoch 88/100
Epoch 89/100
Epoch 90/100
Epoch 91/100
Epoch 92/100
Epoch 93/100
Epoch 94/100
Epoch 95/100
Epoch 96/100
Epoch 97/100
Epoch 98/100
Epoch 99/100
Epoch 100/100
Model: "sequential_8"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 embedding_6 

Epoch 29/100
Epoch 30/100
Epoch 31/100
Epoch 32/100
Epoch 33/100
Epoch 34/100
Epoch 35/100
Epoch 36/100
Epoch 37/100
Epoch 38/100
Epoch 39/100
Epoch 40/100
Epoch 41/100
Epoch 42/100
Epoch 43/100
Epoch 44/100
Epoch 45/100
Epoch 46/100
Epoch 47/100
Epoch 48/100
Epoch 49/100
Epoch 50/100
Epoch 51/100
Epoch 52/100
Epoch 53/100
Epoch 54/100
Epoch 55/100
Epoch 56/100
Epoch 57/100
Epoch 58/100
Epoch 59/100
Epoch 60/100
Epoch 61/100
Epoch 62/100
Epoch 63/100
Epoch 64/100
Epoch 65/100
Epoch 66/100
Epoch 67/100
Epoch 68/100
Epoch 69/100
Epoch 70/100
Epoch 71/100
Epoch 72/100
Epoch 73/100
Epoch 74/100
Epoch 75/100
Epoch 76/100
Epoch 77/100
Epoch 78/100
Epoch 79/100
Epoch 80/100
Epoch 81/100
Epoch 82/100
Epoch 83/100
Epoch 84/100
Epoch 85/100
Epoch 86/100
Epoch 87/100
Epoch 88/100
Epoch 89/100
Epoch 90/100
Epoch 91/100
Epoch 92/100
Epoch 93/100
Epoch 94/100
Epoch 95/100
Epoch 96/100
Epoch 97/100
Epoch 98/100
Epoch 99/100
Epoch 100/100
Model: "sequential_9"
_________________________________________

Epoch 21/100
Epoch 22/100
Epoch 23/100
Epoch 24/100
Epoch 25/100
Epoch 26/100
Epoch 27/100
Epoch 28/100
Epoch 29/100
Epoch 30/100
Epoch 31/100
Epoch 32/100
Epoch 33/100
Epoch 34/100
Epoch 35/100
Epoch 36/100
Epoch 37/100
Epoch 38/100
Epoch 39/100
Epoch 40/100
Epoch 41/100
Epoch 42/100
Epoch 43/100
Epoch 44/100
Epoch 45/100
Epoch 46/100
Epoch 47/100
Epoch 48/100
Epoch 49/100
Epoch 50/100
Epoch 51/100
Epoch 52/100
Epoch 53/100
Epoch 54/100
Epoch 55/100
Epoch 56/100
Epoch 57/100
Epoch 58/100
Epoch 59/100
Epoch 60/100
Epoch 61/100
Epoch 62/100
Epoch 63/100
Epoch 64/100
Epoch 65/100
Epoch 66/100
Epoch 67/100
Epoch 68/100
Epoch 69/100


Epoch 70/100
Epoch 71/100
Epoch 72/100
Epoch 73/100
Epoch 74/100
Epoch 75/100
Epoch 76/100
Epoch 77/100
Epoch 78/100
Epoch 79/100
Epoch 80/100
Epoch 81/100
Epoch 82/100
Epoch 83/100
Epoch 84/100
Epoch 85/100
Epoch 86/100
Epoch 87/100
Epoch 88/100
Epoch 89/100
Epoch 90/100
Epoch 91/100
Epoch 92/100
Epoch 93/100
Epoch 94/100
Epoch 95/100
Epoch 96/100
Epoch 97/100
Epoch 98/100
Epoch 99/100
Epoch 100/100
Model: "sequential_10"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 embedding_8 (Embedding)     (None, 25, 100)           262400    
                                                                 
 bidirectional_10 (Bidirecti  (None, 200)              160800    
 onal)                                                           
                                                                 
 dropout_10 (Dropout)        (None, 200)               0         
                                             

Epoch 61/100
Epoch 62/100
Epoch 63/100
Epoch 64/100
Epoch 65/100
Epoch 66/100
Epoch 67/100
Epoch 68/100
Epoch 69/100
Epoch 70/100
Epoch 71/100
Epoch 72/100
Epoch 73/100
Epoch 74/100
Epoch 75/100
Epoch 76/100
Epoch 77/100
Epoch 78/100
Epoch 79/100
Epoch 80/100
Epoch 81/100
Epoch 82/100
Epoch 83/100
Epoch 84/100
Epoch 85/100
Epoch 86/100
Epoch 87/100
Epoch 88/100
Epoch 89/100
Epoch 90/100
Epoch 91/100
Epoch 92/100
Epoch 93/100
Epoch 94/100
Epoch 95/100
Epoch 96/100
Epoch 97/100
Epoch 98/100
Epoch 99/100
Epoch 100/100
Model: "sequential_11"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 embedding_9 (Embedding)     (None, 25, 100)           262400    
                                                                 
 bidirectional_11 (Bidirecti  (None, 200)              160800    
 onal)                                                           
                                                            

Epoch 3/100
Epoch 4/100
Epoch 5/100
Epoch 6/100
Epoch 7/100
Epoch 8/100
Epoch 9/100
Epoch 10/100
Epoch 11/100
Epoch 12/100
Epoch 13/100
Epoch 14/100
Epoch 15/100
Epoch 16/100
Epoch 17/100
Epoch 18/100
Epoch 19/100
Epoch 20/100
Epoch 21/100
Epoch 22/100
Epoch 23/100
Epoch 24/100
Epoch 25/100
Epoch 26/100
Epoch 27/100
Epoch 28/100
Epoch 29/100
Epoch 30/100
Epoch 31/100
Epoch 32/100
Epoch 33/100
Epoch 34/100
Epoch 35/100
Epoch 36/100
Epoch 37/100
Epoch 38/100
Epoch 39/100
Epoch 40/100
Epoch 41/100
Epoch 42/100
Epoch 43/100
Epoch 44/100
Epoch 45/100
Epoch 46/100
Epoch 47/100
Epoch 48/100
Epoch 49/100
Epoch 50/100


Epoch 51/100
Epoch 52/100
Epoch 53/100
Epoch 54/100
Epoch 55/100
Epoch 56/100
Epoch 57/100
Epoch 58/100
Epoch 59/100
Epoch 60/100
Epoch 61/100
Epoch 62/100
Epoch 63/100
Epoch 64/100
Epoch 65/100
Epoch 66/100
Epoch 67/100
Epoch 68/100
Epoch 69/100
Epoch 70/100
Epoch 71/100
Epoch 72/100
Epoch 73/100
Epoch 74/100
Epoch 75/100
Epoch 76/100
Epoch 77/100
Epoch 78/100
Epoch 79/100
Epoch 80/100
Epoch 81/100
Epoch 82/100
Epoch 83/100
Epoch 84/100
Epoch 85/100
Epoch 86/100
Epoch 87/100
Epoch 88/100
Epoch 89/100
Epoch 90/100
Epoch 91/100
Epoch 92/100
Epoch 93/100
Epoch 94/100
Epoch 95/100
Epoch 96/100
Epoch 97/100
Epoch 98/100


Epoch 99/100
Epoch 100/100
Model: "sequential_12"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 embedding_10 (Embedding)    (None, 25, 100)           262400    
                                                                 
 bidirectional_12 (Bidirecti  (None, 200)              160800    
 onal)                                                           
                                                                 
 dropout_12 (Dropout)        (None, 200)               0         
                                                                 
 dense_12 (Dense)            (None, 4)                 804       
                                                                 
Total params: 424,004
Trainable params: 161,604
Non-trainable params: 262,400
_________________________________________________________________
None
Epoch 1/100
Epoch 2/100
Epoch 3/100
Epoch 4/100
Epoch 5/100
Epoch 6/100
Epo

Epoch 41/100
Epoch 42/100
Epoch 43/100
Epoch 44/100
Epoch 45/100
Epoch 46/100
Epoch 47/100
Epoch 48/100
Epoch 49/100
Epoch 50/100
Epoch 51/100
Epoch 52/100
Epoch 53/100
Epoch 54/100
Epoch 55/100
Epoch 56/100
Epoch 57/100
Epoch 58/100
Epoch 59/100
Epoch 60/100
Epoch 61/100
Epoch 62/100
Epoch 63/100
Epoch 64/100
Epoch 65/100
Epoch 66/100
Epoch 67/100
Epoch 68/100
Epoch 69/100
Epoch 70/100
Epoch 71/100
Epoch 72/100
Epoch 73/100
Epoch 74/100
Epoch 75/100
Epoch 76/100
Epoch 77/100
Epoch 78/100
Epoch 79/100
Epoch 80/100
Epoch 81/100
Epoch 82/100
Epoch 83/100
Epoch 84/100
Epoch 85/100
Epoch 86/100
Epoch 87/100
Epoch 88/100
Epoch 89/100


Epoch 90/100
Epoch 91/100
Epoch 92/100
Epoch 93/100
Epoch 94/100
Epoch 95/100
Epoch 96/100
Epoch 97/100
Epoch 98/100
Epoch 99/100
Epoch 100/100
Model: "sequential_13"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 embedding_11 (Embedding)    (None, 25, 100)           262400    
                                                                 
 bidirectional_13 (Bidirecti  (None, 200)              160800    
 onal)                                                           
                                                                 
 dropout_13 (Dropout)        (None, 200)               0         
                                                                 
 dense_13 (Dense)            (None, 4)                 804       
                                                                 
Total params: 424,004
Trainable params: 161,604
Non-trainable params: 262,400
_____________________________

Epoch 81/100
Epoch 82/100
Epoch 83/100
Epoch 84/100
Epoch 85/100
Epoch 86/100
Epoch 87/100
Epoch 88/100
Epoch 89/100
Epoch 90/100
Epoch 91/100
Epoch 92/100
Epoch 93/100
Epoch 94/100
Epoch 95/100
Epoch 96/100
Epoch 97/100
Epoch 98/100
Epoch 99/100
Epoch 100/100
Model: "sequential_14"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 embedding_12 (Embedding)    (None, 25, 100)           262400    
                                                                 
 bidirectional_14 (Bidirecti  (None, 200)              160800    
 onal)                                                           
                                                                 
 dropout_14 (Dropout)        (None, 200)               0         
                                                                 
 dense_14 (Dense)            (None, 4)                 804       
                                                        

Epoch 24/100
Epoch 25/100
Epoch 26/100
Epoch 27/100
Epoch 28/100
Epoch 29/100
Epoch 30/100
Epoch 31/100
Epoch 32/100
Epoch 33/100
Epoch 34/100
Epoch 35/100
Epoch 36/100
Epoch 37/100
Epoch 38/100
Epoch 39/100
Epoch 40/100
Epoch 41/100
Epoch 42/100
Epoch 43/100
Epoch 44/100
Epoch 45/100
Epoch 46/100
Epoch 47/100
Epoch 48/100
Epoch 49/100
Epoch 50/100
Epoch 51/100
Epoch 52/100
Epoch 53/100
Epoch 54/100
Epoch 55/100
Epoch 56/100
Epoch 57/100
Epoch 58/100
Epoch 59/100
Epoch 60/100
Epoch 61/100
Epoch 62/100
Epoch 63/100
Epoch 64/100
Epoch 65/100
Epoch 66/100
Epoch 67/100
Epoch 68/100
Epoch 69/100
Epoch 70/100
Epoch 71/100
Epoch 72/100
Epoch 73/100
Epoch 74/100
Epoch 75/100
Epoch 76/100
Epoch 77/100
Epoch 78/100
Epoch 79/100
Epoch 80/100
Epoch 81/100
Epoch 82/100
Epoch 83/100
Epoch 84/100
Epoch 85/100
Epoch 86/100
Epoch 87/100
Epoch 88/100
Epoch 89/100
Epoch 90/100
Epoch 91/100
Epoch 92/100
Epoch 93/100
Epoch 94/100
Epoch 95/100
Epoch 96/100
Epoch 97/100
Epoch 98/100
Epoch 99/100
Epoch 100/10

In [179]:
#define separate models
def perschema_models_bert(train_X, train_y, test_X, test_y):
    model = Sequential()
    model.add(Input(shape=(128, 512,)))
    model.add(Bidirectional(LSTM(100)))
    model.add(Dropout(0.1))
    model.add(Dense(4, activation='softmax'))
    # compile the model
    model.compile(optimizer='Adam', loss='categorical_crossentropy', metrics=['mean_absolute_error'])
    # summarize the model
    print(model.summary())
    # fit the model
    model.fit(train_X, train_y,
              validation_data=[test_X,test_y],
              batch_size=32, 
              epochs=100, 
              verbose=1)
    out=model.predict(test_X)
    gof,p=scipy.stats.spearmanr(out,test_y,axis=None)
    return gof, model

In [180]:
%%time
directory_name = "Data/PSMs/per_schema_models_bert"
for i in range(9):
    train_label_schema = np_utils.to_categorical(train_labels[:,i])
    val_label_schema = np_utils.to_categorical(val_labels[:,i])
    val_output_slm, model = perschema_models_bert(train_outputs["sequence_output"],train_label_schema,val_outputs["sequence_output"],val_label_schema)
    #we write trained models to files to free up working memory
    model_name = '/schema_model_' + schemas[i]
    save_model_under = directory_name + model_name
    model.save(save_model_under + '.h5')

Model: "sequential_16"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 bidirectional_16 (Bidirecti  (None, 200)              490400    
 onal)                                                           
                                                                 
 dropout_16 (Dropout)        (None, 200)               0         
                                                                 
 dense_16 (Dense)            (None, 4)                 804       
                                                                 
Total params: 491,204
Trainable params: 491,204
Non-trainable params: 0
_________________________________________________________________
None
Epoch 1/100
Epoch 2/100
Epoch 3/100
Epoch 4/100
Epoch 5/100
Epoch 6/100
Epoch 7/100
Epoch 8/100
Epoch 9/100
Epoch 10/100
Epoch 11/100
Epoch 12/100
Epoch 13/100
Epoch 14/100
Epoch 15/100
Epoch 16/100
Epoch 17/100
Epoch 18/100
Epoch 19/100
Ep

Epoch 93/100
Epoch 94/100
Epoch 95/100
Epoch 96/100
Epoch 97/100
Epoch 98/100
Epoch 99/100
Epoch 100/100
Model: "sequential_17"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 bidirectional_17 (Bidirecti  (None, 200)              490400    
 onal)                                                           
                                                                 
 dropout_17 (Dropout)        (None, 200)               0         
                                                                 
 dense_17 (Dense)            (None, 4)                 804       
                                                                 
Total params: 491,204
Trainable params: 491,204
Non-trainable params: 0
_________________________________________________________________
None
Epoch 1/100
Epoch 2/100
Epoch 3/100
Epoch 4/100
Epoch 5/100
Epoch 6/100
Epoch 7/100
Epoch 8/100
Epoch 9/100
Epoch 10/100
Epoch 11/100
E

Epoch 84/100
Epoch 85/100
Epoch 86/100
Epoch 87/100
Epoch 88/100
Epoch 89/100
Epoch 90/100
Epoch 91/100
Epoch 92/100
Epoch 93/100
Epoch 94/100
Epoch 95/100
Epoch 96/100
Epoch 97/100
Epoch 98/100
Epoch 99/100
Epoch 100/100
Model: "sequential_18"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 bidirectional_18 (Bidirecti  (None, 200)              490400    
 onal)                                                           
                                                                 
 dropout_18 (Dropout)        (None, 200)               0         
                                                                 
 dense_18 (Dense)            (None, 4)                 804       
                                                                 
Total params: 491,204
Trainable params: 491,204
Non-trainable params: 0
_________________________________________________________________
None
Epoch 1/100
Epoch 

Epoch 75/100
Epoch 76/100
Epoch 77/100
Epoch 78/100
Epoch 79/100
Epoch 80/100
Epoch 81/100
Epoch 82/100
Epoch 83/100
Epoch 84/100
Epoch 85/100
Epoch 86/100
Epoch 87/100
Epoch 88/100
Epoch 89/100
Epoch 90/100
Epoch 91/100
Epoch 92/100
Epoch 93/100
Epoch 94/100
Epoch 95/100
Epoch 96/100
Epoch 97/100
Epoch 98/100
Epoch 99/100
Epoch 100/100
Model: "sequential_19"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 bidirectional_19 (Bidirecti  (None, 200)              490400    
 onal)                                                           
                                                                 
 dropout_19 (Dropout)        (None, 200)               0         
                                                                 
 dense_19 (Dense)            (None, 4)                 804       
                                                                 
Total params: 491,204
Trainable params: 491,

Epoch 65/100
Epoch 66/100
Epoch 67/100
Epoch 68/100
Epoch 69/100
Epoch 70/100
Epoch 71/100
Epoch 72/100
Epoch 73/100
Epoch 74/100
Epoch 75/100
Epoch 76/100
Epoch 77/100
Epoch 78/100
Epoch 79/100
Epoch 80/100
Epoch 81/100
Epoch 82/100
Epoch 83/100
Epoch 84/100
Epoch 85/100
Epoch 86/100
Epoch 87/100
Epoch 88/100
Epoch 89/100
Epoch 90/100
Epoch 91/100
Epoch 92/100
Epoch 93/100
Epoch 94/100
Epoch 95/100
Epoch 96/100
Epoch 97/100
Epoch 98/100
Epoch 99/100
Epoch 100/100
Model: "sequential_20"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 bidirectional_20 (Bidirecti  (None, 200)              490400    
 onal)                                                           
                                                                 
 dropout_20 (Dropout)        (None, 200)               0         
                                                                 
 dense_20 (Dense)            (None, 4)        

Epoch 56/100
Epoch 57/100
Epoch 58/100
Epoch 59/100
Epoch 60/100
Epoch 61/100
Epoch 62/100
Epoch 63/100
Epoch 64/100
Epoch 65/100
Epoch 66/100
Epoch 67/100
Epoch 68/100
Epoch 69/100
Epoch 70/100
Epoch 71/100
Epoch 72/100
Epoch 73/100
Epoch 74/100
Epoch 75/100
Epoch 76/100
Epoch 77/100
Epoch 78/100
Epoch 79/100
Epoch 80/100
Epoch 81/100
Epoch 82/100
Epoch 83/100
Epoch 84/100
Epoch 85/100
Epoch 86/100
Epoch 87/100
Epoch 88/100
Epoch 89/100
Epoch 90/100
Epoch 91/100
Epoch 92/100
Epoch 93/100
Epoch 94/100
Epoch 95/100
Epoch 96/100
Epoch 97/100
Epoch 98/100
Epoch 99/100
Epoch 100/100
Model: "sequential_21"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 bidirectional_21 (Bidirecti  (None, 200)              490400    
 onal)                                                           
                                                                 


 dropout_21 (Dropout)        (None, 200)               0         
                                                                 
 dense_21 (Dense)            (None, 4)                 804       
                                                                 
Total params: 491,204
Trainable params: 491,204
Non-trainable params: 0
_________________________________________________________________
None
Epoch 1/100
Epoch 2/100
Epoch 3/100
Epoch 4/100
Epoch 5/100
Epoch 6/100
Epoch 7/100
Epoch 8/100
Epoch 9/100
Epoch 10/100
Epoch 11/100
Epoch 12/100
Epoch 13/100
Epoch 14/100
Epoch 15/100
Epoch 16/100
Epoch 17/100
Epoch 18/100
Epoch 19/100
Epoch 20/100
Epoch 21/100
Epoch 22/100
Epoch 23/100
Epoch 24/100
Epoch 25/100
Epoch 26/100
Epoch 27/100
Epoch 28/100
Epoch 29/100
Epoch 30/100
Epoch 31/100
Epoch 32/100
Epoch 33/100
Epoch 34/100
Epoch 35/100
Epoch 36/100
Epoch 37/100
Epoch 38/100
Epoch 39/100
Epoch 40/100
Epoch 41/100
Epoch 42/100
Epoch 43/100
Epoch 44/100
Epoch 45/100
Epoch 46/100
Epoc

Epoch 92/100
Epoch 93/100
Epoch 94/100
Epoch 95/100
Epoch 96/100
Epoch 97/100
Epoch 98/100
Epoch 99/100
Epoch 100/100
Model: "sequential_22"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 bidirectional_22 (Bidirecti  (None, 200)              490400    
 onal)                                                           
                                                                 
 dropout_22 (Dropout)        (None, 200)               0         
                                                                 
 dense_22 (Dense)            (None, 4)                 804       
                                                                 
Total params: 491,204
Trainable params: 491,204
Non-trainable params: 0
_________________________________________________________________
None
Epoch 1/100
Epoch 2/100
Epoch 3/100
Epoch 4/100
Epoch 5/100
Epoch 6/100
Epoch 7/100
Epoch 8/100
Epoch 9/100
Epoch 10/100
E

Epoch 34/100
Epoch 35/100
Epoch 36/100
Epoch 37/100
Epoch 38/100
Epoch 39/100
Epoch 40/100
Epoch 41/100
Epoch 42/100
Epoch 43/100
Epoch 44/100
Epoch 45/100
Epoch 46/100
Epoch 47/100
Epoch 48/100
Epoch 49/100
Epoch 50/100
Epoch 51/100
Epoch 52/100
Epoch 53/100
Epoch 54/100
Epoch 55/100
Epoch 56/100
Epoch 57/100
Epoch 58/100
Epoch 59/100
Epoch 60/100
Epoch 61/100
Epoch 62/100
Epoch 63/100
Epoch 64/100
Epoch 65/100
Epoch 66/100
Epoch 67/100
Epoch 68/100
Epoch 69/100
Epoch 70/100
Epoch 71/100
Epoch 72/100
Epoch 73/100
Epoch 74/100
Epoch 75/100
Epoch 76/100
Epoch 77/100
Epoch 78/100
Epoch 79/100
Epoch 80/100
Epoch 81/100
Epoch 82/100
Epoch 83/100
Epoch 84/100
Epoch 85/100
Epoch 86/100
Epoch 87/100
Epoch 88/100
Epoch 89/100
Epoch 90/100
Epoch 91/100
Epoch 92/100
Epoch 93/100
Epoch 94/100
Epoch 95/100
Epoch 96/100
Epoch 97/100
Epoch 98/100
Epoch 99/100
Epoch 100/100
Model: "sequential_23"
_________________________________________________________________
 Layer (type)                Output Sha

Epoch 22/100
Epoch 23/100
Epoch 24/100
Epoch 25/100
Epoch 26/100
Epoch 27/100
Epoch 28/100
Epoch 29/100
Epoch 30/100
Epoch 31/100
Epoch 32/100
Epoch 33/100
Epoch 34/100
Epoch 35/100
Epoch 36/100
Epoch 37/100
Epoch 38/100
Epoch 39/100
Epoch 40/100
Epoch 41/100
Epoch 42/100
Epoch 43/100
Epoch 44/100
Epoch 45/100
Epoch 46/100
Epoch 47/100
Epoch 48/100
Epoch 49/100
Epoch 50/100
Epoch 51/100
Epoch 52/100
Epoch 53/100
Epoch 54/100
Epoch 55/100
Epoch 56/100
Epoch 57/100
Epoch 58/100
Epoch 59/100
Epoch 60/100
Epoch 61/100
Epoch 62/100
Epoch 63/100
Epoch 64/100
Epoch 65/100
Epoch 66/100
Epoch 67/100
Epoch 68/100
Epoch 69/100
Epoch 70/100
Epoch 71/100
Epoch 72/100
Epoch 73/100
Epoch 74/100
Epoch 75/100
Epoch 76/100
Epoch 77/100
Epoch 78/100
Epoch 79/100
Epoch 80/100
Epoch 81/100
Epoch 82/100
Epoch 83/100
Epoch 84/100
Epoch 85/100
Epoch 86/100
Epoch 87/100
Epoch 88/100
Epoch 89/100
Epoch 90/100
Epoch 91/100
Epoch 92/100
Epoch 93/100
Epoch 94/100
Epoch 95/100
Epoch 96/100
Epoch 97/100
Epoch 98/100

Epoch 13/100
Epoch 14/100
Epoch 15/100
Epoch 16/100
Epoch 17/100
Epoch 18/100
Epoch 19/100
Epoch 20/100
Epoch 21/100
Epoch 22/100
Epoch 23/100
Epoch 24/100
Epoch 25/100
Epoch 26/100
Epoch 27/100
Epoch 28/100
Epoch 29/100
Epoch 30/100
Epoch 31/100
Epoch 32/100
Epoch 33/100
Epoch 34/100
Epoch 35/100
Epoch 36/100
Epoch 37/100
Epoch 38/100
Epoch 39/100
Epoch 40/100
Epoch 41/100
Epoch 42/100
Epoch 43/100
Epoch 44/100
Epoch 45/100
Epoch 46/100
Epoch 47/100
Epoch 48/100
Epoch 49/100
Epoch 50/100
Epoch 51/100
Epoch 52/100
Epoch 53/100
Epoch 54/100
Epoch 55/100
Epoch 56/100
Epoch 57/100
Epoch 58/100
Epoch 59/100
Epoch 60/100
Epoch 61/100
Epoch 62/100
Epoch 63/100
Epoch 64/100
Epoch 65/100
Epoch 66/100
Epoch 67/100
Epoch 68/100
Epoch 69/100
Epoch 70/100
Epoch 71/100
Epoch 72/100
Epoch 73/100
Epoch 74/100
Epoch 75/100
Epoch 76/100
Epoch 77/100
Epoch 78/100
Epoch 79/100
Epoch 80/100
Epoch 81/100
Epoch 82/100
Epoch 83/100
Epoch 84/100
Epoch 85/100
Epoch 86/100
Epoch 87/100
Epoch 88/100
Epoch 89/100

In [64]:
#load single models
def load_single_models(directory):
    single_models = []
    for i in range(9):
        model_name ='/schema_model_' + schemas[i]
        get_from = directory + model_name
        model = keras.models.load_model(get_from + '.h5')
        single_models.append(model)
    return single_models

In [61]:
#generate predictions with the per-schema models
def predict_schema_psm(test_text, test_labels, model_name):
    directory_name= "Data/PSMs/per_schema_models_" + model_name
    all_preds = np.zeros(test_labels.shape)
    all_gofs = []
    single_models = load_single_models(directory_name)
    for i in range(9):
        model = single_models[i]
        out = model.predict(test_text)
        out = out.argmax(axis=1)
        all_preds[:,i] = out
        gof,p=scipy.stats.spearmanr(out,test_labels[:,i])
        all_gofs.append(gof)
    return all_gofs,all_preds    

### Generate Testset Predictions with the RNN Models

In [65]:
def my_rnn(test_X,test_y,single, model_name):
    if single:
        gof,preds=predict_schema_psm(test_X,test_y, model_name)
    else:
        gof,preds=predict_schema_mlm(test_X,test_y, model_name)
    return gof, preds

In [66]:
%%time
# wall time to run: ~  43.2 s
# predicting testset with multilabel model
output_mlm_glove,idx_mlm_glove = my_rnn(padded_test,test_labels, 0, "glove")
# predicting testset with multilabel model
output_mlm_bert,idx_mlm_bert = my_rnn(test_outputs["sequence_output"],test_labels, 0, "bert")
# predicting testset with perschema models
output_psm_glove,idx_psm_glove = my_rnn(padded_test,test_labels, 1, "glove")

# predicting testset with perschema models
output_psm_bert,idx_psm_bert = my_rnn(test_outputs["sequence_output"],test_labels, 1, "bert")

Wall time: 15.3 s


In [71]:
print('RNN Multilabel Model Testset Output GLoVE')
print(pd.DataFrame(data=output_mlm_glove,index=schemas,columns=['estimate']))

RNN Multilabel Model Testset Output GLoVE
          estimate
Attach         NaN
Comp           NaN
Global         NaN
Health         NaN
Control        NaN
MetaCog        NaN
Others         NaN
Hopeless       NaN
OthViews       NaN


In [68]:
print('RNN Multilabel Model Testset Output BERT')
print(pd.DataFrame(data=output_mlm_bert,index=schemas,columns=['estimate']))

RNN Multilabel Model Testset Output BERT
          estimate
Attach         NaN
Comp           NaN
Global         NaN
Health         NaN
Control        NaN
MetaCog        NaN
Others         NaN
Hopeless       NaN
OthViews       NaN


In [69]:
print('RNN Per-Schema Testset Output GLoVE')
print(pd.DataFrame(data=output_psm_glove,index=schemas,columns=['estimate']))

RNN Per-Schema Testset Output GLoVE
          estimate
Attach    0.689987
Comp      0.720595
Global    0.565708
Health    0.780698
Control   0.277295
MetaCog  -0.009075
Others    0.180899
Hopeless  0.590422
OthViews  0.600660


In [70]:
print('RNN Per-Schema Testset Output BERT')
print(pd.DataFrame(data=output_psm_bert,index=schemas,columns=['estimate']))

RNN Per-Schema Testset Output BERT
          estimate
Attach    0.737940
Comp      0.767731
Global    0.565223
Health    0.747472
Control   0.213274
MetaCog   0.178619
Others    0.177825
Hopeless  0.639479
OthViews  0.621209


In [None]:
def my_rnn_fixed(test_X,test_y,single, model_name):
    if single:
        gof,preds=predict_schema_psm(test_X,test_y,"glove")
    else:
        gof,preds=predict_schema_mlm(test_X,test_y,idx_mlm[0])
    return gof

In [None]:
%%time
# wall time to run: ~ 37min
#bootstrapping the 95% confidence intervals
#bs_mlm = bootstrap(n_iterations,n_size,padded_test,test_labels,0,"rnn")
bs_psm = bootstrap(n_iterations,n_size,padded_test,test_labels,1,"rnn")

In [None]:
# Save Results for quick loading later if project stops
joblib.dump(bs_svc_glove, 'Data/BootstrapResults/RNN/bs_svc_glove.pkl')
joblib.dump(bs_svr_glove, 'Data/BootstrapResults/RNN/bs_svr_glove.pkl')
joblib.dump(bs_svc_bert, 'Data/BootstrapResults/RNN/bs_svc_bert.pkl')
joblib.dump(bs_svr_bert, 'Data/BootstrapResults/RNN/bs_svr_bert.pkl')

In [None]:
print(f'Multilabel RNN Classification 95% Confidence Intervals')
print(pd.DataFrame(data=np.transpose(bs_mlm),index=schemas,columns=['low','high']))
print(f'Per-Schema RNN Classification 95% Confidence Intervals')
print(pd.DataFrame(data=np.transpose(bs_psm),index=schemas,columns=['low','high']))

## Results of Hypothesis 1

In [226]:
output_psm_glove_flat = [item for sublist in output_psm_glove for item in sublist]
#output_mlm_flat = [item for sublist in output_mlm for item in sublist]

TypeError: 'numpy.float64' object is not iterable

In [79]:
print(f'Estimates of all models with GLoVE Embeddings')
outputs = np.concatenate((output_kNN_class_glove,output_kNN_reg_glove,output_SVC_glove, output_SVR_glove, output_psm_glove))#, output_mlm_glove))
outputs=np.reshape(outputs,(9,5),order='F')
pd.DataFrame(data=outputs,index=schemas,columns=['kNN_class','kNN_reg','SVC','SVR','PSM'])#,'MLM']))

Estimates of all models with GLoVE Embeddings


Unnamed: 0,kNN_class,kNN_reg,SVC,SVR,PSM
Attach,0.130606,0.606499,0.647714,0.67534,0.689987
Comp,0.135201,0.701906,0.684661,0.640866,0.720595
Global,0.204418,0.417896,0.357601,0.489372,0.565708
Health,0.249344,0.656053,0.729181,0.349064,0.780698
Control,-0.011459,0.216933,,0.310007,0.277295
MetaCog,,0.019173,,0.114894,-0.009075
Others,,0.237087,,0.185827,0.180899
Hopeless,0.167857,0.534698,0.489903,0.535979,0.590422
OthViews,0.157289,0.461305,0.476297,0.516635,0.60066


In [78]:
print(f'Estimates of all models with BERT Embeddings')
outputs = np.concatenate((output_kNN_class_bert,output_kNN_reg_bert,output_SVC_bert, output_SVR_bert, output_psm_bert))#, output_mlm_glove))
outputs=np.reshape(outputs,(9,5),order='F')
pd.DataFrame(data=outputs,index=schemas,columns=['kNN_class','kNN_reg','SVC','SVR','PSM'])#)#,'MLM']))

Estimates of all models with BERT Embeddings


Unnamed: 0,kNN_class,kNN_reg,SVC,SVR,PSM
Attach,0.531913,0.527163,0.577022,0.663786,0.73794
Comp,0.587256,0.604087,0.689898,0.672748,0.767731
Global,0.375899,0.399843,0.373424,0.506108,0.565223
Health,0.500031,0.406143,0.561481,0.305131,0.747472
Control,0.102623,0.198838,-0.011459,0.30682,0.213274
MetaCog,,0.042465,,0.141697,0.178619
Others,0.183972,0.142093,,0.114909,0.177825
Hopeless,0.491567,0.459757,0.517952,0.521259,0.639479
OthViews,0.401721,0.437244,0.471861,0.491448,0.621209


In [None]:
print(f'Lower CIs of all models')
lower_cis = np.concatenate((bs_knn_class[0],bs_knn_reg[0],bs_svc[0], bs_svr[0], bs_psm[0], bs_mlm[0]))
lower_cis=np.reshape(lower_cis,(9,6),order='F')
print(pd.DataFrame(data=lower_cis,index=schemas,columns=['kNN_class','kNN_reg','SVC','SVR','PSM','MLM']))

In [None]:
print(f'Upper CIs of all models')
upper_cis = np.concatenate((bs_knn_class[1],bs_knn_reg[1],bs_svc[1], bs_svr[1], bs_psm[1], bs_mlm[1]))
upper_cis=np.reshape(upper_cis,(9,6),order='F')
print(pd.DataFrame(data=upper_cis,index=schemas,columns=['kNN_class','kNN_reg','SVC','SVR','PSM','MLM']))

## Generate Dataset for Testing Hypothesis 2
Finally, we need to use the best-performing algorithm, the per-schema RNNs, to generate the predictions on the testset and write these to a file so that we can use them to test Hypothesis 2.

In [None]:
gofH2,predsH2=predict_schema_psm(padded_test,test_labels,idx_psm[0])

In [None]:
predsH2 = predsH2.astype(int)
print(predsH2[:,0:5])
diag_rho = [scipy.stats.spearmanr(predsH2[i,:], test_labels[i,0:9], nan_policy='omit')[0] for i in range(predsH2.shape[0])]


In [None]:
df_predsH2 = pd.DataFrame(data=predsH2,columns=['AttachPred','CompPred',"GlobalPred","HealthPred","ControlPred","MetaCogPred","OthersPred","HopelessPred","OthViewsPred"])
df_predsH2["Corr"] = pd.DataFrame(diag_rho)

In [None]:
print(df_predsH2.head())

In [None]:
df_predsH2.to_csv("Data/PredictionsH2.csv", sep=';', header=True, index=False, mode='w')