<a href="https://colab.research.google.com/github/michealman114/Natural-Language-Models-for-Hate-Speech-Classification/blob/main/TestingLSTMsForHateSpeech.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

Here's most of our code for lazy testing different approaches. We still need to add in the k-fold cross validation functions from the original experiments and add in the title context functionality.


To use this notebook you will need
- to be able to mount your google drive in the 4th code cell (with the folder containing the folder of stored tensors in your home directory)
- to upload CommentsDatasets.py, models.py, TrainingLoops.py in the left sidebar file manager

In [1]:
import torch
import torch.nn as nn 
import torch.nn.functional as F
import torch.optim as optim
import torch.utils.data as torch_data

In [2]:
from torch import cuda

if cuda.is_available():
    device = 'cuda'
    seed = 4814
    torch.cuda.manual_seed_all(seed)
    print("running on GPU:", torch.cuda.get_device_name(0))
else:
    print("running on CPU")

running on GPU: Tesla T4


In [3]:
from CommentsDatasets import * # torch dataset setup
from models import * # all our LSTM based models
from TrainingLoops import * # our training and k-fold cross validation methods

In [4]:
from google.colab import drive
drive.mount('/content/drive')

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


In [5]:
!ls '/content/drive/MyDrive/Natural Language Models for Hate Speech Classification'

'Classification with BERT.ipynb'   StoredTensors
'Embeddings from BERT.ipynb'	   TestingLSTMsForHateSpeech.ipynb
'LSTM Full Model with Attention'


Note to self (and anybody else who wants to try this): The two cells below correspond to loading data for pregenerated Word2Vec (embed_dim = 768) and BERT (embed_dim = 768) embeddings respectively. Pick the one that you want, and don't bother running the other.

Word2Vec Embeddings Results (from a lazy training loop)
- base model: Loss: 3.535007230937481
- bidirectional LSTM: Loss: 1.8555335476994514
- bidirectional LSTM with Attention: 0.5990170016884804

**Word2Vec Embeddings 10-fold cross validation (30 epochs)**
- base model:
    - Accuracy: 0.7289473684210527
    - Precision, Recall, F1: (0.5273311897106109, 0.3822843822843823, 0.4432432432432432, None)
- bidirectional model:
    - Accuracy: 0.7263157894736842
    - Precision, Recall, F1: (0.5226480836236934, 0.34965034965034963, 0.41899441340782123, None)
- bidirectional model with attention:
    - Accuracy: 0.7157894736842105
    - Precision, Recall, F1: (0.49584487534626037, 0.4172494172494173, 0.4531645569620253, None)

**Word2Vec Embeddings 10-fold cross validation (40 epochs)**
- base model:
    - Accuracy: 0.7335526315789473
    - Precision, Recall, F1: (0.5298507462686567, 0.4965034965034965, 0.5126353790613718, None)

- bidirectional model:
    - Accuracy: 0.7342105263157894
    - Precision, Recall, F1: (0.5319693094629157, 0.48484848484848486, 0.5073170731707317, None)


- bidirectional model with attention:
    - Accuracy: 0.7217105263157895
    - Precision, Recall, F1: (0.5086206896551724, 0.4125874125874126, 0.4555984555984556, None)

In [5]:
#Word2Vec 
embed_dim = 300

train_comments = torch.load('/content/drive/MyDrive/Natural Language Models for Hate Speech Classification/StoredTensors/w2v_train_comments_embeddings.pt')
train_labels = torch.load('/content/drive/MyDrive/Natural Language Models for Hate Speech Classification/StoredTensors/w2v_train_labels.pt')
train_comments_padding_masks = torch.load('/content/drive/MyDrive/Natural Language Models for Hate Speech Classification/StoredTensors/w2v_train_comments_padding_masks.pt')

In [6]:
"""
quick sanity checks:
input data should be: (batch_size, max_length, embed_dim)
labels should be: (batch_size,)
"""
print(train_comments.shape, train_comments_padding_masks.shape, train_labels.shape)

torch.Size([1528, 244, 300]) torch.Size([1528, 244]) torch.Size([1528])


The following cells correspond to loading data for pregenerated BERT (embed_dim = 300) embeddings.

BERT Embeddings Results (from a lazy training loop)
- base model: Loss: 858.4895858764648
- bidirectional LSTM: Loss: 341.25000190734863
- bidirectional LSTM with Attention: Loss: 0.0014144671586109325

These results are both intuitive and surprising. It is amsuingly surprising that the LSTMS without attention are as ludicrously terrible as they are, but it kind of makes sense. 

The enormous performance jump suggests that the attention mechanism (which in this application is just a very simple set of FC layers) is doing most of the work (even without being attention masked - which is something we need to fix pretty urgently). 

I'd be willing to bet that when working on BERT embeddings we can just trivially slap on a couple linear layers on top and get really good performance The suspicion here is that BERT more or less makes the LSTM obsolete - ironic.

Also, when we were cleaning up the dataset before feeding into word2vec we originally did some classical stuff (removing stopwords, punctuation etc) that removes important embedding context - especailly since some stopwords like "no","never","not" substantially change the meaning of a sentence - which probably also explains a good amount of why BERT performs so much better. Fixing this really obvious data processing mistake isn't too difficult, but we can do that later.

**BERT Embeddings 10-fold cross validation**
- 30 epochs: bidirectional model with attention:
    - Accuracy: 0.7532894736842105
    - Precision, Recall, F1: (0.5875370919881305, 0.45622119815668205, 0.5136186770428015, None)
- 15 epochs: bidirectional model with attention:
    - Accuracy: 0.7526315789473684
    - Precision, Recall, F1: (0.5759162303664922, 0.5069124423963134, 0.5392156862745099, None)
- 10 epochs: bidirectional model with attention:
    - Accuracy: 0.75
    - Precision, Recall, F1: (0.5658536585365853, 0.5345622119815668, 0.5497630331753554, None)
- 5 epochs: bidirectional model with attention:
    - Accuracy: 0.7381578947368421
    - Precision, Recall, F1: (0.5481283422459893, 0.47235023041474655, 0.5074257425742573, None)


Training on BERT embeddings works best with 10 epochs (15 epochs isn't bad either, but 10 just slightly outperforms it). Yay for no more overfitting.

In [4]:
embed_dim = 768

train_comments = torch.load('drive/MyDrive/Natural Language Models for Hate Speech Classification/StoredTensors/BERT_train_comments_embeddings.pt')
train_comments_padding_masks = torch.load('drive/MyDrive/Natural Language Models for Hate Speech Classification/StoredTensors/BERT_train_comments_attention_masks.pt')

train_labels = torch.load('drive/MyDrive/Natural Language Models for Hate Speech Classification/StoredTensors/BERT_train_labels.pt')

In [5]:
"""
quick sanity checks:
input data should be: (batch_size, max_length, embed_dim)
labels should be: (batch_size,)
"""
print(train_comments.shape, train_comments_padding_masks.shape, train_labels.shape)

torch.Size([1528, 512, 768]) torch.Size([1528, 512]) torch.Size([1528])


Now for a simple training loop experiment:

In [11]:
num_epochs = 30 #running on colab, use 30-40ish, see what works

In [None]:
trained_model,training_losses = train(training_data, num_epochs, 128, Full_LSTM_Model, embed_dim = embed_dim, bidi = False, attention = False)

In [None]:
trained_model,training_losses = train(training_data, num_epochs, 128, Full_LSTM_Model, embed_dim = embed_dim, bidi = True, attention = False)

In [None]:
trained_model,training_losses = train(training_data, num_epochs, 128, Full_LSTM_Model, embed_dim = embed_dim, bidi = True, attention = True)



Now that we've got training working, we also set up a nice: k-fold evaluation loop is as follows:

Starting with some Word2Vec stuff (it also works for BERT but its pointless to run the base and non attention models on BERT embeddings):

In [None]:
kfold_crossvalidation(10, train_comments, train_comments_padding_masks, train_labels, Full_LSTM_Model, device, n_epochs = 30, embed_dim = embed_dim)

In [None]:
kfold_crossvalidation(10, train_comments, train_comments_padding_masks, train_labels, Full_LSTM_Model, device, n_epochs = 30, embed_dim = embed_dim, bidi = True)

In [None]:
kfold_crossvalidation(10, train_comments, train_comments_padding_masks, train_labels, Full_LSTM_Model, device, n_epochs = 15, embed_dim = embed_dim, bidi = True, attention = True)

This Stuff I used for BERT evaluation


In [None]:
kfold_crossvalidation(10, train_comments, train_comments_padding_masks, train_labels, Full_LSTM_Model, device, n_epochs = 15, embed_dim = embed_dim, bidi = True, attention = True, intermediate_checks = True)

In [6]:
kfold_crossvalidation(10, train_comments, train_comments_padding_masks, train_labels, Full_LSTM_Model, device, n_epochs = 5, embed_dim = embed_dim, bidi = True, attention = True)

  "num_layers={}".format(dropout, num_layers))




running split 1:

LSTM with bidirectional = True, attention = True


 20%|██        | 1/5 [00:01<00:07,  1.94s/it]

Loss: 6.741696655750275


 40%|████      | 2/5 [00:04<00:06,  2.16s/it]

Loss: 6.235944449901581


 60%|██████    | 3/5 [00:05<00:03,  1.88s/it]

Loss: 5.861877769231796


 80%|████████  | 4/5 [00:07<00:01,  1.70s/it]

Loss: 5.171745449304581


100%|██████████| 5/5 [00:08<00:00,  1.73s/it]
  "num_layers={}".format(dropout, num_layers))


Loss: 4.7218102514743805
0.5347775220870972
Final Evaluation for model 1
Accuracy: 0.7302631578947368
Precision, Recall, F1: (0.5918367346938775, 0.58, 0.5858585858585857, None)


running split 2:

LSTM with bidirectional = True, attention = True


 20%|██        | 1/5 [00:01<00:05,  1.41s/it]

Loss: 6.649314999580383


 40%|████      | 2/5 [00:02<00:04,  1.41s/it]

Loss: 6.229241788387299


 60%|██████    | 3/5 [00:04<00:02,  1.41s/it]

Loss: 5.749286741018295


 80%|████████  | 4/5 [00:05<00:01,  1.42s/it]

Loss: 5.177586525678635


100%|██████████| 5/5 [00:07<00:00,  1.42s/it]
  "num_layers={}".format(dropout, num_layers))


Loss: 4.557841867208481
0.5659334063529968
Final Evaluation for model 2
Accuracy: 0.6907894736842105
Precision, Recall, F1: (0.4909090909090909, 0.5869565217391305, 0.5346534653465347, None)


running split 3:

LSTM with bidirectional = True, attention = True


 20%|██        | 1/5 [00:01<00:05,  1.42s/it]

Loss: 6.668006241321564


 40%|████      | 2/5 [00:02<00:04,  1.42s/it]

Loss: 6.27809602022171


 60%|██████    | 3/5 [00:04<00:02,  1.43s/it]

Loss: 5.9165584444999695


 80%|████████  | 4/5 [00:05<00:01,  1.43s/it]

Loss: 5.305051535367966


100%|██████████| 5/5 [00:07<00:00,  1.43s/it]
  "num_layers={}".format(dropout, num_layers))


Loss: 4.597853809595108
0.5419158935546875
Final Evaluation for model 3
Accuracy: 0.7105263157894737
Precision, Recall, F1: (0.5652173913043478, 0.2765957446808511, 0.37142857142857144, None)


running split 4:

LSTM with bidirectional = True, attention = True


 20%|██        | 1/5 [00:01<00:05,  1.44s/it]

Loss: 6.742801189422607


 40%|████      | 2/5 [00:02<00:04,  1.44s/it]

Loss: 6.299060165882111


 60%|██████    | 3/5 [00:04<00:02,  1.45s/it]

Loss: 5.822777509689331


 80%|████████  | 4/5 [00:05<00:01,  1.46s/it]

Loss: 5.239582777023315


100%|██████████| 5/5 [00:07<00:00,  1.47s/it]
  "num_layers={}".format(dropout, num_layers))


Loss: 4.704946666955948
0.46380966901779175
Final Evaluation for model 4
Accuracy: 0.756578947368421
Precision, Recall, F1: (0.4857142857142857, 0.4722222222222222, 0.47887323943661975, None)


running split 5:

LSTM with bidirectional = True, attention = True


 20%|██        | 1/5 [00:01<00:05,  1.45s/it]

Loss: 6.734560817480087


 40%|████      | 2/5 [00:02<00:04,  1.46s/it]

Loss: 6.235641628503799


 60%|██████    | 3/5 [00:04<00:02,  1.47s/it]

Loss: 5.905038058757782


 80%|████████  | 4/5 [00:05<00:01,  1.47s/it]

Loss: 5.364112287759781


100%|██████████| 5/5 [00:07<00:00,  1.47s/it]
  "num_layers={}".format(dropout, num_layers))


Loss: 4.856743633747101
0.5772454142570496
Final Evaluation for model 5
Accuracy: 0.6644736842105263
Precision, Recall, F1: (0.4186046511627907, 0.4090909090909091, 0.4137931034482759, None)


running split 6:

LSTM with bidirectional = True, attention = True


 20%|██        | 1/5 [00:01<00:05,  1.47s/it]

Loss: 6.6113201379776


 40%|████      | 2/5 [00:02<00:04,  1.48s/it]

Loss: 6.119948506355286


 60%|██████    | 3/5 [00:04<00:02,  1.48s/it]

Loss: 5.704057604074478


 80%|████████  | 4/5 [00:05<00:01,  1.48s/it]

Loss: 5.15896201133728


100%|██████████| 5/5 [00:07<00:00,  1.49s/it]
  "num_layers={}".format(dropout, num_layers))


Loss: 4.522872358560562
0.5860206484794617
Final Evaluation for model 6
Accuracy: 0.7368421052631579
Precision, Recall, F1: (0.7333333333333333, 0.4074074074074074, 0.5238095238095238, None)


running split 7:

LSTM with bidirectional = True, attention = True


 20%|██        | 1/5 [00:01<00:05,  1.49s/it]

Loss: 6.90861189365387


 40%|████      | 2/5 [00:02<00:04,  1.50s/it]

Loss: 6.457770645618439


 60%|██████    | 3/5 [00:04<00:02,  1.50s/it]

Loss: 6.186989784240723


 80%|████████  | 4/5 [00:05<00:01,  1.50s/it]

Loss: 5.682197391986847


100%|██████████| 5/5 [00:07<00:00,  1.50s/it]
  "num_layers={}".format(dropout, num_layers))


Loss: 5.1802297830581665
0.32413312792778015
Final Evaluation for model 7
Accuracy: 0.881578947368421
Precision, Recall, F1: (0.8095238095238095, 0.5483870967741935, 0.6538461538461537, None)


running split 8:

LSTM with bidirectional = True, attention = True


 20%|██        | 1/5 [00:01<00:06,  1.51s/it]

Loss: 6.667836785316467


 40%|████      | 2/5 [00:03<00:04,  1.51s/it]

Loss: 6.233171701431274


 60%|██████    | 3/5 [00:04<00:03,  1.51s/it]

Loss: 5.947250813245773


 80%|████████  | 4/5 [00:06<00:01,  1.52s/it]

Loss: 5.417760252952576


100%|██████████| 5/5 [00:07<00:00,  1.52s/it]
  "num_layers={}".format(dropout, num_layers))


Loss: 4.809996277093887
0.515805184841156
Final Evaluation for model 8
Accuracy: 0.75
Precision, Recall, F1: (0.5625, 0.42857142857142855, 0.4864864864864864, None)


running split 9:

LSTM with bidirectional = True, attention = True


 20%|██        | 1/5 [00:01<00:06,  1.51s/it]

Loss: 6.662729322910309


 40%|████      | 2/5 [00:03<00:04,  1.53s/it]

Loss: 6.211628794670105


 60%|██████    | 3/5 [00:04<00:03,  1.53s/it]

Loss: 5.680173248052597


 80%|████████  | 4/5 [00:06<00:01,  1.53s/it]

Loss: 5.122129917144775


100%|██████████| 5/5 [00:07<00:00,  1.54s/it]
  "num_layers={}".format(dropout, num_layers))


Loss: 4.642663687467575
0.5205357074737549
Final Evaluation for model 9
Accuracy: 0.7105263157894737
Precision, Recall, F1: (0.48214285714285715, 0.6428571428571429, 0.5510204081632654, None)


running split 10:

LSTM with bidirectional = True, attention = True


 20%|██        | 1/5 [00:01<00:06,  1.51s/it]

Loss: 6.745823383331299


 40%|████      | 2/5 [00:03<00:04,  1.50s/it]

Loss: 6.285453200340271


 60%|██████    | 3/5 [00:04<00:03,  1.50s/it]

Loss: 5.838353395462036


 80%|████████  | 4/5 [00:06<00:01,  1.50s/it]

Loss: 5.234912604093552


100%|██████████| 5/5 [00:07<00:00,  1.50s/it]

Loss: 4.620144546031952
0.5495288968086243
Final Evaluation for model 10
Accuracy: 0.75
Precision, Recall, F1: (0.5666666666666667, 0.40476190476190477, 0.4722222222222222, None)

===Aggregate Stats===
Accuracy: 0.7381578947368421
Precision, Recall, F1: (0.5481283422459893, 0.47235023041474655, 0.5074257425742573, None)





In [7]:
kfold_crossvalidation(10, train_comments, train_comments_padding_masks, train_labels, Full_LSTM_Model, device, n_epochs = 10, embed_dim = embed_dim, bidi = True, attention = True)

  "num_layers={}".format(dropout, num_layers))




running split 1:

LSTM with bidirectional = True, attention = True


 10%|█         | 1/10 [00:01<00:13,  1.48s/it]

Loss: 6.783921897411346


 20%|██        | 2/10 [00:02<00:11,  1.48s/it]

Loss: 6.415431499481201


 30%|███       | 3/10 [00:04<00:10,  1.48s/it]

Loss: 6.05392399430275


 40%|████      | 4/10 [00:05<00:08,  1.48s/it]

Loss: 5.53213107585907


 50%|█████     | 5/10 [00:07<00:07,  1.48s/it]

Loss: 4.960976392030716


 60%|██████    | 6/10 [00:08<00:05,  1.48s/it]

Loss: 4.530582100152969


 70%|███████   | 7/10 [00:10<00:04,  1.48s/it]

Loss: 3.8154749125242233


 80%|████████  | 8/10 [00:11<00:02,  1.48s/it]

Loss: 3.1231046319007874


 90%|█████████ | 9/10 [00:13<00:01,  1.48s/it]

Loss: 2.5247548073530197


100%|██████████| 10/10 [00:14<00:00,  1.49s/it]
  "num_layers={}".format(dropout, num_layers))


Loss: 1.8445914685726166
0.7031214237213135
Final Evaluation for model 1
Accuracy: 0.7828947368421053
Precision, Recall, F1: (0.7741935483870968, 0.48, 0.5925925925925926, None)


running split 2:

LSTM with bidirectional = True, attention = True


 10%|█         | 1/10 [00:01<00:13,  1.46s/it]

Loss: 6.605070114135742


 20%|██        | 2/10 [00:02<00:11,  1.47s/it]

Loss: 6.104359984397888


 30%|███       | 3/10 [00:04<00:10,  1.47s/it]

Loss: 5.577102035284042


 40%|████      | 4/10 [00:05<00:08,  1.47s/it]

Loss: 4.904650539159775


 50%|█████     | 5/10 [00:07<00:07,  1.47s/it]

Loss: 4.224044322967529


 60%|██████    | 6/10 [00:08<00:05,  1.47s/it]

Loss: 3.5375994741916656


 70%|███████   | 7/10 [00:10<00:04,  1.47s/it]

Loss: 2.736216112971306


 80%|████████  | 8/10 [00:11<00:02,  1.47s/it]

Loss: 1.764616183936596


 90%|█████████ | 9/10 [00:13<00:01,  1.47s/it]

Loss: 1.0980910025537014


100%|██████████| 10/10 [00:14<00:00,  1.47s/it]
  "num_layers={}".format(dropout, num_layers))


Loss: 0.6409009285271168
1.2052271366119385
Final Evaluation for model 2
Accuracy: 0.6776315789473685
Precision, Recall, F1: (0.47619047619047616, 0.6521739130434783, 0.5504587155963303, None)


running split 3:

LSTM with bidirectional = True, attention = True


 10%|█         | 1/10 [00:01<00:13,  1.47s/it]

Loss: 6.743395030498505


 20%|██        | 2/10 [00:02<00:11,  1.47s/it]

Loss: 6.265676558017731


 30%|███       | 3/10 [00:04<00:10,  1.47s/it]

Loss: 5.924173533916473


 40%|████      | 4/10 [00:05<00:08,  1.48s/it]

Loss: 5.318146705627441


 50%|█████     | 5/10 [00:07<00:07,  1.48s/it]

Loss: 4.65463462471962


 60%|██████    | 6/10 [00:08<00:05,  1.48s/it]

Loss: 4.038762032985687


 70%|███████   | 7/10 [00:10<00:04,  1.48s/it]

Loss: 3.149392232298851


 80%|████████  | 8/10 [00:11<00:03,  1.50s/it]

Loss: 2.369959905743599


 90%|█████████ | 9/10 [00:13<00:01,  1.49s/it]

Loss: 2.023649074137211


100%|██████████| 10/10 [00:14<00:00,  1.49s/it]
  "num_layers={}".format(dropout, num_layers))


Loss: 1.539672702550888
0.7902793884277344
Final Evaluation for model 3
Accuracy: 0.7236842105263158
Precision, Recall, F1: (0.5925925925925926, 0.3404255319148936, 0.4324324324324324, None)


running split 4:

LSTM with bidirectional = True, attention = True


 10%|█         | 1/10 [00:01<00:13,  1.47s/it]

Loss: 6.800923228263855


 20%|██        | 2/10 [00:02<00:11,  1.48s/it]

Loss: 6.305029273033142


 30%|███       | 3/10 [00:04<00:10,  1.48s/it]

Loss: 5.93771767616272


 40%|████      | 4/10 [00:05<00:08,  1.48s/it]

Loss: 5.461984038352966


 50%|█████     | 5/10 [00:07<00:07,  1.48s/it]

Loss: 4.880969822406769


 60%|██████    | 6/10 [00:08<00:05,  1.48s/it]

Loss: 4.351214915513992


 70%|███████   | 7/10 [00:10<00:04,  1.48s/it]

Loss: 3.6658969074487686


 80%|████████  | 8/10 [00:11<00:02,  1.48s/it]

Loss: 3.128068894147873


 90%|█████████ | 9/10 [00:13<00:01,  1.48s/it]

Loss: 2.2808870300650597


100%|██████████| 10/10 [00:14<00:00,  1.48s/it]
  "num_layers={}".format(dropout, num_layers))


Loss: 1.5986247658729553
0.6269927620887756
Final Evaluation for model 4
Accuracy: 0.7631578947368421
Precision, Recall, F1: (0.5, 0.5, 0.5, None)


running split 5:

LSTM with bidirectional = True, attention = True


 10%|█         | 1/10 [00:01<00:13,  1.48s/it]

Loss: 6.613054990768433


 20%|██        | 2/10 [00:02<00:11,  1.49s/it]

Loss: 6.0395936369895935


 30%|███       | 3/10 [00:04<00:10,  1.50s/it]

Loss: 5.469045728445053


 40%|████      | 4/10 [00:05<00:08,  1.49s/it]

Loss: 4.916662663221359


 50%|█████     | 5/10 [00:07<00:07,  1.50s/it]

Loss: 4.348067551851273


 60%|██████    | 6/10 [00:08<00:05,  1.50s/it]

Loss: 3.531091883778572


 70%|███████   | 7/10 [00:10<00:04,  1.50s/it]

Loss: 2.9724753499031067


 80%|████████  | 8/10 [00:11<00:02,  1.50s/it]

Loss: 2.2599014937877655


 90%|█████████ | 9/10 [00:13<00:01,  1.49s/it]

Loss: 1.6109944805502892


100%|██████████| 10/10 [00:14<00:00,  1.50s/it]
  "num_layers={}".format(dropout, num_layers))


Loss: 0.9706889241933823
0.8931474089622498
Final Evaluation for model 5
Accuracy: 0.7171052631578947
Precision, Recall, F1: (0.5087719298245614, 0.6590909090909091, 0.5742574257425743, None)


running split 6:

LSTM with bidirectional = True, attention = True


 10%|█         | 1/10 [00:01<00:13,  1.48s/it]

Loss: 6.614535987377167


 20%|██        | 2/10 [00:02<00:11,  1.48s/it]

Loss: 6.152480453252792


 30%|███       | 3/10 [00:04<00:10,  1.49s/it]

Loss: 5.771692663431168


 40%|████      | 4/10 [00:05<00:08,  1.49s/it]

Loss: 5.218861550092697


 50%|█████     | 5/10 [00:07<00:07,  1.49s/it]

Loss: 4.807599991559982


 60%|██████    | 6/10 [00:08<00:05,  1.49s/it]

Loss: 4.255495637655258


 70%|███████   | 7/10 [00:10<00:04,  1.49s/it]

Loss: 3.4808487445116043


 80%|████████  | 8/10 [00:11<00:02,  1.49s/it]

Loss: 2.4944699108600616


 90%|█████████ | 9/10 [00:13<00:01,  1.49s/it]

Loss: 1.5808917880058289


100%|██████████| 10/10 [00:14<00:00,  1.49s/it]
  "num_layers={}".format(dropout, num_layers))


Loss: 0.9207409471273422
0.926649808883667
Final Evaluation for model 6
Accuracy: 0.7368421052631579
Precision, Recall, F1: (0.6666666666666666, 0.5185185185185185, 0.5833333333333334, None)


running split 7:

LSTM with bidirectional = True, attention = True


 10%|█         | 1/10 [00:01<00:13,  1.48s/it]

Loss: 6.766563594341278


 20%|██        | 2/10 [00:02<00:11,  1.49s/it]

Loss: 6.382769525051117


 30%|███       | 3/10 [00:04<00:10,  1.49s/it]

Loss: 5.972908020019531


 40%|████      | 4/10 [00:05<00:08,  1.49s/it]

Loss: 5.414547145366669


 50%|█████     | 5/10 [00:07<00:07,  1.49s/it]

Loss: 4.866333782672882


 60%|██████    | 6/10 [00:08<00:05,  1.49s/it]

Loss: 4.299627900123596


 70%|███████   | 7/10 [00:10<00:04,  1.49s/it]

Loss: 3.936822384595871


 80%|████████  | 8/10 [00:11<00:02,  1.49s/it]

Loss: 3.1191279888153076


 90%|█████████ | 9/10 [00:13<00:01,  1.49s/it]

Loss: 2.371863067150116


100%|██████████| 10/10 [00:14<00:00,  1.49s/it]
  "num_layers={}".format(dropout, num_layers))


Loss: 1.8216080591082573
0.4453512132167816
Final Evaluation for model 7
Accuracy: 0.8552631578947368
Precision, Recall, F1: (0.6363636363636364, 0.6774193548387096, 0.65625, None)


running split 8:

LSTM with bidirectional = True, attention = True


 10%|█         | 1/10 [00:01<00:13,  1.47s/it]

Loss: 6.716046512126923


 20%|██        | 2/10 [00:02<00:11,  1.48s/it]

Loss: 6.290104985237122


 30%|███       | 3/10 [00:04<00:10,  1.49s/it]

Loss: 5.830421149730682


 40%|████      | 4/10 [00:05<00:08,  1.49s/it]

Loss: 5.171818941831589


 50%|█████     | 5/10 [00:07<00:07,  1.49s/it]

Loss: 4.580947518348694


 60%|██████    | 6/10 [00:08<00:05,  1.49s/it]

Loss: 3.8696189522743225


 70%|███████   | 7/10 [00:10<00:04,  1.49s/it]

Loss: 3.1105483323335648


 80%|████████  | 8/10 [00:11<00:02,  1.49s/it]

Loss: 2.32100810110569


 90%|█████████ | 9/10 [00:13<00:01,  1.49s/it]

Loss: 1.5681634545326233


100%|██████████| 10/10 [00:14<00:00,  1.49s/it]
  "num_layers={}".format(dropout, num_layers))


Loss: 1.1448244974017143
0.6648249626159668
Final Evaluation for model 8
Accuracy: 0.7631578947368421
Precision, Recall, F1: (0.5652173913043478, 0.6190476190476191, 0.5909090909090909, None)


running split 9:

LSTM with bidirectional = True, attention = True


 10%|█         | 1/10 [00:01<00:13,  1.47s/it]

Loss: 6.71686065196991


 20%|██        | 2/10 [00:02<00:11,  1.49s/it]

Loss: 6.232078433036804


 30%|███       | 3/10 [00:04<00:10,  1.49s/it]

Loss: 5.836347937583923


 40%|████      | 4/10 [00:05<00:08,  1.49s/it]

Loss: 5.19204568862915


 50%|█████     | 5/10 [00:07<00:07,  1.49s/it]

Loss: 4.670345991849899


 60%|██████    | 6/10 [00:08<00:05,  1.49s/it]

Loss: 3.99908185005188


 70%|███████   | 7/10 [00:10<00:04,  1.49s/it]

Loss: 3.413888931274414


 80%|████████  | 8/10 [00:11<00:02,  1.49s/it]

Loss: 2.5037889629602432


 90%|█████████ | 9/10 [00:13<00:01,  1.49s/it]

Loss: 1.6444427147507668


100%|██████████| 10/10 [00:14<00:00,  1.49s/it]
  "num_layers={}".format(dropout, num_layers))


Loss: 1.0501855053007603
0.7791993021965027
Final Evaluation for model 9
Accuracy: 0.7302631578947368
Precision, Recall, F1: (0.5128205128205128, 0.47619047619047616, 0.49382716049382713, None)


running split 10:

LSTM with bidirectional = True, attention = True


 10%|█         | 1/10 [00:01<00:13,  1.48s/it]

Loss: 6.607423961162567


 20%|██        | 2/10 [00:02<00:11,  1.48s/it]

Loss: 6.166712760925293


 30%|███       | 3/10 [00:04<00:10,  1.48s/it]

Loss: 5.777746647596359


 40%|████      | 4/10 [00:05<00:08,  1.49s/it]

Loss: 5.10493677854538


 50%|█████     | 5/10 [00:07<00:07,  1.49s/it]

Loss: 4.606020390987396


 60%|██████    | 6/10 [00:08<00:05,  1.49s/it]

Loss: 3.9191214740276337


 70%|███████   | 7/10 [00:10<00:04,  1.49s/it]

Loss: 3.0925460010766983


 80%|████████  | 8/10 [00:11<00:02,  1.49s/it]

Loss: 2.2106542140245438


 90%|█████████ | 9/10 [00:13<00:01,  1.49s/it]

Loss: 1.501184195280075


100%|██████████| 10/10 [00:14<00:00,  1.49s/it]

Loss: 1.201873518526554
0.864876925945282
Final Evaluation for model 10
Accuracy: 0.75
Precision, Recall, F1: (0.5555555555555556, 0.47619047619047616, 0.5128205128205129, None)

===Aggregate Stats===
Accuracy: 0.75
Precision, Recall, F1: (0.5658536585365853, 0.5345622119815668, 0.5497630331753554, None)



