<a href="https://colab.research.google.com/github/LudoCatt/FakeNewsDetection/blob/main/Emotions.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Sentiment Analysis using Flair

largely inspired by [this tutorial](https://medium.com/@b.terryjack/nlp-pre-trained-sentiment-analysis-1eb52a9d742c) and [this tutorial](https://www.section.io/engineering-education/how-to-create-nlp-application-with-flair/#splitting-the-dataset) 

## load dataset from [Sem-Eval 2018](https://www.scielo.org.mx/pdf/cys/v24n3/1405-5546-cys-24-03-1159.pdf) competition

### install libraries

In [None]:
#!pip uninstall protobuf

In [None]:
!pip install protobuf==3.20.3 --quiet
!pip install flair --quiet

[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m373.1/373.1 kB[0m [31m8.4 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m788.5/788.5 kB[0m [31m34.5 MB/s[0m eta [36m0:00:00[0m
[?25h  Preparing metadata (setup.py) ... [?25l[?25hdone
  Preparing metadata (setup.py) ... [?25l[?25hdone
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m135.6/135.6 kB[0m [31m5.5 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m7.1/7.1 MB[0m [31m72.0 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m981.5/981.5 kB[0m [31m47.5 MB/s[0m eta [36m0:00:00[0m
[?25h  Preparing metadata (setup.py) ... [?25l[?25hdone
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m53.1/53.1 kB[0m [31m5.9 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m19.7/19.7 MB[0m [31m57.0 MB/s[0m eta [

### import flair NLP library; visualisation of pretrained classifier

In [None]:
import flair
# 'en-sentiment' text classifier is a pre-trained model that classifies text 
# into two emotions, with a confidence score
flair_sentiment = flair.models.TextClassifier.load('en-sentiment')

2023-05-23 12:31:00,331 https://nlp.informatik.hu-berlin.de/resources/models/sentiment-curated-distilbert/sentiment-en-mix-distillbert_4.pt not found in cache, downloading to /tmp/tmpictlw8cg


100%|██████████| 253M/253M [00:12<00:00, 21.7MB/s]

2023-05-23 12:31:13,010 copying /tmp/tmpictlw8cg to cache at /root/.flair/models/sentiment-en-mix-distillbert_4.pt





2023-05-23 12:31:13,542 removing temp file /tmp/tmpictlw8cg


Downloading (…)okenizer_config.json:   0%|          | 0.00/28.0 [00:00<?, ?B/s]

Downloading (…)lve/main/config.json:   0%|          | 0.00/483 [00:00<?, ?B/s]

Downloading (…)solve/main/vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

Downloading (…)/main/tokenizer.json:   0%|          | 0.00/466k [00:00<?, ?B/s]

In [None]:
sentence = "green apples make me angry"
s = flair.data.Sentence(sentence)
print(s)

Sentence[5]: "green apples make me angry"


In [None]:
flair_sentiment.predict(s)
total_sentiment = s.labels
print(total_sentiment)

['Sentence[5]: "green apples make me angry"'/'NEGATIVE' (0.9969)]


### We create our own model based on a dataset with multi-label sentiment classification

In [None]:
from datasets import list_datasets, load_dataset
from pprint import pprint
import pandas as pd

 we import a dataset from the Sem-Eval 2018 competition, containing roughly 11k tweets in English, each labeled with one of 11 emotions 

In [None]:
dataset_train = load_dataset("sem_eval_2018_task_1", "subtask5.english", split="train")
dataset_val = load_dataset("sem_eval_2018_task_1", "subtask5.english", split="validation")
dataset_test = load_dataset("sem_eval_2018_task_1", "subtask5.english", split="test")

Downloading builder script:   0%|          | 0.00/6.29k [00:00<?, ?B/s]

Downloading metadata:   0%|          | 0.00/7.76k [00:00<?, ?B/s]

Downloading readme:   0%|          | 0.00/10.6k [00:00<?, ?B/s]

Downloading and preparing dataset sem_eval_2018_task_1/subtask5.english to /root/.cache/huggingface/datasets/sem_eval_2018_task_1/subtask5.english/1.1.0/a7c0de8b805f1988b118882fb289ccfbbeb9085c7820b6f046b5887e234af182...


Downloading data files:   0%|          | 0/1 [00:00<?, ?it/s]

Downloading data:   0%|          | 0.00/5.98M [00:00<?, ?B/s]

Extracting data files:   0%|          | 0/1 [00:00<?, ?it/s]

Generating train split:   0%|          | 0/6838 [00:00<?, ? examples/s]

Generating test split:   0%|          | 0/3259 [00:00<?, ? examples/s]

Generating validation split:   0%|          | 0/886 [00:00<?, ? examples/s]

Dataset sem_eval_2018_task_1 downloaded and prepared to /root/.cache/huggingface/datasets/sem_eval_2018_task_1/subtask5.english/1.1.0/a7c0de8b805f1988b118882fb289ccfbbeb9085c7820b6f046b5887e234af182. Subsequent calls will reuse this data.




In [None]:
dataset_train.set_format(type="pandas")
dataset_val.set_format(type="pandas")
dataset_test.set_format(type="pandas")

In [None]:
# view the columns of the dataset
dataset_train

Dataset({
    features: ['ID', 'Tweet', 'anger', 'anticipation', 'disgust', 'fear', 'joy', 'love', 'optimism', 'pessimism', 'sadness', 'surprise', 'trust'],
    num_rows: 6838
})

In [None]:
dataset_train['anger'].head()

0    False
1    False
2     True
3    False
4     True
Name: anger, dtype: bool

### preprocess the data to feed it to model

In [None]:
df_train = pd.DataFrame(dataset_train[:])
df_val = pd.DataFrame(dataset_val[:])
df_test = pd.DataFrame(dataset_test[:])

In [None]:
# example from the dataset
print(f"text: {df_train.iloc[5]['Tweet']}")
df_train.iloc[5]

text: No but that's so cute. Atsu was probably shy about photos before but cherry helped her out uwu


ID                                                  2017-En-22190
Tweet           No but that's so cute. Atsu was probably shy a...
anger                                                       False
anticipation                                                False
disgust                                                     False
fear                                                        False
joy                                                          True
love                                                        False
optimism                                                    False
pessimism                                                   False
sadness                                                     False
surprise                                                    False
trust                                                       False
Name: 5, dtype: object

In [None]:
emotions = ["anger", "anticipation", "disgust", "fear", "joy", "love", "optimism", 
            "pessimism", "sadness", "surprise", "trust"]

# create a single "label" column for each dataset
df_train["label"] = df_train[emotions].idxmax(axis=1)
labeled_df_train = df_train.drop(emotions, axis=1)

df_test["label"] = df_test[emotions].idxmax(axis=1)
labeled_df_test = df_test.drop(emotions, axis=1)

df_val["label"] = df_val[emotions].idxmax(axis=1)
labeled_df_val = df_val.drop(emotions, axis=1)


In [None]:
# format the columns to feed into model
labeled_df_train['label'] = '__label__' + labeled_df_train['label'].astype(str)
labeled_df_test['label'] = '__label__' + labeled_df_test['label'].astype(str)
labeled_df_val['label'] = '__label__' + labeled_df_val['label'].astype(str)

df_train = labeled_df_train.rename(columns={"Tweet": "text"}).drop("ID", axis=1)
df_test = labeled_df_test.rename(columns={"Tweet": "text"}).drop("ID", axis=1)
df_val = labeled_df_val.rename(columns={"Tweet": "text"}).drop("ID", axis=1)

In [None]:
df_train.head()

Unnamed: 0,text,label
0,“Worry is a down payment on a problem you may ...,__label__anticipation
1,Whatever you decide to do make sure it makes y...,__label__joy
2,@Max_Kellerman it also helps that the majorit...,__label__anger
3,Accept the challenges so that you can literall...,__label__joy
4,My roommate: it's okay that we can't spell bec...,__label__anger


## Model training on multilabel dataset

In [None]:
!mkdir -p data_fst

In [None]:
df_train.to_csv("data_fst/train.csv")
df_test.to_csv("data_fst/test.csv")
df_val.to_csv("data_fst/dev.csv")

In [None]:
from flair.data import Corpus
from flair.embeddings import WordEmbeddings, FlairEmbeddings, DocumentRNNEmbeddings
from flair.models import TextClassifier
from flair.trainers import ModelTrainer
from pathlib import Path
from datasets import Dataset
from flair.datasets import CSVClassificationCorpus

In [None]:
# load the datasets
data_folder = 'data_fst/'
column_name_map = {2:"label_topic",1:"text"}
# create a corpus class instance from our data to then feed into the model
corpus_csv = CSVClassificationCorpus(data_folder, 
                                     column_name_map=column_name_map,
                                     skip_header=True,
                                     delimiter=',',
                                     label_type='question_class')

2023-05-23 12:32:06,081 Reading data from data_fst
2023-05-23 12:32:06,082 Train: data_fst/train.csv
2023-05-23 12:32:06,083 Dev: data_fst/dev.csv
2023-05-23 12:32:06,084 Test: data_fst/test.csv


In [None]:
# create a label dictionary from our data
label_dict_csv = corpus_csv.make_label_dictionary(label_type='question_class')
print(label_dict_csv)

2023-05-23 12:32:06,132 Computing label dictionary. Progress:


6838it [00:03, 2001.31it/s]

2023-05-23 12:32:09,561 Dictionary created for label 'question_class' with 11 values: __label__anger (seen 2748 times), __label__joy (seen 1660 times), __label__anticipation (seen 811 times), __label__fear (seen 569 times), __label__disgust (seen 452 times), __label__pessimism (seen 194 times), __label__optimism (seen 180 times), __label__sadness (seen 180 times), __label__love (seen 27 times), __label__surprise (seen 17 times)
Dictionary with 11 tags: <unk>, __label__anger, __label__joy, __label__anticipation, __label__fear, __label__disgust, __label__pessimism, __label__optimism, __label__sadness, __label__love, __label__surprise





In [None]:
# Word embeddings provide methods that combine words and documents in different ways.
word_embeddings = [FlairEmbeddings('news-forward-fast'), FlairEmbeddings('news-backward-fast')]
# DocumentEmbeddings which embed an entire text or sentence
document_embeddings = DocumentRNNEmbeddings(word_embeddings, hidden_size=512, reproject_words=True, reproject_words_dimension=256)


2023-05-23 12:32:10,227 https://flair.informatik.hu-berlin.de/resources/embeddings/flair/lm-news-english-forward-1024-v0.2rc.pt not found in cache, downloading to /tmp/tmphz8pxr83


100%|██████████| 18.8M/18.8M [00:01<00:00, 10.1MB/s]

2023-05-23 12:32:12,587 copying /tmp/tmphz8pxr83 to cache at /root/.flair/embeddings/lm-news-english-forward-1024-v0.2rc.pt
2023-05-23 12:32:12,607 removing temp file /tmp/tmphz8pxr83





2023-05-23 12:32:14,004 https://flair.informatik.hu-berlin.de/resources/embeddings/flair/lm-news-english-backward-1024-v0.2rc.pt not found in cache, downloading to /tmp/tmpt2cxredy


100%|██████████| 18.8M/18.8M [00:01<00:00, 10.7MB/s]

2023-05-23 12:32:16,234 copying /tmp/tmpt2cxredy to cache at /root/.flair/embeddings/lm-news-english-backward-1024-v0.2rc.pt
2023-05-23 12:32:16,251 removing temp file /tmp/tmpt2cxredy





In [None]:
# we define the classifier model by creating a TextClassifier instance
classifier = TextClassifier(document_embeddings, label_dictionary=label_dict_csv,
                            label_type='question_class', multi_label=True)
trainer = ModelTrainer(classifier, corpus_csv)

In [None]:
# train the model using Adam optimizer
import torch
trainer.train('data_fst/',
              max_epochs=50,
              optimizer=torch.optim.Adam,
              learning_rate=1.0e-3,
              mini_batch_size=16, 
              )

2023-05-23 12:32:16,343 ----------------------------------------------------------------------------------------------------
2023-05-23 12:32:16,345 Model: "TextClassifier(
  (embeddings): DocumentRNNEmbeddings(
    (embeddings): StackedEmbeddings(
      (list_embedding_0): FlairEmbeddings(
        (lm): LanguageModel(
          (drop): Dropout(p=0.25, inplace=False)
          (encoder): Embedding(275, 100)
          (rnn): LSTM(100, 1024)
        )
      )
      (list_embedding_1): FlairEmbeddings(
        (lm): LanguageModel(
          (drop): Dropout(p=0.25, inplace=False)
          (encoder): Embedding(275, 100)
          (rnn): LSTM(100, 1024)
        )
      )
    )
    (word_reprojection_map): Linear(in_features=2048, out_features=256, bias=True)
    (rnn): GRU(256, 512, batch_first=True)
    (dropout): Dropout(p=0.5, inplace=False)
  )
  (decoder): Linear(in_features=512, out_features=11, bias=True)
  (dropout): Dropout(p=0.0, inplace=False)
  (locked_dropout): LockedDropout(p=

100%|██████████| 56/56 [00:05<00:00, 10.74it/s]

2023-05-23 12:32:56,252 Evaluating as a multi-label problem: False
2023-05-23 12:32:56,269 DEV : loss 2.39920711517334 - f1-score (micro avg)  0.381





2023-05-23 12:32:56,553 BAD EPOCHS (no improvement): 0
2023-05-23 12:32:56,555 saving best model
2023-05-23 12:32:56,629 ----------------------------------------------------------------------------------------------------
2023-05-23 12:32:59,591 epoch 2 - iter 42/428 - loss 2.39651133 - time (sec): 2.96 - samples/sec: 226.95 - lr: 0.001000
2023-05-23 12:33:02,591 epoch 2 - iter 84/428 - loss 2.37764266 - time (sec): 5.96 - samples/sec: 225.48 - lr: 0.001000
2023-05-23 12:33:06,267 epoch 2 - iter 126/428 - loss 2.35983649 - time (sec): 9.64 - samples/sec: 209.20 - lr: 0.001000
2023-05-23 12:33:09,692 epoch 2 - iter 168/428 - loss 2.36651862 - time (sec): 13.06 - samples/sec: 205.80 - lr: 0.001000
2023-05-23 12:33:12,734 epoch 2 - iter 210/428 - loss 2.37481027 - time (sec): 16.10 - samples/sec: 208.65 - lr: 0.001000
2023-05-23 12:33:15,740 epoch 2 - iter 252/428 - loss 2.36979537 - time (sec): 19.11 - samples/sec: 211.00 - lr: 0.001000
2023-05-23 12:33:19,410 epoch 2 - iter 294/428 - lo

100%|██████████| 56/56 [00:05<00:00, 10.07it/s]

2023-05-23 12:33:35,541 Evaluating as a multi-label problem: False
2023-05-23 12:33:35,569 DEV : loss 2.187305212020874 - f1-score (micro avg)  0.4517





2023-05-23 12:33:36,060 BAD EPOCHS (no improvement): 0
2023-05-23 12:33:36,063 saving best model
2023-05-23 12:33:36,186 ----------------------------------------------------------------------------------------------------
2023-05-23 12:33:39,622 epoch 3 - iter 42/428 - loss 2.22016976 - time (sec): 3.43 - samples/sec: 195.77 - lr: 0.001000
2023-05-23 12:33:42,807 epoch 3 - iter 84/428 - loss 2.22741128 - time (sec): 6.62 - samples/sec: 203.09 - lr: 0.001000
2023-05-23 12:33:45,875 epoch 3 - iter 126/428 - loss 2.20576352 - time (sec): 9.69 - samples/sec: 208.14 - lr: 0.001000
2023-05-23 12:33:49,779 epoch 3 - iter 168/428 - loss 2.21419727 - time (sec): 13.59 - samples/sec: 197.79 - lr: 0.001000
2023-05-23 12:33:53,150 epoch 3 - iter 210/428 - loss 2.18870742 - time (sec): 16.96 - samples/sec: 198.10 - lr: 0.001000
2023-05-23 12:33:56,192 epoch 3 - iter 252/428 - loss 2.19207144 - time (sec): 20.00 - samples/sec: 201.57 - lr: 0.001000
2023-05-23 12:33:59,219 epoch 3 - iter 294/428 - lo

100%|██████████| 56/56 [00:04<00:00, 13.74it/s]

2023-05-23 12:34:14,172 Evaluating as a multi-label problem: True
2023-05-23 12:34:14,218 DEV : loss 2.0205698013305664 - f1-score (micro avg)  0.486





2023-05-23 12:34:14,970 BAD EPOCHS (no improvement): 0
2023-05-23 12:34:14,972 saving best model
2023-05-23 12:34:15,071 ----------------------------------------------------------------------------------------------------
2023-05-23 12:34:18,384 epoch 4 - iter 42/428 - loss 2.07872371 - time (sec): 3.31 - samples/sec: 202.96 - lr: 0.001000
2023-05-23 12:34:21,767 epoch 4 - iter 84/428 - loss 2.09706441 - time (sec): 6.69 - samples/sec: 200.76 - lr: 0.001000
2023-05-23 12:34:24,746 epoch 4 - iter 126/428 - loss 2.09102628 - time (sec): 9.67 - samples/sec: 208.42 - lr: 0.001000
2023-05-23 12:34:27,775 epoch 4 - iter 168/428 - loss 2.09130881 - time (sec): 12.70 - samples/sec: 211.61 - lr: 0.001000
2023-05-23 12:34:31,475 epoch 4 - iter 210/428 - loss 2.09377178 - time (sec): 16.40 - samples/sec: 204.86 - lr: 0.001000
2023-05-23 12:34:35,013 epoch 4 - iter 252/428 - loss 2.07334146 - time (sec): 19.94 - samples/sec: 202.21 - lr: 0.001000
2023-05-23 12:34:38,131 epoch 4 - iter 294/428 - lo

100%|██████████| 56/56 [00:04<00:00, 13.21it/s]

2023-05-23 12:34:53,126 Evaluating as a multi-label problem: False
2023-05-23 12:34:53,143 DEV : loss 2.039571523666382 - f1-score (micro avg)  0.4459





2023-05-23 12:34:53,421 BAD EPOCHS (no improvement): 1
2023-05-23 12:34:53,422 ----------------------------------------------------------------------------------------------------
2023-05-23 12:34:57,021 epoch 5 - iter 42/428 - loss 1.97466031 - time (sec): 3.60 - samples/sec: 186.88 - lr: 0.001000
2023-05-23 12:35:00,123 epoch 5 - iter 84/428 - loss 1.95844950 - time (sec): 6.70 - samples/sec: 200.66 - lr: 0.001000
2023-05-23 12:35:03,585 epoch 5 - iter 126/428 - loss 1.99406543 - time (sec): 10.16 - samples/sec: 198.42 - lr: 0.001000
2023-05-23 12:35:06,726 epoch 5 - iter 168/428 - loss 2.01156936 - time (sec): 13.30 - samples/sec: 202.09 - lr: 0.001000
2023-05-23 12:35:09,792 epoch 5 - iter 210/428 - loss 2.01162042 - time (sec): 16.37 - samples/sec: 205.28 - lr: 0.001000
2023-05-23 12:35:13,433 epoch 5 - iter 252/428 - loss 2.02013459 - time (sec): 20.01 - samples/sec: 201.52 - lr: 0.001000
2023-05-23 12:35:16,773 epoch 5 - iter 294/428 - loss 2.01887779 - time (sec): 23.35 - sampl

100%|██████████| 56/56 [00:04<00:00, 12.19it/s]

2023-05-23 12:35:31,869 Evaluating as a multi-label problem: True





2023-05-23 12:35:31,932 DEV : loss 1.9613715410232544 - f1-score (micro avg)  0.5292
2023-05-23 12:35:32,427 BAD EPOCHS (no improvement): 0
2023-05-23 12:35:32,431 saving best model
2023-05-23 12:35:32,557 ----------------------------------------------------------------------------------------------------
2023-05-23 12:35:35,717 epoch 6 - iter 42/428 - loss 1.93138262 - time (sec): 3.16 - samples/sec: 212.90 - lr: 0.001000
2023-05-23 12:35:39,302 epoch 6 - iter 84/428 - loss 1.96096347 - time (sec): 6.74 - samples/sec: 199.37 - lr: 0.001000
2023-05-23 12:35:42,411 epoch 6 - iter 126/428 - loss 1.94602751 - time (sec): 9.85 - samples/sec: 204.66 - lr: 0.001000
2023-05-23 12:35:45,763 epoch 6 - iter 168/428 - loss 1.96862693 - time (sec): 13.20 - samples/sec: 203.60 - lr: 0.001000
2023-05-23 12:35:49,027 epoch 6 - iter 210/428 - loss 1.95214656 - time (sec): 16.47 - samples/sec: 204.04 - lr: 0.001000
2023-05-23 12:35:52,107 epoch 6 - iter 252/428 - loss 1.96101228 - time (sec): 19.55 - s

100%|██████████| 56/56 [00:04<00:00, 11.87it/s]

2023-05-23 12:36:10,785 Evaluating as a multi-label problem: True
2023-05-23 12:36:10,820 DEV : loss 2.008732318878174 - f1-score (micro avg)  0.5254





2023-05-23 12:36:11,098 BAD EPOCHS (no improvement): 1
2023-05-23 12:36:11,099 ----------------------------------------------------------------------------------------------------
2023-05-23 12:36:14,445 epoch 7 - iter 42/428 - loss 1.91467122 - time (sec): 3.34 - samples/sec: 200.97 - lr: 0.001000
2023-05-23 12:36:17,717 epoch 7 - iter 84/428 - loss 1.93474520 - time (sec): 6.62 - samples/sec: 203.17 - lr: 0.001000
2023-05-23 12:36:21,280 epoch 7 - iter 126/428 - loss 1.96481416 - time (sec): 10.18 - samples/sec: 198.06 - lr: 0.001000
2023-05-23 12:36:24,311 epoch 7 - iter 168/428 - loss 1.93228840 - time (sec): 13.21 - samples/sec: 203.48 - lr: 0.001000
2023-05-23 12:36:27,552 epoch 7 - iter 210/428 - loss 1.93100298 - time (sec): 16.45 - samples/sec: 204.25 - lr: 0.001000
2023-05-23 12:36:31,002 epoch 7 - iter 252/428 - loss 1.93193160 - time (sec): 19.90 - samples/sec: 202.60 - lr: 0.001000
2023-05-23 12:36:34,066 epoch 7 - iter 294/428 - loss 1.92058797 - time (sec): 22.96 - sampl

100%|██████████| 56/56 [00:04<00:00, 13.54it/s]

2023-05-23 12:36:49,095 Evaluating as a multi-label problem: False
2023-05-23 12:36:49,112 DEV : loss 1.9906941652297974 - f1-score (micro avg)  0.5257





2023-05-23 12:36:49,875 BAD EPOCHS (no improvement): 2
2023-05-23 12:36:49,877 ----------------------------------------------------------------------------------------------------
2023-05-23 12:36:52,975 epoch 8 - iter 42/428 - loss 1.82936975 - time (sec): 3.10 - samples/sec: 217.06 - lr: 0.001000
2023-05-23 12:36:56,107 epoch 8 - iter 84/428 - loss 1.81872621 - time (sec): 6.23 - samples/sec: 215.80 - lr: 0.001000
2023-05-23 12:36:59,661 epoch 8 - iter 126/428 - loss 1.84851393 - time (sec): 9.78 - samples/sec: 206.10 - lr: 0.001000
2023-05-23 12:37:03,233 epoch 8 - iter 168/428 - loss 1.84108864 - time (sec): 13.35 - samples/sec: 201.28 - lr: 0.001000
2023-05-23 12:37:06,243 epoch 8 - iter 210/428 - loss 1.85396464 - time (sec): 16.36 - samples/sec: 205.32 - lr: 0.001000
2023-05-23 12:37:09,279 epoch 8 - iter 252/428 - loss 1.87410328 - time (sec): 19.40 - samples/sec: 207.83 - lr: 0.001000
2023-05-23 12:37:12,797 epoch 8 - iter 294/428 - loss 1.87180204 - time (sec): 22.92 - sample

100%|██████████| 56/56 [00:04<00:00, 12.05it/s]

2023-05-23 12:37:28,018 Evaluating as a multi-label problem: False
2023-05-23 12:37:28,044 DEV : loss 1.9675838947296143 - f1-score (micro avg)  0.5371





2023-05-23 12:37:29,055 BAD EPOCHS (no improvement): 0
2023-05-23 12:37:29,062 saving best model
2023-05-23 12:37:29,162 ----------------------------------------------------------------------------------------------------
2023-05-23 12:37:32,271 epoch 9 - iter 42/428 - loss 1.79058003 - time (sec): 3.11 - samples/sec: 216.29 - lr: 0.001000
2023-05-23 12:37:35,292 epoch 9 - iter 84/428 - loss 1.79484985 - time (sec): 6.13 - samples/sec: 219.34 - lr: 0.001000
2023-05-23 12:37:38,302 epoch 9 - iter 126/428 - loss 1.79552568 - time (sec): 9.14 - samples/sec: 220.64 - lr: 0.001000
2023-05-23 12:37:41,718 epoch 9 - iter 168/428 - loss 1.79983061 - time (sec): 12.55 - samples/sec: 214.12 - lr: 0.001000
2023-05-23 12:37:45,615 epoch 9 - iter 210/428 - loss 1.79548844 - time (sec): 16.45 - samples/sec: 204.25 - lr: 0.001000
2023-05-23 12:37:48,790 epoch 9 - iter 252/428 - loss 1.81299393 - time (sec): 19.63 - samples/sec: 205.45 - lr: 0.001000
2023-05-23 12:37:51,816 epoch 9 - iter 294/428 - lo

100%|██████████| 56/56 [00:04<00:00, 13.82it/s]

2023-05-23 12:38:06,778 Evaluating as a multi-label problem: True
2023-05-23 12:38:06,815 DEV : loss 1.986078143119812 - f1-score (micro avg)  0.5148





2023-05-23 12:38:07,079 BAD EPOCHS (no improvement): 1
2023-05-23 12:38:07,081 ----------------------------------------------------------------------------------------------------
2023-05-23 12:38:10,490 epoch 10 - iter 42/428 - loss 1.77352553 - time (sec): 3.41 - samples/sec: 197.31 - lr: 0.001000
2023-05-23 12:38:14,325 epoch 10 - iter 84/428 - loss 1.76602150 - time (sec): 7.24 - samples/sec: 185.60 - lr: 0.001000
2023-05-23 12:38:17,312 epoch 10 - iter 126/428 - loss 1.78906243 - time (sec): 10.23 - samples/sec: 197.09 - lr: 0.001000
2023-05-23 12:38:20,340 epoch 10 - iter 168/428 - loss 1.78707887 - time (sec): 13.26 - samples/sec: 202.77 - lr: 0.001000
2023-05-23 12:38:23,558 epoch 10 - iter 210/428 - loss 1.79985516 - time (sec): 16.47 - samples/sec: 203.95 - lr: 0.001000
2023-05-23 12:38:27,554 epoch 10 - iter 252/428 - loss 1.81268065 - time (sec): 20.47 - samples/sec: 196.97 - lr: 0.001000
2023-05-23 12:38:30,569 epoch 10 - iter 294/428 - loss 1.80969644 - time (sec): 23.48 

100%|██████████| 56/56 [00:04<00:00, 11.90it/s]

2023-05-23 12:38:45,512 Evaluating as a multi-label problem: True
2023-05-23 12:38:45,542 DEV : loss 1.9659820795059204 - f1-score (micro avg)  0.5515





2023-05-23 12:38:45,816 BAD EPOCHS (no improvement): 0
2023-05-23 12:38:45,817 saving best model
2023-05-23 12:38:45,899 ----------------------------------------------------------------------------------------------------
2023-05-23 12:38:48,968 epoch 11 - iter 42/428 - loss 1.71553803 - time (sec): 3.07 - samples/sec: 219.16 - lr: 0.001000
2023-05-23 12:38:52,883 epoch 11 - iter 84/428 - loss 1.76226138 - time (sec): 6.98 - samples/sec: 192.51 - lr: 0.001000
2023-05-23 12:38:56,180 epoch 11 - iter 126/428 - loss 1.76661322 - time (sec): 10.28 - samples/sec: 196.16 - lr: 0.001000
2023-05-23 12:38:59,222 epoch 11 - iter 168/428 - loss 1.75837706 - time (sec): 13.32 - samples/sec: 201.80 - lr: 0.001000
2023-05-23 12:39:02,227 epoch 11 - iter 210/428 - loss 1.75817344 - time (sec): 16.32 - samples/sec: 205.82 - lr: 0.001000
2023-05-23 12:39:05,324 epoch 11 - iter 252/428 - loss 1.75212540 - time (sec): 19.42 - samples/sec: 207.60 - lr: 0.001000
2023-05-23 12:39:09,649 epoch 11 - iter 294/

100%|██████████| 56/56 [00:05<00:00, 10.11it/s]

2023-05-23 12:39:24,891 Evaluating as a multi-label problem: True
2023-05-23 12:39:24,927 DEV : loss 1.963733434677124 - f1-score (micro avg)  0.5651





2023-05-23 12:39:25,197 BAD EPOCHS (no improvement): 0
2023-05-23 12:39:25,201 saving best model
2023-05-23 12:39:25,288 ----------------------------------------------------------------------------------------------------
2023-05-23 12:39:28,318 epoch 12 - iter 42/428 - loss 1.74068641 - time (sec): 3.03 - samples/sec: 221.97 - lr: 0.001000
2023-05-23 12:39:31,341 epoch 12 - iter 84/428 - loss 1.67438968 - time (sec): 6.05 - samples/sec: 222.11 - lr: 0.001000
2023-05-23 12:39:35,060 epoch 12 - iter 126/428 - loss 1.69127247 - time (sec): 9.77 - samples/sec: 206.36 - lr: 0.001000
2023-05-23 12:39:38,548 epoch 12 - iter 168/428 - loss 1.70081159 - time (sec): 13.26 - samples/sec: 202.74 - lr: 0.001000
2023-05-23 12:39:41,575 epoch 12 - iter 210/428 - loss 1.71772906 - time (sec): 16.28 - samples/sec: 206.33 - lr: 0.001000
2023-05-23 12:39:44,625 epoch 12 - iter 252/428 - loss 1.71521563 - time (sec): 19.33 - samples/sec: 208.54 - lr: 0.001000
2023-05-23 12:39:47,683 epoch 12 - iter 294/4

100%|██████████| 56/56 [00:04<00:00, 13.82it/s]

2023-05-23 12:40:02,556 Evaluating as a multi-label problem: True
2023-05-23 12:40:02,609 DEV : loss 1.9898380041122437 - f1-score (micro avg)  0.5469





2023-05-23 12:40:03,751 BAD EPOCHS (no improvement): 1
2023-05-23 12:40:03,753 ----------------------------------------------------------------------------------------------------
2023-05-23 12:40:07,308 epoch 13 - iter 42/428 - loss 1.73209752 - time (sec): 3.55 - samples/sec: 189.11 - lr: 0.001000
2023-05-23 12:40:10,308 epoch 13 - iter 84/428 - loss 1.71901194 - time (sec): 6.55 - samples/sec: 205.07 - lr: 0.001000
2023-05-23 12:40:13,299 epoch 13 - iter 126/428 - loss 1.72878577 - time (sec): 9.54 - samples/sec: 211.22 - lr: 0.001000
2023-05-23 12:40:16,853 epoch 13 - iter 168/428 - loss 1.71942394 - time (sec): 13.10 - samples/sec: 205.21 - lr: 0.001000
2023-05-23 12:40:20,347 epoch 13 - iter 210/428 - loss 1.69672327 - time (sec): 16.59 - samples/sec: 202.50 - lr: 0.001000
2023-05-23 12:40:23,538 epoch 13 - iter 252/428 - loss 1.69930948 - time (sec): 19.78 - samples/sec: 203.81 - lr: 0.001000
2023-05-23 12:40:26,516 epoch 13 - iter 294/428 - loss 1.70216324 - time (sec): 22.76 -

100%|██████████| 56/56 [00:04<00:00, 13.81it/s]

2023-05-23 12:40:41,372 Evaluating as a multi-label problem: True
2023-05-23 12:40:41,403 DEV : loss 1.9912744760513306 - f1-score (micro avg)  0.5448





2023-05-23 12:40:42,191 BAD EPOCHS (no improvement): 2
2023-05-23 12:40:42,193 ----------------------------------------------------------------------------------------------------
2023-05-23 12:40:45,316 epoch 14 - iter 42/428 - loss 1.57514710 - time (sec): 3.12 - samples/sec: 215.51 - lr: 0.001000
2023-05-23 12:40:48,725 epoch 14 - iter 84/428 - loss 1.63848866 - time (sec): 6.53 - samples/sec: 205.89 - lr: 0.001000
2023-05-23 12:40:51,902 epoch 14 - iter 126/428 - loss 1.65431599 - time (sec): 9.70 - samples/sec: 207.74 - lr: 0.001000
2023-05-23 12:40:55,499 epoch 14 - iter 168/428 - loss 1.64262016 - time (sec): 13.30 - samples/sec: 202.09 - lr: 0.001000
2023-05-23 12:40:58,528 epoch 14 - iter 210/428 - loss 1.65626655 - time (sec): 16.33 - samples/sec: 205.75 - lr: 0.001000
2023-05-23 12:41:01,788 epoch 14 - iter 252/428 - loss 1.65082092 - time (sec): 19.59 - samples/sec: 205.81 - lr: 0.001000
2023-05-23 12:41:05,207 epoch 14 - iter 294/428 - loss 1.65883514 - time (sec): 23.01 -

100%|██████████| 56/56 [00:04<00:00, 12.18it/s]

2023-05-23 12:41:20,493 Evaluating as a multi-label problem: True
2023-05-23 12:41:20,529 DEV : loss 1.9482636451721191 - f1-score (micro avg)  0.5621





2023-05-23 12:41:21,275 BAD EPOCHS (no improvement): 3
2023-05-23 12:41:21,280 ----------------------------------------------------------------------------------------------------
2023-05-23 12:41:24,430 epoch 15 - iter 42/428 - loss 1.58750966 - time (sec): 3.15 - samples/sec: 213.62 - lr: 0.001000
2023-05-23 12:41:27,513 epoch 15 - iter 84/428 - loss 1.60256910 - time (sec): 6.23 - samples/sec: 215.76 - lr: 0.001000
2023-05-23 12:41:30,812 epoch 15 - iter 126/428 - loss 1.58822467 - time (sec): 9.53 - samples/sec: 211.59 - lr: 0.001000
2023-05-23 12:41:34,270 epoch 15 - iter 168/428 - loss 1.59951493 - time (sec): 12.99 - samples/sec: 206.99 - lr: 0.001000
2023-05-23 12:41:38,023 epoch 15 - iter 210/428 - loss 1.59858718 - time (sec): 16.74 - samples/sec: 200.73 - lr: 0.001000
2023-05-23 12:41:41,159 epoch 15 - iter 252/428 - loss 1.62245118 - time (sec): 19.87 - samples/sec: 202.87 - lr: 0.001000
2023-05-23 12:41:44,393 epoch 15 - iter 294/428 - loss 1.61852223 - time (sec): 23.11 -

100%|██████████| 56/56 [00:04<00:00, 13.12it/s]

2023-05-23 12:41:59,382 Evaluating as a multi-label problem: True





2023-05-23 12:41:59,442 DEV : loss 2.04893159866333 - f1-score (micro avg)  0.5468
2023-05-23 12:41:59,905 Epoch    15: reducing learning rate of group 0 to 5.0000e-04.
2023-05-23 12:41:59,909 BAD EPOCHS (no improvement): 4
2023-05-23 12:41:59,911 ----------------------------------------------------------------------------------------------------
2023-05-23 12:42:03,913 epoch 16 - iter 42/428 - loss 1.54100920 - time (sec): 4.00 - samples/sec: 167.98 - lr: 0.000500
2023-05-23 12:42:06,928 epoch 16 - iter 84/428 - loss 1.44914411 - time (sec): 7.02 - samples/sec: 191.58 - lr: 0.000500
2023-05-23 12:42:09,985 epoch 16 - iter 126/428 - loss 1.49812336 - time (sec): 10.07 - samples/sec: 200.15 - lr: 0.000500
2023-05-23 12:42:13,072 epoch 16 - iter 168/428 - loss 1.51090670 - time (sec): 13.16 - samples/sec: 204.27 - lr: 0.000500
2023-05-23 12:42:17,270 epoch 16 - iter 210/428 - loss 1.52883316 - time (sec): 17.36 - samples/sec: 193.58 - lr: 0.000500
2023-05-23 12:42:20,246 epoch 16 - iter 

100%|██████████| 56/56 [00:04<00:00, 13.80it/s]

2023-05-23 12:42:38,178 Evaluating as a multi-label problem: True
2023-05-23 12:42:38,212 DEV : loss 2.030203104019165 - f1-score (micro avg)  0.5354





2023-05-23 12:42:38,507 BAD EPOCHS (no improvement): 1
2023-05-23 12:42:38,509 ----------------------------------------------------------------------------------------------------
2023-05-23 12:42:41,554 epoch 17 - iter 42/428 - loss 1.42168542 - time (sec): 3.04 - samples/sec: 221.00 - lr: 0.000500
2023-05-23 12:42:45,696 epoch 17 - iter 84/428 - loss 1.41389597 - time (sec): 7.18 - samples/sec: 187.12 - lr: 0.000500
2023-05-23 12:42:48,747 epoch 17 - iter 126/428 - loss 1.41910080 - time (sec): 10.23 - samples/sec: 196.99 - lr: 0.000500
2023-05-23 12:42:51,802 epoch 17 - iter 168/428 - loss 1.44149446 - time (sec): 13.29 - samples/sec: 202.28 - lr: 0.000500
2023-05-23 12:42:54,840 epoch 17 - iter 210/428 - loss 1.44389473 - time (sec): 16.33 - samples/sec: 205.80 - lr: 0.000500
2023-05-23 12:42:59,040 epoch 17 - iter 252/428 - loss 1.46745523 - time (sec): 20.53 - samples/sec: 196.43 - lr: 0.000500
2023-05-23 12:43:02,131 epoch 17 - iter 294/428 - loss 1.47459200 - time (sec): 23.62 

100%|██████████| 56/56 [00:04<00:00, 13.30it/s]

2023-05-23 12:43:17,360 Evaluating as a multi-label problem: True
2023-05-23 12:43:17,391 DEV : loss 2.0385875701904297 - f1-score (micro avg)  0.5686





2023-05-23 12:43:17,674 BAD EPOCHS (no improvement): 0
2023-05-23 12:43:17,676 saving best model
2023-05-23 12:43:17,763 ----------------------------------------------------------------------------------------------------
2023-05-23 12:43:20,810 epoch 18 - iter 42/428 - loss 1.38990946 - time (sec): 3.04 - samples/sec: 220.99 - lr: 0.000500
2023-05-23 12:43:24,431 epoch 18 - iter 84/428 - loss 1.37810132 - time (sec): 6.66 - samples/sec: 201.75 - lr: 0.000500
2023-05-23 12:43:27,979 epoch 18 - iter 126/428 - loss 1.39503641 - time (sec): 10.21 - samples/sec: 197.45 - lr: 0.000500
2023-05-23 12:43:31,043 epoch 18 - iter 168/428 - loss 1.41495305 - time (sec): 13.27 - samples/sec: 202.51 - lr: 0.000500
2023-05-23 12:43:34,068 epoch 18 - iter 210/428 - loss 1.43064558 - time (sec): 16.30 - samples/sec: 206.15 - lr: 0.000500
2023-05-23 12:43:37,715 epoch 18 - iter 252/428 - loss 1.45209466 - time (sec): 19.95 - samples/sec: 202.15 - lr: 0.000500
2023-05-23 12:43:41,098 epoch 18 - iter 294/

100%|██████████| 56/56 [00:05<00:00, 10.57it/s]

2023-05-23 12:43:56,234 Evaluating as a multi-label problem: True
2023-05-23 12:43:56,285 DEV : loss 2.106293201446533 - f1-score (micro avg)  0.5674





2023-05-23 12:43:56,788 BAD EPOCHS (no improvement): 1
2023-05-23 12:43:56,791 ----------------------------------------------------------------------------------------------------
2023-05-23 12:43:59,818 epoch 19 - iter 42/428 - loss 1.36377330 - time (sec): 3.03 - samples/sec: 222.14 - lr: 0.000500
2023-05-23 12:44:02,839 epoch 19 - iter 84/428 - loss 1.42869591 - time (sec): 6.05 - samples/sec: 222.28 - lr: 0.000500
2023-05-23 12:44:06,408 epoch 19 - iter 126/428 - loss 1.38814493 - time (sec): 9.62 - samples/sec: 209.66 - lr: 0.000500
2023-05-23 12:44:09,846 epoch 19 - iter 168/428 - loss 1.38901966 - time (sec): 13.05 - samples/sec: 205.92 - lr: 0.000500
2023-05-23 12:44:12,970 epoch 19 - iter 210/428 - loss 1.38166408 - time (sec): 16.18 - samples/sec: 207.70 - lr: 0.000500
2023-05-23 12:44:16,015 epoch 19 - iter 252/428 - loss 1.38455515 - time (sec): 19.22 - samples/sec: 209.75 - lr: 0.000500
2023-05-23 12:44:19,622 epoch 19 - iter 294/428 - loss 1.39713035 - time (sec): 22.83 -

100%|██████████| 56/56 [00:04<00:00, 11.81it/s]

2023-05-23 12:44:34,454 Evaluating as a multi-label problem: True





2023-05-23 12:44:34,510 DEV : loss 2.119152069091797 - f1-score (micro avg)  0.551
2023-05-23 12:44:34,963 BAD EPOCHS (no improvement): 2
2023-05-23 12:44:34,967 ----------------------------------------------------------------------------------------------------
2023-05-23 12:44:38,365 epoch 20 - iter 42/428 - loss 1.24780105 - time (sec): 3.40 - samples/sec: 197.90 - lr: 0.000500
2023-05-23 12:44:41,365 epoch 20 - iter 84/428 - loss 1.33658324 - time (sec): 6.40 - samples/sec: 210.13 - lr: 0.000500
2023-05-23 12:44:44,941 epoch 20 - iter 126/428 - loss 1.36590402 - time (sec): 9.97 - samples/sec: 202.15 - lr: 0.000500
2023-05-23 12:44:47,912 epoch 20 - iter 168/428 - loss 1.38877647 - time (sec): 12.94 - samples/sec: 207.68 - lr: 0.000500
2023-05-23 12:44:51,377 epoch 20 - iter 210/428 - loss 1.37286587 - time (sec): 16.41 - samples/sec: 204.78 - lr: 0.000500
2023-05-23 12:44:54,370 epoch 20 - iter 252/428 - loss 1.36159615 - time (sec): 19.40 - samples/sec: 207.82 - lr: 0.000500
2023

100%|██████████| 56/56 [00:04<00:00, 13.69it/s]

2023-05-23 12:45:12,279 Evaluating as a multi-label problem: True
2023-05-23 12:45:12,310 DEV : loss 2.159588575363159 - f1-score (micro avg)  0.5461





2023-05-23 12:45:13,101 BAD EPOCHS (no improvement): 3
2023-05-23 12:45:13,104 ----------------------------------------------------------------------------------------------------
2023-05-23 12:45:16,441 epoch 21 - iter 42/428 - loss 1.29421930 - time (sec): 3.33 - samples/sec: 201.86 - lr: 0.000500
2023-05-23 12:45:19,746 epoch 21 - iter 84/428 - loss 1.31943493 - time (sec): 6.63 - samples/sec: 202.58 - lr: 0.000500
2023-05-23 12:45:22,759 epoch 21 - iter 126/428 - loss 1.30429303 - time (sec): 9.65 - samples/sec: 208.97 - lr: 0.000500
2023-05-23 12:45:26,372 epoch 21 - iter 168/428 - loss 1.34433825 - time (sec): 13.26 - samples/sec: 202.72 - lr: 0.000500
2023-05-23 12:45:29,532 epoch 21 - iter 210/428 - loss 1.35556835 - time (sec): 16.42 - samples/sec: 204.63 - lr: 0.000500
2023-05-23 12:45:32,916 epoch 21 - iter 252/428 - loss 1.34930024 - time (sec): 19.80 - samples/sec: 203.60 - lr: 0.000500
2023-05-23 12:45:35,952 epoch 21 - iter 294/428 - loss 1.34476739 - time (sec): 22.84 -

100%|██████████| 56/56 [00:04<00:00, 13.63it/s]

2023-05-23 12:45:50,873 Evaluating as a multi-label problem: True
2023-05-23 12:45:50,906 DEV : loss 2.0870018005371094 - f1-score (micro avg)  0.5654





2023-05-23 12:45:51,669 Epoch    21: reducing learning rate of group 0 to 2.5000e-04.
2023-05-23 12:45:51,673 BAD EPOCHS (no improvement): 4
2023-05-23 12:45:51,681 ----------------------------------------------------------------------------------------------------
2023-05-23 12:45:54,815 epoch 22 - iter 42/428 - loss 1.27131705 - time (sec): 3.13 - samples/sec: 214.64 - lr: 0.000250
2023-05-23 12:45:58,228 epoch 22 - iter 84/428 - loss 1.27449861 - time (sec): 6.54 - samples/sec: 205.39 - lr: 0.000250
2023-05-23 12:46:01,401 epoch 22 - iter 126/428 - loss 1.26855700 - time (sec): 9.72 - samples/sec: 207.47 - lr: 0.000250
2023-05-23 12:46:04,413 epoch 22 - iter 168/428 - loss 1.25684416 - time (sec): 12.73 - samples/sec: 211.18 - lr: 0.000250
2023-05-23 12:46:07,991 epoch 22 - iter 210/428 - loss 1.26252720 - time (sec): 16.31 - samples/sec: 206.05 - lr: 0.000250
2023-05-23 12:46:11,253 epoch 22 - iter 252/428 - loss 1.25660905 - time (sec): 19.57 - samples/sec: 206.04 - lr: 0.000250
2

100%|██████████| 56/56 [00:04<00:00, 13.00it/s]

2023-05-23 12:46:29,408 Evaluating as a multi-label problem: True
2023-05-23 12:46:29,441 DEV : loss 2.1754400730133057 - f1-score (micro avg)  0.5683





2023-05-23 12:46:29,716 BAD EPOCHS (no improvement): 1
2023-05-23 12:46:29,718 ----------------------------------------------------------------------------------------------------
2023-05-23 12:46:33,248 epoch 23 - iter 42/428 - loss 1.17395907 - time (sec): 3.53 - samples/sec: 190.53 - lr: 0.000250
2023-05-23 12:46:36,294 epoch 23 - iter 84/428 - loss 1.21230033 - time (sec): 6.57 - samples/sec: 204.47 - lr: 0.000250
2023-05-23 12:46:39,754 epoch 23 - iter 126/428 - loss 1.21139878 - time (sec): 10.03 - samples/sec: 200.94 - lr: 0.000250
2023-05-23 12:46:42,847 epoch 23 - iter 168/428 - loss 1.20985182 - time (sec): 13.13 - samples/sec: 204.78 - lr: 0.000250
2023-05-23 12:46:46,512 epoch 23 - iter 210/428 - loss 1.21611415 - time (sec): 16.79 - samples/sec: 200.10 - lr: 0.000250
2023-05-23 12:46:49,485 epoch 23 - iter 252/428 - loss 1.23384307 - time (sec): 19.76 - samples/sec: 204.01 - lr: 0.000250
2023-05-23 12:46:52,899 epoch 23 - iter 294/428 - loss 1.24197978 - time (sec): 23.18 

100%|██████████| 56/56 [00:04<00:00, 12.15it/s]

2023-05-23 12:47:07,811 Evaluating as a multi-label problem: True
2023-05-23 12:47:07,841 DEV : loss 2.1913187503814697 - f1-score (micro avg)  0.5606





2023-05-23 12:47:08,109 BAD EPOCHS (no improvement): 2
2023-05-23 12:47:08,111 ----------------------------------------------------------------------------------------------------
2023-05-23 12:47:11,664 epoch 24 - iter 42/428 - loss 1.15071005 - time (sec): 3.55 - samples/sec: 189.36 - lr: 0.000250
2023-05-23 12:47:14,665 epoch 24 - iter 84/428 - loss 1.20289959 - time (sec): 6.55 - samples/sec: 205.18 - lr: 0.000250
2023-05-23 12:47:17,722 epoch 24 - iter 126/428 - loss 1.21852847 - time (sec): 9.61 - samples/sec: 209.84 - lr: 0.000250
2023-05-23 12:47:21,160 epoch 24 - iter 168/428 - loss 1.20868317 - time (sec): 13.05 - samples/sec: 206.05 - lr: 0.000250
2023-05-23 12:47:24,178 epoch 24 - iter 210/428 - loss 1.19575918 - time (sec): 16.06 - samples/sec: 209.17 - lr: 0.000250
2023-05-23 12:47:27,792 epoch 24 - iter 252/428 - loss 1.20183222 - time (sec): 19.68 - samples/sec: 204.91 - lr: 0.000250
2023-05-23 12:47:30,823 epoch 24 - iter 294/428 - loss 1.20284656 - time (sec): 22.71 -

100%|██████████| 56/56 [00:04<00:00, 11.64it/s]

2023-05-23 12:47:45,688 Evaluating as a multi-label problem: True
2023-05-23 12:47:45,738 DEV : loss 2.2344441413879395 - f1-score (micro avg)  0.5553





2023-05-23 12:47:46,216 BAD EPOCHS (no improvement): 3
2023-05-23 12:47:46,219 ----------------------------------------------------------------------------------------------------
2023-05-23 12:47:49,487 epoch 25 - iter 42/428 - loss 1.27491760 - time (sec): 3.27 - samples/sec: 205.75 - lr: 0.000250
2023-05-23 12:47:52,498 epoch 25 - iter 84/428 - loss 1.20959468 - time (sec): 6.28 - samples/sec: 214.11 - lr: 0.000250
2023-05-23 12:47:56,085 epoch 25 - iter 126/428 - loss 1.21579155 - time (sec): 9.86 - samples/sec: 204.38 - lr: 0.000250
2023-05-23 12:47:59,183 epoch 25 - iter 168/428 - loss 1.23064799 - time (sec): 12.96 - samples/sec: 207.38 - lr: 0.000250
2023-05-23 12:48:02,532 epoch 25 - iter 210/428 - loss 1.22945604 - time (sec): 16.31 - samples/sec: 206.00 - lr: 0.000250
2023-05-23 12:48:05,577 epoch 25 - iter 252/428 - loss 1.21040737 - time (sec): 19.36 - samples/sec: 208.31 - lr: 0.000250
2023-05-23 12:48:08,611 epoch 25 - iter 294/428 - loss 1.21948337 - time (sec): 22.39 -

100%|██████████| 56/56 [00:04<00:00, 11.75it/s]

2023-05-23 12:48:24,114 Evaluating as a multi-label problem: True
2023-05-23 12:48:24,147 DEV : loss 2.2455615997314453 - f1-score (micro avg)  0.5564





2023-05-23 12:48:24,423 Epoch    25: reducing learning rate of group 0 to 1.2500e-04.
2023-05-23 12:48:24,425 BAD EPOCHS (no improvement): 4
2023-05-23 12:48:24,429 ----------------------------------------------------------------------------------------------------
2023-05-23 12:48:27,779 epoch 26 - iter 42/428 - loss 1.07435248 - time (sec): 3.35 - samples/sec: 200.66 - lr: 0.000125
2023-05-23 12:48:30,940 epoch 26 - iter 84/428 - loss 1.11658505 - time (sec): 6.51 - samples/sec: 206.47 - lr: 0.000125
2023-05-23 12:48:33,977 epoch 26 - iter 126/428 - loss 1.12960278 - time (sec): 9.55 - samples/sec: 211.17 - lr: 0.000125
2023-05-23 12:48:37,572 epoch 26 - iter 168/428 - loss 1.11982320 - time (sec): 13.14 - samples/sec: 204.54 - lr: 0.000125
2023-05-23 12:48:40,881 epoch 26 - iter 210/428 - loss 1.12105509 - time (sec): 16.45 - samples/sec: 204.25 - lr: 0.000125
2023-05-23 12:48:44,121 epoch 26 - iter 252/428 - loss 1.12707524 - time (sec): 19.69 - samples/sec: 204.77 - lr: 0.000125
2

100%|██████████| 56/56 [00:04<00:00, 13.64it/s]

2023-05-23 12:49:01,975 Evaluating as a multi-label problem: True
2023-05-23 12:49:02,006 DEV : loss 2.2323734760284424 - f1-score (micro avg)  0.5619





2023-05-23 12:49:02,791 BAD EPOCHS (no improvement): 1
2023-05-23 12:49:02,798 ----------------------------------------------------------------------------------------------------
2023-05-23 12:49:05,916 epoch 27 - iter 42/428 - loss 1.16829394 - time (sec): 3.12 - samples/sec: 215.69 - lr: 0.000125
2023-05-23 12:49:09,381 epoch 27 - iter 84/428 - loss 1.13539895 - time (sec): 6.58 - samples/sec: 204.23 - lr: 0.000125
2023-05-23 12:49:12,459 epoch 27 - iter 126/428 - loss 1.11352138 - time (sec): 9.66 - samples/sec: 208.74 - lr: 0.000125
2023-05-23 12:49:15,555 epoch 27 - iter 168/428 - loss 1.11926377 - time (sec): 12.75 - samples/sec: 210.75 - lr: 0.000125
2023-05-23 12:49:19,196 epoch 27 - iter 210/428 - loss 1.11349876 - time (sec): 16.40 - samples/sec: 204.94 - lr: 0.000125
2023-05-23 12:49:22,617 epoch 27 - iter 252/428 - loss 1.11135773 - time (sec): 19.82 - samples/sec: 203.47 - lr: 0.000125
2023-05-23 12:49:25,713 epoch 27 - iter 294/428 - loss 1.12990772 - time (sec): 22.91 -

100%|██████████| 56/56 [00:04<00:00, 13.43it/s]

2023-05-23 12:49:40,592 Evaluating as a multi-label problem: True
2023-05-23 12:49:40,631 DEV : loss 2.2462480068206787 - f1-score (micro avg)  0.5589





2023-05-23 12:49:40,904 BAD EPOCHS (no improvement): 2
2023-05-23 12:49:40,906 ----------------------------------------------------------------------------------------------------
2023-05-23 12:49:44,498 epoch 28 - iter 42/428 - loss 1.09621649 - time (sec): 3.59 - samples/sec: 187.20 - lr: 0.000125
2023-05-23 12:49:47,626 epoch 28 - iter 84/428 - loss 1.11123014 - time (sec): 6.72 - samples/sec: 200.06 - lr: 0.000125
2023-05-23 12:49:51,040 epoch 28 - iter 126/428 - loss 1.10912815 - time (sec): 10.13 - samples/sec: 198.97 - lr: 0.000125
2023-05-23 12:49:54,050 epoch 28 - iter 168/428 - loss 1.12080869 - time (sec): 13.14 - samples/sec: 204.54 - lr: 0.000125
2023-05-23 12:49:57,640 epoch 28 - iter 210/428 - loss 1.12560749 - time (sec): 16.73 - samples/sec: 200.82 - lr: 0.000125
2023-05-23 12:50:00,663 epoch 28 - iter 252/428 - loss 1.13385848 - time (sec): 19.75 - samples/sec: 204.11 - lr: 0.000125
2023-05-23 12:50:04,120 epoch 28 - iter 294/428 - loss 1.12708016 - time (sec): 23.21 

100%|██████████| 56/56 [00:04<00:00, 12.22it/s]

2023-05-23 12:50:19,018 Evaluating as a multi-label problem: True
2023-05-23 12:50:19,049 DEV : loss 2.3105244636535645 - f1-score (micro avg)  0.552





2023-05-23 12:50:19,331 BAD EPOCHS (no improvement): 3
2023-05-23 12:50:19,333 ----------------------------------------------------------------------------------------------------
2023-05-23 12:50:22,375 epoch 29 - iter 42/428 - loss 1.06874142 - time (sec): 3.04 - samples/sec: 221.08 - lr: 0.000125
2023-05-23 12:50:25,918 epoch 29 - iter 84/428 - loss 1.09057317 - time (sec): 6.58 - samples/sec: 204.18 - lr: 0.000125
2023-05-23 12:50:29,113 epoch 29 - iter 126/428 - loss 1.09201170 - time (sec): 9.78 - samples/sec: 206.18 - lr: 0.000125
2023-05-23 12:50:32,422 epoch 29 - iter 168/428 - loss 1.08813491 - time (sec): 13.09 - samples/sec: 205.39 - lr: 0.000125
2023-05-23 12:50:35,448 epoch 29 - iter 210/428 - loss 1.09309612 - time (sec): 16.11 - samples/sec: 208.53 - lr: 0.000125
2023-05-23 12:50:39,113 epoch 29 - iter 252/428 - loss 1.10451405 - time (sec): 19.78 - samples/sec: 203.86 - lr: 0.000125
2023-05-23 12:50:42,267 epoch 29 - iter 294/428 - loss 1.10575512 - time (sec): 22.93 -

100%|██████████| 56/56 [00:05<00:00, 11.10it/s]

2023-05-23 12:50:57,300 Evaluating as a multi-label problem: True





2023-05-23 12:50:57,360 DEV : loss 2.3183412551879883 - f1-score (micro avg)  0.5506
2023-05-23 12:50:57,869 Epoch    29: reducing learning rate of group 0 to 6.2500e-05.
2023-05-23 12:50:57,875 BAD EPOCHS (no improvement): 4
2023-05-23 12:50:57,877 ----------------------------------------------------------------------------------------------------
2023-05-23 12:50:57,880 ----------------------------------------------------------------------------------------------------
2023-05-23 12:50:57,882 learning rate too small - quitting training!
2023-05-23 12:50:57,884 ----------------------------------------------------------------------------------------------------
2023-05-23 12:50:57,982 ----------------------------------------------------------------------------------------------------


100%|██████████| 204/204 [00:13<00:00, 15.12it/s]

2023-05-23 12:51:11,723 Evaluating as a multi-label problem: True





2023-05-23 12:51:11,851 0.6993	0.4931	0.5784	0.4863
2023-05-23 12:51:11,854 
Results:
- F-score (micro) 0.5784
- F-score (macro) 0.2245
- Accuracy 0.4863

By class:
                       precision    recall  f1-score   support

       __label__anger     0.7193    0.6624    0.6897      1176
         __label__joy     0.7397    0.6483    0.6910      1052
__label__anticipation     0.4079    0.0914    0.1494       339
        __label__fear     0.6357    0.3787    0.4747       235
     __label__disgust     0.2500    0.0102    0.0196       196
   __label__pessimism     0.5000    0.1860    0.2712        86
     __label__sadness     0.0000    0.0000    0.0000        86
    __label__optimism     0.3636    0.1143    0.1739        70
        __label__love     0.0000    0.0000    0.0000        14
    __label__surprise     0.0000    0.0000    0.0000         4
       __label__trust     0.0000    0.0000    0.0000         1

            micro avg     0.6993    0.4931    0.5784      3259
            ma

{'test_score': 0.5783696238977866,
 'dev_score_history': [0.381020848310568,
  0.45170660856935363,
  0.48600883652430044,
  0.44588045234248785,
  0.5292479108635098,
  0.5253648366921474,
  0.5257142857142858,
  0.5370629370629371,
  0.514822848879248,
  0.5515358361774745,
  0.5651006711409395,
  0.5469277515192439,
  0.5448323066392882,
  0.5620827770360481,
  0.546831955922865,
  0.5354439091534756,
  0.5685752330226365,
  0.5674121405750798,
  0.5510071474983755,
  0.5460526315789475,
  0.5653896961690885,
  0.5682560418027434,
  0.5605670103092784,
  0.5553372626064178,
  0.5563517915309446,
  0.5618556701030929,
  0.5588615782664942,
  0.5519897304236201,
  0.5505761843790012],
 'train_loss_history': [2.6792705183874346,
  2.335255660240885,
  2.1629621679234066,
  2.092928094799737,
  2.003150710239505,
  1.970020895291574,
  1.9144010635871362,
  1.8842396075072154,
  1.8410915942107002,
  1.8239836784858179,
  1.7693277292343634,
  1.7548125390596436,
  1.7125222162044198,
 

## making predictions

In [None]:
new_clf = TextClassifier.load('data_fst/best-model.pt')
from flair.data import Sentence


In [None]:
s1 = Sentence("I just finished watching Shrek for the third time! It's the small things in life...")

In [None]:
s2 = Sentence("I'm really looking forward to moving to Zurich! Can't wait to see some old friends")

In [None]:
s3 = Sentence("Bruh the current president is a clown")

In [None]:
s4 = Sentence("With things as they currently stand, I'm afraid I don't see a future for our kids")

In [None]:
sentences = [s1, s2, s3, s4]

In [None]:
for s in sentences:
  new_clf.predict(s)
  print(s.labels)

['Sentence[18]: "I just finished watching Shrek for the third time! It's the small things in life..."'/'__label__joy' (0.6652)]
['Sentence[18]: "I'm really looking forward to moving to Zurich! Can't wait to see some old friends"'/'__label__anticipation' (0.6501)]
['Sentence[7]: "Bruh the current president is a clown"'/'__label__anger' (0.7535)]
['Sentence[19]: "With things as they currently stand, I'm afraid I don't see a future for our kids"'/'__label__fear' (0.7026)]
