<br><br>

## **Import necessary Python libraries and modules**

First, we will import necessary Python libraries and modules. These include as `gdown`, for downloading large files from Google Drive (where we will get our UCSD Goodreads reviews), as well as scikit-learn (`sklearn`) and PyTorch (`torch`), for various machine learning tools.

In [1]:

# For data manipulation and analysis
import pandas as pd
import numpy as np

# For machine learning tools and evaluation
from sklearn.metrics import accuracy_score, precision_recall_fscore_support, classification_report


# For deep learning
# https://pytorch.org/tutorials/beginner/basics/quickstart_tutorial.html
import torch



To use the HuggingFace [`transformers` Python library](https://huggingface.co/transformers/installation.html), we will install it with `pip`.

In [2]:
!pip install transformers

Collecting transformers
  Downloading transformers-4.16.2-py3-none-any.whl (3.5 MB)
[K     |████████████████████████████████| 3.5 MB 8.5 MB/s 
[?25hCollecting sacremoses
  Downloading sacremoses-0.0.47-py2.py3-none-any.whl (895 kB)
[K     |████████████████████████████████| 895 kB 43.9 MB/s 
Collecting pyyaml>=5.1
  Downloading PyYAML-6.0-cp37-cp37m-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl (596 kB)
[K     |████████████████████████████████| 596 kB 45.0 MB/s 
Collecting huggingface-hub<1.0,>=0.1.0
  Downloading huggingface_hub-0.4.0-py3-none-any.whl (67 kB)
[K     |████████████████████████████████| 67 kB 5.2 MB/s 
Collecting tokenizers!=0.11.3,>=0.10.1
  Downloading tokenizers-0.11.4-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl (6.8 MB)
[K     |████████████████████████████████| 6.8 MB 19.5 MB/s 
Installing collected packages: pyyaml, tokenizers, sacremoses, huggingface-hub, transformers
  Attempting uninstall: pyyaml
    Foun

Once `transformers` is installed, we will import modules for `DistilBert`, a *distilled* or smaller version of a BERT model that runs more quickly and uses less computing power. This makes it ideal for those just getting started with BERT.

In [3]:
from transformers import DistilBertTokenizerFast, DistilBertForSequenceClassification
from transformers import Trainer, TrainingArguments

<br><br>

## **Set parameters and file paths**

In [4]:
# This is the name of the BERT model that we want to use. 
# We're using DistilBERT to save space (it's a distilled version of the full BERT model), 
# and we're going to use the cased (vs uncased) version.
model_name = 'distilbert-base-multilingual-cased'  

# This is the name of the program management system for NVIDIA GPUs. We're going to send our code here.
device_name = 'cuda'       

# This is the maximum number of tokens in any document sent to BERT.
max_length = 512                                                        

# This is the name of the directory where we'll save our model. You can name it whatever you want.
#cached_model_directory_name = 'ABSA_FineTuning_BERT'  
cached_model_directory_name = 'Emotion_12'  

In [5]:
#stiamo utlizzando la GPU?
import torch

# If there's a GPU available...
if torch.cuda.is_available():    

    # Tell PyTorch to use the GPU.    
    device = torch.device("cuda")

    print('There are %d GPU(s) available.' % torch.cuda.device_count())

    print('We will use the GPU:', torch.cuda.get_device_name(0))

# If not...
else:
    print('No GPU available, using the CPU instead.')
    device = torch.device("cpu")

There are 1 GPU(s) available.
We will use the GPU: Tesla K80


In [6]:
train=pd.read_csv("train12_emo.csv")
train

Unnamed: 0.1,Unnamed: 0,Sentence,Emotions
0,0,"Ariel, ever since the day I met you, I knew th...",Joy
1,1,"Definitely doesn't like you. But, I wouldn't t...",Anger
2,2,I'd forgotten how much I loved happy hour. Jul...,Joy
3,3,My two and half hours of class every other day...,Neutral
4,4,"She smiles, clasping her hands together and le...",Joy
...,...,...,...
14304,14304,이게 영화인가?,Fear
14305,14305,로버트 레드포드가 48살때 찍은 영화구나.. 중후하고 농후한 남자의 향기,Anger
14306,14306,신랑이랑 기분좋게 보는 드라마 입니다 가족의 따뜻함도 알게 해주고 웃기도하고 울기도...,Joy
14307,14307,누군가를 가르쳐 본 적이 있는 사람이라면 100% 공감하고 감동받을 수 있는 이야기!,Fear


In [7]:
test=pd.read_csv("test1_emo.csv")
test

Unnamed: 0.1,Unnamed: 0,Sentence,Emotions
0,0,"No, it's okay. It's the same feeling he had, t...",Surprise
1,1,"At last, after many passionate kisses and whis...",Joy
2,2,Maybe I just wanted to feel your soft lips aga...,Joy
3,3,With her gentle demeanour and sudden courage s...,Joy
4,4,She agreed with Professor Snape that it was un...,Neutral
...,...,...,...
5133,5133,이것은 절대 1점이 아니다! 10점으로도 모자라 11점을 주고 싶은 내마음이다!!,Fear
5134,5134,낸시 알렌 이쁘고 섹시하다,Joy
5135,5135,영화 안봤는데 방금 라디오 듣고 어떤영화인지 대강 알것 같네요ㅋㅋㅋ,Anger
5136,5136,러셀크로우 왕멋있고 암튼 큰기대는 안했는데 잼썼음ㅎㅎ,Joy


In [8]:
#creiamo le liste di train e test. 1.8k train 250 test  
train_texts=train['Sentence']
train_labels=train['Emotions']
test_texts=test['Sentence']
test_labels=test['Emotions']


<br><br>

## **Split the data into training and test sets**

In [9]:
len(train_texts), len(train_labels), len(test_texts), len(test_labels)

(14309, 14309, 5138, 5138)

Here's an example of a training label and review:

In [10]:
train_labels[0], train_texts[0]

('Joy',
 'Ariel, ever since the day I met you, I knew that you were going to be the death of me, Shane said as he held my hands. "I knew that I wanted to spend the rest of my life with you, and now I want to make it official. So," he said as he knelt down on one knee and pulled out a small black box from his pocket.')

<br><br>

## **Encode data for BERT**

We're going to transform our texts and labels into a format that BERT (via Huggingface and PyTorch) will understand. This is called *encoding* the data.

Here are the steps we need to follow:

1. The labels&mdash;in this case, Goodreads genres&mdash;need to be turned into integers rather than strings.

2. The texts&mdash;in this case, Goodreads reviews&mdash;need to be truncated if they're more than 512 tokens or padded if they're fewer than 512 tokens. The tokens, or words in the texts, also need to be separated into "word pieces" and matched to their embedding vectors.

3. We need to add special tokens to help BERT:

| BERT special token | Explanation |
| --------------| ---------|
| [CLS] | Start token of every document. |
| [SEP] | Separator between each sentence |
| [PAD] | Padding at the end of the document as many times as necessary, up to 512 tokens |
|  &#35;&#35; | Start of a "word piece" |




Here we will load `DistilBertTokenizerFast` from the HuggingFace library, which will do all the work of encoding the texts for us. The `tokenizer()` will break word tokens into word pieces, truncate to 512 tokens, and add padding and special BERT tokens.

In [11]:
tokenizer = DistilBertTokenizerFast.from_pretrained(model_name) # The model_name needs to match our pre-trained model.

Downloading:   0%|          | 0.00/29.0 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/972k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/1.87M [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/466 [00:00<?, ?B/s]

Here we will create a map of our labels, or Goodreads genres, to integer keys. We take the unique labels, and then we make a dictionary that associates each label/tag with an integer.

**Note:** HuggingFace documentation sometimes refers to "labels" as "tags" but these are the same thing. We use "labels" throughout this notebook for clarity.

In [12]:
unique_labels = set(label for label in train_labels)
label2id = {label: id for id, label in enumerate(unique_labels)}
id2label = {id: label for label, id in label2id.items()}

In [13]:
unique_labels

{'Anger', 'Disgust', 'Fear', 'Joy', 'Neutral', 'Sadness', 'Surprise'}

In [14]:
label2id.keys()

dict_keys(['Sadness', 'Anger', 'Disgust', 'Surprise', 'Neutral', 'Fear', 'Joy'])

In [15]:
id2label.keys()

dict_keys([0, 1, 2, 3, 4, 5, 6])

Now let's encode our texts and labels!

In [16]:
train_encodings = tokenizer(train_texts.tolist(), truncation=True, padding=True, max_length=max_length)
test_encodings  = tokenizer(test_texts.tolist(), truncation=True, padding=True, max_length=max_length)

train_labels_encoded = [label2id[y] for y in train_labels.tolist()]
test_labels_encoded  = [label2id[y] for y in test_labels.tolist()]

**Examine a Goodreads review in the training set after encoding**

In [17]:
' '.join(train_encodings[29].tokens[0:100])

'[CLS] They kept warm with the body heat radi ##ating off of each other . Sc ##all ##i would be sp ##oon ##ed by Ve ##git ##o in a lock ##ed hu ##g . The warm ##th of his em ##bra ##ce caused a sm ##ile to form on her face , her gray sp ##ark ##ling eyes focusing on the screen of her phone that they watched little videos on , the sai ##yan girl ig ##nor ##ing the thu ##mps in her heart and the burning felt on her face . [SEP] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD]'

**Examine a Goodreads review in the test set after encoding**

In [18]:
' '.join(test_encodings[0].tokens[0:100])

"[CLS] No , it ' s oka ##y . It ' s the same feeling he had , too . He kept doing it up because he ' s just such an ups ##tand ##ing gent ##leman who didn ' t want to int ##rude on me getting an education or something like that , she said , her voice heavy with sa ##rca ##sm . I per ##ked up slightly , taking that as a good sign . A dry sense of humor seemed like a moderate improvement from c ##ry ##ing so hard it was difficult to br"

**Examine the training labels after encoding**

In [19]:
set(train_labels_encoded)

{0, 1, 2, 3, 4, 5, 6}

**Examine the test labels after encoding**

In [20]:
set(test_labels_encoded)

{0, 1, 2, 3, 4, 5, 6}

<br><br>

## **Make a custom Torch dataset**

Here we combine the encoded labels and texts into dataset objects. We use the custom Torch `MyDataSet` class to make a `train_dataset` object from  the `train_encodings` and `train_labels_encoded`. We also make a `test_dataset` object from `test_encodings`, and `test_labels_encoded`.

In [21]:
class MyDataset(torch.utils.data.Dataset):
    def __init__(self, encodings, labels):
        self.encodings = encodings
        self.labels = labels

    def __getitem__(self, idx):
        item = {key: torch.tensor(val[idx]) for key, val in self.encodings.items()}
        item['labels'] = torch.tensor(self.labels[idx])
        return item

    def __len__(self):
        return len(self.labels)

In [22]:
train_dataset = MyDataset(train_encodings, train_labels_encoded)
test_dataset = MyDataset(test_encodings, test_labels_encoded)

In [23]:
' '.join(train_dataset.encodings[0].tokens[0:100])

'[CLS] Ariel , ever since the day I met you , I knew that you were going to be the death of me , Shane said as he held my hands . " I knew that I wanted to spend the rest of my life with you , and now I want to make it official . So , " he said as he kn ##elt down on one knee and pulled out a small black box from his poc ##ket . [SEP] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD]'

In [24]:
' '.join(test_dataset.encodings[1].tokens[0:100])

"[CLS] At last , after many passion ##ate kis ##ses and w ##his ##pere ##d words of love they gave in to their desire to be together . He could not wait and needed her with an intensity that was slowly driving him out of his mind . She couldn ' t think beyond this moment as he care ##ssed every inch of her . Da ##zed with jo ##y she surrendered completely to him . [SEP] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD]"

<br><br>

## **Load pre-trained BERT model**

Here we load a pre-trained DistilBERT model and send it to CUDA.

**Note:** If you decide to repeat fine-tuning after already running the following cells, make sure that you re-run this cell to re-load the original pre-trained model before fine-tuning again.

In [25]:
# The model_name needs to match the name used for the tokenizer above.
model = DistilBertForSequenceClassification.from_pretrained(model_name, num_labels=len(id2label)).to(device_name)

Downloading:   0%|          | 0.00/517M [00:00<?, ?B/s]

Some weights of the model checkpoint at distilbert-base-multilingual-cased were not used when initializing DistilBertForSequenceClassification: ['vocab_layer_norm.bias', 'vocab_transform.weight', 'vocab_projector.bias', 'vocab_transform.bias', 'vocab_layer_norm.weight', 'vocab_projector.weight']
- This IS expected if you are initializing DistilBertForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing DistilBertForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of DistilBertForSequenceClassification were not initialized from the model checkpoint at distilbert-base-multilingual-cased and are newly initialized: ['classifier.bias', 'classifier.we

<br><br>

## **Set the BERT fine-tuning parameters**

These are the arguments we'll set in the HuggingFace TrainingArguments objects, which we'll then pass to the HuggingFace Trainer object. There are many more possible arguments, but here we highlight the basics and some common gotchas.

When training your own model, you should search over these parameters to find the best settings for your particular dataset. You should use a held-out set of validation data for this step.

| Parameter | Explanation |
|-----------| ------------|
| num_train_epochs | total number of training epochs (how many times to pass through the entire dataset; too much can cause overfitting) |
| per_device_train_batch_size | batch size per device during training |
| per_device_eval_batch_size |  batch size for evaluation |
|  warmup_steps |  number of warmup steps for learning rate scheduler (set lower because of small dataset size) |
| weight_decay | strength of weight decay (reduces size of weights, like regularization) |
| output_dir | output directory for the fine-tuned model and configuration files |
| logging_dir | directory for storing logs |
| logging_steps | how often to print logging output (so that we can stop training early if the loss isn't going down) |
| evaluation_strategy | evaluate while training so that we can see the accuracy going up |

In [26]:
training_args=TrainingArguments(
    num_train_epochs=4,              # total number of training epochs
    per_device_train_batch_size=32,  # batch size per device during training
    per_device_eval_batch_size=32,   # batch size for evaluation
    learning_rate=2e-5,              # initial learning rate for Adam optimizer
    warmup_steps=100,                # number of warmup steps for learning rate scheduler (set lower because of small dataset size)
    weight_decay=0.01,               # strength of weight decay
    output_dir='./results',          # output directory
    logging_dir='./logs',            # directory for storing logs
    logging_steps=100,               # number of steps to output logging (set lower because of small dataset size)
    evaluation_strategy='steps'     # evaluate during fine-tuning so that we can see progress
)

<br><br>

## **Fine-tune the BERT model**

First, we define a custom evaluation function that returns the accuracy. You could modify this function to return precision, recall, F1, and/or other metrics.

In [27]:
def compute_metrics(pred):
  labels = pred.label_ids
  preds = pred.predictions.argmax(-1)
  acc = accuracy_score(labels, preds)
  return {
      'accuracy': acc,
  }

Then we create a HuggingFace `Trainer` object using the `TrainingArguments` object that we created above. We also send our `compute_metrics` function to the `Trainer` object, along with our test and train datasets.

**Note:** This is what we've been aiming for this whole time! All the work of tokenizing, creating datasets, and setting the training arguments was for this cell.

In [28]:
trainer = Trainer(
    model=model,                         # the instantiated 🤗 Transformers model to be trained
    args=training_args,                  # training arguments, defined above
    train_dataset=train_dataset,         # training dataset
    eval_dataset=test_dataset,           # evaluation dataset (usually a validation set; here we just send our test set)
    compute_metrics=compute_metrics      # our custom evaluation function 
)

Time to finally fine-tune! 

Be patient; if you've set everything in Colab to use GPUs, then it should only take a minute or two to run, but if you're running on CPU, it can take hours.

After every 10 steps (as we specified in the TrainingArguments object), the trainer will output the current state of the model, including the training loss, validation ("test") loss, and accuracy (from our `compute_metrics` function).

You should see the loss going down and the accuracy going up. If instead they are staying the same or oscillating, you probably need to change the fine-tuning parameters.

In [29]:
trainer.train()       

***** Running training *****
  Num examples = 14309
  Num Epochs = 4
  Instantaneous batch size per device = 32
  Total train batch size (w. parallel, distributed & accumulation) = 32
  Gradient Accumulation steps = 1
  Total optimization steps = 1792


Step,Training Loss,Validation Loss,Accuracy
100,1.8688,1.752915,0.287661
200,1.691,1.597382,0.380888
300,1.5917,1.521355,0.410666
400,1.5167,1.460044,0.43441
500,1.4575,1.417258,0.454457
600,1.4005,1.377405,0.482873
700,1.3776,1.336748,0.491242
800,1.3211,1.311992,0.50798
900,1.3304,1.297125,0.514402
1000,1.208,1.287446,0.523161


***** Running Evaluation *****
  Num examples = 5138
  Batch size = 32
***** Running Evaluation *****
  Num examples = 5138
  Batch size = 32
***** Running Evaluation *****
  Num examples = 5138
  Batch size = 32
***** Running Evaluation *****
  Num examples = 5138
  Batch size = 32
***** Running Evaluation *****
  Num examples = 5138
  Batch size = 32
Saving model checkpoint to ./results/checkpoint-500
Configuration saved in ./results/checkpoint-500/config.json
Model weights saved in ./results/checkpoint-500/pytorch_model.bin
***** Running Evaluation *****
  Num examples = 5138
  Batch size = 32
***** Running Evaluation *****
  Num examples = 5138
  Batch size = 32
***** Running Evaluation *****
  Num examples = 5138
  Batch size = 32
***** Running Evaluation *****
  Num examples = 5138
  Batch size = 32
***** Running Evaluation *****
  Num examples = 5138
  Batch size = 32
Saving model checkpoint to ./results/checkpoint-1000
Configuration saved in ./results/checkpoint-1000/config.jso

TrainOutput(global_step=1792, training_loss=1.3167115535054887, metrics={'train_runtime': 4747.5421, 'train_samples_per_second': 12.056, 'train_steps_per_second': 0.377, 'total_flos': 4516966656336840.0, 'train_loss': 1.3167115535054887, 'epoch': 4.0})

<br><br>

## **Save fine-tuned model**

The following cell will save the model and its configuration files to a directory in Colab. To preserve this model for future use, you should download the model to your computer.

In [30]:
trainer.save_model(cached_model_directory_name)

Saving model checkpoint to Emotion_12
Configuration saved in Emotion_12/config.json
Model weights saved in Emotion_12/pytorch_model.bin


(Optional) If you've already fine-tuned and saved the model, you can reload it using the following line. You don't have to run fine-tuning every time you want to evaluate.

<br><br>

## **Evaluate fine-tuned model**

The following function of the `Trainer` object will run the built-in evaluation, including our `compute_metrics` function.

In [31]:
trainer.evaluate()

***** Running Evaluation *****
  Num examples = 5138
  Batch size = 32


{'epoch': 4.0,
 'eval_accuracy': 0.5720124562086415,
 'eval_loss': 1.172942876815796,
 'eval_runtime': 103.1286,
 'eval_samples_per_second': 49.821,
 'eval_steps_per_second': 1.561}

But we might want to do more fine-grained analysis of the model, so we extract the predicted labels.

In [32]:
predicted_results = trainer.predict(test_dataset)

***** Running Prediction *****
  Num examples = 5138
  Batch size = 32


In [33]:
predicted_results.predictions.shape

(5138, 7)

In [34]:
predicted_labels = predicted_results.predictions.argmax(-1) # Get the highest probability prediction
predicted_labels = predicted_labels.flatten().tolist()      # Flatten the predictions into a 1D list
predicted_labels = [id2label[l] for l in predicted_labels]  # Convert from integers back to strings for readability

In [35]:
len(predicted_labels)

5138

In [36]:
print(classification_report(test_labels, 
                            predicted_labels))

              precision    recall  f1-score   support

       Anger       0.58      0.62      0.60       801
     Disgust       0.41      0.37      0.39       427
        Fear       0.49      0.57      0.53       681
         Joy       0.66      0.70      0.68      1157
     Neutral       0.70      0.69      0.70       708
     Sadness       0.52      0.60      0.56       826
    Surprise       0.45      0.19      0.27       538

    accuracy                           0.57      5138
   macro avg       0.54      0.53      0.53      5138
weighted avg       0.57      0.57      0.56      5138



In [37]:
#a livello generale, insieme ad 13 il migliore per strategy 1

In [38]:
#RELOAD THE MODEL

In [39]:
from transformers import DistilBertTokenizerFast, DistilBertForSequenceClassification
#tok = DistilBertTokenizerFast.from_pretrained(model_name) # The model_name needs to match our pre-trained model.
tok = DistilBertTokenizerFast.from_pretrained(model_name) # The model_name needs to match our pre-trained model.
mod = DistilBertForSequenceClassification.from_pretrained(cached_model_directory_name)

loading file https://huggingface.co/distilbert-base-multilingual-cased/resolve/main/vocab.txt from cache at /root/.cache/huggingface/transformers/28e5b750bf4f39cc620367720e105de1501cf36ec4ca7029eba82c1d2cc47caf.6c5b6600e968f4b5e08c86d8891ea99e51537fc2bf251435fb46922e8f7a7b29
loading file https://huggingface.co/distilbert-base-multilingual-cased/resolve/main/tokenizer.json from cache at /root/.cache/huggingface/transformers/5cbdf121f196be5f1016cb102b197b0c34009e1e658f513515f2eebef9f38093.b33e51591f94f17c238ee9b1fac75b96ff2678cbaed6e108feadb3449d18dc24
loading file https://huggingface.co/distilbert-base-multilingual-cased/resolve/main/added_tokens.json from cache at None
loading file https://huggingface.co/distilbert-base-multilingual-cased/resolve/main/special_tokens_map.json from cache at None
loading file https://huggingface.co/distilbert-base-multilingual-cased/resolve/main/tokenizer_config.json from cache at /root/.cache/huggingface/transformers/47087d99feeb3bc6184d7576ff089c52f7fbe

In [40]:
#valutazione per domain

In [41]:
def prediction_test(dataset,modello):  #dataset fa riferimento al set di dati che vogliamo passare, modello dipende dal tipo di FT
#poi ci sarà anche la funzione per predire senza labels
    tok = DistilBertTokenizerFast.from_pretrained('distilbert-base-multilingual-cased')
    test_texts=dataset['Sentence']
    test_labels=dataset['Emotions']

#unique_labels = set(label for label in test_labels)
#label2id = {label: id for id, label in enumerate(unique_labels)}
#id2label = {id: label for label, id in label2id.items()}
    unique_labels = set(label for label in train_labels)
    label2id = {label: id for id, label in enumerate(unique_labels)}
    id2label = {id: label for label, id in label2id.items()}

#train_encodings = tok(train_texts.tolist(), truncation=True, padding=True, max_length=max_length)
    test_encodings  = tok(test_texts.tolist(), truncation=True, padding=True, max_length=max_length)

#train_labels_encoded = [label2id[y] for y in train_labels.tolist()]
    test_labels_encoded  = [label2id[y] for y in test_labels.tolist()]
    

    class MyDataset(torch.utils.data.Dataset):
        def __init__(self, encodings, labels):
            self.encodings = encodings
            self.labels = labels

        def __getitem__(self, idx):
            item = {key: torch.tensor(val[idx]) for key, val in self.encodings.items()}
            item['labels'] = torch.tensor(self.labels[idx])
            return item

        def __len__(self):
            return len(self.labels)

#train_dataset = MyDataset(train_encodings, train_labels_encoded)
    test_dataset = MyDataset(test_encodings, test_labels_encoded)
    
    training_args = TrainingArguments(
    num_train_epochs=4,              # total number of training epochs
    per_device_train_batch_size=32,  # batch size per device during training
    per_device_eval_batch_size=32,   # batch size for evaluation
    learning_rate=2e-5,              # initial learning rate for Adam optimizer
    warmup_steps=100,                # number of warmup steps for learning rate scheduler (set lower because of small dataset size)
    weight_decay=0.01,               # strength of weight decay
    output_dir='./results',          # output directory
    logging_dir='./logs',            # directory for storing logs
    logging_steps=500,               # number of steps to output logging (set lower because of small dataset size)
    evaluation_strategy='steps',     # evaluate during fine-tuning so that we can see progress
)
    
    trainer = Trainer(model=modello, args=training_args)  #basta avere il modello come parametro

    predicted_results=trainer.predict(test_dataset)
    
    predicted_labels = predicted_results.predictions.argmax(-1) # Get the highest probability prediction
    predicted_labels = predicted_labels.flatten().tolist()      # Flatten the predictions into a 1D list
    predicted_labels = [id2label[l] for l in predicted_labels]  # Convert from integers back to strings for readability

#len(predicted_labels)

    return print(classification_report(test_labels,predicted_labels))#,predicted_labels

In [42]:
prediction_test(test[:2172],mod)   #tabella per inglesi    #0.86

loading file https://huggingface.co/distilbert-base-multilingual-cased/resolve/main/vocab.txt from cache at /root/.cache/huggingface/transformers/28e5b750bf4f39cc620367720e105de1501cf36ec4ca7029eba82c1d2cc47caf.6c5b6600e968f4b5e08c86d8891ea99e51537fc2bf251435fb46922e8f7a7b29
loading file https://huggingface.co/distilbert-base-multilingual-cased/resolve/main/tokenizer.json from cache at /root/.cache/huggingface/transformers/5cbdf121f196be5f1016cb102b197b0c34009e1e658f513515f2eebef9f38093.b33e51591f94f17c238ee9b1fac75b96ff2678cbaed6e108feadb3449d18dc24
loading file https://huggingface.co/distilbert-base-multilingual-cased/resolve/main/added_tokens.json from cache at None
loading file https://huggingface.co/distilbert-base-multilingual-cased/resolve/main/special_tokens_map.json from cache at None
loading file https://huggingface.co/distilbert-base-multilingual-cased/resolve/main/tokenizer_config.json from cache at /root/.cache/huggingface/transformers/47087d99feeb3bc6184d7576ff089c52f7fbe

              precision    recall  f1-score   support

       Anger       0.69      0.71      0.70       326
     Disgust       0.00      0.00      0.00        17
        Fear       0.64      0.73      0.69       341
         Joy       0.79      0.86      0.82       619
     Neutral       0.74      0.64      0.68       436
     Sadness       0.68      0.77      0.72       348
    Surprise       0.60      0.04      0.07        85

    accuracy                           0.72      2172
   macro avg       0.59      0.54      0.53      2172
weighted avg       0.71      0.72      0.70      2172



  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))


In [None]:
#top per strategy 1, in quanto viene dato più spazio alle IT diminunedo KO 

In [43]:
prediction_test(test[2172:2938],mod)   #tabella per italiane, domain social media e hotel reviews     #0.95

loading file https://huggingface.co/distilbert-base-multilingual-cased/resolve/main/vocab.txt from cache at /root/.cache/huggingface/transformers/28e5b750bf4f39cc620367720e105de1501cf36ec4ca7029eba82c1d2cc47caf.6c5b6600e968f4b5e08c86d8891ea99e51537fc2bf251435fb46922e8f7a7b29
loading file https://huggingface.co/distilbert-base-multilingual-cased/resolve/main/tokenizer.json from cache at /root/.cache/huggingface/transformers/5cbdf121f196be5f1016cb102b197b0c34009e1e658f513515f2eebef9f38093.b33e51591f94f17c238ee9b1fac75b96ff2678cbaed6e108feadb3449d18dc24
loading file https://huggingface.co/distilbert-base-multilingual-cased/resolve/main/added_tokens.json from cache at None
loading file https://huggingface.co/distilbert-base-multilingual-cased/resolve/main/special_tokens_map.json from cache at None
loading file https://huggingface.co/distilbert-base-multilingual-cased/resolve/main/tokenizer_config.json from cache at /root/.cache/huggingface/transformers/47087d99feeb3bc6184d7576ff089c52f7fbe

              precision    recall  f1-score   support

       Anger       0.46      0.44      0.45        54
     Disgust       0.71      0.10      0.18        48
        Fear       0.50      0.12      0.20        16
         Joy       0.70      0.57      0.63       151
     Neutral       0.67      0.79      0.73       272
     Sadness       0.60      0.77      0.67        83
    Surprise       0.55      0.60      0.57       142

    accuracy                           0.63       766
   macro avg       0.60      0.49      0.49       766
weighted avg       0.63      0.63      0.61       766



In [44]:
prediction_test(test[2938:],mod)   #tabella per koreane, domain movie reviews     #0.81


loading file https://huggingface.co/distilbert-base-multilingual-cased/resolve/main/vocab.txt from cache at /root/.cache/huggingface/transformers/28e5b750bf4f39cc620367720e105de1501cf36ec4ca7029eba82c1d2cc47caf.6c5b6600e968f4b5e08c86d8891ea99e51537fc2bf251435fb46922e8f7a7b29
loading file https://huggingface.co/distilbert-base-multilingual-cased/resolve/main/tokenizer.json from cache at /root/.cache/huggingface/transformers/5cbdf121f196be5f1016cb102b197b0c34009e1e658f513515f2eebef9f38093.b33e51591f94f17c238ee9b1fac75b96ff2678cbaed6e108feadb3449d18dc24
loading file https://huggingface.co/distilbert-base-multilingual-cased/resolve/main/added_tokens.json from cache at None
loading file https://huggingface.co/distilbert-base-multilingual-cased/resolve/main/special_tokens_map.json from cache at None
loading file https://huggingface.co/distilbert-base-multilingual-cased/resolve/main/tokenizer_config.json from cache at /root/.cache/huggingface/transformers/47087d99feeb3bc6184d7576ff089c52f7fbe

              precision    recall  f1-score   support

       Anger       0.52      0.56      0.54       421
     Disgust       0.41      0.43      0.42       362
        Fear       0.34      0.42      0.38       324
         Joy       0.44      0.49      0.47       387
     Neutral       0.00      0.00      0.00         0
     Sadness       0.35      0.42      0.38       395
    Surprise       0.21      0.05      0.08       311

    accuracy                           0.41      2200
   macro avg       0.32      0.34      0.32      2200
weighted avg       0.39      0.41      0.39      2200



  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))


In [None]:
#a livello generale è più performante di Emotion11
#ancora una volta il livellamento completo non conduce a grandi risultati

In [None]:
#livellando le KO si benefiacia in IT e EN, perdendo in KO

In [None]:
#questo modello è il vincitore per la prima strategia, eppure non porta buoni risultati

In [None]:
#otteniamo una sifficianza per EN e IT, una grave insufficienza per KO, ed una insuff generale, al pari di 13