# **LLMs Analysis on IMDb Dataset**

## 1. **SETUP**

### We check the GPU status

In [1]:
!watch -n 1 nvidia-smi

[?1l>

### Now we are going to verify that there are no other past instances of our project within the Colab directory.

In [1]:
%ls

[0m[01;34msample_data[0m/


### We check that the directory is not already there, and pull the files needed for training and evaluation from GitHub.

In [2]:
!test -d DLA_LLMSANALYSIS && rm -rf DLA_LLMSANALYSIS
!git clone https://github.com/wakaflocka17/DLA_LLMSANALYSIS.git
%cd DLA_LLMSANALYSIS

Cloning into 'DLA_LLMSANALYSIS'...
remote: Enumerating objects: 391, done.[K
remote: Counting objects: 100% (90/90), done.[K
remote: Compressing objects: 100% (54/54), done.[K
remote: Total 391 (delta 66), reused 57 (delta 36), pack-reused 301 (from 1)[K
Receiving objects: 100% (391/391), 82.22 KiB | 20.56 MiB/s, done.
Resolving deltas: 100% (284/284), done.
/content/DLA_LLMSANALYSIS


### We are now going to create our virtual environment using venv.

In [3]:
!pip install virtualenv
!python -m virtualenv venv
!source venv/bin/activate

Collecting virtualenv
  Downloading virtualenv-20.30.0-py3-none-any.whl.metadata (4.5 kB)
Collecting distlib<1,>=0.3.7 (from virtualenv)
  Downloading distlib-0.3.9-py2.py3-none-any.whl.metadata (5.2 kB)
Downloading virtualenv-20.30.0-py3-none-any.whl (4.3 MB)
[?25l   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/4.3 MB[0m [31m?[0m eta [36m-:--:--[0m[2K   [91m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[91m╸[0m [32m4.3/4.3 MB[0m [31m141.9 MB/s[0m eta [36m0:00:01[0m[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m4.3/4.3 MB[0m [31m81.9 MB/s[0m eta [36m0:00:00[0m
[?25hDownloading distlib-0.3.9-py2.py3-none-any.whl (468 kB)
[?25l   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/469.0 kB[0m [31m?[0m eta [36m-:--:--[0m[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m469.0/469.0 kB[0m [31m37.4 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: distlib, virtualenv
Successfully installed distlib-

### Now we are going to install all the libraries defined within our requirements.txt file.

In [5]:
!venv/bin/pip install -r requirements.txt

Collecting numpy<2.0.0 (from -r requirements.txt (line 1))
  Downloading numpy-1.26.4-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (61 kB)
Collecting scikit-learn==1.2.2 (from -r requirements.txt (line 2))
  Downloading scikit_learn-1.2.2-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (11 kB)
Collecting transformers==4.28.0 (from -r requirements.txt (line 3))
  Downloading transformers-4.28.0-py3-none-any.whl.metadata (109 kB)
Collecting datasets==2.10.0 (from -r requirements.txt (line 4))
  Downloading datasets-2.10.0-py3-none-any.whl.metadata (20 kB)
Collecting torch==2.0.0 (from -r requirements.txt (line 5))
  Downloading torch-2.0.0-cp311-cp311-manylinux1_x86_64.whl.metadata (24 kB)
Collecting tensorflow==2.12.0 (from -r requirements.txt (line 6))
  Downloading tensorflow-2.12.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (3.4 kB)
Collecting fsspec==2023.1.0 (from -r requirements.txt (line 7))
  Downloading fsspec-

## 2. **HUGGINGFACE LOGIN USING TOKEN ACCOUNT**

In [7]:
from huggingface_hub import notebook_login

notebook_login()

VBox(children=(HTML(value='<center> <img\nsrc=https://huggingface.co/front/assets/huggingface_logo-noborder.sv…

## 3. **TRAINING AND EVALUATION MODELS**

### 3.1 **Train & Validation encoder-only**: google-bert/bert-base-uncased

In [9]:
!venv/bin/python main.py --model_config_key bert_base_uncased --mode train

INFO:__main__:Usando la configurazione di default: {'model_name': 'google-bert/bert-base-uncased', 'epochs': 5, 'train_batch_size': 8, 'eval_batch_size': 4, 'learning_rate': 2e-05, 'repo_pretrained': 'models/pretrained/bert-base-uncased', 'repo_finetuned': 'models/finetuned/bert-base-uncased-imdb'}
INFO:__main__:I modelli verranno salvati in: models/bert_base_uncased
2025-04-16 16:07:03.778649: I tensorflow/core/util/port.cc:110] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2025-04-16 16:07:03.835406: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX512F AVX512_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
Down

### 3.1 **Evaluation encoder-only pre-trained**: google-bert/bert-base-uncased

In [10]:
!venv/bin/python main.py --model_config_key bert_base_uncased --mode eval --eval_type pretrained --output_json_path "results/evaluation/pretrained/bert-base-uncased-imdb.json"

INFO:__main__:Usando la configurazione di default: {'model_name': 'google-bert/bert-base-uncased', 'epochs': 5, 'train_batch_size': 8, 'eval_batch_size': 4, 'learning_rate': 2e-05, 'repo_pretrained': 'models/pretrained/bert-base-uncased', 'repo_finetuned': 'models/finetuned/bert-base-uncased-imdb'}
INFO:__main__:I modelli verranno salvati in: models/bert_base_uncased
2025-04-16 16:24:57.783420: I tensorflow/core/util/port.cc:110] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2025-04-16 16:24:57.842510: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX512F AVX512_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
Some

### 3.1 **Evaluation encoder-only fine-tuned**: google-bert/bert-base-uncased

In [11]:
!venv/bin/python main.py --model_config_key bert_base_uncased --mode eval --eval_type fine_tuned --output_json_path "results/evaluation/finetuned/bert-base-uncased-imdb.json"

INFO:__main__:Usando la configurazione di default: {'model_name': 'google-bert/bert-base-uncased', 'epochs': 5, 'train_batch_size': 8, 'eval_batch_size': 4, 'learning_rate': 2e-05, 'repo_pretrained': 'models/pretrained/bert-base-uncased', 'repo_finetuned': 'models/finetuned/bert-base-uncased-imdb'}
INFO:__main__:I modelli verranno salvati in: models/bert_base_uncased
2025-04-16 16:28:57.770326: I tensorflow/core/util/port.cc:110] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2025-04-16 16:28:57.830208: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX512F AVX512_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
Some

### 3.2 **Train & Validation encoder-decoder**: facebook/bart-base

In [12]:
!venv/bin/python main.py --model_config_key bart_base --mode train

INFO:__main__:Usando la configurazione di default: {'model_name': 'facebook/bart-base', 'epochs': 5, 'train_batch_size': 8, 'eval_batch_size': 4, 'learning_rate': 2e-05, 'repo_pretrained': 'models/pretrained/bart-base', 'repo_finetuned': 'models/finetuned/bart-base-imdb'}
INFO:__main__:I modelli verranno salvati in: models/bart_base
2025-04-16 16:32:57.259728: I tensorflow/core/util/port.cc:110] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2025-04-16 16:32:57.319807: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX512F AVX512_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
Downloading vocab.json: 100% 899k/899k 

### 3.2 **Evaluation encoder-decoder pre-trained**: facebook/bart-base

In [13]:
!venv/bin/python main.py --model_config_key bart_base --mode eval --eval_type pretrained --output_json_path "results/evaluation/pretrained/bart-base-imdb.json"

INFO:__main__:Usando la configurazione di default: {'model_name': 'facebook/bart-base', 'epochs': 5, 'train_batch_size': 8, 'eval_batch_size': 4, 'learning_rate': 2e-05, 'repo_pretrained': 'models/pretrained/bart-base', 'repo_finetuned': 'models/finetuned/bart-base-imdb'}
INFO:__main__:I modelli verranno salvati in: models/bart_base
2025-04-16 16:53:06.586690: I tensorflow/core/util/port.cc:110] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2025-04-16 16:53:06.647920: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX512F AVX512_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
Some weights of BartForSequenceClassifi

### 3.2 **Evaluation encoder-decoder fine-tuned**: facebook/bart-base

In [14]:
!venv/bin/python main.py --model_config_key bart_base --mode eval --eval_type fine_tuned --output_json_path "results/evaluation/finetuned/bart-base-imdb.json"

INFO:__main__:Usando la configurazione di default: {'model_name': 'facebook/bart-base', 'epochs': 5, 'train_batch_size': 8, 'eval_batch_size': 4, 'learning_rate': 2e-05, 'repo_pretrained': 'models/pretrained/bart-base', 'repo_finetuned': 'models/finetuned/bart-base-imdb'}
INFO:__main__:I modelli verranno salvati in: models/bart_base
2025-04-16 17:02:27.095064: I tensorflow/core/util/port.cc:110] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2025-04-16 17:02:27.156016: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX512F AVX512_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
Some weights of BartForSequenceClassifi

### 3.3 **Train & Validation decoder-only**: EleutherAI/gpt-neo-2.7B

In [19]:
!venv/bin/python main.py --model_config_key gpt_neo_2_7b --mode train

INFO:__main__:Usando la configurazione di default: {'model_name': 'EleutherAI/gpt-neo-2.7B', 'epochs': 3, 'train_batch_size': 1, 'eval_batch_size': 1, 'learning_rate': 0.0005, 'gradient_accumulation_steps': 8, 'repo_pretrained': 'models/pretrained/gpt-neo-2.7B', 'repo_finetuned': 'models/finetuned/gpt-neo-2.7B-imdb'}
INFO:__main__:I modelli verranno salvati in: models/gpt_neo_2_7b
2025-04-16 17:51:04.049820: I tensorflow/core/util/port.cc:110] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2025-04-16 17:51:04.109433: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX512F AVX512_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compil

### 3.3 **Evaluation decoder-only pre-trained**: EleutherAI/gpt-neo-2.7B

In [26]:
!venv/bin/python main.py --model_config_key gpt_neo_2_7b --mode eval --eval_type pretrained --output_json_path "results/evaluation/pretrained/gpt-neo-2.7b-imdb.json"

INFO:__main__:Usando la configurazione di default: {'model_name': 'EleutherAI/gpt-neo-2.7B', 'epochs': 3, 'train_batch_size': 1, 'eval_batch_size': 1, 'learning_rate': 0.0005, 'gradient_accumulation_steps': 8, 'repo_pretrained': 'models/pretrained/gpt-neo-2.7B', 'repo_finetuned': 'models/finetuned/gpt-neo-2.7B-imdb'}
INFO:__main__:I modelli verranno salvati in: models/gpt_neo_2_7b
2025-04-16 22:10:52.248205: I tensorflow/core/util/port.cc:110] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2025-04-16 22:10:52.306812: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX512F AVX512_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compil

### 3.3 **Evaluation decoder-only fine-tuned**: EleutherAI/gpt-neo-2.7B

In [27]:
!venv/bin/python main.py --model_config_key gpt_neo_2_7b --mode eval --eval_type fine_tuned --output_json_path "results/evaluation/finetuned/gpt-neo-2.7b-imdb.json"

INFO:__main__:Usando la configurazione di default: {'model_name': 'EleutherAI/gpt-neo-2.7B', 'epochs': 3, 'train_batch_size': 1, 'eval_batch_size': 1, 'learning_rate': 0.0005, 'gradient_accumulation_steps': 8, 'repo_pretrained': 'models/pretrained/gpt-neo-2.7B', 'repo_finetuned': 'models/finetuned/gpt-neo-2.7B-imdb'}
INFO:__main__:I modelli verranno salvati in: models/gpt_neo_2_7b
2025-04-16 22:22:23.454327: I tensorflow/core/util/port.cc:110] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2025-04-16 22:22:23.515412: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX512F AVX512_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compil

### 3.4 **Evaluation Ensemble (Majority Voting) fine-tuned**: encoder-only, encoder-decoder and decoder-only

In [32]:
!venv/bin/python main.py --model_config_key ensemble_majority_voting --mode eval --eval_type fine_tuned --output_json_path "results/evaluation/finetuned/ensemble-majority-voting-imdb.json"

INFO:__main__:Usando la configurazione di default: {'model_names': ['bart_base', 'bert_base_uncased', 'gpt_neo_2_7b'], 'train_batch_size': 8, 'eval_batch_size': 4, 'epochs': 3, 'repo': 'models/ensemble/majority-voting-imdb'}
INFO:__main__:I modelli verranno salvati in: models/ensemble_majority_voting
2025-04-16 22:40:12.908177: I tensorflow/core/util/port.cc:110] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2025-04-16 22:40:12.968194: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX512F AVX512_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
Traceback (most recent call last):
  File "/content/DLA_LLMSANALYSIS/mai

## 4. **UPLOADING ALL MODELS ON HUGGINGFACE REPOSITORY**

In [25]:
!venv/bin/python src/upload_models.py

2025-04-16 22:00:22.417936: I tensorflow/core/util/port.cc:110] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2025-04-16 22:00:22.477982: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX512F AVX512_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
🚀 Upload in corso per: bert-base-uncased-imdb
✅ Copiato file di evaluation: results/evaluation/finetuned/bert-base-uncased-imdb.json
✅ Copiato file di validation: results/validation/finetuned/bert-base-uncased-imdb_metrics.json
INFO:root:Uploading ./models/finetuned/bert-base-uncased-imdb to Hugging Face as wakaflocka17/bert-imdb-finetuned...
pytorch_model.bin:   0% 0.00/

## 5. **GENERATE .JSON RESULTS AND PLOT RESULTS**

In [None]:
!venv/bin/python aggregate_results.py \
  --input_dir results \
  --output_file results_aggregati.json

In [None]:
!venv/bin/python plot_results.py \
    --results_file "results/aggregate_results.json" \
    --output_dir "plots"


Traceback (most recent call last):
  File "/content/DLA_LLMSANALYSIS/src/plot_results.py", line 1, in <module>
    import matplotlib.pyplot as plt
  File "/content/DLA_LLMSANALYSIS/venv/lib/python3.11/site-packages/matplotlib/__init__.py", line 1296, in <module>
    rcParams['backend'] = os.environ.get('MPLBACKEND')
    ~~~~~~~~^^^^^^^^^^^
  File "/content/DLA_LLMSANALYSIS/venv/lib/python3.11/site-packages/matplotlib/__init__.py", line 771, in __setitem__
    raise ValueError(f"Key {key}: {ve}") from None
ValueError: Key backend: 'module://matplotlib_inline.backend_inline' is not a valid value for backend; supported values are ['gtk3agg', 'gtk3cairo', 'gtk4agg', 'gtk4cairo', 'macosx', 'nbagg', 'notebook', 'qtagg', 'qtcairo', 'qt5agg', 'qt5cairo', 'tkagg', 'tkcairo', 'webagg', 'wx', 'wxagg', 'wxcairo', 'agg', 'cairo', 'pdf', 'pgf', 'ps', 'svg', 'template']


In [None]:
!venv/bin/python ensemble_analysis.py --ensemble_file results/ensemble/majority-voting-imdb.json --output_dir plots
