# **LLMs Models - Training and Evaluation**

## 1. **SETUP**

### We check the GPU status

In [None]:
!nvidia-smi

### Now we are going to verify that there are no other past instances of our project within the Colab directory.

In [None]:
%ls

### We check that the directory is not already there, and pull the files needed for training and evaluation from GitHub.

In [None]:
!test -d DLA_LLMSANALYSIS && rm -rf DLA_LLMSANALYSIS
!git clone https://github.com/wakaflocka17/DLA_LLMSANALYSIS.git
%cd DLA_LLMSANALYSIS

### We are now going to create our virtual environment using venv.

In [None]:
!pip install virtualenv
!python -m virtualenv venv
!source venv/bin/activate

### Now we are going to install all the libraries defined within our requirements.txt file.

In [None]:
!venv/bin/pip install -r requirements.txt

## 2. **HUGGINGFACE LOGIN USING TOKEN ACCOUNT**

In [None]:
from huggingface_hub import notebook_login

notebook_login()

## 3. **TRAINING AND EVALUATION MODELS**

### 3.1 **Train & Validation encoder-only**: google-bert/bert-base-uncased

In [None]:
!venv/bin/python main.py --model_config_key bert_base_uncased --mode train

### 3.1 **Evaluation encoder-only pre-trained**: google-bert/bert-base-uncased

In [None]:
!venv/bin/python main.py --model_config_key bert_base_uncased --mode eval --eval_type pretrained --output_json_path "results/evaluation/pretrained/bert-base-uncased-imdb.json"

### 3.1 **Evaluation encoder-only fine-tuned**: google-bert/bert-base-uncased

In [None]:
!venv/bin/python main.py --model_config_key bert_base_uncased --mode eval --eval_type fine_tuned --output_json_path "results/evaluation/finetuned/bert-base-uncased-imdb.json"

### 3.2 **Train & Validation encoder-decoder**: facebook/bart-base

In [None]:
!venv/bin/python main.py --model_config_key bart_base --mode train

### 3.2 **Evaluation encoder-decoder pre-trained**: facebook/bart-base

In [None]:
!venv/bin/python main.py --model_config_key bart_base --mode eval --eval_type pretrained --output_json_path "results/evaluation/pretrained/bart-base-imdb.json"

### 3.2 **Evaluation encoder-decoder fine-tuned**: facebook/bart-base

In [None]:
!venv/bin/python main.py --model_config_key bart_base --mode eval --eval_type fine_tuned --output_json_path "results/evaluation/finetuned/bart-base-imdb.json"

### 3.3 **Train & Validation decoder-only**: EleutherAI/gpt-neo-2.7B

In [None]:
!venv/bin/python main.py --model_config_key gpt_neo_2_7b --mode train

### 3.3 **Evaluation decoder-only pre-trained**: EleutherAI/gpt-neo-2.7B

In [None]:
!venv/bin/python main.py --model_config_key gpt_neo_2_7b --mode eval --eval_type pretrained --output_json_path "results/evaluation/pretrained/gpt-neo-2.7b-imdb.json"

### 3.3 **Evaluation decoder-only fine-tuned**: EleutherAI/gpt-neo-2.7B

In [None]:
!venv/bin/python main.py --model_config_key gpt_neo_2_7b --mode eval --eval_type fine_tuned --output_json_path "results/evaluation/finetuned/gpt-neo-2.7b-imdb.json"

## 4. **UPLOADING ALL MODELS ON HUGGINGFACE REPOSITORY**

In [None]:
!venv/bin/python src/upload_models.py --only bert-base-uncased-imdb

In [None]:
!venv/bin/python src/upload_models.py --only bart-base-imdb

In [None]:
!venv/bin/python src/upload_models.py --only gpt-neo-2.7B-imdb

In [None]:
!venv/bin/python aggregate_results.py \
  --input_dir results \
  --output_file results_aggregati.json

In [None]:
!venv/bin/python plot_results.py --results_file "results/aggregate_results.json" --output_dir "plots"

In [None]:
!venv/bin/python ensemble_analysis.py --ensemble_file results/ensemble/majority-voting-imdb.json --output_dir plots