# Homework 1 Part 4

## Course Name: Large Language Models
#### Lecturers: Dr. Soleimani, Dr. Rohban, Dr. Asgari

---

#### Notebooks Supervised By: MohammadAli SadraeiJavaheri
#### Notebooks Prepared By: Zeinab Sadat Taghavi, Hamed Jamshidian, Seyed Mohammad Reza Modarres

**Contact**: Ask your questions in Quera

---

### Instructions:
- Complete all exercises presented in this notebook.
- Ensure you run each cell after you've entered your solution.
- After completing the exercises, save the notebook and <font color='red'>follow the submission guidelines provided in the PDF.</font>


---

**Note**: Replace the placeholders (between `## Your code begins ##` and `## Your code ends ##`) with the appropriate details.


## Introduction

In this notebook you have to use [Adapter Hub](https://docs.adapterhub.ml/overview.html) to define and train your adapters.

### Requirements

In [None]:
%%capture
! pip install datasets adapter-transformers

### Imports

In [None]:
from tqdm.notebook import tqdm
from IPython import display

import numpy as np
import pandas as pd
import math

from sklearn.metrics import accuracy_score

import torch
import torch.nn as nn

from datasets import load_dataset
from transformers import T5TokenizerFast, T5ForConditionalGeneration, DataCollatorForSeq2Seq
from transformers.models.t5.modeling_t5 import T5LayerFF

### Constants

We will use `t5-small` as our base model from Hugging Face ([HF_Link](https://huggingface.co/t5-small)). And we use `32` as our adapter bottleneck size.

In [None]:
#####################################
###### DO NOT CHANGE THIS CELL ######
#####################################

BASE_MODEL_NAME = 't5-small'

BATCH_SIZE = 32
LEARNING_RATE = 1e-4
EPOCHS = 10

DEVICE = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")

## Dataset

### Load dataset

`imdb` dataset is a famouns NLP sentiment dataset. Each row of data is either `negative` or `positive`.

In [None]:
dataset = load_dataset('imdb')
dataset.pop('unsupervised')
print(dataset)

### Define related functions

Because `T5` model is a sequence to sequence model we should map our labels to label_names before training and doing vice versa duing calculating metrics.

The functions `id2label` and `label2id` are defined to do this.

In [None]:
def id2label(ids):
    label_names = ['negative', 'positive']
    return [label_names[id] for id in ids]

def label2id(labels):
    label_names_dict = {
        'negative': 0,
        'positive': 1
    }
    return [
        label_names_dict.get(label, 2)
        for label in labels
    ]

# Tokenizer

### Load tokenizer

In [None]:
tokenizer = T5TokenizerFast.from_pretrained(BASE_MODEL_NAME)

### Process dataset using tokenizer

In this step we will getting our dataset ready for training.

We preprocess tokenize our `text` and `label`.

In [None]:
def preprocess_input(text):
    text = text.lower()
    text = text.replace('<br />', ' ')
    return text

def map_function(row):
    processed_input = [
        preprocess_input(text)
        for text in row['text']
    ]
    input_info = tokenizer(processed_input, truncation=True, max_length=256)
    output_info = tokenizer(id2label(row['label']))
    return {
        **input_info,
        'labels': output_info.input_ids
    }


dataset = dataset.map(map_function, batched=True)
dataset.set_format(type='torch', columns=['input_ids', 'attention_mask', 'labels'])

# Model

### Load model and create adapter

In next part create `PfeifferConfig` by considering `BOTTLENECK_SIZE`.

Complete the next cell using methods of `train_adapter` and `add_adapter`.

Report final test data performance.s

Read [this page](https://docs.adapterhub.ml/training.html) for more information.

In [None]:
from transformers import PfeifferConfig

BOTTLENECK_SIZE = 8

model = T5ForConditionalGeneration.from_pretrained(BASE_MODEL_NAME)

######### Your code begins #########
####################################
####################################

# Train and evaluate

Complete next part to train your peft using `AdapterTrainer`.

In [None]:
from transformers import TrainingArguments, AdapterTrainer

col_fn = DataCollatorForSeq2Seq(
    tokenizer, return_tensors='pt', padding='longest',
)

training_args = TrainingArguments(
    './temp',
    learning_rate=LEARNING_RATE,
    num_train_epochs=EPOCHS,
    logging_strategy='epoch',
    evaluation_strategy='epoch',
    save_strategy='epoch',
    per_device_train_batch_size=BATCH_SIZE,
    per_device_eval_batch_size=BATCH_SIZE
)

######### Your code begins #########
####################################
####################################

# Bottleneck effect

Check the model performance with `BOTTLENECK_SIZE = 1` and report the number.

In [None]:
from transformers import PfeifferConfig

BOTTLENECK_SIZE = 1

model = T5ForConditionalGeneration.from_pretrained(BASE_MODEL_NAME)

######### Your code begins #########
####################################
####################################