<details><summary style="display:list-item; font-size:16px; color:blue;">Jupyter Help</summary>
    
Having trouble testing your work? Double-check that you have followed the steps below to write, run, save, and test your code!
    
[Click here for a walkthrough GIF of the steps below](https://static-assets.codecademy.com/Courses/ds-python/jupyter-help.gif)

Run all initial cells to import libraries and datasets. Then follow these steps for each question:
    
1. Add your solution to the cell with `## YOUR SOLUTION HERE ## `.
2. Run the cell by selecting the `Run` button or the `Shift`+`Enter` keys.
3. Save your work by selecting the `Save` button, the `command`+`s` keys (Mac), or `control`+`s` keys (Windows).
4. Select the `Test Work` button at the bottom left to test your work.

![Screenshot of the buttons at the top of a Jupyter Notebook. The Run and Save buttons are highlighted](https://static-assets.codecademy.com/Paths/ds-python/jupyter-buttons.png)

**Setup**

Run this cell to install and set up our finetuning run using code we've already used for two previous finetunes.

We're making sure the training arguments are the same as those for LoRA and the full finetune for the sake of comparing results.

In [None]:
import torch
import pandas as pd
from transformers import AutoTokenizer, AutoModelForSequenceClassification, Trainer, TrainingArguments, BitsAndBytesConfig
from datasets import Dataset
from peft import LoraConfig, TaskType, get_peft_model, prepare_model_for_kbit_training, LoftQConfig
import random

random.seed(42)
torch.manual_seed(42)

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

data = pd.read_csv('imdb_data.csv')

training_set = Dataset.from_pandas(data[data['dataset'] == 'train'])
test_set = Dataset.from_pandas(data[data['dataset'] == 'test'])

tokenizer = AutoTokenizer.from_pretrained("distilbert-base-uncased") # this tokenizer will work for our smaller model

def tokenize_function(examples):
    return tokenizer(examples["text"], padding="max_length", truncation=True)

tokenized_training_set = training_set.map(tokenize_function, batched=True)
tokenized_test_set = test_set.map(tokenize_function, batched=True)

training_args = TrainingArguments(
    output_dir="./temp_results",
    num_train_epochs=3,
    per_device_train_batch_size=12,
    per_device_eval_batch_size=12,
    warmup_steps=500,
    weight_decay=0.01,
    logging_dir="./logs",
    learning_rate=1e-4,
    logging_steps=10,
    save_strategy="no",
)

#### Checkpoint 1/4
Now, on to the unique components of quantization.

First, we'll define the config of our quantization library BitsAndBytes.

There are several options for how many bits you'd like to quantize the model to, with 8 a common configuration. However to keep things simple, we'll quantize our BERT model to 4 bits.

Now, let's fill in the configuration for our quantization library, `BitsAndBytes`.
We will be using the following parameters:
 
`load_in_4bit`: This parameter, when set to `True`, will load the model in 4-bit precision.
 
`bnb_4bit_quant_type`: This parameter specifies the type of quantization. We will use `"nf4"` which stands for normalized float 4-bit.
  
`bnb_4bit_use_double_quant`: This parameter, when set to `True`, will use double quantization. Double quantization can help to reduce the quantization error.
  
`bnb_4bit_compute_dtype`: This parameter specifies the data type for computation. We will use `torch.bfloat16` which is a 16-bit floating point format. This is the precision of the weights when they're being used for computation and are temporarily not quantized. 
 
Fill in these parameters in the `BitsAndBytesConfig` in the cell below.

Don't forget to run the cell and save the notebook before selecting `Test Work`! Open the `Jupyter Help` toggle at the top of the notebook for more details.

In [None]:

config = BitsAndBytesConfig(
## YOUR SOLUTION HERE ##
    load_in_4bit=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_use_double_quant=True,
    bnb_4bit_compute_dtype=torch.bfloat16
)

#### Checkpoint 2/4

Now that we have our quantization configuration set up with BitsAndBytes, it's time to prepare our model for quantization-aware training using QLoRA.

This time, when we instantiate our BERT model, we'll pass in a `quantization_config` argument that specifies the quantization configuration we defined earlier (the `config` in the previous cell).

Hugging Face has another awesome, batteries-included library for QLoRA called LoftQ. It quantizes the model with LoRA finetuning in mind.

We'll define a LoftQConfig that specifies that we want to quantize our model to 4 bits and assign it to `loftq_config`. We'll do so by passing the `LoftQConfig()` function the `loftq_bits` parameter set to 4.

The `prepare_model_for_kbit_training` we import below prepares our model for quantization. `kbit` in this case just refers to quantizing the model to a certain number of bits (k). To prepare the model, simply pass `model` to the `prepare_model_for_kbit_training()` function.

Pass the quantization config argument to the model instantiation, initialize the LoftQConfig, and prepare the model for kbit training in the lines below.

Don't forget to run the cell and save the notebook before selecting `Test Work`! Open the `Jupyter Help` toggle at the top of the notebook for more details.

In [None]:
model = AutoModelForSequenceClassification.from_pretrained("prajjwal1/bert-tiny", num_labels=2, quantization_config=config) 
## YOUR SOLUTION HERE ##
loftq_config = LoftQConfig(loftq_bits=4)
model = prepare_model_for_kbit_training(model)

#### Checkpoint 3/4

We've mostly finished adapting our code to run QLoRA. The final remaining difference is that we pass our `loftq_config` as a named argument to the `LoraConfig`.

Let's also take the opportunity to practice setting up our LoRA hyperparameters once again.

In the `LoraConfig()` call below, pass `loftq_config` to the named argument `loftqconfig`, then set up our LoRA with a rank of 16, an alpha of 32, and a dropout of 0.1.

Don't forget to run the cell and save the notebook before selecting `Test Work`! Open the `Jupyter Help` toggle at the top of the notebook for more details.

In [None]:
peft_config = LoraConfig(
    task_type=TaskType.SEQ_CLS,
## YOUR SOLUTION HERE ##
    loftq_config=loftq_config,
    r=16,  # Rank of low-rank matrices
    lora_alpha=32,  # analogous to the learning rate in normal GD
    lora_dropout=0.1  # Dropout rate, helps prevent overfitting
)

Execute the cell below to instantiate the QLoRA model and print the trainable parameters.

In [None]:
qlora_model = get_peft_model(model, peft_config)

def print_trainable_parameters(model):
    trainable_params = 0
    all_param = 0
    for _, param in model.named_parameters():
        all_param += param.numel()
        if param.requires_grad:
            trainable_params += param.numel()
    print(
        f"trainable params: {trainable_params} || all params: {all_param} || trainable%: {100 * trainable_params / all_param}"
    )

print_trainable_parameters(qlora_model)

The quantized model is much smaller than the original, which means we're training an incredibly small number of parameters in QLoRA. While this may result in disappointing performance on our IMDB BERT experiment, this exact technique can be used on the latest LLMs with great results.

### Checkpoint 4/4
Alright, let's get our QLoRA model trained!

To start, set up our trainer by passing it the `qlora_model` as `model`, with the `training_args` as `args`, the `tokenized_training_set` as `train_dataset`, and the `tokenized_test_set` as `eval_dataset`.

Then, under the second comment of "YOUR SOLUTION HERE", instruct the model to `train()` and then `evaluate()` using the syntax we've already covered. If you've forgotten what that looks like, check the hint.

Don't forget to run the cell and save the notebook before selecting `Test Work`! Open the `Jupyter Help` toggle at the top of the notebook for more details.

In [None]:
trainer = Trainer(
## YOUR SOLUTION HERE ##
    model=qlora_model,
    args=training_args,
    train_dataset=tokenized_training_set,
    eval_dataset=tokenized_test_set
)

## YOUR SOLUTION HERE ##
trainer.train()
trainer.evaluate()
