# Lightweight Fine-Tuning Project

In this cell, describe your choices for each of the following

* PEFT technique: QLoRA
* Model: mistralai/Mistral-7B-v0.1
* Evaluation approach: Rouge score
* Fine-tuning dataset: codeparrot/github-code

In [2]:
!pip install -r requirements.txt



## Loading and Evaluating a Foundation Model

In the cells below, I load the pre-trained Hugging Face model and evaluate its performance prior to fine-tuning. This step includes loading an appropriate tokenizer and dataset.

In [3]:
from datasets import load_dataset

In [4]:
train_size=100_000

In [5]:
val_size=train_size//10

In [6]:
test_size=val_size

In [7]:
seed=42

In [8]:
ds=load_dataset("codeparrot/github-code", streaming=True, trust_remote_code=True,
                split="train").shuffle(seed=seed,
                                       buffer_size=train_size+val_size+test_size)

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


In [9]:
train_ds=ds.take(train_size)

In [10]:
val_ds=ds.skip(train_size).take(val_size)

In [11]:
test_ds=ds.skip(train_size+val_size).take(test_size)

In [58]:
from evaluate import evaluator,load

In [14]:
metric_name="rouge"

In [15]:
metric=load(metric_name)

In [16]:
from transformers import AutoModelForCausalLM, AutoTokenizer

In [17]:
model_id = "mistralai/Mistral-7B-v0.1"

In [18]:
tokenizer = AutoTokenizer.from_pretrained(model_id)

In [28]:
if tokenizer.pad_token is None:
  print("it was None")
  tokenizer.pad_token = tokenizer.eos_token

In [20]:
model = AutoModelForCausalLM.from_pretrained(model_id, device_map="auto", load_in_4bit=True)

Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

In [21]:
model

MistralForCausalLM(
  (model): MistralModel(
    (embed_tokens): Embedding(32000, 4096)
    (layers): ModuleList(
      (0-31): 32 x MistralDecoderLayer(
        (self_attn): MistralSdpaAttention(
          (q_proj): Linear4bit(in_features=4096, out_features=4096, bias=False)
          (k_proj): Linear4bit(in_features=4096, out_features=1024, bias=False)
          (v_proj): Linear4bit(in_features=4096, out_features=1024, bias=False)
          (o_proj): Linear4bit(in_features=4096, out_features=4096, bias=False)
          (rotary_emb): MistralRotaryEmbedding()
        )
        (mlp): MistralMLP(
          (gate_proj): Linear4bit(in_features=4096, out_features=14336, bias=False)
          (up_proj): Linear4bit(in_features=4096, out_features=14336, bias=False)
          (down_proj): Linear4bit(in_features=14336, out_features=4096, bias=False)
          (act_fn): SiLU()
        )
        (input_layernorm): MistralRMSNorm()
        (post_attention_layernorm): MistralRMSNorm()
      )
    )

In [22]:
from functools import partial
from datasets import Dataset

def gen_from_iterable_dataset(iterable_ds):
    yield from iterable_ds

test_set = Dataset.from_generator(partial(gen_from_iterable_dataset, test_ds), features=test_ds.features)

In [60]:
test_=test_set.select(range(1))

In [23]:
from transformers import pipeline

In [71]:
task="text-generation"

In [50]:
pipe = pipeline(task=task,model=model,tokenizer=tokenizer,max_new_tokens=1_00)

In [72]:
task_evaluator=evaluator(task=task)

In [73]:
results = task_evaluator.compute(model_or_pipeline=pipe,
                          data=test_,
                          metric=metric,
                          random_state=seed,
                          input_column="code")

Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


ValueError: Evaluation module cache file doesn't exist. Please make sure that you call `add` or `add_batch` at least once before calling `compute`.

## Performing Parameter-Efficient Fine-Tuning

TODO: In the cells below, create a PEFT model from your loaded model, run a training loop, and save the PEFT model weights.

## Performing Inference with a PEFT Model

TODO: In the cells below, load the saved PEFT model weights and evaluate the performance of the trained PEFT model. Be sure to compare the results to the results from prior to fine-tuning.