# Parameter Efficient Fine Tuning

This notebook I checkout different 7 billion parameter models and see how they perform with QLoRA. Specifically, how many trainable parameters they have.

In [1]:
import os
os.chdir("../")

## Set cache directory

Cache directory for huggingface stores model and related code in that directory.

In [2]:
os.environ['HF_HOME'] = '.cache/'

In [3]:
def print_trainable_parameters(model):
    """
    Prints the number of trainable parameters in the model.
    """
    trainable_params = 0
    all_param = 0
    for _, param in model.named_parameters():
        all_param += param.numel()
        if param.requires_grad:
            trainable_params += param.numel()
    print(
        f"trainable params: {trainable_params} || all params: {all_param} || trainable%: {100 * trainable_params / all_param}"
    )

## QLoRA Configurations

[Read more here](https://huggingface.co/blog/4bit-transformers-bitsandbytes)

In [4]:
import torch
from transformers import BitsAndBytesConfig

  from .autonotebook import tqdm as notebook_tqdm


In [5]:
nf4_config = BitsAndBytesConfig(
   load_in_4bit=True,
   bnb_4bit_quant_type="nf4",
   bnb_4bit_use_double_quant=True,
   bnb_4bit_compute_dtype=torch.bfloat16
)

## Loading Model

In [6]:
ROOT_MODEL_PATH = "/media/ishrak/volume_1/Projects/mining-misconceptions-in-math/.cache/"

In [7]:
from transformers import AutoModelForCausalLM
from peft import prepare_model_for_kbit_training
from peft import LoraConfig, get_peft_model
import gc
import time

## Check Models

In [9]:
def check_models():
    models = os.listdir(ROOT_MODEL_PATH)

    for model_name in models:
        if model_name == "hub":
            continue
        
        model = AutoModelForCausalLM.from_pretrained(
            os.path.join(ROOT_MODEL_PATH, model_name),
            quantization_config=nf4_config,
            device_map="auto",
            trust_remote_code=True,
        )

        print(model_name)
        print_trainable_parameters(model)

        model.gradient_checkpointing_enable()
        model = prepare_model_for_kbit_training(model)

        config = LoraConfig(
            r=8,
            lora_alpha=32,
            target_modules=["q_proj", "v_proj"],
            lora_dropout=0.05,
            bias="none",
            task_type="CAUSAL_LM",
        )

        model = get_peft_model(model, config)

        print_trainable_parameters(model)

        del model
        torch.cuda.empty_cache()
        gc.collect()
        time.sleep(5)


check_models()

Loading checkpoint shards: 100%|██████████| 2/2 [00:05<00:00,  2.87s/it]


Mistral-7B-v0.1
trainable params: 262410240 || all params: 3752071168 || trainable%: 6.993743675173274
trainable params: 3407872 || all params: 3755479040 || trainable%: 0.09074400266124238


Loading checkpoint shards: 100%|██████████| 4/4 [00:07<00:00,  1.77s/it]


Qwen2.5-7B-Instruct
trainable params: 1090199040 || all params: 4352972288 || trainable%: 25.04493407884521
trainable params: 2523136 || all params: 4355495424 || trainable%: 0.05792994262137927


Loading checkpoint shards: 100%|██████████| 4/4 [00:07<00:00,  1.77s/it]


Llama-3.1-8B-Instruct
trainable params: 1050939392 || all params: 4540600320 || trainable%: 23.145384264959926
trainable params: 3407872 || all params: 4544008192 || trainable%: 0.07499704789264605
