# Lab 1: LoRA & QLoRA - Fine-Tuning a Llama-2 Model

**Goal:** This lab provides a deep dive into the most popular PEFT method, **Low-Rank Adaptation (LoRA)**, and its highly efficient variant, **QLoRA**.

**You will learn to:**
-   Load a large language model (`Llama-2-7B`) in 4-bit precision.
-   Understand and configure `LoraConfig` from the `peft` library in detail.
-   Apply the LoRA adapter to the base model.
-   Use the `transformers.Trainer` to fine-tune the model with LoRA.
-   Perform inference with the fine-tuned LoRA model.
-   Merge the LoRA adapter back into the base model for deployment.

---
## Notebook 1: Environment Setup

This first notebook ensures your environment is ready for the lab. It will:
1.  Check for GPU availability.
2.  Install all required Python libraries from Hugging Face and the broader AI ecosystem.
3.  Verify that the installations are successful.


### Step 1: Check for GPU

For fine-tuning large models (even with PEFT), a GPU is essential. The command below uses `nvidia-smi` to check for and display the status of any available NVIDIA GPUs. If this command fails or shows no devices, you may need to enable a GPU runtime (e.g., in Google Colab) or ensure your local environment has a correctly configured GPU.


In [2]:
!nvidia-smi


Tue Oct 14 12:53:54 2025       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 575.64.03              Driver Version: 575.64.03      CUDA Version: 12.9     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|   0  NVIDIA RTX 2000 Ada Gene...    Off |   00000000:01:00.0 Off |                  Off |
| 30%   34C    P8              7W /   70W |    3005MiB /  16380MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
                                                

### Step 2: Install Core Libraries

Now, we will install the key Python libraries. We use `pip install -q` for a "quiet" installation, which reduces the verbosity of the output.

-   **`transformers`**: The core Hugging Face library. It provides the `AutoModelForCausalLM` and `AutoTokenizer` classes for loading our base model, as well as the powerful `Trainer` API.
-   **`peft`**: The Parameter-Efficient Fine-Tuning library. This provides the `LoraConfig` and other tools to create and manage our adapters.
-   **`datasets`**: Used to easily load datasets from the Hugging Face Hub.
-   **`accelerate`**: A library from Hugging Face that simplifies running PyTorch code across different hardware configurations (CPU, single GPU, multiple GPUs, TPUs). The `Trainer` uses it under the hood.
-   **`bitsandbytes`**: This is a crucial library for quantization. It allows us to load our model in 4-bit precision (`load_in_4bit=True`), which is the key to making QLoRA work on consumer-grade hardware.
-   **`sentencepiece` & `protobuf`**: These are common dependencies for the tokenizers used by many modern LLMs, including Llama-2.


In [2]:
%pip install -q transformers peft datasets accelerate bitsandbytes sentencepiece protobuf



[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m25.1.1[0m[39;49m -> [0m[32;49m25.2[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m
Note: you may need to restart the kernel to use updated packages.


### Step 3: Verify Installation

Let's import the main libraries and print their versions to confirm that everything was installed correctly. If this cell runs without errors, your environment is ready to go!


In [3]:
import torch
import transformers
import peft
import datasets

print(f"PyTorch version: {torch.__version__}")
print(f"Transformers version: {transformers.__version__}")
print(f"PEFT version: {peft.__version__}")
print(f"Datasets version: {datasets.__version__}")

print("\n✅ Environment is set up correctly!")


PyTorch version: 2.6.0+cu124
Transformers version: 4.57.0
PEFT version: 0.17.1
Datasets version: 4.1.1

✅ Environment is set up correctly!


In [4]:
import transformers
from transformers import TrainingArguments
print("Transformers version:", transformers.__version__)
print("TrainingArguments from:", TrainingArguments.__module__)


Transformers version: 4.57.0
TrainingArguments from: transformers.training_args
