# Recitation 9: Low-Rank Adaptation
_Date_: 12/4/2025

## Background

### Parameter-Efficient Fine-Tuning (PEFT)

As LLMs normally have many parameters, it's very computationally expensive to "tune" each parameter to fit a new set of observations (data). 

(Imagine a lego built with thousands of pieces supports various lookings, you would be crazy to disassemble all pieces and reassemble them together to get a new look)

**Question**: Is it possible to have a magical glue/adaptor plugged in this giant "machine" so that we can only tweak very few amount of "pieces"?

**Answer**: Yes, PEFT

**Question**: By how? Or where do we place the adaptor in a model?

### Rank (of a matrix)

Rank have two kinds: column rank and row rank

_**Definition**_ (Column rank): the number of linearly independent column vectors of a matrix

_**Definition**_ (Linear independence): Given a matrix represented as column vectors $M = [v_1, v_2, \cdots, v_n]$, an linearly dependent column vector $v_j$ if it is a linear combination of rest of column vectors.

## LoRA

LoRA introduces an additional weight matrix $\Delta W = B \times A$ to a pre-trained model with weight $W$, where $A \in \mathbb{R}^{r \times d}, B \in \mathbb{R}^{d \times r}$ are low-rank matrices and $r \ll d$.

LoRA is often applied to the tranformation for $Q$ and $V$, so $Q = X \cdot (W_Q + \Delta W)$ and $V = X \cdot (W_V + \Delta W)$.

_**Question**_: Why does LoRA apply to the layer computing self-attention?

_**Question**_: What is the hyperparameter for LoRA?

> NOTE: Other than applying to the process computing $Q, V$, it's worth trying to adapt LoRA to FFN or key $K$

## Implementation

Other than instaniating and loading a model using the library `transformers`, loading a LoRA adaptor needs another library `peft`.

In [None]:
from transformers import AutoModelForSequenceClassification
from peft import LoraConfig, TaskType, get_peft_model

In [None]:
model = AutoModelForSequenceClassification.from_pretrained('bert-base-uncased', num_labels=2)

In [None]:
lora_config = LoraConfig(
    task_type=TaskType.SEQ_CLS,
    inference_mode=False,
    r=8,
    lora_alpha=16,
    lora_dropout=0.1,
    target_modules=['q_proj', 'v_proj']
)

In [None]:
lora_model = get_peft_model(model, lora_config)
lora_model.print_trainable_parameters()

In [None]:
print(lora_model)