This notebook demonstrates how to fine-tune a pre-trained **Flan-T5** model for a summarization task using **LoRA** — a Parameter-Efficient Fine-Tuning (PEFT) technique. Instead of updating all model parameters, we fine-tune only small trainable adapters injected into selected layers of the transformer, reducing the computational and memory cost.

Theoretical Background

What is PEFT?

**Parameter-Efficient Fine-Tuning (PEFT)** refers to methods that allow training large language models (LLMs) by modifying only a **small subset of parameters**.

Traditional fine-tuning requires updating **all parameters** of the model, which is inefficient and costly for LLMs. PEFT techniques overcome this by introducing a small number of **trainable parameters** (e.g., adapters, prompts, LoRA layers), leaving the backbone model **frozen**.

What is LoRA?

**Low-Rank Adaptation (LoRA)** is a PEFT method that modifies **attention layers** of transformer models.

Instead of fine-tuning the full weight matrix \( W \), LoRA freezes it and injects a low-rank decomposition:

$$
W' = W + \Delta W = W + AB
$$

Where:

$$
A \in \mathbb{R}^{d \times r}
$$

$$
B \in \mathbb{R}^{r \times k}
$$

$$
r \ll d, k
$$

- The **rank** \( r \) is typically between 4 and 16.
- Only matrices **A** and **B** are trainable, making this approach very lightweight.

**LoRA advantages**:
- Reduces trainable parameters by 10x–1000x
- Compatible with most transformer models
- Easy to merge back into base model after training

---
Project Pipeline

1. **Model & Tokenizer**: Load `google/flan-t5-base` with `AutoAdapterModel`.
2. **Dataset**: Use `"neural-bridge/rag-dataset-12000"` with context, question, and answer fields.
3. **Preprocessing**:
   - Concatenate question + context
   - Add task prefix `"summarize: "`
   - Tokenize inputs/targets
4. **Adapter Injection**:
   - Define `LoRAConfig` (e.g., `r=8`, `alpha=16`)
   - Add and activate LoRA adapter
5. **Training**:
   - Use `AdapterTrainer` with 2 epochs and small batch size
6. **Inference**:
   - Test on custom story + question
   - Use `generate()` for summarization
7. **Adapter Merge**:
   - Merge trained LoRA weights into base model for deployment


In [None]:
from transformers import AutoTokenizer

base_model = "google/flan-t5-base"

tokenizer = AutoTokenizer.from_pretrained(base_model)
prefix = 'summarize: '

In [None]:
def encode_batch(examples):
    text_column1 = 'context'
    text_column2 = 'question'
    summary_column = 'answer'
    
    padding = "max_length"

    inputs, targets = [], []
    for i in range(len(examples[text_column1])):
        if examples[text_column1][i] and examples[text_column2][i] and examples[summary_column][i]:
            # Concatenate question + context
            input_text = examples[text_column2][i] + " " + examples[text_column1][i]
            inputs.append(input_text)
            targets.append(examples[summary_column][i])


    inputs = [prefix + inp for inp in inputs]

    model_inputs = tokenizer(inputs, max_length=512, padding=padding, truncation=True)
    labels = tokenizer(targets, max_length=128, padding=padding, truncation=True)

    model_inputs["labels"] = labels["input_ids"]
    return model_inputs

In [None]:
def load_split(split_name, max_items):
    
    dataset = load_dataset("neural-bridge/rag-dataset-12000")[split_name] 


    dataset = dataset.filter(lambda example: example['context'] is not None and example['answer'] is not None)
    
    dataset = dataset.filter(lambda _, idx: idx < max_items, with_indices=True)
    
    
    dataset = dataset.map(
        encode_batch,
        batched=True,
        remove_columns=dataset.column_names,
        desc="Running tokenizer on " + split_name + " dataset",
    )
    
    dataset.set_format(type="torch", columns=["input_ids", "labels"])

    return dataset

In [None]:
#from transformers import AutoModelForSeq2SeqLM
from adapters import LoRAConfig

from adapters import AutoAdapterModel

model = AutoAdapterModel.from_pretrained(base_model)

# Load the model
#model = AutoModelForSeq2SeqLM.from_pretrained(base_model)

config = LoRAConfig(
    r=8,
    alpha=16,
    intermediate_lora=True,
    output_lora=True
)


In [17]:
print(type(model))

<class 'transformers.models.t5.modeling_t5.T5ForConditionalGeneration'>


In [5]:
print(type(model))

<class 'adapters.models.t5.adapter_model.T5AdapterModel'>


In [None]:
#model.add_adapter("my_summary_adapter", config=config, adapter_type="lora")
model.add_adapter(adapter_name="my_summary_adapter", config=config)

model.train_adapter("my_summary_adapter")
model.set_active_adapters("my_summary_adapter")

In [None]:
from transformers import TrainingArguments
from adapters import AdapterTrainer
from datasets import load_dataset
batch_size = 2  

training_args = TrainingArguments(
    learning_rate=3e-4,
    num_train_epochs=2,
    per_device_train_batch_size=batch_size,
    per_device_eval_batch_size=batch_size,
    logging_steps=50,
    output_dir="./training_output",
    overwrite_output_dir=True,
    remove_unused_columns=False,
)

trainer = AdapterTrainer(
    model=model,
    args=training_args,
    tokenizer=tokenizer,
    train_dataset=load_split("train", 1000),
    eval_dataset=load_split("test", 100),
)

trainer.train()


Filter:   0%|          | 0/9600 [00:00<?, ? examples/s]

Filter:   0%|          | 0/9598 [00:00<?, ? examples/s]

Running tokenizer on train dataset:   0%|          | 0/1000 [00:00<?, ? examples/s]

Filter:   0%|          | 0/2400 [00:00<?, ? examples/s]

Filter:   0%|          | 0/2399 [00:00<?, ? examples/s]

Running tokenizer on test dataset:   0%|          | 0/100 [00:00<?, ? examples/s]

Passing a tuple of `past_key_values` is deprecated and will be removed in Transformers v4.48.0. You should pass an instance of `EncoderDecoderCache` instead, e.g. `past_key_values=EncoderDecoderCache.from_legacy_cache(past_key_values)`.


Step,Training Loss
50,23.367
100,5.4429
150,2.8273
200,1.0929
250,0.5771
300,0.5632
350,0.5436
400,0.6298
450,0.5274
500,0.454


Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead.
Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead.
Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead.
Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead.


TrainOutput(global_step=1000, training_loss=2.032686315536499, metrics={'train_runtime': 383.374, 'train_samples_per_second': 5.217, 'train_steps_per_second': 2.608, 'total_flos': 1381594300416000.0, 'train_loss': 2.032686315536499, 'epoch': 2.0})

In [12]:
trainer.evaluate()

{'eval_loss': 0.3849603533744812,
 'eval_runtime': 6.2349,
 'eval_samples_per_second': 16.039,
 'eval_steps_per_second': 8.019,
 'epoch': 2.0}

In [None]:

model.merge_adapter("my_summary_adapter")

In [None]:
context = """
Once upon a time, there were two brothers — one was rich, and the other was poor. The poor brother ran out of food and went to his rich brother, begging for something to eat.

The rich brother, not happy about helping, said, “I’ll give you this ham, but you must take it to Dead Man’s Hall.”

Grateful for the food, the poor brother agreed. He walked all day and finally reached a large building at dusk. Outside, an old man was chopping wood.

“Excuse me, sir,” said the poor brother. “Is this the way to Dead Man’s Hall?”

“Yes, you’ve arrived,” replied the old man. “Inside, they will want to buy your ham. But don’t sell it unless they give you the hand-mill that stands behind the door.”

The poor brother thanked the old man, went inside, and everything happened just as the old man had said. The poor brother left with the hand-mill and asked the old man how to use it. Then, he set off home.

The hand-mill was magical. When the poor brother got home, he asked it to grind a feast of food and drink. To stop the mill, he simply had to say, “Thank you, magic mill, you can stop now.”

When the rich brother saw that his brother was no longer poor, he became jealous. “Give me that mill!” he demanded. The poor brother, having everything he needed, agreed to sell it but didn’t tell his rich brother how to stop it.

The rich brother eagerly asked the mill to grind food when he got home, but because he didn’t know how to stop it, the mill kept grinding until food overflowed from the house and across the fields. In a panic, he ran to his poor brother’s house. “Please take it back!” he cried. “If it doesn’t stop, the whole town will be buried!”

The poor brother took the mill back and was never poor or hungry again.

Soon, the story of the magic mill spread far and wide. One day, a sailor knocked at the poor brother’s door. “Does the mill grind salt?” he asked.

“Of course,” replied the brother. “It will grind anything you ask.”

The sailor, eager to stop traveling far for salt, offered a thousand coins for the mill. Though the brother was hesitant, he eventually agreed.

In his hurry, the sailor forgot to ask how to stop the mill. Once at sea, he placed the mill on deck and commanded, “Grind salt, and grind quickly!”

The mill obeyed, but it didn’t stop. The pile of salt grew and grew until the ship sank under its weight.

The mill still lies at the bottom of the sea, grinding salt to this day, and that’s why the sea is salty.

"""
question = "Summarize the story."

input_text = prefix + question + " " + context

inputs = tokenizer(input_text, return_tensors="pt", truncation=True).to(model.device)

output = model.generate(**inputs, max_length=128)

generated_summary = tokenizer.decode(output[0], skip_special_tokens=True)
print("Input:\n", input_text)
print("\nGenerated Summary:\n", generated_summary)


Input:
 summarize: Summarize the story. 
Once upon a time, there were two brothers — one was rich, and the other was poor. The poor brother ran out of food and went to his rich brother, begging for something to eat.

The rich brother, not happy about helping, said, “I’ll give you this ham, but you must take it to Dead Man’s Hall.”

Grateful for the food, the poor brother agreed. He walked all day and finally reached a large building at dusk. Outside, an old man was chopping wood.

“Excuse me, sir,” said the poor brother. “Is this the way to Dead Man’s Hall?”

“Yes, you’ve arrived,” replied the old man. “Inside, they will want to buy your ham. But don’t sell it unless they give you the hand-mill that stands behind the door.”

The poor brother thanked the old man, went inside, and everything happened just as the old man had said. The poor brother left with the hand-mill and asked the old man how to use it. Then, he set off home.

The hand-mill was magical. When the poor brother got home,