# Lightweight Fine Tuning

This project will load a pre-trained model and evaluate it's performance, perform parameter-efficient fine-tuning using the pre-trained model, and perform inference using the fine-tuned model, finally comparing its performance to the original model. 

- **PEFT Technique**:
    - Parameter Efficient Fine Tuning Methods
    - This project will use LoRA: [Low-Rank Adaptation of Large Language Models](https://arxiv.org/abs/2106.09685)
- **Model**:
    - GPT-2: [OpenAI's open source Generative Pre-trained Transformer](https://huggingface.co/openai-community/gpt2)
- **Evaluation Approach**:
    - The `evaluate` method with a Hugging Face `Trainer` will be used.
    - The key requirement for the evlauation is that 
- **Dataset**:
    - [Wikitext2](https://huggingface.co/datasets/mindchain/wikitext2): The WikiText language modeling dataset is a collection of over 100 million tokens extracted from the set of verified Good and Featured articles on Wikipedia. The dataset is available under the Creative Commons Attribution-ShareAlike License.

## Training with PEFT

### Importing the modules

In [1]:
from transformers import AutoTokenizer, AutoModelForCausalLM, Trainer, TrainingArguments, DataCollatorForLanguageModeling
from peft import get_peft_model, LoraConfig, TaskType, AutoPeftModelForCausalLM
from datasets import load_dataset

### Setup the Model and Tokenizer

In [2]:
tokenizer = AutoTokenizer.from_pretrained("gpt2")
tokenizer.pad_token = tokenizer.eos_token
model = AutoModelForCausalLM.from_pretrained("gpt2")

### Creating a PEFT Config

In [3]:
peft_config = LoraConfig(
    task_type=TaskType.CAUSAL_LM,  # Causal language modeling for GPT-2
    r=8,                           # Rank of update matrices
    lora_alpha=32,                 # Alpha parameter for LoRA scaling
    lora_dropout=0.1,              # Dropout probability for LoRA layers
    # Target the attention and MLP layers in GPT-2
    target_modules=["c_attn", "c_proj", "c_fc"],
    bias="none",
    fan_in_fan_out=True,
    inference_mode=False,
)
lora_model = get_peft_model(model, peft_config)
# Check trainable parameters
lora_model.print_trainable_parameters()

trainable params: 1,179,648 || all params: 125,619,456 || trainable%: 0.9391


### Training with a PEFT Model

In [18]:
# Define training arguments
training_args = TrainingArguments(
    output_dir="peft_model_output",
    num_train_epochs=3,
    per_device_train_batch_size=2,
    gradient_accumulation_steps=2,  # Accumulate gradients to simulate larger batch
    warmup_steps=50,
    learning_rate=2e-5,
    logging_steps=10,
    eval_strategy="steps",
    eval_steps=100,
    save_strategy="steps",
    save_steps=200,
    load_best_model_at_end=True,
)

# Load the dataset and split into train and validation sets
subset_size = 1000
dataset = load_dataset(
    "wikitext", 
    "wikitext-2-raw-v1",
    split={
        'train': f'train[:{subset_size}]',
        'validation': f'validation[:{subset_size//10}]'  # Smaller validation set
    }
)

# Tokenize the dataset
def tokenize_function(examples):
    return tokenizer(examples["text"], padding="max_length", truncation=True, max_length=512)

tokenized_dataset = dataset.map(tokenize_function, batched=True, batch_size=32, desc="Tokenizing")

# Add sequence lengths for efficient batching
tokenized_dataset = tokenized_dataset.map(
    lambda x: {"length": len(x["input_ids"])},
    desc="Adding sequence lengths"
)


# Define data collator to handle padding and labels
data_collator = DataCollatorForLanguageModeling(
    tokenizer=tokenizer, mlm=False,
)

# Initialize trainer
trainer = Trainer(
    model=lora_model,
    args=training_args,
    train_dataset=tokenized_dataset["train"],
    eval_dataset=tokenized_dataset["validation"],
    data_collator=data_collator,
)

# Start training
trainer.train()

# Save the final model
lora_model.save_pretrained("gpt2-lora")

Step,Training Loss,Validation Loss
100,0.0,
200,0.0,
300,0.0,
400,0.0,
500,0.0,
600,0.0,
700,0.0,
