<a href="https://colab.research.google.com/github/BF667/ipynb/blob/main/LLM.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# LLM Fine-tuning with TRL

This notebook demonstrates fine-tuning language models using TRL (Transformer Reinforcement Learning) library.

In [None]:
# Install required packages
!pip install torch transformers datasets tokenizers trl peft accelerate bitsandbytes

In [None]:
import os
import torch
from datasets import load_dataset
from transformers import (
    AutoModelForCausalLM,
    AutoTokenizer,
    TrainingArguments,
    BitsAndBytesConfig
)
from trl import SFTTrainer
from peft import LoraConfig, get_peft_model

# Disable Weights & Biases
os.environ["WANDB_DISABLED"] = "true"

In [None]:
# Configuration
MODEL_NAME = "microsoft/DialoGPT-small"  # You can change this to any causal LM
DATASET_NAME = "timdettmers/openassistant-guanaco"  # Example dataset for instruction tuning
OUTPUT_DIR = "./results"
MAX_SEQ_LENGTH = 512

# Load tokenizer
tokenizer = AutoTokenizer.from_pretrained(MODEL_NAME)
tokenizer.pad_token = tokenizer.eos_token  # Set pad token

In [None]:
# Load and prepare dataset
dataset = load_dataset(DATASET_NAME)

# Format the dataset for instruction tuning
def format_instruction(example):
    return {"text": f"### Instruction: {example['instruction']}\n### Response: {example['response']}"}

dataset = dataset.map(format_instruction)

print("Dataset sample:")
print(dataset['train'][0]['text'])

In [None]:
# Quantization configuration for memory efficiency (optional)
bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype=torch.float16,
    bnb_4bit_use_double_quant=True,
)

# Load model with quantization
model = AutoModelForCausalLM.from_pretrained(
    MODEL_NAME,
    quantization_config=bnb_config,
    device_map="auto",
    trust_remote_code=True
)

In [None]:
# LoRA configuration for efficient fine-tuning
peft_config = LoraConfig(
    r=16,
    lora_alpha=32,
    lora_dropout=0.05,
    bias="none",
    task_type="CAUSAL_LM",
    target_modules=["q_proj", "v_proj"]  # Adjust based on your model architecture
)

In [None]:
# Training arguments
training_args = TrainingArguments(
    output_dir=OUTPUT_DIR,
    per_device_train_batch_size=4,
    gradient_accumulation_steps=4,
    learning_rate=2e-4,
    num_train_epochs=3,
    logging_steps=10,
    save_steps=500,
    fp16=True,
    optim="paged_adamw_32bit",
    report_to="none",  # Disable all logging integrations including wandb
    remove_unused_columns=False,
)

In [None]:
# Initialize SFTTrainer
trainer = SFTTrainer(
    model=model,
    args=training_args,
    train_dataset=dataset["train"],
    dataset_text_field="text",
    max_seq_length=MAX_SEQ_LENGTH,
    tokenizer=tokenizer,
    peft_config=peft_config,
    packing=True,  # Efficiently pack multiple sequences
)

In [None]:
# Start training
print("Starting training...")
trainer.train()

In [None]:
# Save the trained model
trainer.save_model()
tokenizer.save_pretrained(OUTPUT_DIR)
print("Model saved successfully!")

In [None]:
# Test the trained model
def generate_response(prompt, max_length=100):
    inputs = tokenizer.encode(prompt, return_tensors="pt").to(model.device)
    
    with torch.no_grad():
        outputs = model.generate(
            inputs,
            max_length=max_length,
            num_return_sequences=1,
            temperature=0.7,
            do_sample=True,
            pad_token_id=tokenizer.eos_token_id
        )
    
    response = tokenizer.decode(outputs[0], skip_special_tokens=True)
    return response

# Test with a sample prompt
test_prompt = "### Instruction: Explain what machine learning is.\n### Response:"
response = generate_response(test_prompt)
print("Generated response:")
print(response)

In [None]:
# Alternative: Using your own text file dataset
# Uncomment and modify the following code if you want to use your own text file:

# from datasets import load_dataset
# 
# # Load your custom text file
# custom_dataset = load_dataset("text", data_files={"train": "/content/your_data.txt"})
# 
# # For custom text files, you might want to use a different formatting function
# def format_custom_text(example):
#     return {"text": example["text"]}
# 
# custom_dataset = custom_dataset.map(format_custom_text)
# 
# # Then use custom_dataset in the SFTTrainer instead of dataset["train"]

## Key Features of This Notebook:

1. **TRL Integration**: Uses `SFTTrainer` from TRL for supervised fine-tuning
2. **No WandB**: Completely disabled Weights & Biases logging
3. **QLoRA Support**: Uses 4-bit quantization for memory efficiency
4. **LoRA Fine-tuning**: Efficient parameter-efficient fine-tuning
5. **Instruction Tuning**: Formatted for instruction-response datasets
6. **Memory Optimized**: Uses gradient accumulation and mixed precision

## To Use Your Own Data:

1. Upload your text file to Colab
2. Uncomment and modify the last cell
3. Replace `DATASET_NAME` with your file path
4. Adjust the formatting function as needed