<a href="https://colab.research.google.com/github/calmrocks/master-machine-learning-engineer/blob/main/GenAI/FineTune.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Fine-tuning Large Language Models: A Practical Guide

This notebook demonstrates how to fine-tune a pre-trained language model on custom data. We'll use a smaller open-source model for demonstration purposes.

## Table of Contents
1. Setup and Dependencies
2. Loading the Pre-trained Model
3. Preparing the Dataset
4. Fine-tuning Configuration
5. Training Process
6. Evaluation
7. Saving and Loading the Fine-tuned Model

In [1]:
# Install required packages
!pip install transformers datasets torch evaluate

Collecting datasets
  Downloading datasets-3.2.0-py3-none-any.whl.metadata (20 kB)
Collecting evaluate
  Downloading evaluate-0.4.3-py3-none-any.whl.metadata (9.2 kB)
Collecting dill<0.3.9,>=0.3.0 (from datasets)
  Downloading dill-0.3.8-py3-none-any.whl.metadata (10 kB)
Collecting xxhash (from datasets)
  Downloading xxhash-3.5.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (12 kB)
Collecting multiprocess<0.70.17 (from datasets)
  Downloading multiprocess-0.70.16-py311-none-any.whl.metadata (7.2 kB)
Collecting fsspec<=2024.9.0,>=2023.1.0 (from fsspec[http]<=2024.9.0,>=2023.1.0->datasets)
  Downloading fsspec-2024.9.0-py3-none-any.whl.metadata (11 kB)
Collecting nvidia-cuda-nvrtc-cu12==12.4.127 (from torch)
  Downloading nvidia_cuda_nvrtc_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-cuda-runtime-cu12==12.4.127 (from torch)
  Downloading nvidia_cuda_runtime_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl.metadata (1.5 kB)
Col

## 1. Setup and Dependencies

We'll use the following libraries:
- `transformers`: Hugging Face's library for working with pre-trained models
- `datasets`: For data handling and preprocessing
- `torch`: Deep learning framework
- `evaluate`: For model evaluation

In [None]:
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
from datasets import load_dataset
from transformers import TrainingArguments, Trainer
import evaluate
import numpy as np

## 2. Loading the Pre-trained Model

We'll use a smaller version of LLaMA or GPT-2 as our base model. For this example, we'll use GPT-2 small, which has 124M parameters.

In [None]:
# Load model and tokenizer
model_name = "gpt2"
model = AutoModelForCausalLM.from_pretrained(model_name)
tokenizer = AutoTokenizer.from_pretrained(model_name)
tokenizer.pad_token = tokenizer.eos_token

## 3. Preparing the Dataset

For this example, we'll use a simple text dataset. We'll prepare it in the format required for fine-tuning.

In [None]:
# Load dataset (example using a small subset of WikiText)
dataset = load_dataset("wikitext", "wikitext-2-raw-v1", split="train[:1000]")

def tokenize_function(examples):
    return tokenizer(examples["text"], padding="max_length", truncation=True, max_length=128)

# Tokenize the dataset
tokenized_dataset = dataset.map(tokenize_function, batched=True)

## 4. Fine-tuning Configuration

We'll set up the training arguments that control the fine-tuning process. Key parameters include:
- Learning rate
- Number of epochs
- Batch size
- Training steps

In [None]:
training_args = TrainingArguments(
    output_dir="./results",
    num_train_epochs=3,
    per_device_train_batch_size=8,
    per_device_eval_batch_size=8,
    warmup_steps=500,
    weight_decay=0.01,
    logging_dir='./logs',
    logging_steps=10,
)

## 5. Training Process

Now we'll create a Trainer instance and start the fine-tuning process.

In [None]:
trainer = Trainer(
    model=model,
    args=training_args,
    train_dataset=tokenized_dataset,
    tokenizer=tokenizer
)

# Start training
trainer.train()

## 6. Evaluation

After training, we'll evaluate the model's performance.

In [None]:
# Load test dataset
test_dataset = load_dataset("wikitext", "wikitext-2-raw-v1", split="test[:100]")
tokenized_test_dataset = test_dataset.map(tokenize_function, batched=True)

# Evaluate
eval_results = trainer.evaluate(eval_dataset=tokenized_test_dataset)
print(f"Perplexity: {math.exp(eval_results['eval_loss']):.2f}")

## 7. Saving and Loading the Fine-tuned Model

Finally, we'll save our fine-tuned model and show how to load it back.

In [None]:
# Save the model
model_path = "./fine_tuned_gpt2"
trainer.save_model(model_path)

# Load the fine-tuned model (if needed later)
loaded_model = AutoModelForCausalLM.from_pretrained(model_path)
loaded_tokenizer = AutoTokenizer.from_pretrained(model_path)

## Testing the Fine-tuned Model

Let's test our fine-tuned model with some example prompts.

In [None]:
def generate_text(prompt, max_length=100):
    inputs = tokenizer(prompt, return_tensors="pt")
    outputs = model.generate(
        inputs.input_ids,
        max_length=max_length,
        num_return_sequences=1,
        no_repeat_ngram_size=2,
        temperature=0.7
    )
    return tokenizer.decode(outputs[0], skip_special_tokens=True)

# Test the model
prompt = "The artificial intelligence revolution"
generated_text = generate_text(prompt)
print(f"Prompt: {prompt}")
print(f"Generated: {generated_text}")

## Conclusion

In this notebook, we've covered:
1. Setting up the necessary dependencies
2. Loading a pre-trained model
3. Preparing and preprocessing data
4. Configuring and executing the fine-tuning process
5. Evaluating the model's performance
6. Saving and loading the fine-tuned model
7. Testing the model with example prompts

Remember that this is a basic example, and you might need to adjust parameters and configurations based on your specific use case and requirements.