<a href="https://colab.research.google.com/github/KaifAhmad1/code-test/blob/main/Llumo_AI_Assignment_Mohd_Kaif.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

### **Fine Tuning Meta's Llama 3.2B Model on Meta Review Summarization Task**
This notebook demonstrates the process of fine-tuning the Meta LLaMA 3.2B model for summarizing academic paper meta-reviews. We'll go through the entire pipeline, from setting up the environment to evaluating the model's performance.


**First, let's install the necessary libraries**

In [1]:
!pip install -qU transformers datasets evaluate rouge_score trl peft bitsandbytes accelerate xformer bert-score

In [2]:
import torch
from datasets import load_dataset
from transformers import AutoModelForCausalLM, AutoTokenizer, TrainingArguments, Trainer, BitsAndBytesConfig
from peft import LoraConfig, get_peft_model, prepare_model_for_kbit_training
from trl import SFTTrainer
import evaluate
import numpy as np
from tqdm.auto import tqdm
from nltk.translate.bleu_score import sentence_bleu
from bert_score import score as bert_score
from accelerate import Accelerator
from huggingface_hub import notebook_login
from transformers import pipeline
import os
import plotly.express as px
import plotly.graph_objects as go

In [3]:
from huggingface_hub import notebook_login
notebook_login()

VBox(children=(HTML(value='<center> <img\nsrc=https://huggingface.co/front/assets/huggingface_logo-noborder.sv…

In [4]:
# Enable xformers for optimized attention
os.environ["PYTORCH_CUDA_ALLOC_CONF"] = "max_split_size_mb:128"

# Initialize accelerator
accelerator = Accelerator()

In [5]:
# Load and explore the dataset
dataset = load_dataset("zqz979/meta-review")
print(f"Dataset size: {len(dataset['train'])} train, {len(dataset['validation'])} validation, {len(dataset['test'])} test")

print("\nSample Meta-Review:")
print(dataset['train'][0]['Input'][:500] + "...")
print("\nSample Summary:")
print(dataset['train'][0]['Output'])

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


Dataset size: 7692 train, 1648 validation, 1649 test

Sample Meta-Review:
In this paper, the author investigates how to utilize large-scale human video to train dexterous robot manipulation skills. To leverage the information from the Internet videos, the author proposes a handful of techniques to pre-process the video data to extract the action information. Then the network is trained on the extracted hand data and deployed to the real robot with some human demonstration collected by teleoperation for fine-tuning. Experiments show that the proposed pipeline can solve...

Sample Summary:
This paper studies how to learn dexterous manipulation from human videos.    In the initial review, the reviewer appreciated the direction and real-world experiment but also raised  concerns about the need of special sensor for tracking. During rebuttal, the authors effectively addressed this concern by providing additional experiment results, and reviewers were satisfied with the response.  AC would l

In [7]:
# Load tokenizer
model_name = "meta-llama/Llama-3.2-1B"
tokenizer = AutoTokenizer.from_pretrained(model_name)
tokenizer.pad_token = tokenizer.eos_token
tokenizer.padding_side = "right"

# Configure quantization for faster training and lower memory usage
bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_use_double_quant=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype=torch.bfloat16  # Use bf16 for computation
)

# Load model with 4-bit quantization
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    quantization_config=bnb_config,
    device_map="auto",
    trust_remote_code=True,
    use_auth_token=True
)

# Enable gradient checkpointing and disable caching for memory efficiency
model.config.use_cache = False
model.config.pretraining_tp = 1
model.gradient_checkpointing_enable()

# Ensure model parameters require gradients only if they are of floating-point type
for param in model.parameters():
    if param.dtype in [torch.float16, torch.float32, torch.float64, torch.bfloat16]:
        param.requires_grad = True

CUDA is required but not available for bitsandbytes. Please consider installing the multi-platform enabled version of bitsandbytes, which is currently a work in progress. Please check currently supported platforms and installation instructions at https://huggingface.co/docs/bitsandbytes/main/en/installation#multi-backend


RuntimeError: CUDA is required but not available for bitsandbytes. Please consider installing the multi-platform enabled version of bitsandbytes, which is currently a work in progress. Please check currently supported platforms and installation instructions at https://huggingface.co/docs/bitsandbytes/main/en/installation#multi-backend

In [None]:
# Improved prompt
def generate_summary_prompt(meta_review):
    return f"Summarize the following meta-review of an academic paper. Focus on the overall assessment, strengths, weaknesses, and final decision:\n\n{meta_review}\n\nSummary:"

# Preprocess function
def preprocess_function(examples):
    inputs = [generate_summary_prompt(review) for review in examples["Input"]]
    model_inputs = tokenizer(inputs, max_length=1024, truncation=True, padding="max_length")

    with tokenizer.as_target_tokenizer():
        labels = tokenizer(examples["Output"], max_length=128, truncation=True, padding="max_length")

    model_inputs["labels"] = labels["input_ids"]
    return model_inputs

# Tokenize datasets
tokenized_train = dataset['train'].map(preprocess_function, batched=True, remove_columns=dataset['train'].column_names)
tokenized_eval = dataset['validation'].map(preprocess_function, batched=True, remove_columns=dataset['validation'].column_names)

In [None]:
# Configure LoRA
lora_config = LoraConfig(
    r=16,
    lora_alpha=32,
    lora_dropout=0.05,
    bias="none",
    task_type="CAUSAL_LM",
    target_modules=["q_proj", "k_proj", "v_proj", "o_proj", "gate_proj", "up_proj", "down_proj",]
)

In [None]:
# Prepare model for training
model = prepare_model_for_kbit_training(model)
model = get_peft_model(model, lora_config)

training_arguments = TrainingArguments(
    output_dir="./results",
    num_train_epochs=6,
    per_device_train_batch_size=8,
    gradient_accumulation_steps=1,
    optim="paged_adamw_32bit",
    save_steps=0,
    logging_steps=500,
    learning_rate=2e-4,
    weight_decay=0.001,
    bf16=True,
    max_grad_norm=0.3,
    max_steps=-1,
    warmup_ratio=0.03,
    group_by_length=True,
    lr_scheduler_type="cosine",
    report_to="tensorboard",
    gradient_checkpointing=True
)

In [None]:
# Evaluation metrics
rouge = evaluate.load('rouge')
meteor = evaluate.load('meteor')

def compute_metrics(pred):
    labels_ids = pred.label_ids
    pred_ids = pred.predictions

    pred_str = tokenizer.batch_decode(pred_ids, skip_special_tokens=True)
    labels_ids[labels_ids == -100] = tokenizer.pad_token_id
    label_str = tokenizer.batch_decode(labels_ids, skip_special_tokens=True)

    rouge_output = rouge.compute(predictions=pred_str, references=label_str, use_stemmer=True)
    meteor_output = meteor.compute(predictions=pred_str, references=label_str)

    return {
        'rouge1': rouge_output['rouge1'].mid.fmeasure,
        'rouge2': rouge_output['rouge2'].mid.fmeasure,
        'rougeL': rouge_output['rougeL'].mid.fmeasure,
        'meteor': meteor_output['meteor'],
    }

# Initialize trainer
trainer = SFTTrainer(
    model=model,
    args=training_arguments,
    train_dataset=tokenized_train,
    eval_dataset=tokenized_eval,
    tokenizer=tokenizer,
    compute_metrics=compute_metrics,
)

# Train model
print("Starting model training...")
trainer.train()

In [None]:
# Save the fine-tuned model
trainer.save_model("./fine_tuned_model")
print("Fine-tuned model saved.")

In [None]:
# Evaluation on test set
test_dataset = dataset['test']

def generate_summary(review):
    prompt = generate_summary_prompt(review)
    inputs = tokenizer(prompt, return_tensors="pt", max_length=1024, truncation=True).to(model.device)
    with torch.no_grad():
        output = model.generate(**inputs, max_new_tokens=150, do_sample=True, top_p=0.95, top_k=50)
    return tokenizer.decode(output[0], skip_special_tokens=True).split("Summary:")[-1].strip()

# Generate summaries
print("Generating summaries...")
generated_summaries = []
references = []

for review in tqdm(test_dataset['Input']):
    summary = generate_summary(review)
    generated_summaries.append(summary)
    references.append(test_dataset['Output'][i])

# Evaluate ROUGE
rouge_scores = rouge.compute(predictions=generated_summaries, references=references, use_stemmer=True)
print("Test Set ROUGE Scores:", rouge_scores)

# Evaluate METEOR
meteor_scores = meteor.compute(predictions=generated_summaries, references=references)
print("Test Set METEOR Score:", meteor_scores)

# Evaluate BLEU
bleu_scores = [sentence_bleu([ref.split()], gen.split()) for ref, gen in zip(references, generated_summaries)]
print("Test Set BLEU Score:", np.mean(bleu_scores))

# Evaluate BERTScore
_, _, f1 = bert_score(generated_summaries, references, lang="en")
print("Test Set BERTScore:", torch.mean(f1).item())

In [None]:
# Post-processing function
def postprocess_summary(summary, max_length=150):
    # Truncate to max_length
    summary = summary[:max_length]

    # Ensure the summary ends with a complete sentence
    last_period = summary.rfind('.')
    if last_period != -1:
        summary = summary[:last_period + 1]

    # Remove any trailing whitespace
    summary = summary.strip()

    return summary

# Apply post-processing
processed_summaries = [postprocess_summary(summary) for summary in generated_summaries]

# Print sample summaries
print("\nSample Summaries:")
for i in range(3):
    print(f"\nOriginal Summary: {generated_summaries[i]}")
    print(f"Processed Summary: {processed_summaries[i]}")
    print(f"Reference Summary: {references[i]}")