# Lightweight Fine-Tuning Project

TODO: In this cell, describe your choices for each of the following

* PEFT technique: GPT2ForSequenceClassification
* Model: gpt2
* Evaluation approach: accuracy evaluation 
* Fine-tuning dataset: imdb

## Loading and Evaluating a Foundation Model

TODO: In the cells below, load your chosen pre-trained Hugging Face model and evaluate its performance prior to fine-tuning. This step includes loading an appropriate tokenizer and dataset.

In [1]:
from datasets import load_dataset

dataset = load_dataset("imdb")

In [2]:
from transformers import GPT2Tokenizer

tokenizer = GPT2Tokenizer.from_pretrained("gpt2")
tokenizer.pad_token = tokenizer.eos_token

def tokenize_function(examples):
    return tokenizer(examples["text"], padding="max_length", truncation=True, max_length=512, return_tensors="pt")

tokenized_dataset = dataset.map(tokenize_function, batched=True)

Map:   0%|          | 0/25000 [00:00<?, ? examples/s]

Map:   0%|          | 0/25000 [00:00<?, ? examples/s]

Map:   0%|          | 0/50000 [00:00<?, ? examples/s]

## Performing Parameter-Efficient Fine-Tuning

TODO: In the cells below, create a PEFT model from your loaded model, run a training loop, and save the PEFT model weights.

In [3]:
from transformers import GPT2ForSequenceClassification

model = GPT2ForSequenceClassification.from_pretrained("gpt2", num_labels=2)

Some weights of GPT2ForSequenceClassification were not initialized from the model checkpoint at gpt2 and are newly initialized: ['score.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


In [4]:
from transformers import DataCollatorWithPadding, TrainingArguments, Trainer

training_args = TrainingArguments(
    output_dir='./results',            # Directory where the results are saved
    num_train_epochs=3,                # Total number of training epochs
    per_device_train_batch_size=8,     # Batch size per device during training
    per_device_eval_batch_size=8,      # Batch size for evaluation
    warmup_steps=500,                  # Number of warmup steps for learning rate scheduler
    weight_decay=0.01,                 # Weight decay if we apply some.
    logging_dir='./logs',              # Directory for storing logs
    evaluation_strategy="epoch",       # Evaluate at the end of every epoch
    save_strategy="epoch",             # Save the model at the end of every epoch
    load_best_model_at_end=True,       # Load the best model at the end of training
    remove_unused_columns=False
)

In [5]:
def compute_metrics(eval_pred):
    predictions, labels = eval_pred
    predictions = np.argmax(predictions, axis=1)
    return {"accuracy": (predictions == labels).mean()}
        
trainer = Trainer(
    model=model,                         # The LoRA-adapted model
    args=training_args,                       # Training arguments
    train_dataset=tokenized_dataset["train"], # Training dataset
    eval_dataset=tokenized_dataset["test"],   # Evaluation dataset
    tokenizer=tokenizer,
    data_collator=DataCollatorWithPadding(tokenizer = tokenizer),
    compute_metrics=compute_metrics, 
)

In [6]:
trainer.train()

ValueError: Unable to create tensor, you should probably activate truncation and/or padding with 'padding=True' 'truncation=True' to have batched tensors with the same length. Perhaps your features (`text` in this case) have excessive nesting (inputs type `list` where type `int` is expected).

In [None]:
trainer.evaluate()

## Performing Inference with a PEFT Model

TODO: In the cells below, load the saved PEFT model weights and evaluate the performance of the trained PEFT model. Be sure to compare the results to the results from prior to fine-tuning.

In [None]:
from transformers import GPT2Tokenizer, GPT2ForSequenceClassification
import torch

# Load the model and tokenizer
tokenizer = GPT2Tokenizer.from_pretrained("./results")
model = GPT2ForSequenceClassification.from_pretrained("./results")

def predict_sentiment(text):
    # Encode the input text
    inputs = tokenizer(text, return_tensors="pt", padding=True, truncation=True, max_length=512)

    # Perform inference
    with torch.no_grad():
        outputs = model(**inputs)

    # Extract logits
    logits = outputs.logits

    # Convert to probabilities (softmax)
    probs = torch.softmax(logits, dim=1)

    # Assuming label 0 is negative and label 1 is positive
    sentiment = "Positive" if torch.argmax(probs) == 1 else "Negative"
    confidence = torch.max(probs).item()

    return sentiment, confidence

# Example usage
text = "This movie was an amazing journey from start to finish. Loved every moment!"
sentiment, confidence = predict_sentiment(text)
print(f"Sentiment: {sentiment}, Confidence: {confidence:.2f}")