# **GPT-2 Sentiment-Controlled Persian Review: Guided Demo**

This notebook provides a guided demonstration of the from-scratch GPT-2 project for generating sentiment-controlled food reviews. We will use the scripts in the `scripts/` directory to train and generate text.

**Note:** Ensure you have installed all dependencies from `requirements.txt` in your virtual environment before running this.

## 1. Setup and Data Download

First, we need to authenticate with Kaggle and Hugging Face to download the dataset and tokenizer.

1.  **Hugging Face:** Run `huggingface-cli login` in your terminal and enter your access token. This is required for the `meta-llama/Llama-3.3-70B-Instruct` tokenizer.
2.  **Kaggle:** Make sure your `kaggle.json` API token is set up. You can download it from your Kaggle account page and place it in `~/.kaggle/kaggle.json`.

Once authenticated, the `train.py` script will automatically handle downloading the data.

## 2. Training the Model

Now, let's run the training script. We will train for just **5 epochs** with a **batch size of 32** for this demo.

This script will:
1.  Download the Snappfood dataset (if not present).
2.  Load the Llama 3.3 tokenizer and add sentiment tokens.
3.  Build the from-scratch GPT-2 model.
4.  Run training and evaluation for 5 epochs.
5.  Save the best model to `models/best_gpt2_model.pt`.
6.  Save loss plots to `models/training_loss_plots.png`.

In [None]:
!python ../scripts/train.py \
    --epochs 5 \
    --batch_size 32 \
    --lr 1e-4 \
    --n_embd 192 \
    --n_layer 3 \
    --n_head 3

### Check Training Artifacts

After the script finishes, you should see the saved model and the loss plots. Let's display the loss plot.

In [None]:
from IPython.display import Image, display
import os

plot_path = "../models/training_loss_plots.png"

if os.path.exists(plot_path):
    print(f"Displaying loss plot from {plot_path}")
    display(Image(filename=plot_path))
else:
    print("Loss plot not found. Did the training script run correctly?")

## 3. Generating Text

Now we can use the `generate.py` script to generate new comments using our trained model. We'll test both positive and negative sentiments.

### 3.1. Generate Positive Comments

In [None]:
!python ../scripts/generate.py \
    --sentiment positive \
    --num_samples 5 \
    --model_path "../models/best_gpt2_model.pt"

### 3.2. Generate Negative Comments

In [None]:
!python ../scripts/generate.py \
    --sentiment negative \
    --num_samples 5 \
    --model_path "../models/best_gpt2_model.pt"

## 4. Hyperparameter Experiments

The `generate.py` script allows you to control the generation process with `temperature`, `top_k`, and `top_p`. Let's try a few combinations to see the effect.

### 4.1. Low Temperature (Conservative)

Low temperature (e.g., 0.3) makes the model more deterministic and less creative. It will stick to high-probability, common words.

In [None]:
!python ../scripts/generate.py \
    --sentiment positive \
    --num_samples 3 \
    --temperature 0.3 \
    --top_k 50 \
    --top_p 0.9

### 4.2. High Temperature (Creative/Risky)

High temperature (e.g., 1.5) makes the model more creative and random. It's more likely to produce interesting, but also potentially nonsensical, text.

In [None]:
!python ../scripts/generate.py \
    --sentiment positive \
    --num_samples 3 \
    --temperature 1.5 \
    --top_k 50 \
    --top_p 0.9

### 4.3. Low Top-K (Restrictive)

Using a very small `top_k` (e.g., 5) restricts the model to only sampling from the 5 most likely next tokens. This can lead to very repetitive text.

In [None]:
!python ../scripts/generate.py \
    --sentiment negative \
    --num_samples 3 \
    --temperature 0.8 \
    --top_k 5 \
    --top_p 1.0

### 4.4. Low Top-P (Nucleus Sampling)

Using a `top_p` of 0.5 means the model only samples from the smallest set of tokens whose cumulative probability exceeds 50%. This is an adaptive way to restrict the vocabulary.

In [None]:
!python ../scripts/generate.py \
    --sentiment positive \
    --num_samples 3 \
    --temperature 0.8 \
    --top_k 0 \
    --top_p 0.5