## Installing Dependencies

**Key Concept**: We'll use the Hugging Face `transformers` library, which provides pre-trained models and tools for fine-tuning.

In [1]:
# Install required packages for model evaluation, training acceleration, and UI
!pip install rouge-score      # For ROUGE evaluation metrics
!pip install accelerate -U    # For faster training with GPU acceleration
!pip install gradio          # For creating interactive web interfaces



### **Note**: After this restart the runtime to ensure all packages are properly loaded

# **Data Preprocessing: Cleaning Indian Food Recipe Dataset**

**Key Concept**: Data cleaning is crucial for training language models. Clean, consistent text leads to better model performance.

In [2]:
# Import essential libraries for data manipulation and text processing
import pandas as pd  # For data manipulation and analysis
import re           # For regular expressions and text cleaning

In [3]:
# Check what files are available in the current directory
!ls

'GPT2_trained_indian_food_recipe - Workbook.ipynb'   test_dataset.csv
 IndianFoodDataset.xlsx				     train_dataset.csv
 cleaned_IndianFoodDataset.csv			     validation_dataset.csv
 gpt2-indian-food


In [4]:
!pip install openpyxl -q
!pip install scikit-learn -q
!pip install transformers -q
!pip install torch -q
!pip install datasets -q
!pip install ipywidgets  -q# optional


Usage:   
  pip install [options] <requirement specifier> [package-index-options] ...
  pip install [options] -r <requirements file> [package-index-options] ...
  pip install [options] [-e] <vcs project url> ...
  pip install [options] [-e] <local project path> ...
  pip install [options] <archive url/path> ...

no such option: -#


In [5]:
import torch
print(torch.__version__)

2.10.0+cu128


In [6]:
# Load the Indian Food dataset
# Note: Make sure your file name matches the one you uploaded
data = pd.read_excel("IndianFoodDataset.xlsx")

### **TODO 2**: Explore the Dataset
<details>
<summary>üí° Hint: Essential pandas methods for data exploration</summary>

Use these methods to understand your data:
- `.head()` - View first few rows
- `.info()` - Get data types and memory usage
- `.shape` - Get dimensions
- `.describe()` - Get statistical summary
</details>

In [7]:
# TODO: Display the first 5 rows of the dataset
data.head()

Unnamed: 0,Srno,RecipeName,TranslatedRecipeName,Ingredients,TranslatedIngredients,PrepTimeInMins,CookTimeInMins,TotalTimeInMins,Servings,Cuisine,Course,Diet,Instructions,TranslatedInstructions,URL
0,1,Masala Karela Recipe,Masala Karela Recipe,"6 Karela (Bitter Gourd/ Pavakkai) - deseeded,S...","6 Karela (Bitter Gourd/ Pavakkai) - deseeded,S...",15,30,f,6,Indian,Side Dish,Diabetic Friendly,"To make the¬†Masala Karela Recipe,de-seed the k...","To make the¬†Masala Karela Recipe,de-seed the k...",https://www.archanaskitchen.com/masala-karela-...
1,2,‡§ü‡§Æ‡§æ‡§ü‡§∞ ‡§™‡•Å‡§≤‡§ø‡§Ø‡•ã‡§ó‡§∞‡•á ‡§∞‡•á‡§∏‡§ø‡§™‡•Ä - Spicy Tomato Rice (Re...,Spicy Tomato Rice (Recipe),"2-1/2 ‡§ï‡§™ ‡§ö‡§æ‡§µ‡§≤ - ‡§™‡§ï‡§æ ‡§≤‡•á,3 ‡§ü‡§Æ‡§æ‡§ü‡§∞,3 ‡§õ‡•ã‡§ü‡§æ ‡§ö‡§Æ‡§ö‡•ç‡§ö ‡§¨‡•Ä...","2-1 / 2 cups rice - cooked, 3 tomatoes, 3 teas...",5,10,15,3,South Indian Recipes,Main Course,Vegetarian,‡§ü‡§Æ‡§æ‡§ü‡§∞ ‡§™‡•Å‡§≤‡§ø‡§Ø‡•ã‡§ó‡§∞‡•á ‡§¨‡§®‡§æ‡§®‡•á ‡§ï‡•á ‡§≤‡§ø‡§è ‡§∏‡§¨‡§∏‡•á ‡§™‡§π‡§≤‡•á ‡§ü‡§Æ‡§æ‡§ü‡§∞ ‡§ï...,"To make tomato puliogere, first cut the tomato...",http://www.archanaskitchen.com/spicy-tomato-ri...
2,3,Ragi Semiya Upma Recipe - Ragi Millet Vermicel...,Ragi Semiya Upma Recipe - Ragi Millet Vermicel...,"1-1/2 cups Rice Vermicelli Noodles (Thin),1 On...","1-1/2 cups Rice Vermicelli Noodles (Thin),1 On...",20,30,50,4,South Indian Recipes,South Indian Breakfast,High Protein Vegetarian,"To make the Ragi Vermicelli Recipe, first stea...","To make the Ragi Vermicelli Recipe, first stea...",http://www.archanaskitchen.com/ragi-vermicelli...
3,4,Gongura Chicken Curry Recipe - Andhra Style Go...,Gongura Chicken Curry Recipe - Andhra Style Go...,"500 grams Chicken,2 Onion - chopped,1 Tomato -...","500 grams Chicken,2 Onion - chopped,1 Tomato -...",15,30,45,4,Andhra,Lunch,Non Vegeterian,To make¬†Gongura Chicken Curry Recipe first pre...,To make¬†Gongura Chicken Curry Recipe first pre...,http://www.archanaskitchen.com/gongura-chicken...
4,5,‡§Ü‡§Ç‡§ß‡•ç‡§∞‡§æ ‡§∏‡•ç‡§ü‡§æ‡§á‡§≤ ‡§Ü‡§≤‡§Æ ‡§™‡§ö‡•ú‡•Ä ‡§∞‡•á‡§∏‡§ø‡§™‡•Ä - Adrak Chutney ...,Andhra Style Alam Pachadi Recipe - Adrak Chutn...,"1 ‡§¨‡•ú‡§æ ‡§ö‡§Æ‡§ö‡•ç‡§ö ‡§ö‡§®‡§æ ‡§¶‡§æ‡§≤,1 ‡§¨‡•ú‡§æ ‡§ö‡§Æ‡§ö‡•ç‡§ö ‡§∏‡•û‡•á‡§¶ ‡§â‡§∞‡§¶ ‡§¶‡§æ‡§≤,2...","1 tablespoon chana dal, 1 tablespoon white ura...",10,20,30,4,Andhra,South Indian Breakfast,Vegetarian,‡§Ü‡§Ç‡§ß‡•ç‡§∞‡§æ ‡§∏‡•ç‡§ü‡§æ‡§á‡§≤ ‡§Ü‡§≤‡§Æ ‡§™‡§ö‡•ú‡•Ä ‡§¨‡§®‡§æ‡§®‡•á ‡§ï‡•á ‡§≤‡§ø‡§è ‡§∏‡§¨‡§∏‡•á ‡§™‡§π‡§≤‡•á ...,"To make Andhra Style Alam Pachadi, first heat ...",https://www.archanaskitchen.com/andhra-style-a...


In [8]:
# TODO: Get information about the dataset structure
data.info()

<class 'pandas.DataFrame'>
RangeIndex: 1186 entries, 0 to 1185
Data columns (total 15 columns):
 #   Column                  Non-Null Count  Dtype 
---  ------                  --------------  ----- 
 0   Srno                    1186 non-null   int64 
 1   RecipeName              1186 non-null   str   
 2   TranslatedRecipeName    1186 non-null   str   
 3   Ingredients             1185 non-null   str   
 4   TranslatedIngredients   1185 non-null   str   
 5   PrepTimeInMins          1186 non-null   int64 
 6   CookTimeInMins          1186 non-null   int64 
 7   TotalTimeInMins         1186 non-null   object
 8   Servings                1186 non-null   int64 
 9   Cuisine                 1186 non-null   str   
 10  Course                  1186 non-null   str   
 11  Diet                    1186 non-null   str   
 12  Instructions            1186 non-null   str   
 13  TranslatedInstructions  1186 non-null   str   
 14  URL                     1186 non-null   str   
dtypes: int64(4), ob

### **Understanding Data Selection for Language Models**

**Key Concept**: For training a language model, we need to decide which columns contain the text we want the model to learn from. In this case, we're focusing on recipe instructions.

In [9]:
# Remove columns we don't need for language model training
# We keep only the translated text columns for training
data = data.drop(['Srno', 'RecipeName', 'Ingredients', 'Instructions'], axis=1)

# Why drop these? We focus on translated versions for consistency
# and 'Srno' is just an index column

In [10]:
data.head()

Unnamed: 0,TranslatedRecipeName,TranslatedIngredients,PrepTimeInMins,CookTimeInMins,TotalTimeInMins,Servings,Cuisine,Course,Diet,TranslatedInstructions,URL
0,Masala Karela Recipe,"6 Karela (Bitter Gourd/ Pavakkai) - deseeded,S...",15,30,f,6,Indian,Side Dish,Diabetic Friendly,"To make the¬†Masala Karela Recipe,de-seed the k...",https://www.archanaskitchen.com/masala-karela-...
1,Spicy Tomato Rice (Recipe),"2-1 / 2 cups rice - cooked, 3 tomatoes, 3 teas...",5,10,15,3,South Indian Recipes,Main Course,Vegetarian,"To make tomato puliogere, first cut the tomato...",http://www.archanaskitchen.com/spicy-tomato-ri...
2,Ragi Semiya Upma Recipe - Ragi Millet Vermicel...,"1-1/2 cups Rice Vermicelli Noodles (Thin),1 On...",20,30,50,4,South Indian Recipes,South Indian Breakfast,High Protein Vegetarian,"To make the Ragi Vermicelli Recipe, first stea...",http://www.archanaskitchen.com/ragi-vermicelli...
3,Gongura Chicken Curry Recipe - Andhra Style Go...,"500 grams Chicken,2 Onion - chopped,1 Tomato -...",15,30,45,4,Andhra,Lunch,Non Vegeterian,To make¬†Gongura Chicken Curry Recipe first pre...,http://www.archanaskitchen.com/gongura-chicken...
4,Andhra Style Alam Pachadi Recipe - Adrak Chutn...,"1 tablespoon chana dal, 1 tablespoon white ura...",10,20,30,4,Andhra,South Indian Breakfast,Vegetarian,"To make Andhra Style Alam Pachadi, first heat ...",https://www.archanaskitchen.com/andhra-style-a...


### Handle Missing Data
<details>
<summary>Methods for handling missing values</summary>

Common approaches:
- `.isnull().sum()` - Count missing values
- `.dropna()` - Remove rows with missing values
- `.fillna()` - Fill missing values with specific values
</details>

In [11]:
# TODO: Check for missing values in each column
data.isnull().sum()

TranslatedRecipeName      0
TranslatedIngredients     1
PrepTimeInMins            0
CookTimeInMins            0
TotalTimeInMins           0
Servings                  0
Cuisine                   0
Course                    0
Diet                      0
TranslatedInstructions    0
URL                       0
dtype: int64

In [12]:
# Identify specific rows that have missing values
# This helps us understand the extent of missing data

data[data["TranslatedIngredients"].isnull()]

Unnamed: 0,TranslatedRecipeName,TranslatedIngredients,PrepTimeInMins,CookTimeInMins,TotalTimeInMins,Servings,Cuisine,Course,Diet,TranslatedInstructions,URL
287,Pear And Walnut Salad Recipe,,10,30,40,2,Continental,Appetizer,Vegetarian,"To make the Pear And Walnut Salad Recipe, firs...",https://www.archanaskitchen.com/pear-and-walnu...


In [13]:
# TODO: Remove rows with missing values
data = data.dropna()

# Alternative approach: You could also fill missing values
# data = data.fillna("Unknown recipe step")

### **Text Cleaning for NLP**

**Key Concept**: Text cleaning standardizes the input format, removing noise that could confuse the model during training.

In [14]:
def clean_text(text):
    """
    Clean text by removing special characters and normalizing whitespace.
    
    Args:
        text (str): Input text to clean
        
    Returns:
        str: Cleaned text
    """
    # Remove non-alphanumeric characters (keep letters, numbers, and spaces, and some basic punctuation marks )
    text = re.sub(r"[^a-zA-Z0-9\s.,!?'-]", "", text)
    
    # Replace multiple whitespaces with single space and strip leading/trailing spaces
    text = re.sub(r"\s+", " ", text).strip()
    
    return text

# Apply text cleaning to all text columns
text_columns = ['TranslatedRecipeName', 'TranslatedIngredients', 'TranslatedInstructions']
for column in text_columns:
    data[column] = data[column].apply(clean_text)

In [15]:
# Save the cleaned dataset for later use
data.to_csv("cleaned_IndianFoodDataset.csv", index=False)
print(f"Cleaned dataset saved with {len(data)} recipes")

Cleaned dataset saved with 1185 recipes


# **Fine-tuning GPT-2 Model**

**Key Concept**: Fine-tuning involves taking a pre-trained model and training it further on domain-specific data (Indian food recipes) to make it specialized for that domain.

In [16]:
# Import libraries for model training and data handling
import pandas as pd
from sklearn.model_selection import train_test_split
from transformers import (
    GPT2LMHeadModel,              # Pre-trained GPT-2 model for text generation
    GPT2Tokenizer,               # Tokenizer to convert text to tokens                 
    DataCollatorForLanguageModeling,  # Handles batching for language modeling
    Trainer,                     # High-level training interface
    TrainingArguments           # Configuration for training parameters
)

  from .autonotebook import tqdm as notebook_tqdm


In [17]:
# Load the cleaned dataset
data = pd.read_csv("cleaned_IndianFoodDataset.csv")

### Select Training Data
<details>
<summary>Choosing the right column for language modeling</summary>

For recipe generation, we want the model to learn cooking instructions. Consider:
- `TranslatedInstructions` - Step-by-step cooking process
- `TranslatedIngredients` - List of ingredients
- `TranslatedRecipeName` - Recipe names

Instructions are most suitable for generating coherent cooking steps.
</details>

In [18]:
# TODO: Select the column to train the language model on
text_data = data['TranslatedInstructions']

In [19]:
# Examine the selected text data
text_data

0       To make the Masala Karela Recipe,de-seed the k...
1       To make tomato puliogere, first cut the tomato...
2       To make the Ragi Vermicelli Recipe, first stea...
3       To make Gongura Chicken Curry Recipe first pre...
4       To make Andhra Style Alam Pachadi, first heat ...
                              ...                        
1180    To make the Cheesy Pasta Casserole With Brocco...
1181    To make the Insalata Caprese Salad Recipe, we ...
1182    To make the Mutton Matar Keema Recipe, firstly...
1183    To make the Rajma Wrap, soak the rajma overnig...
1184    To make the spinach chana dal recipe, first so...
Name: TranslatedInstructions, Length: 1185, dtype: str

### **Data Splitting Strategy**

**Key Concept**: We split data into train/validation/test sets to:
- **Train**: Learn patterns from the data
- **Validation**: Monitor training progress and prevent overfitting
- **Test**: Final evaluation on unseen data

In [20]:
# Split the dataset into train, validation, and test sets
# Using 80-10-10 split for train-val-test
SEED = 7  # For reproducible results

# First split: 80% train, 20% temporary
train_data, temp_data = train_test_split(text_data, test_size=0.2, random_state=SEED)

# Second split: Split the 20% into 10% test and 10% validation
test_data, validation_data = train_test_split(temp_data, test_size=0.5, random_state=SEED)

print(f"Training samples: {len(train_data)}")
print(f"Validation samples: {len(validation_data)}")
print(f"Test samples: {len(test_data)}")

Training samples: 948
Validation samples: 119
Test samples: 118


In [21]:
# Save the split datasets as CSV files
# This allows us to reload them later without re-splitting
train_data.to_csv("train_dataset.csv", index=False)
validation_data.to_csv("validation_dataset.csv", index=False)
test_data.to_csv("test_dataset.csv", index=False)

In [22]:
!pip uninstall torch torchvision torchaudio -y
!pip install torch


Found existing installation: torch 2.10.0
Uninstalling torch-2.10.0:
  Successfully uninstalled torch-2.10.0
[0mCollecting torch
  Using cached torch-2.10.0-cp312-cp312-manylinux_2_28_x86_64.whl.metadata (31 kB)
Using cached torch-2.10.0-cp312-cp312-manylinux_2_28_x86_64.whl (915.7 MB)
Installing collected packages: torch
Successfully installed torch-2.10.0


In [23]:
import torch
print(torch.__version__)
print(torch.cuda.is_available())


2.10.0+cu128
False


### **Model and Tokenizer Setup**

**Key Concept**: 
- **Tokenizer**: Converts text into numbers (tokens) that the model can understand
- **Model**: The neural network that learns to generate text

In [24]:
# Load the pre-trained GPT-2 tokenizer and model
# These are trained on general English text and will be fine-tuned on our recipe data
from transformers import GPT2Tokenizer, GPT2LMHeadModel

tokenizer = GPT2Tokenizer.from_pretrained("gpt2")
model = GPT2LMHeadModel.from_pretrained("gpt2")

print("Model loaded successfully!")


# Add padding token (GPT-2 doesn't have one by default)
tokenizer.pad_token = tokenizer.eos_token

Loading weights: 100%|‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà| 148/148 [00:00<00:00, 305.27it/s, Materializing param=transformer.wte.weight]
[1mGPT2LMHeadModel LOAD REPORT[0m from: gpt2
Key                  | Status     |  | 
---------------------+------------+--+-
h.{0...11}.attn.bias | UNEXPECTED |  | 

[3mNotes:
- UNEXPECTED[3m	:can be ignored when loading from different task/architecture; not ok if you expect identical arch.[0m


Model loaded successfully!


### Understanding Block Size
<details>
<summary>What is block_size in language modeling?</summary>

`block_size` is the maximum sequence length:
- **128**: Good for shorter texts, faster training
- **512**: Better for longer contexts, slower training
- **1024**: Maximum context, very slow training

For recipe instructions, 128 tokens is usually sufficient.
</details>

In [25]:
# # Create TextDataset objects for training
# # These handle the conversion of text to tokenized sequences

# train_dataset = TextDataset(
#     tokenizer=tokenizer,           # The tokenizer to convert text into tokens
#     file_path="train_dataset.csv", # Path to the CSV file containing training data
#     block_size=128                 # Maximum length of each input sequence (in tokens)
#                                   # TODO: Try experimenting with different values: 64, 256, 512
# )

# validation_dataset = TextDataset(
#     tokenizer=tokenizer,
#     file_path="validation_dataset.csv",
#     block_size=128
# )

# test_dataset = TextDataset(
#     tokenizer=tokenizer,
#     file_path="test_dataset.csv",
#     block_size=128
# )

In [26]:
from datasets import load_dataset
from transformers import DataCollatorForLanguageModeling

dataset = load_dataset(
    "csv",
    data_files={
        "train": "train_dataset.csv",
        "validation": "validation_dataset.csv",
        "test": "test_dataset.csv"
    }
)

# Check available columns
print("Columns:", dataset["train"].column_names)


Generating train split: 948 examples [00:00, 31374.87 examples/s]
Generating validation split: 119 examples [00:00, 12983.43 examples/s]
Generating test split: 118 examples [00:00, 19314.26 examples/s]

Columns: ['TranslatedInstructions']





In [27]:
def tokenize_function(examples):
    return tokenizer(
        examples["TranslatedInstructions"],  # change if needed
        truncation=True,
        padding="max_length",
        max_length=128
    )

tokenized_dataset = dataset.map(
    tokenize_function,
    batched=True,
    remove_columns=dataset["train"].column_names
)


Map: 100%|‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà| 948/948 [00:00<00:00, 3906.64 examples/s]
Map: 100%|‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà| 119/119 [00:00<00:00, 2260.97 examples/s]
Map: 100%|‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà| 118/118 [00:00<00:00, 3052.00 examples/s]


In [28]:
print("Train size:", len(tokenized_dataset["train"]))
print("Validation size:", len(tokenized_dataset["validation"]))
print("Test size:", len(tokenized_dataset["test"]))


Train size: 948
Validation size: 119
Test size: 118


### **Training Configuration**

**Key Concept**: Training arguments control how the model learns. Key parameters include:
- **Epochs**: How many times to go through the entire dataset
- **Batch size**: How many examples to process together
- **Learning rate**: How fast the model updates its weights

In [29]:
# TODO: Configure training arguments
from transformers import TrainingArguments
training_args = TrainingArguments(
    output_dir="./gpt2-indian-food",     # Directory to save model checkpoints
    num_train_epochs=3,                  # TODO: Try 1, 2, 3, or 5 epochs
    per_device_train_batch_size= 16,    # TODO: Fill in batch size (try 8, 16, 32) [Don't go beyond 32 as we need higher GPU!]
    save_steps=10_000,                   # Save checkpoint every 10,000 steps
    save_total_limit=2,                  # Keep only 2 most recent checkpoints
    prediction_loss_only=True,           # Only compute loss during evaluation
    logging_steps=100,                   # Log training progress every 100 steps
)

In [30]:
# Set up the trainer - this handles the training loop
from transformers import Trainer, DataCollatorForLanguageModeling

trainer = Trainer(
    model=model,                    # The GPT-2 model to fine-tune
    args=training_args,             # Training configuration from above
    
    # Data collator prepares batches of data for training
    data_collator=DataCollatorForLanguageModeling(
        tokenizer=tokenizer,        # Tokenizer for processing text
        mlm=False                  # No masked language modeling (GPT-2 uses causal LM)
    ),
    
    train_dataset=tokenized_dataset["train"],      # Training data
    eval_dataset=tokenized_dataset["validation"],    # Validation data for monitoring progress
)

### **Model Training**

**‚ö†Ô∏è Note**: This step will take time depending on your hardware. On a GPU, expect 10-30 minutes.

In [31]:
# Start the fine-tuning process
# This will train the model on your recipe data
print("Starting training...")
trainer.train()
print("Training completed!")

Starting training...


  super().__init__(loader)
`loss_type=None` was set in the config but it is unrecognized. Using the default loss: `ForCausalLMLoss`.


Step,Training Loss
100,2.774771


Writing model shards: 100%|‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà| 1/1 [00:01<00:00,  1.72s/it]


Training completed!


In [32]:
# Save the fine-tuned model and tokenizer
trainer.save_model("gpt2-indian-food")
tokenizer.save_pretrained("gpt2-indian-food")

print("Model saved successfully!")

Writing model shards: 100%|‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà| 1/1 [00:01<00:00,  1.95s/it]


Model saved successfully!


In [33]:
# Evaluate the model on the test set
print("Evaluating model on test data...")
results = trainer.evaluate(tokenized_dataset["test"])
print("Test Results:")
print(results)

Evaluating model on test data...


  super().__init__(loader)


Test Results:
{'eval_loss': 2.5109846591949463, 'eval_runtime': 25.4128, 'eval_samples_per_second': 4.643, 'eval_steps_per_second': 0.59, 'epoch': 3.0}


In [34]:
# Optional: Download model files from Google Colab to local machine
# Uncomment these lines if you want to download the trained model

# from google.colab import files
# !zip -r gpt2-indian-food.zip gpt2-indian-food
# files.download('gpt2-indian-food.zip')

# **Model Evaluation: BLEU & ROUGE Metrics**

**Key Concept**: 
- **BLEU**: Measures overlap of n-grams between generated and reference text
- **ROUGE**: Measures recall-oriented overlap, useful for summarization and generation tasks

In [35]:
# Import evaluation libraries
import nltk
from nltk.translate.bleu_score import sentence_bleu, SmoothingFunction
from rouge_score import rouge_scorer

# Download required NLTK data
nltk.download('punkt')
nltk.download('averaged_perceptron_tagger')
nltk.download('wordnet')

[nltk_data] Downloading package punkt to /home/arshi/nltk_data...
[nltk_data]   Package punkt is already up-to-date!
[nltk_data] Downloading package averaged_perceptron_tagger to
[nltk_data]     /home/arshi/nltk_data...
[nltk_data]   Package averaged_perceptron_tagger is already up-to-
[nltk_data]       date!
[nltk_data] Downloading package wordnet to /home/arshi/nltk_data...
[nltk_data]   Package wordnet is already up-to-date!


True

### **Text Generation Function**

**Key Concept**: Text generation parameters:
- **max_length**: Maximum tokens to generate
- **temperature**: Controls randomness (lower = more deterministic)
- **num_beams**: Beam search for better quality
- **no_repeat_ngram_size**: Prevents repetitive phrases

#### Why Use Beam Search?

* Advantages:
    - Better Quality: Explores multiple paths, often finding better overall sequences
    - Reduced Repetition: Less likely to get stuck in repetitive loops
    - More Coherent: Considers context across multiple words, not just the immediate next word

* Trade-offs:
    - Computational Cost: 5x more expensive than greedy decoding
    - Slower Generation: Takes longer to produce text
    - Memory Usage: Stores multiple candidate sequences


In [36]:
# Load the fine-tuned model and tokenizer
model = GPT2LMHeadModel.from_pretrained("gpt2-indian-food")
tokenizer = GPT2Tokenizer.from_pretrained("gpt2")

def generate_text(prompt, max_length=100, temperature=1.0):
    """
    Generate text using the fine-tuned model.
    
    Args:
        prompt (str): Starting text for generation
        max_length (int): Maximum length of generated text
        temperature (float): Controls randomness (0.1 = very focused, 2.0 = very random)
    
    Returns:
        str: Generated text including the prompt
    """
    # Convert text to tokens
    input_ids = tokenizer.encode(prompt, return_tensors="pt")
    
    # Generate text using beam search for better quality
    output = model.generate(
        input_ids, 
        max_length=max_length, 
        temperature=temperature,
        num_beams=5,                    # Beam search for quality
        no_repeat_ngram_size=2,         # Prevent repetition
        do_sample=True,                 # Enable sampling
        top_k=50,                       # Top-k sampling
        top_p=0.95,                     # Nucleus sampling
    )
    
    # Convert tokens back to text
    generated_text = tokenizer.decode(output[0], skip_special_tokens=True)
    return generated_text

# Test the generation function
sample_output = generate_text("To make dal,")
print("Sample generation:")
print(sample_output)

Loading weights: 100%|‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà| 148/148 [00:00<00:00, 246.16it/s, Materializing param=transformer.wte.weight]
The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
The attention mask is not set and cannot be inferred from input because pad token is same as eos token. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.


Sample generation:
To make dal, first heat oil in a heavy bottomed pan. Add onions and saute for a couple of minutes until they turn translucent.Add ginger, garlic paste, coriander leaves, red chilli powder, cumin seeds, turmeric powder and let it crackle.Once done, add tomatoes and cook till the tomatoes turn golden brown in colour. Turn off the heat and allow it to cool down.In a mixer grinder, combine all the ingredients mentioned in the recipe


In [37]:
# Try with shorter generation
short_output = generate_text("To make dal,", max_length=20)
print("Short generation:")
print(short_output)

The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


Short generation:
To make dal, firstly heat oil in a heavy bottomed pan. Add the mustard seeds


### **TODO 6**: Create Your Own Test Cases
<details>
<summary>üí° Hint: Creating good test prompts and references</summary>

Good test cases should:
- Start with common recipe beginnings ("To make...", "First...", "Heat oil...")
- Have realistic reference completions
- Cover different types of cooking steps (prep, cooking, seasoning)
</details>

In [38]:
# Example prompts and references for evaluation
prompts = [
    "To make Masala Karela,",
    "to make Tomato Puliogere is prepared by",
    "To make Ragi Vermicelli Recipe starts with",
    "To make Gongura Chicken Curry Recipe first prep all the ingredients",
    "To make Andhra Style Alam Pachadi is made by"
]

references = [
    "To make the Masala Karela Recipe, de-seed the karela and slice.",
    "To make tomato puliogere, first cut the tomatoes.",
    "To begin making the Ragi Vermicelli Recipe, first steam the ragi vermicelli.",
    "To begin making Gongura Chicken Curry Recipe, first prep all the ingredients.",
    "To make Andhra Style Alam Pachadi, first heat oil in a pan."
]

# TODO: Add your own test cases
# prompts.append("____")
# references.append("____")

In [39]:
# Generate text and calculate BLEU and ROUGE scores
smooth = SmoothingFunction().method1  # Smoothing function for BLEU score
bleu_scores = []
rouge_scores = rouge_scorer.RougeScorer(['rouge1', 'rouge2', 'rougeL'], use_stemmer=True)

print("=== Model Evaluation Results ===")
print()

for i, (prompt, reference) in enumerate(zip(prompts, references), 1):
    # Generate text using our fine-tuned model
    generated_text = generate_text(prompt, max_length=50)
    
    # Calculate BLEU score (measures n-gram overlap)
    bleu_score = sentence_bleu([reference.split()], generated_text.split(), smoothing_function=smooth)
    bleu_scores.append(bleu_score)
    
    # Calculate ROUGE scores (measures recall-oriented overlap)
    rouge_score = rouge_scores.score(generated_text, reference)
    
    print(f"Test Case {i}:")
    print(f"Prompt: {prompt}")
    print(f"Generated: {generated_text}")
    print(f"Reference: {reference}")
    print(f"BLEU Score: {bleu_score:.4f}")
    print(f"ROUGE-1 F1: {rouge_score['rouge1'].fmeasure:.4f}")
    print(f"ROUGE-L F1: {rouge_score['rougeL'].fmeasure:.4f}")
    print("-" * 80)

# Calculate and display average scores
average_bleu = sum(bleu_scores) / len(bleu_scores)
print(f"\nüìä Average BLEU Score: {average_bleu:.4f}")
print("\nüí° Score Interpretation:")
print("   BLEU 0.0-0.3: Poor quality")
print("   BLEU 0.3-0.5: Reasonable quality")
print("   BLEU 0.5+: Good quality")

The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


=== Model Evaluation Results ===



The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


Test Case 1:
Prompt: To make Masala Karela,
Generated: To make Masala Karela, first heat oil in a heavy bottomed pan. Add mustard seeds and saute for a few seconds.Add turmeric powder, coriander powder and red chilli powder. Saute until it becomes soft.
Reference: To make the Masala Karela Recipe, de-seed the karela and slice.
BLEU Score: 0.0138
ROUGE-1 F1: 0.2553
ROUGE-L F1: 0.2553
--------------------------------------------------------------------------------


The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


Test Case 2:
Prompt: to make Tomato Puliogere is prepared by
Generated: to make Tomato Puliogere is prepared by soaking tomatoes in water for 2 to 3 hours.In a large mixing bowl, combine all the ingredients mentioned in the recipe and mix well. Add salt and pepper to taste. Mix well and keep aside
Reference: To make tomato puliogere, first cut the tomatoes.
BLEU Score: 0.0052
ROUGE-1 F1: 0.2353
ROUGE-L F1: 0.1961
--------------------------------------------------------------------------------


The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


Test Case 3:
Prompt: To make Ragi Vermicelli Recipe starts with
Generated: To make Ragi Vermicelli Recipe starts with making the dough.In a mixing bowl, combine all the ingredients mentioned in the previous step and knead for about 3-4 minutes. Keep aside.Heat oil in a heavy bottomed pan
Reference: To begin making the Ragi Vermicelli Recipe, first steam the ragi vermicelli.
BLEU Score: 0.0171
ROUGE-1 F1: 0.2745
ROUGE-L F1: 0.2353
--------------------------------------------------------------------------------


The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


Test Case 4:
Prompt: To make Gongura Chicken Curry Recipe first prep all the ingredients
Generated: To make Gongura Chicken Curry Recipe first prep all the ingredients and keep them ready.Heat oil in a heavy bottomed pan on medium flame. Add garlic, ginger, green chillies, coriander leaves and saute for a few seconds.
Reference: To begin making Gongura Chicken Curry Recipe, first prep all the ingredients.
BLEU Score: 0.0933
ROUGE-1 F1: 0.4400
ROUGE-L F1: 0.4400
--------------------------------------------------------------------------------
Test Case 5:
Prompt: To make Andhra Style Alam Pachadi is made by
Generated: To make Andhra Style Alam Pachadi is made by mixing all the ingredients in a mixer grinder and grind it into a smooth paste.Add chopped onions, green chillies, red chilli powder, cumin seeds, coriander powder
Reference: To make Andhra Style Alam Pachadi, first heat oil in a pan.
BLEU Score: 0.1137
ROUGE-1 F1: 0.3333
ROUGE-L F1: 0.3333
---------------------------------------

# **Interactive Model Testing with Gradio**

**Key Concept**: Gradio allows us to create user-friendly interfaces for testing our model interactively.

In [40]:
import gradio as gr
from transformers import GPT2LMHeadModel, GPT2Tokenizer

In [41]:
# Load the fine-tuned model and tokenizer for interactive testing
tokenizer = GPT2Tokenizer.from_pretrained("gpt2")
model_finetuned = GPT2LMHeadModel.from_pretrained("gpt2-indian-food")

print("Fine-tuned model loaded successfully!")

Loading weights: 100%|‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà| 148/148 [00:00<00:00, 243.58it/s, Materializing param=transformer.wte.weight]


Fine-tuned model loaded successfully!


In [42]:
# Display model information
print(f"Model type: {type(model_finetuned)}")
print(f"Number of parameters: {model_finetuned.num_parameters():,}")

Model type: <class 'transformers.models.gpt2.modeling_gpt2.GPT2LMHeadModel'>
Number of parameters: 124,439,808


### **Understanding Temperature in Text Generation**

**Key Concept**: Temperature controls the randomness of text generation:
- **Low (0.1-0.7)**: More focused, deterministic output
- **Medium (0.8-1.2)**: Balanced creativity and coherence
- **High (1.3-2.0)**: More creative but potentially less coherent

In [43]:
# Enhanced text generation function with better control
def generate_text_enhanced(prompt, max_length=100, temperature=1.0):
    """
    Generate text with enhanced parameters for better control.
    """
    input_ids = tokenizer.encode(prompt, return_tensors="pt")
    
    # Generate with more sophisticated parameters
    output = model_finetuned.generate(
        input_ids, 
        max_length=max_length, 
        temperature=temperature,
        num_beams=5,                    # Beam search for quality
        no_repeat_ngram_size=2,         # Prevent repetition
        do_sample=True,                 # Enable sampling
        top_k=50,                       # Top-k sampling
        top_p=0.95,                     # Nucleus sampling
        pad_token_id=tokenizer.eos_token_id
    )
    
    generated_text = tokenizer.decode(output[0], skip_special_tokens=True)
    return generated_text

# Test different temperatures
test_prompt = "To prepare chicken curry,"
temperatures = [0.5, 1.0, 1.5]

print("üå°Ô∏è Temperature Comparison:")
for temp in temperatures:
    output = generate_text_enhanced(test_prompt, max_length=60, temperature=temp)
    print(f"\nTemperature {temp}: {output}")

üå°Ô∏è Temperature Comparison:

Temperature 0.5: To prepare chicken curry, firstly heat oil in a heavy bottomed pan. Add mustard seeds, turmeric powder, coriander powder and saute for a minute or two.Add chicken and cook till the chicken is cooked through. Once done, drain the water and keep aside.In a

Temperature 1.0: To prepare chicken curry, firstly heat oil in a heavy bottomed pan. Add onions and saute till they turn translucent.Add cumin seeds, turmeric powder, red chilli powder and salt to taste. Saute for a couple of minutes till the onions become soft and mushy.

Temperature 1.5: To prepare chicken curry, we will first make the masala.Heat oil in a heavy bottomed pan. Add mustard seeds, green chillies, coriander leaves and let it splutter.Add chopped onions and saute till they turn golden brown in colour.Saut and cook till the


### **Simple Gradio Interface**

In [44]:
# Create a simple Gradio interface for text generation
iface = gr.Interface(
    fn=generate_text_enhanced,        # Function to call for text generation
    inputs=[
        
        gr.Textbox(label="Recipe Prompt", placeholder="Enter recipe start (e.g., 'To make dal,')"),
        gr.Slider(minimum=50, maximum=200, value=100, label="Max Length"),
        gr.Slider(minimum=0.1, maximum=2.0, value=1.0, label="Temperature")
    ],
    outputs=gr.Textbox(label="Generated Recipe Step"),
    title="üçõ Indian Recipe Generator",
    description="Generate Indian recipe instructions using fine-tuned GPT-2",
    examples=[
        ["To make dal,", 100, 1.0],
        ["To prepare chicken curry,", 120, 0.8],
        ["First, heat oil in a pan", 80, 1.2]
    ]
)

In [45]:
# Launch the Gradio interface
iface.launch(share=True)  # share=True creates a public link

* Running on local URL:  http://127.0.0.1:7860
* Running on public URL: https://26646c62fbefcc0503.gradio.live

This share link expires in 1 week. For free permanent hosting and GPU upgrades, run `gradio deploy` from the terminal in the working directory to deploy to Hugging Face Spaces (https://huggingface.co/spaces)




### **Advanced: Model Comparison Interface**

**üî¨ Advanced Section** - Compare original GPT-2 vs fine-tuned model

In [46]:
# Load original GPT-2 model for comparison
model_original = GPT2LMHeadModel.from_pretrained("gpt2")
print("Original GPT-2 model loaded for comparison")

Loading weights: 100%|‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà| 148/148 [00:00<00:00, 172.22it/s, Materializing param=transformer.wte.weight]
[1mGPT2LMHeadModel LOAD REPORT[0m from: gpt2
Key                  | Status     |  | 
---------------------+------------+--+-
h.{0...11}.attn.bias | UNEXPECTED |  | 

[3mNotes:
- UNEXPECTED[3m	:can be ignored when loading from different task/architecture; not ok if you expect identical arch.[0m


Original GPT-2 model loaded for comparison


In [47]:
# Generalized generation function that works with any model
def generate_text_using(model, prompt, temperature=1.0, max_length=120):
    """
    Generate text using a specific model.
    
    Args:
        model: The language model to use
        prompt: Input text prompt
        temperature: Generation temperature
        max_length: Maximum generation length
    
    Returns:
        Generated text string
    """
    input_ids = tokenizer.encode(prompt, return_tensors="pt")
    
    output = model.generate(
        input_ids, 
        max_length=max_length, 
        num_beams=5, 
        no_repeat_ngram_size=2, 
        top_k=50, 
        top_p=0.95, 
        temperature=temperature,
        do_sample=True,
        pad_token_id=tokenizer.eos_token_id
    )
    
    generated_text = tokenizer.decode(output[0], skip_special_tokens=True)
    return generated_text

In [48]:
# Function to compare both models side by side
def generate_texts_compare(prompt, temperature=1.0):
    """
    Generate text using both original and fine-tuned models for comparison.
    
    Returns:
        List containing [original_output, finetuned_output]
    """
    original_output = generate_text_using(model_original, prompt, temperature=temperature)
    finetuned_output = generate_text_using(model_finetuned, prompt, temperature=temperature)
    
    return [original_output, finetuned_output]

In [49]:
# Test the comparison function
test_prompt = "To make Masala Karela"
original_output, finetuned_output = generate_texts_compare(test_prompt)

print("üîÑ Model Comparison Test:")
print(f"Prompt: {test_prompt}")
print(f"\nüìÑ Original GPT-2: {original_output}")
print(f"\nüçõ Fine-tuned GPT-2: {finetuned_output}")

üîÑ Model Comparison Test:
Prompt: To make Masala Karela

üìÑ Original GPT-2: To make Masala Karela's case even more compelling, it's important to understand that he's not the only one who's been targeted in the past.

In fact, many of the people who have been singled out for this kind of harassment are also the ones who've been victims of sexual harassment or assault. This is not to say that all of these people aren't victims, but rather that they should be treated with the same respect and dignity as the rest of us. It's also important for us to recognize that these kinds of attacks are not isolated incidents, and that we need

üçõ Fine-tuned GPT-2: To make Masala Karela Recipe, first heat oil in a heavy bottomed pan. Add cumin seeds and let it crackle.Add garlic and saute till the garlic turns translucent. Now add turmeric powder, red chilli powder and salt. Saute for a few seconds.Now add chopped onions and cook till they turn translucent and turn golden brown in colour. Turn of

### Analyze Model Differences
<details>
<summary>What to look for in model comparisons</summary>

Compare the outputs for:
- **Domain relevance**: Does the fine-tuned model stay on topic?
- **Cooking terminology**: Does it use appropriate cooking terms?
- **Step structure**: Are the instructions well-structured?
- **Cultural context**: Does it understand Indian cooking methods?
</details>

In [50]:
# Advanced Gradio interface for model comparison
comparison_interface = gr.Interface(
    fn=generate_texts_compare,
    inputs=[
        gr.Textbox(
            label="Recipe Prompt", 
            placeholder="Enter recipe start (e.g., 'To make dal,')",
            lines=2
        ),
        gr.Slider(
            minimum=0.1, 
            maximum=2.0, 
            value=1.0, 
            label="Temperature",
            info="Controls randomness: lower = more focused, higher = more creative"
        )
    ],
    outputs=[
        gr.Textbox(label="üìÑ Original GPT-2 Output", lines=4),
        gr.Textbox(label="üçõ Fine-tuned GPT-2 Output", lines=4)
    ],
    title="üî¨ Model Comparison: Original vs Fine-tuned GPT-2",
    description="Compare how the original GPT-2 and fine-tuned model respond to recipe prompts",
    examples=[
        ["To make dal,", 0.8],
        ["To prepare chicken curry,", 1.0],
        ["First, heat oil in a pan", 1.2]
    ],
    article="""
    ### üìä How to Interpret Results:
    - **Domain Relevance**: Fine-tuned model should stay focused on cooking
    - **Terminology**: Look for cooking-specific vocabulary
    - **Structure**: Well-formed recipe instructions
    - **Cultural Context**: Indian cooking methods and ingredients
    """
)

In [51]:
# Launch the comparison interface
comparison_interface.launch(share=True)

* Running on local URL:  http://127.0.0.1:7861
* Running on public URL: https://738dc4cbe742a3b334.gradio.live

This share link expires in 1 week. For free permanent hosting and GPU upgrades, run `gradio deploy` from the terminal in the working directory to deploy to Hugging Face Spaces (https://huggingface.co/spaces)


