# Stage 2: Recipe Fine-Tuning

This notebook loads Stage 1 conversational model from Google Drive and fine-tunes it on recipe data.

**Prerequisites:**
- Stage 1 model saved in Google Drive at: `LLM_Models/cooking-assistant-project/models/gpt2-conversational-v1/final`
- Recipe datasets in Google Drive
- GPU runtime enabled in Colab

## 1. Setup

In [1]:
# Mount Google Drive
from google.colab import drive
drive.mount('/content/drive')

print("✓ Google Drive mounted!")

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).
✓ Google Drive mounted!


In [2]:
# Install required packages
%pip install -q transformers datasets peft accelerate bitsandbytes wandb pyyaml

In [3]:
import os
os.environ["TRANSFORMERS_NO_TF"] = "1"

# Clone your repo into /content if not already present
REPO_URL = "https://github.com/Gani332/DeepLearningLLM.git"
REPO_PATH = "/content/DeepLearningLLM"

# Clone repo only if it doesn't exist already
if not os.path.exists(REPO_PATH):
    !git clone {REPO_URL} {REPO_PATH}
else:
    print("Repo already exists — pulling latest changes...")
    %cd {REPO_PATH}
    !git pull

# Change working directory to your repo
os.chdir(REPO_PATH)

# Show current working directory and its contents
print("Working directory:", os.getcwd())
print("Contents:", os.listdir("."))

Repo already exists — pulling latest changes...
/content/DeepLearningLLM
Already up to date.
Working directory: /content/DeepLearningLLM
Contents: ['.git', 'llm_data_preprocessing.ipynb', 'prepare_recipe_data_for_gpt2.py', 'wandb', 'finetune_llm', '.gitignore', 'lstm_model_training.ipynb', 'supportDocs']


In [4]:
# Import libraries
import torch
from transformers import (
    AutoModelForCausalLM,
    AutoTokenizer,
    TrainingArguments,
    Trainer,
    DataCollatorForLanguageModeling,
    set_seed
)
from peft import LoraConfig, get_peft_model, prepare_model_for_kbit_training, PeftModel
from datasets import load_from_disk
import yaml
from pathlib import Path

set_seed(42)

print(f"PyTorch version: {torch.__version__}")
print(f"CUDA available: {torch.cuda.is_available()}")
if torch.cuda.is_available():
    print(f"GPU: {torch.cuda.get_device_name(0)}")

PyTorch version: 2.8.0+cu126
CUDA available: True
GPU: NVIDIA L4


## 2. Verify Stage 1 Model

In [5]:
# Verify Stage 1 model exists in Google Drive
STAGE1_MODEL_PATH = "/content/drive/MyDrive/LLM_Models/cooking-assistant-project/models/gpt2-conversational-v1/final"

print("Checking for Stage 1 model in Google Drive...")

if os.path.exists(STAGE1_MODEL_PATH):
    print(f"✓ Found Stage 1 model at: {STAGE1_MODEL_PATH}")

    # Verify required files exist (LoRA adapter files)
    required_files = ['adapter_config.json', 'adapter_model.safetensors']
    files = os.listdir(STAGE1_MODEL_PATH)

    all_present = True
    for req_file in required_files:
        if req_file in files:
            print(f"  ✓ {req_file}")
        else:
            print(f"  ❌ Missing: {req_file}")
            all_present = False

    if all_present:
        print("\n✓ Stage 1 LoRA adapter ready to load!")
    else:
        raise FileNotFoundError("Missing required LoRA files")
else:
    print(f"❌ ERROR: Stage 1 model not found!")
    raise FileNotFoundError(f"Stage 1 model not found at {STAGE1_MODEL_PATH}")

Checking for Stage 1 model in Google Drive...
✓ Found Stage 1 model at: /content/drive/MyDrive/LLM_Models/cooking-assistant-project/models/gpt2-conversational-v1/final
  ✓ adapter_config.json
  ✓ adapter_model.safetensors

✓ Stage 1 LoRA adapter ready to load!


## 3. Load Configuration

In [6]:
# Load Stage 2 config
config_path_s2 = Path('/content/DeepLearningLLM/finetune_llm/config/training_config_stage2.yaml')

with open(config_path_s2, 'r') as file:
    config_stage2 = yaml.safe_load(file)

print("Stage 2 Configuration loaded:")
print(f"  Base model: {config_stage2['model']['base_model']}")
print(f"  Train data: {config_stage2['data']['train_data']}")
print(f"  Output dir: {config_stage2['training']['output_dir']}")
print(f"  Epochs: {config_stage2['training']['num_train_epochs']}")

Stage 2 Configuration loaded:
  Base model: ./models/gpt2-conversational-v1
  Train data: /content/datasets/datasets/Cleaned/recipe_gpt2/train
  Output dir: ./models/gpt2-recipe-final
  Epochs: 1


In [7]:
# Override config to use Google Drive paths
config_stage2['model']['base_model'] = STAGE1_MODEL_PATH
config_stage2['training']['output_dir'] = "/content/drive/MyDrive/LLM_Models/cooking-assistant-project/models/gpt2-recipe-final"
config_stage2['training']['num_train_epochs'] = 2  # 2 epochs for better quality

print("\n" + "=" * 70)
print("UPDATED STAGE 2 CONFIGURATION")
print("=" * 70)
print(f"✓ Will load Stage 1 from:")
print(f"  {config_stage2['model']['base_model']}")
print(f"\n✓ Will save Stage 2 to:")
print(f"  {config_stage2['training']['output_dir']}")
print(f"\n✓ Training epochs: {config_stage2['training']['num_train_epochs']}")
print(f"✓ Batch size: {config_stage2['training']['per_device_train_batch_size']}")
print(f"✓ Grad accumulation: {config_stage2['training']['gradient_accumulation_steps']}")
print("=" * 70)


UPDATED STAGE 2 CONFIGURATION
✓ Will load Stage 1 from:
  /content/drive/MyDrive/LLM_Models/cooking-assistant-project/models/gpt2-conversational-v1/final

✓ Will save Stage 2 to:
  /content/drive/MyDrive/LLM_Models/cooking-assistant-project/models/gpt2-recipe-final

✓ Training epochs: 2
✓ Batch size: 4
✓ Grad accumulation: 4


## 4. Load Recipe Datasets

In [8]:
# # Download datasets from Google Drive
import gdown
import os

# Your folder ID
folder_id = "1HiAxfpV-auZECGKufjhgZryg9BhA0RjM"

print("Downloading datasets from Google Drive...")
print("This may take 5-10 minutes...")

# Download the folder
gdown.download_folder(
    id=folder_id,
    output="/content/datasets",
    quiet=False,
    use_cookies=False
)

print("\n✓ Download complete!")
print("\nContents of /content/datasets/datasets/:")
print(os.listdir("/content/datasets/datasets"))

Downloading datasets from Google Drive...
This may take 5-10 minutes...


Retrieving folder contents


Retrieving folder 1h8GsTW808nI_S8SmSrhgFGy8suxTyfLS datasets
Retrieving folder 1bJmlVK4J3ZQCmz6YjBTyRb-Psnt7oRZ4 Cleaned
Retrieving folder 17VIYS11Pbm3-zq9dplQv59CKVbys4DTj recipe_gpt2
Retrieving folder 1EalPfEQQj31U5fJ2Qu3RJHZQt13pt71g train
Processing file 1OImBIMA7mgSKpxT_SzDU2_EXdJO5OzEO data-00000-of-00001.arrow
Processing file 1qHuK4ZYCYsdFtorpGmBAWoCHRTwKCYvM dataset_info.json
Processing file 1Hw6sxs8FITYlXbA4scx88RJHqksuF0to state.json
Retrieving folder 1V3T0WA2WXWWL1lVHlkowE3iYcsExvAYM val
Processing file 1G6sBazdCXrGtbX7PQTV7GVWP35yLoXon data-00000-of-00001.arrow
Processing file 1qJiX001YEllBcb-JNQlBx3KIVYP9lCeh dataset_info.json
Processing file 1ba0EEX68NjwOyKdm4VfJRh6JDA022UBh state.json
Processing file 1Lr09y7M5JDxPUVUXtniqFpjw0_HCge6P conversational_training_data.csv
Processing file 1eHWT3cP6K71d5Faau9ijXj_UKVKC0yB5 nutrition_lookup.csv
Retrieving folder 1xpk7NXDXrCnsV2gCYkGWuogA1WMcLUhL OASST1
Retrieving folder 1q6PwRfgYQI_avsZh9Qcx0jVSW6mxsBMh processed
Retrieving folde

Retrieving folder contents completed
Building directory structure
Building directory structure completed
Downloading...
From: https://drive.google.com/uc?id=1OImBIMA7mgSKpxT_SzDU2_EXdJO5OzEO
To: /content/datasets/datasets/Cleaned/recipe_gpt2/train/data-00000-of-00001.arrow
100%|██████████| 69.8M/69.8M [00:00<00:00, 117MB/s]
Downloading...
From: https://drive.google.com/uc?id=1qHuK4ZYCYsdFtorpGmBAWoCHRTwKCYvM
To: /content/datasets/datasets/Cleaned/recipe_gpt2/train/dataset_info.json
100%|██████████| 165/165 [00:00<00:00, 532kB/s]
Downloading...
From: https://drive.google.com/uc?id=1Hw6sxs8FITYlXbA4scx88RJHqksuF0to
To: /content/datasets/datasets/Cleaned/recipe_gpt2/train/state.json
100%|██████████| 247/247 [00:00<00:00, 484kB/s]
Downloading...
From: https://drive.google.com/uc?id=1G6sBazdCXrGtbX7PQTV7GVWP35yLoXon
To: /content/datasets/datasets/Cleaned/recipe_gpt2/val/data-00000-of-00001.arrow
100%|██████████| 17.5M/17.5M [00:00<00:00, 61.3MB/s]
Downloading...
From: https://drive.google.c


✓ Download complete!

Contents of /content/datasets/datasets/:
['OASST1', 'Cleaned']



Download completed


In [9]:
# Load recipe datasets
train_path_s2 = Path(config_stage2['data']['train_data'])
val_path_s2 = Path(config_stage2['data']['val_data'])

print(f"Loading recipe datasets...")
print(f"  Train: {train_path_s2}")
print(f"  Val: {val_path_s2}")

train_dataset_s2 = load_from_disk(str(train_path_s2))
val_dataset_s2 = load_from_disk(str(val_path_s2))

print(f"\n✓ Datasets loaded")
print(f"  Train: {len(train_dataset_s2):,} examples")
print(f"  Val: {len(val_dataset_s2):,} examples")

Loading recipe datasets...
  Train: /content/datasets/datasets/Cleaned/recipe_gpt2/train
  Val: /content/datasets/datasets/Cleaned/recipe_gpt2/val

✓ Datasets loaded
  Train: 89,157 examples
  Val: 22,290 examples


In [10]:
# Reduce dataset for faster training (~40-50 minutes)
print(f"\nOriginal dataset size:")
print(f"  Train: {len(train_dataset_s2):,}")
print(f"  Val: {len(val_dataset_s2):,}")

# Use 20K examples for good quality with reasonable training time
TRAIN_LIMIT = 30000
VAL_LIMIT = 7500

if len(train_dataset_s2) > TRAIN_LIMIT:
    train_dataset_s2 = train_dataset_s2.select(range(TRAIN_LIMIT))
    print(f"\n✓ Reduced train to {TRAIN_LIMIT:,} examples")

if len(val_dataset_s2) > VAL_LIMIT:
    val_dataset_s2 = val_dataset_s2.select(range(VAL_LIMIT))
    print(f"✓ Reduced val to {VAL_LIMIT:,} examples")

print(f"\nReduced dataset size:")
print(f"  Train: {len(train_dataset_s2):,}")
print(f"  Val: {len(val_dataset_s2):,}")

# Calculate expected training time
effective_batch = config_stage2['training']['per_device_train_batch_size'] * config_stage2['training']['gradient_accumulation_steps']
steps_per_epoch = len(train_dataset_s2) // effective_batch
total_steps = steps_per_epoch * config_stage2['training']['num_train_epochs']

print(f"\nExpected training:")
print(f"  Steps per epoch: {steps_per_epoch:,}")
print(f"  Total epochs: {config_stage2['training']['num_train_epochs']}")
print(f"  Total steps: {total_steps:,}")
print(f"  Estimated time: ~{total_steps * 2 / 60:.0f} minutes")


Original dataset size:
  Train: 89,157
  Val: 22,290

✓ Reduced train to 30,000 examples
✓ Reduced val to 7,500 examples

Reduced dataset size:
  Train: 30,000
  Val: 7,500

Expected training:
  Steps per epoch: 1,875
  Total epochs: 2
  Total steps: 3,750
  Estimated time: ~125 minutes


In [11]:
# Fix corrupted fractions (1/4 → 14, etc.)
print("\n" + "=" * 70)
print("FIXING CORRUPTED FRACTIONS")
print("=" * 70)

def fix_fractions(example):
    """Fix fractions that lost their / during preprocessing"""
    text = example['text']

    # Fix cups
    text = text.replace('14 cup', '1/4 cup')
    text = text.replace('12 cup', '1/2 cup')
    text = text.replace('13 cup', '1/3 cup')
    text = text.replace('34 cup', '3/4 cup')
    text = text.replace('23 cup', '2/3 cup')
    text = text.replace('18 cup', '1/8 cup')

    # Fix tablespoons
    text = text.replace('14 tablespoon', '1/4 tablespoon')
    text = text.replace('12 tablespoon', '1/2 tablespoon')
    text = text.replace('13 tablespoon', '1/3 tablespoon')

    # Fix teaspoons
    text = text.replace('14 teaspoon', '1/4 teaspoon')
    text = text.replace('12 teaspoon', '1/2 teaspoon')
    text = text.replace('13 teaspoon', '1/3 teaspoon')
    text = text.replace('34 teaspoon', '3/4 teaspoon')

    # Fix pounds
    text = text.replace('14 lb', '1/4 lb')
    text = text.replace('12 lb', '1/2 lb')
    text = text.replace('34 lb', '3/4 lb')

    example['text'] = text
    return example

print("Applying fixes to train dataset...")
train_dataset_s2 = train_dataset_s2.map(fix_fractions)

print("Applying fixes to validation dataset...")
val_dataset_s2 = val_dataset_s2.map(fix_fractions)

print("\n✓ Fractions fixed!")


FIXING CORRUPTED FRACTIONS
Applying fixes to train dataset...


Map:   0%|          | 0/30000 [00:00<?, ? examples/s]

Applying fixes to validation dataset...


Map:   0%|          | 0/7500 [00:00<?, ? examples/s]


✓ Fractions fixed!


## 5. Load Stage 1 Model

In [12]:
# Load Stage 1 model from Google Drive
stage1_model_path = config_stage2['model']['base_model']

print(f"\nLoading Stage 1 model from Google Drive...")
print(f"  Path: {stage1_model_path}")

# Load base GPT-2 model
print("\n  Step 1: Loading base GPT-2 medium...")
base_model = AutoModelForCausalLM.from_pretrained("gpt2-medium")

# Load LoRA adapter on top
print("  Step 2: Loading Stage 1 LoRA adapter...")
model_s2 = PeftModel.from_pretrained(base_model, stage1_model_path)

# Merge LoRA weights into base model
print("  Step 3: Merging LoRA weights...")
model_s2 = model_s2.merge_and_unload()

model_size = sum(p.numel() for p in model_s2.parameters())
print(f"\n✓ Stage 1 model loaded from Google Drive")
print(f"  Total parameters: {model_size:,} ({model_size/1e6:.1f}M)")


Loading Stage 1 model from Google Drive...
  Path: /content/drive/MyDrive/LLM_Models/cooking-assistant-project/models/gpt2-conversational-v1/final

  Step 1: Loading base GPT-2 medium...


The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


  Step 2: Loading Stage 1 LoRA adapter...
  Step 3: Merging LoRA weights...

✓ Stage 1 model loaded from Google Drive
  Total parameters: 354,823,168 (354.8M)


## 6. Configure LoRA for Stage 2

In [13]:
# Prepare model for k-bit training (memory efficient)
model_s2 = prepare_model_for_kbit_training(model_s2)

# Configure LoRA
lora_config_s2 = LoraConfig(
    r=config_stage2['lora']['lora_r'],
    lora_alpha=config_stage2['lora']['lora_alpha'],
    target_modules=config_stage2['lora']['target_modules'],
    lora_dropout=config_stage2['lora']['lora_dropout'],
    bias="none",
    task_type="CAUSAL_LM"
)

# Apply LoRA
model_s2 = get_peft_model(model_s2, lora_config_s2)

# Print trainable parameters
trainable_params = sum(p.numel() for p in model_s2.parameters() if p.requires_grad)
total_params = sum(p.numel() for p in model_s2.parameters())

print(f"\n✓ LoRA configured for Stage 2")
print(f"  Trainable parameters: {trainable_params:,} ({trainable_params/1e6:.2f}M)")
print(f"  Total parameters: {total_params:,} ({total_params/1e6:.1f}M)")
print(f"  Trainable %: {100 * trainable_params / total_params:.2f}%")


✓ LoRA configured for Stage 2
  Trainable parameters: 4,325,376 (4.33M)
  Total parameters: 359,148,544 (359.1M)
  Trainable %: 1.20%




## 7. Tokenize Datasets

In [19]:
# Load tokenizer
tokenizer = AutoTokenizer.from_pretrained(stage1_model_path)

# Set pad token
if tokenizer.pad_token is None:
    tokenizer.pad_token = tokenizer.eos_token

print("✓ Tokenizer loaded")
print(f"  Vocab size: {len(tokenizer):,}")
print(f"  Pad token: {tokenizer.pad_token}")

print("\nTokenizing datasets...")

def tokenize_function(examples):
    """
    Tokenize the text and prepare for causal language modeling.
    """
    # Tokenize
    tokenized = tokenizer(
        examples['text'],
        truncation=True,
        max_length=config_stage2['model']['max_length'],
        padding="max_length",  # This is the key!
        return_tensors=None,
    )

    # For causal LM, labels are the same as input_ids
    # But mask padding positions with -100 so the loss ignores them
    tokenized["labels"] = [ids.copy() for ids in tokenized["input_ids"]]
    for i, mask in enumerate(tokenized["attention_mask"]):
        for j, m in enumerate(mask):
            if m == 0:
                tokenized["labels"][i][j] = -100

    return tokenized

# Tokenize datasets
print("  Tokenizing train dataset...")
tokenized_train_s2 = train_dataset_s2.map(
    tokenize_function,
    batched=True,
    remove_columns=train_dataset_s2.column_names,
    desc="Tokenizing train"
)

print("  Tokenizing validation dataset...")
tokenized_val_s2 = val_dataset_s2.map(
    tokenize_function,
    batched=True,
    remove_columns=val_dataset_s2.column_names,
    desc="Tokenizing val"
)

print(f"✓ Tokenization complete")
print(f"  Train: {len(tokenized_train_s2):,} examples")
print(f"  Val: {len(tokenized_val_s2):,} examples")


✓ Tokenizer loaded
  Vocab size: 50,257
  Pad token: <|endoftext|>

Tokenizing datasets...
  Tokenizing train dataset...


Tokenizing train:   0%|          | 0/30000 [00:00<?, ? examples/s]

  Tokenizing validation dataset...


Tokenizing val:   0%|          | 0/7500 [00:00<?, ? examples/s]

✓ Tokenization complete
  Train: 30,000 examples
  Val: 7,500 examples


## 8. Setup Training

In [20]:
# Create output directory
output_dir = Path(config_stage2['training']['output_dir'])
output_dir.mkdir(parents=True, exist_ok=True)

# Training arguments
training_args_s2 = TrainingArguments(
    output_dir=str(output_dir),
    num_train_epochs=config_stage2['training']['num_train_epochs'],
    per_device_train_batch_size=config_stage2['training']['per_device_train_batch_size'],
    per_device_eval_batch_size=config_stage2['training']['per_device_eval_batch_size'],
    gradient_accumulation_steps=config_stage2['training']['gradient_accumulation_steps'],
    learning_rate=config_stage2['training']['learning_rate'],
    warmup_steps=config_stage2['training']['warmup_steps'],
    weight_decay=config_stage2['training']['weight_decay'],
    fp16=config_stage2['training']['fp16'],
    optim=config_stage2['training']['optim'],
    lr_scheduler_type=config_stage2['training']['lr_scheduler_type'],
    logging_steps=config_stage2['training']['logging_steps'],
    save_steps=config_stage2['training']['save_steps'],
    eval_steps=config_stage2['training']['eval_steps'],
    save_total_limit=config_stage2['training']['save_total_limit'],
    eval_strategy="steps",  # Changed from evaluation_strategy
    load_best_model_at_end=True,
    report_to="wandb" if config_stage2['training']['use_wandb'] else "none",
)

print("✓ Training arguments configured")


✓ Training arguments configured


In [21]:
# Data collator
data_collator = DataCollatorForLanguageModeling(
    tokenizer=tokenizer,
    mlm=False,
)

# Initialize trainer
trainer = Trainer(
    model=model_s2,
    args=training_args_s2,
    train_dataset=tokenized_train_s2,
    eval_dataset=tokenized_val_s2,
    data_collator=data_collator,
)

print("✓ Trainer initialized")

✓ Trainer initialized


## 9. Train!

In [22]:
# Optional: Login to W&B
if config_stage2['training']['use_wandb']:
    import wandb
    wandb.login()
    wandb.init(
        project=config_stage2['training']['wandb_project'],
        name=config_stage2['training']['wandb_run_name']
    )

print("\n" + "=" * 70)
print("STARTING STAGE 2 TRAINING")
print("=" * 70)
print(f"Training on: Recipe data")
print(f"Output: {output_dir}")
print("=" * 70)

# Train!
trainer.train()




STARTING STAGE 2 TRAINING
Training on: Recipe data
Output: /content/drive/MyDrive/LLM_Models/cooking-assistant-project/models/gpt2-recipe-final


`loss_type=None` was set in the config but it is unrecognized. Using the default loss: `ForCausalLMLoss`.


Step,Training Loss,Validation Loss
100,2.1762,1.825304
200,1.6815,1.538985
300,1.608,1.49446
400,1.5815,1.470754
500,1.576,1.458774
600,1.5418,1.443982
700,1.5276,1.435878
800,1.528,1.427864
900,1.4864,1.421755
1000,1.4545,1.415498


TrainOutput(global_step=3750, training_loss=1.5491782155354819, metrics={'train_runtime': 13461.7208, 'train_samples_per_second': 4.457, 'train_steps_per_second': 0.279, 'total_flos': 5.6519294976e+16, 'train_loss': 1.5491782155354819, 'epoch': 2.0})

## 10. Evaluate

In [23]:
# Evaluate
print("\nEvaluating model on validation set...")
eval_results = trainer.evaluate()

print(f"\nValidation Results:")
print(f"  Loss: {eval_results['eval_loss']:.4f}")
print(f"  Perplexity: {torch.exp(torch.tensor(eval_results['eval_loss'])):.2f}")


Evaluating model on validation set...



Validation Results:
  Loss: 1.3619
  Perplexity: 3.90


## 11. Save Final Model

In [24]:
# Save final model
print("\nSaving final model...")

final_model_path = output_dir / "final"
final_model_path.mkdir(exist_ok=True)

trainer.save_model(str(final_model_path))
tokenizer.save_pretrained(str(final_model_path))

print(f"✓ Model saved to: {final_model_path}")

# Save config
import json
config_path = final_model_path / "training_config.json"
with open(config_path, 'w') as f:
    json.dump(config_stage2, f, indent=2)
print(f"✓ Config saved to: {config_path}")


Saving final model...
✓ Model saved to: /content/drive/MyDrive/LLM_Models/cooking-assistant-project/models/gpt2-recipe-final/final
✓ Config saved to: /content/drive/MyDrive/LLM_Models/cooking-assistant-project/models/gpt2-recipe-final/final/training_config.json


## 12. Test the Model

In [25]:
# Test the final model
print("\n" + "=" * 70)
print("TESTING FINAL MODEL")
print("=" * 70)

test_prompts = [
    "<|user|> [INGREDIENTS] eggs, milk, cheese <|assistant|>",
    "<|user|> how do i make pasta carbonara? <|assistant|>",
    "<|user|> what can i make with chicken and rice? <|assistant|>",
]

for prompt in test_prompts:
    print(f"\nPrompt: {prompt}")

    inputs = tokenizer(prompt, return_tensors="pt").to(model_s2.device)
    outputs = model_s2.generate(
        **inputs,
        max_new_tokens=100,
        temperature=0.7,
        do_sample=True,
        pad_token_id=tokenizer.eos_token_id
    )

    response = tokenizer.decode(outputs[0], skip_special_tokens=False)
    print(f"Response: {response}")
    print("-" * 70)


TESTING FINAL MODEL

Prompt: <|user|> [INGREDIENTS] eggs, milk, cheese <|assistant|>
Response: <|user|> [INGREDIENTS] eggs, milk, cheese <|assistant|> i see youve got eggs, milk, cheese. i would suggest making eggplant and cream cheese lasagna. would you like to know how to make it? user: yes system: okay heres how you can make it: 1. preheat oven to 425f 2. place eggplant in a baking dish, and combine the milk, cheese and thyme 3. mix well 4. place eggplant in a casserole dish, and top with cheese mixture 5. bake uncovered in
----------------------------------------------------------------------

Prompt: <|user|> how do i make pasta carbonara? <|assistant|>
Response: <|user|> how do i make pasta carbonara? <|assistant|> to make pasta carbonara, youll need: 1 cup whole wheat pasta, 1/4 cup butter, melted, 1 cup ricotta cheese, 1 cup parmesan cheese, shredded, 8 ounces (1 1/4 oz) pasta, 4 ounces (1 1/2 oz) grated parmesan cheese, 1 cup shredded parmesan, 12 ounces (1 1/2 oz) shredded m

## Summary

✅ Stage 2 training complete!

**Model saved to:** `/content/drive/MyDrive/LLM_Models/cooking-assistant-project/models/gpt2-recipe-final/final`

**Next steps:**
1. Test the model with various ingredient combinations
2. Integrate with your CNN/YOLO ingredient detection
3. Build the full cooking assistant pipeline