# Assignment 2: Production Model Training for Recipe Recommendation System

## Project Overview
This notebook demonstrates the development and training of a machine learning model for a recipe recommendation system for my mobile app PantryPal. 
PantryPal is a real-world cooking app where users can input the ingredients that they have and receive recipe suggestions. In addition to ingredient-count based recommendations,
I wanted to incorporate a general recommendation system that tries to rank, out of all 2000 of our recipes, which they would most engage with.

As the owner of this recipe app, I have collected user interaction data (views, favorites, cooking attempts) and built a personalized recommendation system to improve user engagement and help users discover recipes they'll love.

## Business Problem
PantryPal serves hundreds of users who interact with recipes in various ways. The challenge is to predict which recipes a user is most likely to engage with based on their historical behavior and recipe characteristics. This is a classic collaborative filtering problem with content-based features.

## Outline
This notebook demonstrates the complete ML pipeline:
- Environment setup (Colab-compatible)
- Data preparation using `TrainingDataBuilder`
- Model training, evaluation, and saving via `RecipeRanker` 
- Model artifacts for production inference
- Discussion of learning objectives, loss functions, and evaluation metrics



## A2 Criteria: Definition of the Learning Task (Training)
- Input: user–recipe interactions, recipe features, engagement-weighted labels with negatives
- Output: train/val/test datasets, feature columns, metadata, trained model artifacts


In [1]:
# Environment Setup
# This cell configures the environment for both local development and Google Colab
import sys, subprocess, os, pathlib

IN_COLAB = "google.colab" in sys.modules
repo_root = pathlib.Path.cwd()

# Install required packages for Colab environments
if IN_COLAB:
    try:
        subprocess.run([sys.executable, "-m", "pip", "install", "-q",
                        "lightgbm", "pandas", "numpy", "scikit-learn", "matplotlib", "seaborn"],
                       check=False)
    except Exception as e:
        print(f"pip install warning: {e}")

    # Clone the PantryPal ML repository if not already present
    if not (repo_root / "recipe_recommender").exists():
        subprocess.run(["git", "clone", "-q", "https://github.com/marcel-qayoom-taylor/PantryPalML.git"], check=True)
        os.chdir("PantryPalML")
        repo_root = pathlib.Path.cwd()

print(f"Environment ready. Project root: {repo_root}")


Environment ready. Project root: /Users/marcelqayoomtaylor/Documents/GitHub/PantryPalML/notebooks


In [2]:
# Configuration Management
# Import the centralized configuration system for the ML pipeline
from recipe_recommender.config import get_ml_config

# The MLConfig object contains all hyperparameters, file paths, and model settings
# This approach ensures reproducibility and makes hyperparameter tuning systematic
config = get_ml_config()

print("ML Pipeline Configuration:")
print(" - output_dir:", config.output_dir)
print(" - input_dir:", config.input_dir)
print(" - model_dir:", config.model_dir)


ML Pipeline Configuration:
 - output_dir: /Users/marcelqayoomtaylor/Documents/GitHub/PantryPalML/recipe_recommender/output
 - input_dir: /Users/marcelqayoomtaylor/Documents/GitHub/PantryPalML/recipe_recommender/input
 - model_dir: /Users/marcelqayoomtaylor/Documents/GitHub/PantryPalML/recipe_recommender/output/hybrid_models


## Data Preparation Phase

### Objective
Transform raw user interaction events and recipe metadata into a structured ML dataset suitable for training a ranking model. This involves feature engineering, negative sampling, and creating train/validation/test splits.

### Data Sources
- **User interaction events**: Real app analytics data showing user engagement with recipes 
- **Recipe database**: Complete recipe metadata including ingredients, cooking times, ratings, etc.
- **Recipe-ingredient relationships**: Detailed ingredient lists for content-based filtering

The `TrainingDataBuilder` class orchestrates this entire data preparation pipeline.


## A2 Criteria: Knowledge of Model and Algorithm
- LightGBM LambdaRank, ndcg-based objective with user-grouped lists
- Theory → code: objective/metric/ndcg_eval_at, group arrays, early stopping


In [3]:
# Import the training data builder
from recipe_recommender.models.training_data_builder import TrainingDataBuilder

# Initialize the data builder with our configuration
builder = TrainingDataBuilder(config)

# Step 1: Load recipe metadata from the production database
# This includes recipe details, ingredients, nutritional info, etc.
ok_recipes = builder.load_real_recipe_data()

# Step 2: Extract user interaction history from event logs
# This processes real app analytics to understand user preferences
ok_events = builder.extract_user_interactions_from_events()

# Validate that all required data sources are available
if not (ok_recipes and ok_events):
    raise RuntimeError("Missing required data files. Ensure recipe and event outputs exist in recipe_recommender/output.")

# Step 3: Create user profile features 
# Aggregates per-user statistics: average ratings, activity levels, platform preferences
user_profiles = builder.create_user_profiles()

# Step 4: Generate training pairs with labels
# Creates positive (user engaged with recipe) and negative (user likely not interested) pairs
training_pairs = builder.create_user_recipe_pairs()

# Step 5: Build final feature matrix and data splits
# Combines user profiles, recipe features, and interaction patterns into ML-ready format
train_df, val_df, test_df = builder.prepare_training_data()

print("Dataset shapes - Train:", train_df.shape, "Validation:", val_df.shape, "Test:", test_df.shape)


2025-10-03 19:36:40,398 - recipe_recommender.models.training_data_builder - INFO - Initialized Training Data Builder
2025-10-03 19:36:40,399 - recipe_recommender.models.training_data_builder - INFO - Loading real recipe database
2025-10-03 19:36:40,411 - recipe_recommender.models.training_data_builder - INFO - Loaded 1967 recipes with enhanced features
2025-10-03 19:36:40,415 - recipe_recommender.models.training_data_builder - INFO - Loaded 21439 recipe-ingredient relationships
2025-10-03 19:36:40,418 - recipe_recommender.models.training_data_builder - INFO - Loaded 2092 ingredients
2025-10-03 19:36:40,418 - recipe_recommender.models.training_data_builder - INFO - Extracting user interactions from events
2025-10-03 19:36:40,419 - recipe_recommender.models.training_data_builder - INFO -    Processing v1_events_20250827.json...
2025-10-03 19:36:40,731 - recipe_recommender.models.training_data_builder - INFO -    Processing v2_events_20250920.json...
2025-10-03 19:36:40,873 - recipe_recom

Dataset shapes - Train: (11040, 52) Validation: (3680, 52) Test: (3680, 52)


## A2 Criteria: Evaluation and Improvement
- Behavior: outputs relevance scores for ranking
- Loss vs objective: LambdaRank optimizes NDCG proxy; business cares about top-N quality
- Metrics: NDCG@k, Recall@k, Spearman
- Improvements: tune sampling, features, and ranking cutoffs


## Model Training and Evaluation

### Objective
Train a Learning-to-Rank model using LightGBM's Lambdarank objective to optimize for recommendation quality. The model learns to score user-recipe pairs such that recipes the user is more likely to engage with receive higher scores.

### Model Architecture
- **Algorithm**: LightGBM Gradient Boosting with Lambdarank objective. 
- **Loss Function**: LambdaRank doesn't use a traditional loss function but instead directly computes gradients aka "lambdas" by considering pairwise document comparisons and weighting them by how much swapping their positions would improve the ranking metric (NDCG in our case). 
- **Task**: Learning-to-Rank for personalized recipe recommendations  
- **Optimization Target**: Normalized Discounted Cumulative Gain (NDCG@k)
- **Features**: 34 features (22 computed, 12 natural) combining user behavior, recipe content, and compatibility signals

The `RecipeRanker` class handles the complete training, evaluation, and persistence workflow.


In [4]:

# Import the recipe ranking model
from recipe_recommender.models.recipe_ranker import RecipeRanker

# Initialize the ranker with our configuration
ranker = RecipeRanker(config)

# Step 1: Load the prepared training datasets
# This loads the train/validation/test splits created by the TrainingDataBuilder
ranker.load_training_data()

# Step 2: Load recipe feature metadata
# Ensures recipe-level features are available for scoring and evaluation
ranker.load_recipe_features()

# Step 3: Train the LightGBM model
# Uses Lambdarank objective to optimize directly for ranking quality (NDCG)
# Early stopping prevents overfitting based on validation NDCG performance
ranker.train_model()

# Step 4: Evaluate model performance
# Reports both classification metrics (AUC, Precision, Recall) and ranking metrics (NDCG@k, Recall@k)
ranker.evaluate_model()

# Step 5: Analyze feature importance
# LightGBM provides feature importance scores based on information gain
importance = ranker.get_feature_importance()
print("Top 10 Most Important Features:")
print(importance.head(10))

# Step 6: Save model artifacts for production inference
# Saves the trained booster, feature metadata, and configuration
ranker.save_model()

print("Training complete! Model artifacts saved to:", config.model_dir)


2025-10-03 19:36:53,738 - recipe_recommender.models.recipe_ranker - INFO - Initialized Recipe Ranker with lightgbm
2025-10-03 19:36:53,739 - recipe_recommender.models.recipe_ranker - INFO - Loading training data
2025-10-03 19:36:53,922 - recipe_recommender.models.recipe_ranker - INFO - Successfully loaded training data:
2025-10-03 19:36:53,923 - recipe_recommender.models.recipe_ranker - INFO -    Train: 11,040 samples
2025-10-03 19:36:53,924 - recipe_recommender.models.recipe_ranker - INFO -    Validation: 3,680 samples
2025-10-03 19:36:53,924 - recipe_recommender.models.recipe_ranker - INFO -    Test: 3,680 samples
2025-10-03 19:36:53,924 - recipe_recommender.models.recipe_ranker - INFO - Loaded 29 feature columns
2025-10-03 19:36:53,925 - recipe_recommender.models.recipe_ranker - INFO - Loaded training metadata
2025-10-03 19:36:53,938 - recipe_recommender.models.recipe_ranker - INFO - Loaded raw recipe features from enhanced_recipe_features_from_db.csv
2025-10-03 19:36:53,939 - recip

Training until validation scores don't improve for 50 rounds
Early stopping, best iteration is:
[11]	train's ndcg@5: 0.99962	train's ndcg@10: 0.999601	train's ndcg@20: 0.999672	validation's ndcg@5: 1	validation's ndcg@10: 1	validation's ndcg@20: 1


2025-10-03 19:36:54,390 - recipe_recommender.models.recipe_ranker - INFO - Model performance:
2025-10-03 19:36:54,391 - recipe_recommender.models.recipe_ranker - INFO -    NDCG@5: 0.6545
2025-10-03 19:36:54,391 - recipe_recommender.models.recipe_ranker - INFO -    NDCG@10: 0.6545
2025-10-03 19:36:54,391 - recipe_recommender.models.recipe_ranker - INFO -    Recall@5: 0.9555
2025-10-03 19:36:54,391 - recipe_recommender.models.recipe_ranker - INFO -    Recall@10: 0.9894
2025-10-03 19:36:54,391 - recipe_recommender.models.recipe_ranker - INFO -    Spearman Correlation: 0.9958
2025-10-03 19:36:54,395 - recipe_recommender.models.recipe_ranker - INFO - Saving trained model
2025-10-03 19:36:54,397 - recipe_recommender.models.recipe_ranker - INFO - Model saved to: hybrid_lightgbm_model.txt
2025-10-03 19:36:54,397 - recipe_recommender.models.recipe_ranker - INFO - Metadata saved to: hybrid_lightgbm_metadata.json


Top 10 Most Important Features:
                          feature   importance
20  user_recipe_interaction_count  7023.643156
21         user_recipe_max_rating  4744.439364
22         user_recipe_avg_rating  2948.425036
3                      rating_std    47.143476
1                      avg_rating    45.242657
4                  unique_recipes    24.522861
0              total_interactions    21.124030
7                engagement_score    14.515322
6            interactions_per_day     9.407445
2                    total_rating     8.828276
Training complete! Model artifacts saved to: /Users/marcelqayoomtaylor/Documents/GitHub/PantryPalML/recipe_recommender/output/hybrid_models


## Learning Task Analysis

### Problem Formulation
This is a **Learning-to-Rank** problem rather than traditional binary classification:

- **Input**: User-recipe feature vectors with binary engagement labels (1 = user engaged, 0 = no engagement)
- **Output**: Relevance scores for ranking recipes per user
- **Objective**: Learn a scoring function that ranks recipes users will engage with higher than those they won't

### Model Architecture Decisions
- **Algorithm Choice**: LightGBM with Lambdarank objective
- **Why Ranking**: We care about the order of recommendations, not calibrated probabilities
- **Optimization Target**: NDCG@k directly aligns with business goals (top-N recommendation quality)
- **Evaluation**: Primary metrics are NDCG@k and Recall@k;

### Business Alignment
The Lambdarank objective optimizes for the exact metric we care about in production: presenting users with the most relevant recipes at the top of their recommendation list.


## Lambdarank Algorithm Analysis

### Theoretical Foundation

**Lambdarank** is a listwise learning-to-rank algorithm that directly optimizes for ranking evaluation metrics like NDCG

### Why Lambdarank for Recipe Recommendations?

1. **Metric Alignment**: Traditional classification losses (cross-entropy) don't directly optimize for ranking quality. Lambdarank approximates NDCG gradients, ensuring our training objective matches our evaluation criteria.

2. **Handles Recommendation Challenges**:
   - **Class imbalance**: Most user-recipe pairs are negative (sparse interactions)
   - **Variable list lengths**: Different users have different numbers of candidate recipes
   - **Position bias**: Top recommendations matter more than bottom ones

3. **Pairwise Learning**: For each user group, the algorithm considers pairs of recipes and learns to score the engaging recipe higher than the non-engaging one.

### Technical Implementation

- **Objective Function**: `objective = "lambdarank"` in LightGBM
- **Metric**: `metric = "ndcg"` with evaluation at ranks 5, 10, and 20
- **Loss Function**: Pairwise logistic loss weighted by lambda coefficients:
  ```
  L = Σ λ_ij * log(1 + exp(-(s_i - s_j)))
  ```
  where λ_ij represents the change in NDCG if recipes i and j were swapped

### Production Implications

- **Scoring**: Model outputs continuous scores; higher scores indicate better recommendations
- **Ranking**: No threshold needed—simply rank recipes by score for top-N recommendations
- **Optimization**: Early stopping based on validation NDCG prevents overfitting to ranking metrics



## Model Artifact Verification

### Objective
Verify that all necessary model artifacts have been correctly saved for production deployment. This ensures the training pipeline completed successfully and the model is ready for inference.


In [5]:
model_file = config.model_dir / "hybrid_lightgbm_model.txt"
meta_file = config.model_dir / "hybrid_lightgbm_metadata.json"

print("Model exists:", model_file.exists(), model_file)
print("Metadata exists:", meta_file.exists(), meta_file)


Model exists: True /Users/marcelqayoomtaylor/Documents/GitHub/PantryPalML/recipe_recommender/output/hybrid_models/hybrid_lightgbm_model.txt
Metadata exists: True /Users/marcelqayoomtaylor/Documents/GitHub/PantryPalML/recipe_recommender/output/hybrid_models/hybrid_lightgbm_metadata.json
