# Tutorial 19: Case Study - Movie Recommendation System

## End-to-End ML System Design Using the 7-Step Framework

---

## Learning Objectives

By the end of this tutorial, you will be able to:

1. **Apply the complete 7-step ML system design framework** to build a movie recommendation system
2. **Design candidate generation and ranking stages** using the two-tower architecture
3. **Implement collaborative filtering, content-based filtering, and hybrid approaches**
4. **Handle cold start problems** for new users and new items
5. **Design offline and online evaluation strategies** for recommendation quality
6. **Create a production-ready deployment architecture** with monitoring

## Setup and Imports

In [None]:
import numpy as np
import pandas as pd
from datetime import datetime, timedelta
import warnings
warnings.filterwarnings('ignore')

import matplotlib.pyplot as plt
import seaborn as sns

from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler, LabelEncoder, MinMaxScaler
from sklearn.metrics.pairwise import cosine_similarity
from sklearn.decomposition import TruncatedSVD
from sklearn.feature_extraction.text import TfidfVectorizer
from scipy.sparse import csr_matrix
from scipy.sparse.linalg import svds

np.random.seed(42)
print('All imports successful!')

---

# 1. Problem Statement and Requirements

## 1.1 Business Context

**Scenario**: Design a movie recommendation system for a streaming platform that:
1. Increases user engagement (watch time)
2. Helps users discover new content
3. Reduces churn by keeping users satisfied

In [None]:
class RequirementsDocument:
    def __init__(self):
        self.business = {
            'primary_objective': 'Maximize user engagement (watch time)',
            'secondary': ['Content discovery', 'User retention', 'Catalog coverage']
        }
        self.scale = {
            'users': '100 million',
            'items': '50,000 movies',
            'qps': '100,000 at peak'
        }
        self.latency = {'p50': '50ms', 'p99': '200ms'}
        
    def display(self):
        print('REQUIREMENTS SUMMARY')
        print('=' * 50)
        print(f"Objective: {self.business['primary_objective']}")
        print(f"Scale: {self.scale['users']} users, {self.scale['items']}")
        print(f"Latency: p50={self.latency['p50']}, p99={self.latency['p99']}")

req = RequirementsDocument()
req.display()

## 1.2 Two-Stage Architecture

```
User Request -> Candidate Generation (50K -> 500) -> Ranking (500 -> 20) -> Results
```

In [None]:
fig, ax = plt.subplots(figsize=(12, 4))
ax.set_xlim(0, 12)
ax.set_ylim(0, 4)
ax.axis('off')

boxes = [
    (0.5, 1, 2, 2, 'User\nRequest', '#3498db'),
    (3.5, 0.5, 2.5, 3, 'Candidate\nGeneration\n50K->500', '#2ecc71'),
    (7, 0.5, 2.5, 3, 'Ranking\nModel\n500->20', '#e74c3c'),
    (10.5, 1, 1.5, 2, 'Results', '#9b59b6')
]

for x, y, w, h, label, color in boxes:
    rect = plt.Rectangle((x, y), w, h, facecolor=color, alpha=0.3, edgecolor=color, lw=2)
    ax.add_patch(rect)
    ax.text(x + w/2, y + h/2, label, ha='center', va='center', fontsize=10, fontweight='bold')

for x1, x2 in [(2.5, 3.5), (6, 7), (9.5, 10.5)]:
    ax.annotate('', xy=(x2, 2), xytext=(x1, 2), arrowprops=dict(arrowstyle='->', lw=2))

ax.set_title('Two-Stage Recommendation Architecture', fontsize=14, fontweight='bold')
plt.tight_layout()
plt.show()

---

# 2. Data Generation and Preparation

In [None]:
class MovieDataGenerator:
    def __init__(self, n_users=5000, n_movies=2000, n_interactions=100000):
        self.n_users = n_users
        self.n_movies = n_movies
        self.n_interactions = n_interactions
        self.genres = ['Action', 'Comedy', 'Drama', 'Horror', 'Sci-Fi', 'Romance', 'Thriller']
        
    def generate_users(self):
        np.random.seed(42)
        users = pd.DataFrame({
            'user_id': [f'user_{i}' for i in range(self.n_users)],
            'age_group': np.random.choice(['18-24', '25-34', '35-44', '45+'], self.n_users),
            'country': np.random.choice(['US', 'UK', 'CA', 'DE', 'FR'], self.n_users)
        })
        users['preferred_genres'] = [list(np.random.choice(self.genres, 2, replace=False)) for _ in range(self.n_users)]
        return users
    
    def generate_movies(self):
        np.random.seed(43)
        movies = pd.DataFrame({
            'movie_id': [f'movie_{i}' for i in range(self.n_movies)],
            'title': [f'Movie {i}' for i in range(self.n_movies)],
            'year': np.random.randint(1990, 2024, self.n_movies),
            'duration': np.random.normal(110, 25, self.n_movies).astype(int).clip(75, 200),
            'avg_rating': np.random.beta(6, 4, self.n_movies) * 4 + 1
        })
        movies['genres'] = [list(np.random.choice(self.genres, np.random.randint(1, 3), replace=False)) for _ in range(self.n_movies)]
        return movies
    
    def generate_interactions(self, users, movies):
        np.random.seed(44)
        interactions = []
        user_prefs = dict(zip(users['user_id'], users['preferred_genres']))
        movie_genres = dict(zip(movies['movie_id'], movies['genres']))
        
        for _ in range(self.n_interactions):
            user_id = np.random.choice(users['user_id'])
            prefs = user_prefs[user_id]
            
            # Prefer movies matching user preferences
            if np.random.random() < 0.7:
                matching = [m for m, g in movie_genres.items() if any(x in prefs for x in g)]
                movie_id = np.random.choice(matching) if matching else np.random.choice(movies['movie_id'])
            else:
                movie_id = np.random.choice(movies['movie_id'])
            
            watch_pct = np.clip(np.random.beta(5, 3), 0, 1)
            rating = round(np.clip(np.random.normal(3.5 + watch_pct, 0.8), 1, 5), 1) if np.random.random() < 0.3 else None
            
            interactions.append({
                'user_id': user_id,
                'movie_id': movie_id,
                'timestamp': pd.Timestamp('2023-01-01') + pd.Timedelta(days=np.random.randint(0, 365)),
                'watch_pct': round(watch_pct * 100, 1),
                'rating': rating,
                'completed': watch_pct >= 0.9
            })
        return pd.DataFrame(interactions)
    
    def generate_all(self):
        users = self.generate_users()
        movies = self.generate_movies()
        interactions = self.generate_interactions(users, movies)
        print(f'Generated: {len(users)} users, {len(movies)} movies, {len(interactions)} interactions')
        return users, movies, interactions

gen = MovieDataGenerator()
users_df, movies_df, interactions_df = gen.generate_all()

In [None]:
print('Users:')
print(users_df.head())
print('\nMovies:')
print(movies_df[['movie_id', 'title', 'year', 'genres']].head())
print('\nInteractions:')
print(interactions_df.head())

In [None]:
fig, axes = plt.subplots(2, 2, figsize=(12, 10))

# Ratings
ratings = interactions_df[interactions_df['rating'].notna()]['rating']
axes[0, 0].hist(ratings, bins=20, edgecolor='black', alpha=0.7)
axes[0, 0].set_title('Rating Distribution')

# Watch percentage
axes[0, 1].hist(interactions_df['watch_pct'], bins=30, edgecolor='black', alpha=0.7, color='green')
axes[0, 1].set_title('Watch Percentage Distribution')

# User activity
user_counts = interactions_df['user_id'].value_counts()
axes[1, 0].hist(user_counts, bins=50, edgecolor='black', alpha=0.7, color='orange')
axes[1, 0].set_title('Interactions per User')
axes[1, 0].set_yscale('log')

# Movie popularity
movie_counts = interactions_df['movie_id'].value_counts()
axes[1, 1].hist(movie_counts, bins=50, edgecolor='black', alpha=0.7, color='red')
axes[1, 1].set_title('Movie Popularity (Long Tail)')
axes[1, 1].set_yscale('log')

plt.tight_layout()
plt.show()

sparsity = 1 - len(interactions_df) / (5000 * 2000)
print(f'Matrix sparsity: {sparsity:.2%}')

## 2.1 Feature Engineering

In [None]:
# User features
user_stats = interactions_df.groupby('user_id').agg({
    'movie_id': 'count',
    'rating': 'mean',
    'watch_pct': 'mean',
    'completed': 'mean'
}).reset_index()
user_stats.columns = ['user_id', 'total_watches', 'avg_rating', 'avg_watch_pct', 'completion_rate']
user_features = users_df.merge(user_stats, on='user_id', how='left').fillna(0)

# Movie features
movie_stats = interactions_df.groupby('movie_id').agg({
    'user_id': 'count',
    'rating': 'mean',
    'watch_pct': 'mean',
    'completed': 'mean'
}).reset_index()
movie_stats.columns = ['movie_id', 'total_views', 'actual_rating', 'avg_watch_pct', 'completion_rate']
movie_stats['popularity'] = np.log1p(movie_stats['total_views'])
movie_features = movies_df.merge(movie_stats, on='movie_id', how='left').fillna(0)

# Genre matrix
all_genres = sorted(set(g for genres in movies_df['genres'] for g in genres))
genre_matrix = np.zeros((len(movies_df), len(all_genres)))
for i, genres in enumerate(movies_df['genres']):
    for g in genres:
        genre_matrix[i, all_genres.index(g)] = 1
genre_df = pd.DataFrame(genre_matrix, columns=[f'genre_{g}' for g in all_genres])

print(f'User features: {user_features.shape}')
print(f'Movie features: {movie_features.shape}')
print(f'Genre matrix: {genre_df.shape}')

## 2.2 Train/Test Split

In [None]:
# Time-based split
interactions_sorted = interactions_df.sort_values('timestamp')
split_idx = int(len(interactions_sorted) * 0.8)
train_df = interactions_sorted.iloc[:split_idx]
test_df = interactions_sorted.iloc[split_idx:]

print(f'Train: {len(train_df):,} ({train_df["timestamp"].min().date()} to {train_df["timestamp"].max().date()})')
print(f'Test: {len(test_df):,} ({test_df["timestamp"].min().date()} to {test_df["timestamp"].max().date()})')

---

# 3. Model Development

## 3.1 Collaborative Filtering (Matrix Factorization)

In [None]:
class CollaborativeFiltering:
    def __init__(self, n_factors=50):
        self.n_factors = n_factors
        self.user_factors = None
        self.item_factors = None
        self.user_to_idx = {}
        self.movie_to_idx = {}
        self.idx_to_movie = {}
        
    def fit(self, interactions):
        print('Training Collaborative Filtering...')
        users = interactions['user_id'].unique()
        movies = interactions['movie_id'].unique()
        
        self.user_to_idx = {u: i for i, u in enumerate(users)}
        self.movie_to_idx = {m: i for i, m in enumerate(movies)}
        self.idx_to_movie = {i: m for m, i in self.movie_to_idx.items()}
        
        rows, cols, data = [], [], []
        for _, row in interactions.iterrows():
            rows.append(self.user_to_idx[row['user_id']])
            cols.append(self.movie_to_idx[row['movie_id']])
            data.append(row['watch_pct'] / 100)
        
        matrix = csr_matrix((data, (rows, cols)), shape=(len(users), len(movies)))
        n_factors = min(self.n_factors, min(matrix.shape) - 1)
        U, sigma, Vt = svds(matrix, k=n_factors)
        
        self.user_factors = U * np.sqrt(sigma)
        self.item_factors = Vt.T * np.sqrt(sigma)
        print(f'  Trained: {self.user_factors.shape[0]} users, {self.item_factors.shape[0]} movies')
        
    def recommend(self, user_id, n=10, exclude=None):
        if user_id not in self.user_to_idx:
            return []
        user_idx = self.user_to_idx[user_id]
        scores = self.user_factors[user_idx] @ self.item_factors.T
        
        if exclude:
            for m in exclude:
                if m in self.movie_to_idx:
                    scores[self.movie_to_idx[m]] = -np.inf
        
        top_idx = np.argsort(scores)[::-1][:n]
        return [(self.idx_to_movie[i], scores[i]) for i in top_idx]
    
    def similar_items(self, movie_id, n=10):
        if movie_id not in self.movie_to_idx:
            return []
        idx = self.movie_to_idx[movie_id]
        sims = cosine_similarity([self.item_factors[idx]], self.item_factors)[0]
        top_idx = np.argsort(sims)[::-1][1:n+1]
        return [(self.idx_to_movie[i], sims[i]) for i in top_idx]

cf_model = CollaborativeFiltering(n_factors=50)
cf_model.fit(train_df)

In [None]:
# Test CF model
test_user = train_df['user_id'].iloc[0]
watched = set(train_df[train_df['user_id'] == test_user]['movie_id'])

print(f'Recommendations for {test_user}:')
for movie_id, score in cf_model.recommend(test_user, n=5, exclude=watched):
    title = movies_df[movies_df['movie_id'] == movie_id]['title'].values[0]
    print(f'  {title}: {score:.3f}')

## 3.2 Content-Based Filtering

In [None]:
class ContentBasedFiltering:
    def __init__(self):
        self.movie_vectors = None
        self.movie_ids = None
        self.movie_to_idx = {}
        
    def fit(self, movies, genre_matrix):
        print('Building content-based model...')
        self.movie_ids = movies['movie_id'].values
        self.movie_to_idx = {m: i for i, m in enumerate(self.movie_ids)}
        
        year_norm = (movies['year'] - movies['year'].min()) / (movies['year'].max() - movies['year'].min())
        dur_norm = (movies['duration'] - movies['duration'].min()) / (movies['duration'].max() - movies['duration'].min())
        
        self.movie_vectors = np.hstack([genre_matrix.values, year_norm.values.reshape(-1, 1), dur_norm.values.reshape(-1, 1)])
        print(f'  Content vectors: {self.movie_vectors.shape}')
    
    def similar_items(self, movie_id, n=10):
        if movie_id not in self.movie_to_idx:
            return []
        idx = self.movie_to_idx[movie_id]
        sims = cosine_similarity([self.movie_vectors[idx]], self.movie_vectors)[0]
        top_idx = np.argsort(sims)[::-1][1:n+1]
        return [(self.movie_ids[i], sims[i]) for i in top_idx]
    
    def recommend_for_user(self, history, n=10):
        if not history:
            return []
        user_vec = np.mean([self.movie_vectors[self.movie_to_idx[m]] for m in history if m in self.movie_to_idx], axis=0)
        sims = cosine_similarity([user_vec], self.movie_vectors)[0]
        for m in history:
            if m in self.movie_to_idx:
                sims[self.movie_to_idx[m]] = -1
        top_idx = np.argsort(sims)[::-1][:n]
        return [(self.movie_ids[i], sims[i]) for i in top_idx]

cb_model = ContentBasedFiltering()
cb_model.fit(movies_df, genre_df)

In [None]:
# Test CB model
print('Similar movies to movie_0:')
for movie_id, sim in cb_model.similar_items('movie_0', n=5):
    info = movies_df[movies_df['movie_id'] == movie_id].iloc[0]
    print(f'  {info["title"]} ({info["year"]}) - {info["genres"]}: {sim:.3f}')

## 3.3 Hybrid Recommender

In [None]:
class HybridRecommender:
    def __init__(self, cf, cb, cf_weight=0.7):
        self.cf = cf
        self.cb = cb
        self.cf_weight = cf_weight
        
    def recommend(self, user_id, history, n=10):
        cf_recs = dict(self.cf.recommend(user_id, n=n*2, exclude=history))
        cb_recs = dict(self.cb.recommend_for_user(history, n=n*2))
        
        # Normalize
        if cf_recs:
            cf_min, cf_max = min(cf_recs.values()), max(cf_recs.values())
            cf_range = cf_max - cf_min if cf_max != cf_min else 1
            cf_recs = {k: (v - cf_min) / cf_range for k, v in cf_recs.items()}
        if cb_recs:
            cb_min, cb_max = min(cb_recs.values()), max(cb_recs.values())
            cb_range = cb_max - cb_min if cb_max != cb_min else 1
            cb_recs = {k: (v - cb_min) / cb_range for k, v in cb_recs.items()}
        
        # Combine
        combined = {}
        for m in set(cf_recs) | set(cb_recs):
            combined[m] = self.cf_weight * cf_recs.get(m, 0) + (1 - self.cf_weight) * cb_recs.get(m, 0)
        
        return sorted(combined.items(), key=lambda x: x[1], reverse=True)[:n]

hybrid = HybridRecommender(cf_model, cb_model)

history = list(train_df[train_df['user_id'] == test_user]['movie_id'].unique()[:10])
print(f'Hybrid recommendations for {test_user}:')
for movie_id, score in hybrid.recommend(test_user, history, n=5):
    title = movies_df[movies_df['movie_id'] == movie_id]['title'].values[0]
    print(f'  {title}: {score:.3f}')

## 3.4 Two-Tower Model (Neural Retrieval)

In [None]:
class TwoTowerModel:
    def __init__(self, embedding_dim=32):
        self.dim = embedding_dim
        self.user_emb = None
        self.item_emb = None
        self.user_to_idx = {}
        self.item_to_idx = {}
        self.idx_to_item = {}
        
    def fit(self, interactions, epochs=10, lr=0.01):
        print('Training Two-Tower Model...')
        users = interactions['user_id'].unique()
        items = interactions['movie_id'].unique()
        
        self.user_to_idx = {u: i for i, u in enumerate(users)}
        self.item_to_idx = {m: i for i, m in enumerate(items)}
        self.idx_to_item = {i: m for m, i in self.item_to_idx.items()}
        
        np.random.seed(42)
        self.user_emb = np.random.randn(len(users), self.dim) * 0.1
        self.item_emb = np.random.randn(len(items), self.dim) * 0.1
        
        for epoch in range(epochs):
            loss = 0
            for _, row in interactions.sample(min(10000, len(interactions))).iterrows():
                u_idx = self.user_to_idx[row['user_id']]
                i_idx = self.item_to_idx[row['movie_id']]
                label = row['watch_pct'] / 100
                
                pred = 1 / (1 + np.exp(-np.dot(self.user_emb[u_idx], self.item_emb[i_idx])))
                error = pred - label
                loss += error ** 2
                
                self.user_emb[u_idx] -= lr * error * self.item_emb[i_idx]
                self.item_emb[i_idx] -= lr * error * self.user_emb[u_idx]
            
            if (epoch + 1) % 5 == 0:
                print(f'  Epoch {epoch + 1}, Loss: {loss:.2f}')
        
    def get_candidates(self, user_id, n=100):
        if user_id not in self.user_to_idx:
            return []
        scores = np.dot(self.item_emb, self.user_emb[self.user_to_idx[user_id]])
        top_idx = np.argsort(scores)[::-1][:n]
        return [(self.idx_to_item[i], scores[i]) for i in top_idx]

two_tower = TwoTowerModel(embedding_dim=32)
two_tower.fit(train_df, epochs=10)

---

# 4. Evaluation Strategy

## 4.1 Offline Metrics

In [None]:
class RecommendationEvaluator:
    @staticmethod
    def precision_at_k(recommended, relevant, k):
        return len(set(recommended[:k]) & set(relevant)) / k if k > 0 else 0
    
    @staticmethod
    def recall_at_k(recommended, relevant, k):
        return len(set(recommended[:k]) & set(relevant)) / len(relevant) if relevant else 0
    
    @staticmethod
    def ndcg_at_k(recommended, relevant, k):
        rel_set = set(relevant)
        dcg = sum(1 / np.log2(i + 2) if recommended[i] in rel_set else 0 for i in range(min(k, len(recommended))))
        idcg = sum(1 / np.log2(i + 2) for i in range(min(k, len(rel_set))))
        return dcg / idcg if idcg > 0 else 0
    
    @staticmethod
    def mrr(recommended, relevant):
        rel_set = set(relevant)
        for i, item in enumerate(recommended):
            if item in rel_set:
                return 1 / (i + 1)
        return 0

evaluator = RecommendationEvaluator()

In [None]:
def evaluate_model(model, test_df, train_df, k_values=[5, 10, 20]):
    results = {k: {'prec': [], 'recall': [], 'ndcg': []} for k in k_values}
    test_users = test_df.groupby('user_id').filter(lambda x: len(x) >= 3)['user_id'].unique()
    sample_users = np.random.choice(test_users, min(300, len(test_users)), replace=False)
    
    for user_id in sample_users:
        train_history = set(train_df[train_df['user_id'] == user_id]['movie_id'])
        relevant = test_df[(test_df['user_id'] == user_id) & (test_df['watch_pct'] >= 70)]['movie_id'].tolist()
        
        if not relevant:
            continue
            
        recs = model.recommend(user_id, n=max(k_values), exclude=train_history) if hasattr(model, 'recommend') else model.get_candidates(user_id, max(k_values))
        recommended = [r[0] for r in recs]
        
        for k in k_values:
            results[k]['prec'].append(evaluator.precision_at_k(recommended, relevant, k))
            results[k]['recall'].append(evaluator.recall_at_k(recommended, relevant, k))
            results[k]['ndcg'].append(evaluator.ndcg_at_k(recommended, relevant, k))
    
    return {k: {m: np.mean(v) for m, v in metrics.items()} for k, metrics in results.items()}

print('Evaluating CF Model...')
cf_results = evaluate_model(cf_model, test_df, train_df)
print('\nResults:')
for k, metrics in cf_results.items():
    print(f'  @{k}: Prec={metrics["prec"]:.4f}, Recall={metrics["recall"]:.4f}, NDCG={metrics["ndcg"]:.4f}')

In [None]:
# Compare models
models = {
    'Collaborative Filtering': cf_model,
    'Two-Tower': two_tower
}

all_results = {}
for name, model in models.items():
    print(f'Evaluating {name}...')
    all_results[name] = evaluate_model(model, test_df, train_df)

# Visualization
fig, axes = plt.subplots(1, 3, figsize=(14, 4))
k_values = [5, 10, 20]
metrics = ['prec', 'recall', 'ndcg']
titles = ['Precision@K', 'Recall@K', 'NDCG@K']

for ax, metric, title in zip(axes, metrics, titles):
    for name, results in all_results.items():
        values = [results[k][metric] for k in k_values]
        ax.plot(k_values, values, marker='o', label=name)
    ax.set_xlabel('K')
    ax.set_ylabel(title)
    ax.set_title(title)
    ax.legend()
    ax.grid(True, alpha=0.3)

plt.tight_layout()
plt.show()

## 4.2 Online Evaluation (A/B Testing)

In [None]:
class ABTestSimulator:
    def __init__(self, control_ctr=0.05, treatment_effect=0.1):
        self.control_ctr = control_ctr
        self.treatment_ctr = control_ctr * (1 + treatment_effect)
        
    def simulate(self, n_users=10000, n_days=14):
        np.random.seed(42)
        results = []
        
        for day in range(n_days):
            for _ in range(n_users // n_days):
                group = np.random.choice(['control', 'treatment'])
                ctr = self.control_ctr if group == 'control' else self.treatment_ctr
                clicked = np.random.random() < ctr
                results.append({'day': day, 'group': group, 'clicked': clicked})
        
        return pd.DataFrame(results)
    
    def analyze(self, results):
        control = results[results['group'] == 'control']['clicked']
        treatment = results[results['group'] == 'treatment']['clicked']
        
        control_ctr = control.mean()
        treatment_ctr = treatment.mean()
        lift = (treatment_ctr - control_ctr) / control_ctr * 100
        
        # Z-test
        n_c, n_t = len(control), len(treatment)
        p_pooled = (control.sum() + treatment.sum()) / (n_c + n_t)
        se = np.sqrt(p_pooled * (1 - p_pooled) * (1/n_c + 1/n_t))
        z = (treatment_ctr - control_ctr) / se if se > 0 else 0
        
        return {
            'control_ctr': control_ctr,
            'treatment_ctr': treatment_ctr,
            'lift': lift,
            'z_score': z,
            'significant': abs(z) > 1.96
        }

ab_sim = ABTestSimulator(control_ctr=0.05, treatment_effect=0.10)
ab_results = ab_sim.simulate()
analysis = ab_sim.analyze(ab_results)

print('A/B Test Results:')
print(f'  Control CTR: {analysis["control_ctr"]:.4f}')
print(f'  Treatment CTR: {analysis["treatment_ctr"]:.4f}')
print(f'  Lift: {analysis["lift"]:.2f}%')
print(f'  Z-score: {analysis["z_score"]:.2f}')
print(f'  Significant (p<0.05): {analysis["significant"]}')

---

# 5. Deployment Architecture

In [None]:
class RecommendationService:
    """Production recommendation service."""
    
    def __init__(self, cf_model, cb_model, two_tower, popular_items):
        self.cf = cf_model
        self.cb = cb_model
        self.retrieval = two_tower
        self.popular = popular_items
        
    def get_recommendations(self, user_id, history, context=None, n=20):
        import time
        start = time.time()
        
        # Cold start handling
        if user_id not in self.cf.user_to_idx:
            recs = self.popular[:n]
            source = 'popularity'
        else:
            # Stage 1: Candidate generation
            candidates = self.retrieval.get_candidates(user_id, n=100)
            candidate_ids = [c[0] for c in candidates if c[0] not in history]
            
            # Stage 2: Rerank with hybrid
            cf_scores = dict(self.cf.recommend(user_id, n=100, exclude=history))
            
            final_scores = []
            for cid in candidate_ids[:50]:
                score = cf_scores.get(cid, 0)
                final_scores.append((cid, score))
            
            recs = sorted(final_scores, key=lambda x: x[1], reverse=True)[:n]
            source = 'hybrid'
        
        latency = (time.time() - start) * 1000
        
        return {
            'user_id': user_id,
            'recommendations': [{'item_id': r[0], 'score': float(r[1]) if isinstance(r, tuple) else 0} for r in recs],
            'source': source,
            'latency_ms': round(latency, 2)
        }

# Create service
popular_items = movie_features.nlargest(100, 'popularity')['movie_id'].tolist()
service = RecommendationService(cf_model, cb_model, two_tower, popular_items)

# Test service
response = service.get_recommendations(test_user, history)
print(f'Response for {response["user_id"]}:')
print(f'  Source: {response["source"]}')
print(f'  Latency: {response["latency_ms"]}ms')
print(f'  Top 5 recommendations:')
for rec in response['recommendations'][:5]:
    title = movies_df[movies_df['movie_id'] == rec['item_id']]['title'].values[0]
    print(f'    {title}: {rec["score"]:.3f}')

In [None]:
# System architecture diagram
print("""
Production Recommendation System Architecture
============================================

                    ┌─────────────────┐
                    │   API Gateway   │
                    │   (Load Balancer)│
                    └────────┬────────┘
                             │
                    ┌────────▼────────┐
                    │  Recommendation │
                    │     Service     │
                    └────────┬────────┘
                             │
        ┌────────────────────┼────────────────────┐
        │                    │                    │
┌───────▼───────┐   ┌───────▼───────┐   ┌───────▼───────┐
│   Candidate   │   │    Ranking    │   │   Feature     │
│   Generation  │   │    Service    │   │    Store      │
│  (Two-Tower)  │   │   (XGBoost)   │   │   (Redis)     │
└───────────────┘   └───────────────┘   └───────────────┘
        │                    │                    │
        └────────────────────┼────────────────────┘
                             │
                    ┌────────▼────────┐
                    │   Model Store   │
                    │   (S3/GCS)      │
                    └─────────────────┘

Data Pipeline:
User Events → Kafka → Spark → Feature Store → Model Training → Model Store
""")

---

# 6. Monitoring and Maintenance

In [None]:
class RecommendationMonitor:
    def __init__(self):
        self.metrics = []
        
    def log_request(self, user_id, latency_ms, source, n_recs):
        self.metrics.append({
            'timestamp': datetime.now(),
            'user_id': user_id,
            'latency_ms': latency_ms,
            'source': source,
            'n_recs': n_recs
        })
    
    def get_summary(self):
        if not self.metrics:
            return {}
        df = pd.DataFrame(self.metrics)
        return {
            'total_requests': len(df),
            'latency_p50': df['latency_ms'].quantile(0.5),
            'latency_p99': df['latency_ms'].quantile(0.99),
            'source_distribution': df['source'].value_counts().to_dict()
        }

# Simulate monitoring
monitor = RecommendationMonitor()

# Simulate requests
sample_users = np.random.choice(users_df['user_id'], 100)
for user_id in sample_users:
    hist = list(train_df[train_df['user_id'] == user_id]['movie_id'].unique()[:5])
    response = service.get_recommendations(user_id, hist)
    monitor.log_request(user_id, response['latency_ms'], response['source'], len(response['recommendations']))

summary = monitor.get_summary()
print('Monitoring Summary:')
print(f'  Total Requests: {summary["total_requests"]}')
print(f'  Latency P50: {summary["latency_p50"]:.2f}ms')
print(f'  Latency P99: {summary["latency_p99"]:.2f}ms')
print(f'  Source Distribution: {summary["source_distribution"]}')

In [None]:
# Coverage and diversity metrics
def calculate_coverage(recommendations_list, total_items):
    unique_items = set()
    for recs in recommendations_list:
        unique_items.update([r['item_id'] for r in recs])
    return len(unique_items) / total_items

# Collect recommendations for analysis
all_recs = []
for user_id in np.random.choice(users_df['user_id'], 500):
    hist = list(train_df[train_df['user_id'] == user_id]['movie_id'].unique()[:5])
    response = service.get_recommendations(user_id, hist)
    all_recs.append(response['recommendations'])

coverage = calculate_coverage(all_recs, len(movies_df))
print(f'Catalog Coverage: {coverage:.2%}')

---

# 7. Summary

## Key Takeaways

1. **Requirements**: Clearly define business objectives, scale, and latency requirements
2. **Two-Stage Architecture**: Candidate generation (fast, broad) + Ranking (precise, personalized)
3. **Multiple Approaches**: Collaborative filtering, content-based, and hybrid methods
4. **Cold Start**: Handle new users with popularity-based recommendations
5. **Evaluation**: Use ranking metrics (Precision@K, NDCG) offline, CTR/engagement online
6. **Monitoring**: Track latency, coverage, diversity, and model freshness

In [None]:
print("""
7-Step Framework Applied to Movie Recommendations
=================================================

Step 1: Requirements
  - Primary: Maximize watch time
  - Scale: 100M users, 50K items, 100K QPS
  - Latency: p50 < 50ms, p99 < 200ms

Step 2: Problem Framing
  - Stage 1: Retrieval (representation learning)
  - Stage 2: Ranking (pointwise classification)
  - Labels: Implicit (watch %) + Explicit (ratings)

Step 3: Data Preparation
  - User features: demographics, behavior stats
  - Item features: metadata, popularity, genres
  - Interaction features: time, device, context

Step 4: Model Development
  - Collaborative Filtering (SVD)
  - Content-Based (TF-IDF similarity)
  - Two-Tower Neural Network
  - Hybrid ensemble

Step 5: Evaluation
  - Offline: Precision@K, Recall@K, NDCG
  - Online: A/B testing, CTR, watch time

Step 6: Deployment
  - Microservices architecture
  - Feature store for real-time features
  - Model versioning and rollback

Step 7: Monitoring
  - Latency tracking
  - Coverage and diversity
  - Model freshness alerts
""")

print('Tutorial 19 Complete!')