# NCF Movie Recommender - Model Inference on Colab

This notebook demonstrates how to use trained NCF/NeuMF+ models for movie recommendations.

**Features:**
- Load trained models from Google Drive
- Generate top-K movie recommendations for users
- Predict scores for user-item pairs
- Handle cold-start scenarios (new movies)
- Display movie titles and genres

## 1. Setup - Mount Google Drive

This notebook expects your trained models and data to be in Google Drive.

**Required structure in Google Drive:**
```
MyDrive/
‚îî‚îÄ‚îÄ NCF-Movie-Recommender/
    ‚îú‚îÄ‚îÄ data/                    # Processed data files
    ‚îÇ   ‚îú‚îÄ‚îÄ mappings.pkl         # User/item mappings
    ‚îÇ   ‚îú‚îÄ‚îÄ item_synopsis_embeddings.npy
    ‚îÇ   ‚îî‚îÄ‚îÄ ...
    ‚îú‚îÄ‚îÄ datasets/                # Raw datasets
    ‚îÇ   ‚îî‚îÄ‚îÄ movies_metadata.csv  # Movie titles and info
    ‚îî‚îÄ‚îÄ experiments/
        ‚îî‚îÄ‚îÄ trained_models/      # Trained model checkpoints
            ‚îú‚îÄ‚îÄ NeuMFPlus_genre_synopsis_best.pt
            ‚îî‚îÄ‚îÄ ...
```

In [None]:
# @title Mount Google Drive
from google.colab import drive
drive.mount('/content/drive')

print("‚úÖ Google Drive mounted successfully!")

## 2. Configure Paths

Update these paths to match your Google Drive structure.

In [None]:
# @title Configure paths
import os

# @markdown **Base path for NCF-Movie-Recommender project:**
GDRIVE_BASE = "/content/drive/MyDrive/NCF-Movie-Recommender"  # @param {type:"string"}

# Paths relative to your Google Drive base
DATA_DIR = os.path.join(GDRIVE_BASE, "data")
DATASETS_DIR = os.path.join(GDRIVE_BASE, "datasets")
MODELS_DIR = os.path.join(GDRIVE_BASE, "experiments", "trained_models")

print(f"üìÅ Base directory: {GDRIVE_BASE}")
print(f"üìÅ Data directory: {DATA_DIR}")
print(f"üìÅ Datasets directory: {DATASETS_DIR}")
print(f"üìÅ Models directory: {MODELS_DIR}")

# Verify directories exist
if os.path.exists(DATA_DIR):
    data_files = os.listdir(DATA_DIR)
    print(f"\n‚úÖ Data directory found! Files: {len(data_files)}")
else:
    print(f"\n‚ùå Data directory not found: {DATA_DIR}")

if os.path.exists(DATASETS_DIR):
    datasets_files = os.listdir(DATASETS_DIR)
    print(f"‚úÖ Datasets directory found! Files: {len(datasets_files)}")
else:
    print(f"‚ùå Datasets directory not found: {DATASETS_DIR}")

if os.path.exists(MODELS_DIR):
    model_files = [f for f in os.listdir(MODELS_DIR) if f.endswith('.pt')]
    print(f"‚úÖ Models directory found! Checkpoints: {len(model_files)}")
    if model_files:
        print("\nAvailable models:")
        for f in sorted(model_files):
            print(f"  ‚Ä¢ {f}")
else:
    print(f"\n‚ùå Models directory not found: {MODELS_DIR}")

## 3. Install Dependencies

Install required Python packages.

In [None]:
# @title Install dependencies
!pip install -q torch numpy pandas sentence-transformers tqdm

import torch
import numpy as np
import pandas as pd
import pickle
from typing import Dict, List, Optional

print("‚úÖ Dependencies installed!")
print(f"   PyTorch: {torch.__version__}")
print(f"   CUDA available: {torch.cuda.is_available()}")

# Set device
DEVICE = 'cuda' if torch.cuda.is_available() else 'cpu'
print(f"   Using device: {DEVICE}")

## 4. Define Model Architecture

This section defines the NeuMF+ model architecture to match your trained checkpoints.

In [None]:
# @title Define NeuMF+ Model
import torch.nn as nn

class ContentEncoder(nn.Module):
    """Encode content features (genre + synopsis) into embeddings."""
    
    def __init__(self, num_genres: int, genre_embed_dim: int = 64,
                 synopsis_embed_dim: int = 384, content_embed_dim: int = 256,
                 dropout: float = 0.1):
        super().__init__()
        
        self.genre_encoder = nn.Sequential(
            nn.Linear(num_genres, genre_embed_dim),
            nn.ReLU(),
            nn.Dropout(dropout),
        )
        
        self.synopsis_projection = nn.Sequential(
            nn.Linear(synopsis_embed_dim, synopsis_embed_dim // 2),
            nn.ReLU(),
            nn.Dropout(dropout),
        )
        
        combined_dim = genre_embed_dim + synopsis_embed_dim // 2
        self.content_encoder = nn.Sequential(
            nn.Linear(combined_dim, content_embed_dim),
            nn.ReLU(),
            nn.Dropout(dropout),
        )
    
    def forward(self, genre_features, synopsis_embeddings):
        genre_embed = self.genre_encoder(genre_features)
        synopsis_embed = self.synopsis_projection(synopsis_embeddings)
        combined = torch.cat([genre_embed, synopsis_embed], dim=-1)
        return self.content_encoder(combined)


class GatedFusion(nn.Module):
    """Gated fusion for CF and content embeddings."""
    
    def __init__(self, cf_dim: int, content_dim: int, hidden_dim: int = 64, dropout: float = 0.1):
        super().__init__()
        
        self.gate_network = nn.Sequential(
            nn.Linear(cf_dim + content_dim, hidden_dim),
            nn.ReLU(),
            nn.Dropout(dropout),
            nn.Linear(hidden_dim, 1),
            nn.Sigmoid(),
        )
    
    def forward(self, cf_embed, content_embed):
        combined = torch.cat([cf_embed, content_embed], dim=-1)
        gate = self.gate_network(combined)
        
        if cf_embed.shape[-1] != content_embed.shape[-1]:
            if cf_embed.shape[-1] > content_embed.shape[-1]:
                target_dim = cf_embed.shape[-1]
                if not hasattr(self, '_content_proj'):
                    self._content_proj = nn.Linear(content_embed.shape[-1], target_dim).to(cf_embed.device)
                content_embed = self._content_proj(content_embed)
            else:
                target_dim = content_embed.shape[-1]
                if not hasattr(self, '_cf_proj'):
                    self._cf_proj = nn.Linear(cf_embed.shape[-1], target_dim).to(cf_embed.device)
                cf_embed = self._cf_proj(cf_embed)
        
        fused = gate * cf_embed + (1 - gate) * content_embed
        return fused, gate


class NeuMFPlus(nn.Module):
    """Neural Matrix Factorization with content features."""
    
    def __init__(self, num_users: int, num_items: int,
                 embed_dim: int = 32, hidden_dims: List[int] = [128, 64],
                 num_genres: int = 19, use_genre: bool = True,
                 use_synopsis: bool = True, use_gated_fusion: bool = True,
                 synopsis_embed_dim: int = 384, content_embed_dim: int = 256,
                 dropout: float = 0.1):
        super().__init__()
        
        self.num_users = num_users
        self.num_items = num_items
        self.embed_dim = embed_dim
        self.num_genres = num_genres
        self.use_genre = use_genre
        self.use_synopsis = use_synopsis
        self.use_gated_fusion = use_gated_fusion
        self.synopsis_embed_dim = synopsis_embed_dim
        self.content_embed_dim = content_embed_dim
        
        # GMF part
        self.user_gmf_embed = nn.Embedding(num_users, embed_dim)
        self.item_gmf_embed = nn.Embedding(num_items, embed_dim)
        
        # MLP part
        self.user_mlp_embed = nn.Embedding(num_users, embed_dim)
        self.item_mlp_embed = nn.Embedding(num_items, embed_dim)
        
        mlp_input_dim = embed_dim * 2
        mlp_layers = []
        for dim in hidden_dims:
            mlp_layers.extend([
                nn.Linear(mlp_input_dim, dim),
                nn.ReLU(),
                nn.Dropout(dropout),
            ])
            mlp_input_dim = dim
        self.mlp = nn.Sequential(*mlp_layers)
        
        # Content encoder
        self.use_content = use_genre or use_synopsis
        if self.use_content:
            self.content_encoder = ContentEncoder(
                num_genres=num_genres,
                genre_embed_dim=64,
                synopsis_embed_dim=synopsis_embed_dim,
                content_embed_dim=content_embed_dim,
                dropout=dropout,
            )
        
        # Fusion layer
        fusion_input_dim = embed_dim + hidden_dims[-1]
        if self.use_content:
            if use_gated_fusion:
                self.gated_fusion = GatedFusion(
                    cf_dim=embed_dim + hidden_dims[-1],
                    content_dim=content_embed_dim,
                    dropout=dropout,
                )
                fusion_input_dim = embed_dim + hidden_dims[-1]
            else:
                fusion_input_dim += content_embed_dim
        
        self.output_layer = nn.Linear(fusion_input_dim, 1)
        
        self._init_weights()
    
    def _init_weights(self):
        for module in self.modules():
            if isinstance(module, (nn.Embedding, nn.Linear)):
                nn.init.xavier_uniform_(module.weight)
                if isinstance(module, nn.Linear) and module.bias is not None:
                    nn.init.zeros_(module.bias)
    
    def forward(self, user_ids, item_ids, genre_features=None, synopsis_embeddings=None):
        # GMF embeddings
        user_gmf = self.user_gmf_embed(user_ids)
        item_gmf = self.item_gmf_embed(item_ids)
        gmf_output = user_gmf * item_gmf
        
        # MLP embeddings
        user_mlp = self.user_mlp_embed(user_ids)
        item_mlp = self.item_mlp_embed(item_ids)
        mlp_input = torch.cat([user_mlp, item_mlp], dim=-1)
        mlp_output = self.mlp(mlp_input)
        
        # Combine GMF + MLP
        cf_output = torch.cat([gmf_output, mlp_output], dim=-1)
        
        # Add content features if available
        if self.use_content:
            content_features = []
            if self.use_genre and genre_features is not None:
                content_features.append(genre_features)
            if self.use_synopsis and synopsis_embeddings is not None:
                content_features.append(synopsis_embeddings)
            
            # Use zeros if features not provided
            batch_size = user_ids.shape[0]
            if not content_features:
                if self.use_genre:
                    content_features.append(torch.zeros(batch_size, self.num_genres).to(user_ids.device))
                if self.use_synopsis:
                    content_features.append(torch.zeros(batch_size, self.synopsis_embed_dim).to(user_ids.device))
            
            # For genre-only or synopsis-only, create dummy for the other
            if len(content_features) == 1:
                if self.use_genre:
                    content_features.insert(0, torch.zeros(batch_size, self.synopsis_embed_dim).to(user_ids.device))
                else:
                    content_features.append(torch.zeros(batch_size, self.num_genres).to(user_ids.device))
            
            content_input = torch.cat(content_features, dim=-1)
            content_embed = self.content_encoder(
                content_input[:, :self.num_genres],
                content_input[:, self.num_genres:]
            )
            
            if self.use_gated_fusion:
                cf_output, gate = self.gated_fusion(cf_output, content_embed)
            else:
                cf_output = torch.cat([cf_output, content_embed], dim=-1)
        
        output = self.output_layer(cf_output)
        return output
    
    @classmethod
    def load(cls, checkpoint_path: str, device: str = 'cpu'):
        checkpoint = torch.load(checkpoint_path, map_location=device, weights_only=False)
        config = checkpoint['model_config']
        
        model = cls(
            num_users=config['num_users'],
            num_items=config['num_items'],
            embed_dim=config.get('embed_dim', 32),
            hidden_dims=config.get('hidden_dims', [128, 64]),
            num_genres=config.get('num_genres', 19),
            use_genre=config.get('use_genre', True),
            use_synopsis=config.get('use_synopsis', True),
            use_gated_fusion=config.get('use_gated_fusion', True),
            synopsis_embed_dim=config.get('synopsis_embed_dim', 384),
            content_embed_dim=config.get('content_embed_dim', 256),
            dropout=config.get('dropout', 0.1),
        )
        model.load_state_dict(checkpoint['model_state_dict'])
        model.to(device)
        return model, checkpoint

print("‚úÖ Model architecture defined!")

## 5. Load Data and Mappings

Load the processed data files including mappings, genre features, and movie metadata.

In [None]:
# @title Load mappings and features

# Load mappings
mappings_path = os.path.join(DATA_DIR, "mappings.pkl")
with open(mappings_path, 'rb') as f:
    mappings = pickle.load(f)

NUM_USERS = mappings['num_users']
NUM_ITEMS = mappings['num_items']
NUM_GENRES = mappings['num_genres']
GENRE_NAMES = mappings.get('genre_names', [])

print(f"‚úÖ Mappings loaded!")
print(f"   Users: {NUM_USERS:,}")
print(f"   Items: {NUM_ITEMS:,}")
print(f"   Genres: {NUM_GENRES}")
if GENRE_NAMES:
    print(f"   Genre names: {GENRE_NAMES}")

# Load genre features (if available)
genre_path = os.path.join(DATA_DIR, "item_genre_features.npy")
if os.path.exists(genre_path):
    GENRE_FEATURES = np.load(genre_path)
    print(f"\n‚úÖ Genre features loaded: {GENRE_FEATURES.shape}")
else:
    GENRE_FEATURES = None
    print(f"\n‚ö†Ô∏è  Genre features not found: {genre_path}")

# Load synopsis embeddings
# Note: file uses plural "embeddings"
synopsis_path = os.path.join(DATA_DIR, "item_synopsis_embeddings.npy")
if os.path.exists(synopsis_path):
    SYNOPSIS_EMBEDDINGS = np.load(synopsis_path)
    print(f"‚úÖ Synopsis embeddings loaded: {SYNOPSIS_EMBEDDINGS.shape}")
else:
    SYNOPSIS_EMBEDDINGS = None
    print(f"‚ö†Ô∏è  Synopsis embeddings not found: {synopsis_path}")

# Load movie metadata for display (from datasets directory)
metadata_path = os.path.join(DATASETS_DIR, "movies_metadata.csv")
if os.path.exists(metadata_path):
    movies_df = pd.read_csv(metadata_path)
    # Filter for valid IDs
    movies_df['id'] = pd.to_numeric(movies_df['id'], errors='coerce')
    movies_df = movies_df[movies_df['id'].notna()]
    movies_df['id'] = movies_df['id'].astype(int)
    movies_df = movies_df.set_index('id')
    print(f"\n‚úÖ Movie metadata loaded: {len(movies_df):,} movies")
else:
    movies_df = None
    print(f"\n‚ö†Ô∏è  Movie metadata not found: {metadata_path}")

## 6. Load Trained Model

Select and load one of your trained models.

**Available Models:**
| Model | Description | Features |
|-------|-------------|----------|
| `NeuMF_best.pt` | Baseline | Collaborative Filtering only |
| `NeuMFPlus_genre_best.pt` | Genre-enhanced | CF + Genre features |
| `NeuMFPlus_genre_synopsis_bestt.pt` | Full model | CF + Genre + Synopsis |

In [None]:
# @title Load trained model
# @markdown Select the model checkpoint to load:

import ipywidgets as widgets
from IPython.display import display, HTML

# Get available models
available_models = [f for f in os.listdir(MODELS_DIR) if f.endswith('.pt')]

# Model descriptions
MODEL_INFO = {
    'NeuMF_best.pt': {
        'name': 'NeuMF (Baseline)',
        'description': 'Collaborative Filtering only - no content features',
        'features': 'User-Item interactions only'
    },
    'NeuMFPlus_genre_best.pt': {
        'name': 'NeuMF+ (Genre)',
        'description': 'CF + Genre features',
        'features': 'User-Item + Movie genres'
    },
    'NeuMFPlus_genre_synopsis_bestt.pt': {
        'name': 'NeuMF+ (Genre + Synopsis)',
        'description': 'CF + Genre + Synopsis features (Full Model)',
        'features': 'User-Item + Genres + Movie synopsis'
    }
}

if not available_models:
    print(f"‚ùå No models found in {MODELS_DIR}")
else:
    # Create dropdown with model descriptions
    model_options = [(f"{m}  ({MODEL_INFO.get(m, {}).get('name', m)})", m) for m in sorted(available_models)]
    
    model_dropdown = widgets.Dropdown(
        options=model_options,
        description='Select model:',
        style={'description_width': 'initial'},
    )
    display(model_dropdown)
    
    # Model info display
    info_out = widgets.Output()
    display(info_out)
    
    def show_model_info(change):
        with info_out:
            info_out.clear_output()
            model_name = change['new']
            info = MODEL_INFO.get(model_name, {})
            if info:
                print(f"üìã {info.get('name', model_name)}")
                print(f"   {info.get('description', '')}")
                print(f"   Features: {info.get('features', 'N/A')}")
    
    model_dropdown.observe(show_model_info, names='value')
    # Show initial info
    show_model_info({'new': model_dropdown.value})
    
    # Load button
    load_btn = widgets.Button(description='Load Model', button_style='primary')
    display(load_btn)
    
    # Output area
    out = widgets.Output()
    display(out)
    
    def load_model(b):
        with out:
            out.clear_output()
            model_name = model_dropdown.value
            checkpoint_path = os.path.join(MODELS_DIR, model_name)
            
            print(f"Loading model from: {model_name}")
            print(f"Path: {checkpoint_path}")
            
            global model, checkpoint, model_config
            model, checkpoint = NeuMFPlus.load(checkpoint_path, device=DEVICE)
            model.eval()
            model_config = checkpoint['model_config']
            
            print("\n" + "="*70)
            print("MODEL CONFIGURATION")
            print("="*70)
            print(f"use_genre: {model_config.get('use_genre')}")
            print(f"use_synopsis: {model_config.get('use_synopsis')}")
            print(f"use_gated_fusion: {model_config.get('use_gated_fusion')}")
            print(f"\nParameters: {sum(p.numel() for p in model.parameters()):,}")
            
            if 'metrics' in checkpoint:
                print("\nValidation Metrics:")
                for k, v in checkpoint['metrics'].items():
                    if isinstance(v, (int, float)):
                        print(f"  {k}: {v:.4f}")
            
            # Show what features are needed
            print("\n" + "-"*70)
            print("REQUIRED FEATURES FOR INFERENCE:")
            if model_config.get('use_genre'):
                print("  ‚úÖ Genre features (item_genre_features.npy)")
            else:
                print("  ‚ùå Genre features NOT needed")
            if model_config.get('use_synopsis'):
                print("  ‚úÖ Synopsis embeddings (item_synopsis_embeddings.npy)")
            else:
                print("  ‚ùå Synopsis embeddings NOT needed")
            print("-"*70)
            
            print("\n‚úÖ Model loaded successfully!")
    
    load_btn.on_click(load_model)

## 7. Helper Functions

Define helper functions for prediction and recommendation.

In [None]:
# @title Define helper functions

def get_movie_title(item_id: int, movies_df: Optional[pd.DataFrame] = None) -> str:
    """Get movie title for item ID."""
    if movies_df is None:
        return f"Movie {item_id}"
    
    # Try to get title from movies_df
    if item_id in movies_df.index:
        title = movies_df.loc[item_id, 'title']
        return title if pd.notna(title) else f"Movie {item_id}"
    
    return f"Movie {item_id}"


def parse_genres(genres_str: str) -> list:
    """Parse genres from JSON string."""
    import json
    import ast
    
    if pd.isna(genres_str) or genres_str == "":
        return []
    
    try:
        genres = json.loads(genres_str)
        return [g['name'] for g in genres]
    except:
        try:
            genres = ast.literal_eval(genres_str)
            return [g['name'] for g in genres]
        except:
            return []


def get_movie_genres(item_id: int, movies_df: Optional[pd.DataFrame] = None) -> str:
    """Get genres for item ID."""
    if movies_df is None or GENRE_FEATURES is None:
        return "Unknown"
    
    if item_id in movies_df.index:
        genres_str = movies_df.loc[item_id, 'genres']
        genres = parse_genres(genres_str)
        return ', '.join(genres) if genres else "Unknown"
    
    # Use genre features if available
    if item_id < len(GENRE_FEATURES):
        genre_indices = [i for i, g in enumerate(GENRE_FEATURES[item_id]) if g == 1]
        if genre_indices and GENRE_NAMES:
            return ', '.join([GENRE_NAMES[i] for i in genre_indices if i < len(GENRE_NAMES)])
    
    return "Unknown"


def predict_score(model, user_id: int, item_id: int, 
                genre_vector: Optional[np.ndarray] = None,
                synopsis_embedding: Optional[np.ndarray] = None,
                device: str = DEVICE) -> float:
    """Predict score for a user-item pair."""
    model.eval()
    
    user_tensor = torch.LongTensor([user_id]).to(device)
    item_tensor = torch.LongTensor([item_id]).to(device)
    
    kwargs = {}
    if genre_vector is not None:
        kwargs['genre_features'] = torch.FloatTensor([genre_vector]).to(device)
    if synopsis_embedding is not None:
        kwargs['synopsis_embeddings'] = torch.FloatTensor([synopsis_embedding]).to(device)
    
    with torch.no_grad():
        logits = model(user_tensor, item_tensor, **kwargs)
        score = torch.sigmoid(logits).squeeze(-1).item()
    
    return score


def recommend(model, user_id: int, k: int = 10,
             item_genre_features: Optional[np.ndarray] = None,
             item_synopsis_embeddings: Optional[np.ndarray] = None,
             seen_items: Optional[List[int]] = None,
             device: str = DEVICE) -> List[Dict]:
    """Recommend top-K items for a user."""
    model.eval()
    num_items = model.num_items
    
    candidate_items = list(range(num_items))
    if seen_items is not None:
        candidate_items = [item for item in candidate_items if item not in seen_items]
    
    user_tensor = torch.LongTensor([user_id] * len(candidate_items)).to(device)
    item_tensor = torch.LongTensor(candidate_items).to(device)
    
    kwargs = {}
    if item_genre_features is not None:
        kwargs['genre_features'] = torch.FloatTensor(item_genre_features[candidate_items]).to(device)
    if item_synopsis_embeddings is not None:
        kwargs['synopsis_embeddings'] = torch.FloatTensor(item_synopsis_embeddings[candidate_items]).to(device)
    
    with torch.no_grad():
        logits = model(user_tensor, item_tensor, **kwargs)
        scores = torch.sigmoid(logits).squeeze(-1).cpu().numpy()
    
    top_indices = np.argsort(scores)[::-1][:k]
    
    recommendations = []
    for idx in top_indices:
        item_id = int(candidate_items[idx])
        recommendations.append({
            'item_id': item_id,
            'score': float(scores[idx]),
            'rank': len(recommendations) + 1,
            'title': get_movie_title(item_id, movies_df),
            'genres': get_movie_genres(item_id, movies_df),
        })
    
    return recommendations


def load_multiple_models(model_paths: Dict[str, str], device: str = DEVICE) -> Dict[str, tuple]:
    """Load multiple models for comparison."""
    loaded = {}
    for name, path in model_paths.items():
        try:
            model_obj, checkpoint = NeuMFPlus.load(path, device=device)
            model_obj.eval()
            loaded[name] = (model_obj, checkpoint)
            print(f"‚úÖ Loaded: {name}")
        except Exception as e:
            print(f"‚ùå Failed to load {name}: {e}")
    return loaded

print("‚úÖ Helper functions defined!")

## 7.1 Model Comparison

Compare predictions from different models to see how content features affect recommendations.

**Models available:**
- **NeuMF (Baseline)**: Uses only user-item interaction patterns
- **NeuMF+ (Genre)**: Adds genre information for better content-aware recommendations  
- **NeuMF+ (Genre + Synopsis)**: Full model with both genre and synopsis features

In [None]:
# @title Compare predictions from different models

# Define model paths for comparison
MODEL_PATHS = {
    'Baseline (NeuMF)': os.path.join(MODELS_DIR, 'NeuMF_best.pt'),
    'Genre Only': os.path.join(MODELS_DIR, 'NeuMFPlus_genre_best.pt'),
    'Genre + Synopsis': os.path.join(MODELS_DIR, 'NeuMFPlus_genre_synopsis_bestt.pt'),
}

# @markdown Enter user ID and item ID to compare predictions:
compare_user_id = 100  # @param {type:"integer"}
compare_item_id = 500  # @param {type:"integer"}

if compare_user_id >= NUM_USERS:
    print(f"‚ùå Invalid user ID. Must be less than {NUM_USERS}.")
elif compare_item_id >= NUM_ITEMS:
    print(f"‚ùå Invalid item ID. Must be less than {NUM_ITEMS}.")
else:
    print("Loading models for comparison...")
    print("-"*70)
    
    loaded_models = load_multiple_models(MODEL_PATHS, device=DEVICE)
    
    if loaded_models:
        print("\n" + "="*70)
        print(f"MODEL COMPARISON: User {compare_user_id} ‚Üí Item {compare_item_id}")
        print("="*70)
        print(f"\nMovie: {get_movie_title(compare_item_id, movies_df)}")
        print(f"Genres: {get_movie_genres(compare_item_id, movies_df)}")
        
        print(f"\n{'Model':<25} {'Score':<10} {'Recommendation'}")
        print("-"*70)
        
        # Prepare features
        genre_vec = GENRE_FEATURES[compare_item_id] if GENRE_FEATURES is not None else None
        synopsis_emb = SYNOPSIS_EMBEDDINGS[compare_item_id] if SYNOPSIS_EMBEDDINGS is not None else None
        
        for model_name, (model_obj, _) in loaded_models.items():
            # Determine what features to pass based on model config
            config = model_obj.use_genre, model_obj.use_synopsis
            
            model_genre = genre_vec if model_obj.use_genre else None
            model_synopsis = synopsis_emb if model_obj.use_synopsis else None
            
            score = predict_score(model_obj, compare_user_id, compare_item_id, 
                                model_genre, model_synopsis)
            
            if score > 0.8:
                recommendation = "Strongly recommend"
            elif score > 0.6:
                recommendation = "Recommend"
            elif score > 0.4:
                recommendation = "Maybe"
            else:
                recommendation = "Not recommended"
            
            print(f"{model_name:<25} {score:.4f}     {recommendation}")
        
        # Calculate differences
        baseline_name = 'Baseline (NeuMF)'
        full_name = 'Genre + Synopsis'
        if baseline_name in loaded_models and full_name in loaded_models:
            baseline_score = predict_score(loaded_models[baseline_name][0], compare_user_id, compare_item_id)
            full_score = predict_score(loaded_models[full_name][0], compare_user_id, compare_item_id, 
                                      genre_vec, synopsis_emb)
            diff = full_score - baseline_score
            
            print("\n" + "-"*70)
            print(f"IMPROVEMENT: {diff:+.4f} ({(diff/baseline_score)*100:+.1f}%)")
            if diff > 0:
                print("‚ú® Content features improved the prediction!")
            elif diff < 0:
                print("üìâ Baseline performed better for this item")
            else:
                print("‚ûñ No significant difference")
    else:
        print("\n‚ö†Ô∏è  No models could be loaded. Please check the paths.")

## 9. Example: Top-K Recommendations

Get personalized movie recommendations for a user.

In [None]:
# @title Get top-K recommendations
# @markdown Enter user ID and number of recommendations:

user_id_rec = 100  # @param {type:"integer"}
k_recommendations = 10  # @param {type:"integer", min:1, max:50}

if user_id_rec >= NUM_USERS:
    print(f"‚ùå Invalid user ID. Must be less than {NUM_USERS}.")
else:
    recommendations = recommend(
        model, user_id_rec, k=k_recommendations,
        item_genre_features=GENRE_FEATURES,
        item_synopsis_embeddings=SYNOPSIS_EMBEDDINGS,
    )
    
    print("="*70)
    print(f"TOP-{k_recommendations} RECOMMENDATIONS FOR USER {user_id_rec}")
    print("="*70)
    
    print(f"\n{'Rank':<6} {'Score':<10} {'Title':<50} {'Genres'}")
    print("-" * 100)
    
    for rec in recommendations:
        title = rec['title'][:47] + '...' if len(rec['title']) > 47 else rec['title']
        print(f"{rec['rank']:<6} {rec['score']:.4f}     {title:<50} {rec['genres']}")

## 10. Example: Compare Multiple Users

See how different users would rate the same movie.

In [None]:
# @title Compare predictions for multiple users
# @markdown Enter item ID and list of user IDs to compare:

item_id_compare = 500  # @param {type:"integer"}
user_ids_compare = "0, 50, 100, 500, 1000"  # @param {type:"string"}

try:
    user_list = [int(u.strip()) for u in user_ids_compare.split(',')]
except:
    user_list = [0, 50, 100, 500, 1000]

if item_id_compare >= NUM_ITEMS:
    print(f"‚ùå Invalid item ID. Must be less than {NUM_ITEMS}.")
else:
    print("="*70)
    print(f"USER COMPARISON FOR ITEM: {get_movie_title(item_id_compare, movies_df)}")
    print(f"Genres: {get_movie_genres(item_id_compare, movies_df)}")
    print("="*70)
    
    print(f"\n{'User ID':<12} {'Score':<10} {'Prediction'}")
    print("-" * 40)
    
    genre_vec = GENRE_FEATURES[item_id_compare] if GENRE_FEATURES is not None else None
    synopsis_emb = SYNOPSIS_EMBEDDINGS[item_id_compare] if SYNOPSIS_EMBEDDINGS is not None else None
    
    for user_id in user_list:
        if user_id >= NUM_USERS:
            print(f"{user_id:<12} (invalid user)")
            continue
        
        score = predict_score(model, user_id, item_id_compare, genre_vec, synopsis_emb)
        
        if score > 0.8:
            prediction = "Will love it!"
        elif score > 0.6:
            prediction = "Will probably like it"
        elif score > 0.4:
            prediction = "Maybe"
        else:
            prediction = "Probably not interested"
        
        print(f"{user_id:<12} {score:.4f}     {prediction}")

## 11. Advanced: Cold-Start Prediction for New Movies

Predict how users would rate a completely new movie using only its content features (genres and synopsis).

In [None]:
# @title Cold-start prediction for a new movie
# @markdown Enter movie details for prediction:

new_user_id = 100  # @param {type:"integer"}
new_movie_genres = "Action,Sci-Fi"  # @param {type:"string"}
new_movie_synopsis = "A group of astronauts discover a mysterious artifact on Mars that changes their understanding of humanity's place in the universe."  # @param {type:"string"}

if new_user_id >= NUM_USERS:
    print(f"‚ùå Invalid user ID. Must be less than {NUM_USERS}.")
else:
    from sentence_transformers import SentenceTransformer
    
    # Load SBERT model for synopsis encoding
    print("Loading Sentence-BERT model...")
    sbert = SentenceTransformer('all-MiniLM-L6-v2')
    
    # Encode genres
    genre_list = [g.strip() for g in new_movie_genres.split(',')]
    genre_vector = np.zeros(NUM_GENRES, dtype=np.float32)
    
    if GENRE_NAMES:
        for genre in genre_list:
            if genre in GENRE_NAMES:
                idx = GENRE_NAMES.index(genre)
                genre_vector[idx] = 1.0
    
    # Encode synopsis
    synopsis_embedding = sbert.encode(new_movie_synopsis, show_progress_bar=False)
    synopsis_embedding = np.array(synopsis_embedding, dtype=np.float32)
    
    # Use a placeholder item ID (last item as reference)
    placeholder_item_id = NUM_ITEMS - 1
    
    # Predict
    score = predict_score(
        model, new_user_id, placeholder_item_id,
        genre_vector=genre_vector,
        synopsis_embedding=synopsis_embedding
    )
    
    print("\n" + "="*70)
    print("COLD-START PREDICTION")
    print("="*70)
    print(f"\nUser ID: {new_user_id}")
    print(f"\nNew Movie:")
    print(f"  Genres: {new_movie_genres}")
    print(f"  Synopsis: {new_movie_synopsis[:100]}...")
    print(f"\n‚úÖ Predicted score: {score:.4f}")
    
    if score > 0.7:
        print("\nüé¨ This user would likely enjoy this movie!")
    elif score > 0.5:
        print("\nüé¨ This user might be interested in this movie.")
    else:
        print("\nüé¨ This movie may not be a good fit for this user.")

## 12. Interactive Recommendation Widget

Use this interactive widget to explore recommendations for different users.

In [None]:
# @title Interactive recommendation widget

user_widget = widgets.IntSlider(
    value=100,
    min=0,
    max=NUM_USERS-1,
    step=1,
    description='User ID:',
    continuous_update=False,
)

k_widget = widgets.IntSlider(
    value=10,
    min=1,
    max=50,
    step=1,
    description='Top K:',
    continuous_update=False,
)

rec_out = widgets.Output()

def update_recommendations(user_id, k):
    with rec_out:
        rec_out.clear_output()
        
        recommendations = recommend(
            model, user_id, k=k,
            item_genre_features=GENRE_FEATURES,
            item_synopsis_embeddings=SYNOPSIS_EMBEDDINGS,
        )
        
        print("="*70)
        print(f"TOP-{k} RECOMMENDATIONS FOR USER {user_id}")
        print("="*70)
        
        print(f"\n{'Rank':<6} {'Score':<10} {'Title':<50} {'Genres'}")
        print("-" * 100)
        
        for rec in recommendations:
            title = rec['title'][:47] + '...' if len(rec['title']) > 47 else rec['title']
            print(f"{rec['rank']:<6} {rec['score']:.4f}     {title:<50} {rec['genres']}")

widgets.interactive(update_recommendations, user_id=user_widget, k=k_widget)

display(widgets.VBox([user_widget, k_widget, rec_out]))

## Summary

This notebook provides a complete interface for:

1. **Loading trained models** from Google Drive
2. **Single model selection** - Choose between:
   - `NeuMF_best.pt` - Baseline (CF only)
   - `NeuMFPlus_genre_best.pt` - Genre-enhanced
   - `NeuMFPlus_genre_synopsis_bestt.pt` - Full model (Genre + Synopsis)
3. **Model comparison** - Compare predictions from different models side-by-side
4. **Predicting scores** for user-item pairs
5. **Generating recommendations** for users
6. **Cold-start predictions** for new movies
7. **Interactive exploration** of the recommendation system

**Tips:**
- Use the GPU runtime in Colab for faster inference
- Adjust `GDRIVE_BASE` if your files are in a different location
- Ensure all data files (mappings.pkl, features, metadata) are in the correct directories
- Use the Model Comparison section to understand how content features affect predictions

**Model Selection Guide:**
- Use **Baseline** for pure collaborative filtering (fastest)
- Use **Genre Only** when you want genre-aware recommendations
- Use **Genre + Synopsis** for the best accuracy (slower but most accurate)