## Problem Statement

**Current Situation:**
- Streamly uses random title selection for recommendations
- Users report poor satisfaction with suggestions
- No consideration for user preferences or content appropriateness

**Objectives (implemented in this notebook and backend):**
The recommendation algorithm implemented here:
- Respects content appropriateness for kids profiles (filters to kids content when requested)
- Considers user preferences (language, genres)
- Accounts for user demographics (age band) where available
- Produces a preference-based ranking (by preference match score)

In [1]:
import pandas as pd
import numpy as np

In [2]:
import pandas as pd
import sqlite3

## Algorithm Implementation

In [3]:
def recommend_titles(profile_id, limit=10):
    """
    Content-based recommendation algorithm for Streamly.
    
    Connects to SQLite database and recommends titles based on user profile.
    
    Algorithm Steps:
    1. Load user profile from database
    2. Load all titles from database
    3. Filter titles by age-appropriateness (kids flag)
    4. Filter titles by language preference
    5. Score titles by genre preference match
    6. Sort by score (descending)
    7. Return top N recommendations
    
    Args:
        profile_id (int): User profile identifier
        limit (int): Number of recommendations to return (default: 10)
        
    Returns:
        list: List of recommended title dictionaries with metadata
    """
    conn = sqlite3.connect("streamly.db")
 
    # Step 1: Load profile from database
    profile = pd.read_sql(f"SELECT * FROM profiles WHERE profile_id={profile_id}", conn).iloc[0]
    
    # Step 2: Load all titles from database
    titles = pd.read_sql("SELECT * FROM titles", conn)

    # Step 3: Filter by kids content
    if profile["kids_profile"] == 1:
        titles = titles[titles["is_kids_content"] == 1]

    # Step 4: Filter by language preference
    if pd.notna(profile["preferred_language"]):
        titles = titles[
            (titles["language"] == profile["preferred_language"]) |
            (titles["language"].isna())
        ]

    # Step 5: Score by genre preferences
    if pd.notna(profile["preferences"]):
        preferred_genres = [g.strip() for g in profile["preferences"].split(",")]
        titles["score"] = titles["category"].apply(
            lambda cat: 1 if cat in preferred_genres else 0
        )
    else:
        titles["score"] = 0

    # Step 6: Sort by score (descending)
    titles = titles.sort_values("score", ascending=False)

    # Step 7: Return top recommendations
    conn.close()
    return titles.head(limit).to_dict(orient="records")