# Recommendations

This notebook loads the saved models and data, then shows movie recommendations for any chosen user.

It uses:

- the ALS collaborative model

- the hybrid LightGBM ranking model

The goals are:

- inspect a user’s watch history

- show top-N ALS recommendations

- show top-N hybrid recommendations

- compare both lists for the same user

In [1]:
import pandas as pd
import numpy as np
from pathlib import Path
import pickle
import scipy.sparse as sp

# Loading processed data and models

This section loads:

- processed CSV files

- the ALS model and sparse matrix

- the hybrid LightGBM model

- the hybrid feature table

All later functions reuse these objects.

In [2]:
PROJECT_ROOT = Path("..").resolve()
PROCESSED_DIR = PROJECT_ROOT / "data" / "processed"

In [3]:
# Core data
merged = pd.read_csv(PROCESSED_DIR / "merged.csv")
movie_map = pd.read_csv(PROCESSED_DIR / "movie_map.csv")
user_map = pd.read_csv(PROCESSED_DIR / "user_map.csv")

In [4]:
# Hybrid training pairs (features + label)
pairs = pd.read_parquet(PROCESSED_DIR / "hybrid_train_pairs.parquet")

In [5]:
# Load ALS model + matrix
with open(PROCESSED_DIR / "als_model.pkl", "rb") as f:
    als_model = pickle.load(f)

In [6]:
item_user = sp.load_npz(PROCESSED_DIR / "item_user_train.npz")
user_item = item_user.T.tocsr()

In [7]:
# Load hybrid LightGBM model
with open(PROCESSED_DIR / "hybrid_lgbm_model.pkl", "rb") as f:
    hybrid_model = pickle.load(f)


In [8]:
print("Data and models loaded.")
print("Merged shape:", merged.shape)
print("Pairs shape:", pairs.shape)

Data and models loaded.
Merged shape: (1000209, 8)
Pairs shape: (923353, 41)


# Helper: pick a user and inspect history

This section defines helpers to:

- select a user by internal index (**u_index**) or raw ID

- see some of the movies they have already interacted with

This helps confirm that the model is recommending new, relevant items.

In [9]:
def list_random_users(n=5):
    """Return a small sample of user indices."""
    users = merged["u_index"].unique()
    return np.random.choice(users, size=min(n, len(users)), replace=False)

In [10]:
def get_user_history(u_index, max_movies=15):
    """Show some movies the user has interacted with."""
    user_rows = merged[merged["u_index"] == u_index].copy()
    user_rows = user_rows.sort_values("timestamp", ascending=False)
    hist = (
        user_rows[["movie_id", "title", "genres", "rating", "timestamp"]]
        .head(max_movies)
        .reset_index(drop=True)
    )
    return hist

In [11]:
sample_users = list_random_users(5)
sample_users

array([3334, 5538, 5746, 4631, 2560])

In [12]:
# Inspect watch history for the first sample user
example_user = int(sample_users[0])
print("Example user u_index:", example_user)

Example user u_index: 3334


In [13]:
get_user_history(example_user, max_movies=10)

Unnamed: 0,movie_id,title,genres,rating,timestamp
0,3361,Bull Durham (1988),Comedy,4.0,2000-09-01 20:22:07
1,1060,Swingers (1996),Comedy|Drama,4.0,2000-09-01 20:22:07
2,2396,Shakespeare in Love (1998),Comedy|Romance,4.0,2000-09-01 20:21:40
3,2795,Vacation (1983),Comedy,4.0,2000-09-01 20:21:19
4,104,Happy Gilmore (1996),Comedy,4.0,2000-09-01 20:20:23
5,39,Clueless (1995),Comedy|Romance,4.0,2000-09-01 20:20:23
6,2424,You've Got Mail (1998),Comedy|Romance,4.0,2000-09-01 20:20:23
7,3671,Blazing Saddles (1974),Comedy|Western,2.0,2000-09-01 20:19:34
8,3039,Trading Places (1983),Comedy,5.0,2000-09-01 20:19:04
9,3072,Moonstruck (1987),Comedy,3.0,2000-09-01 20:19:04


# ALS recommendations

This section defines a function that:

- takes a user index

- uses the ALS model to recommend top-N movies

- returns titles and genres

ALS automatically avoids items that the user has already interacted with.

In [16]:
def als_recommend_for_user(u_index, als_model, user_item_matrix, movie_map, top_n=10):
    """
    Return top-N ALS recommendations for a given user index.
    """
    # check bounds
    if u_index < 0 or u_index >= als_model.user_factors.shape[0]:
        print("User index out of range for ALS model.")
        return None
    user_row = user_item_matrix[u_index]  # 1 x n_items CSR
    rec_ids, scores = als_model.recommend(
        userid=u_index,
        user_items=user_row,
        N=top_n,
    )
    rec_df = movie_map[movie_map["m_index"].isin(rec_ids)].copy()
    # keep ALS order
    order = {m: i for i, m in enumerate(rec_ids)}
    rec_df["rank"] = rec_df["m_index"].map(order)
    rec_df["als_score"] = rec_df["m_index"].map(
        {m: float(s) for m, s in zip(rec_ids, scores)}
    )
    rec_df = rec_df.sort_values("rank")
    return rec_df[["rank", "m_index", "title", "genres", "als_score"]].reset_index(drop=True)


In [17]:
als_recs = als_recommend_for_user(example_user, als_model, user_item, movie_map, top_n=10)

In [18]:
als_recs

Unnamed: 0,rank,m_index,title,genres,als_score
0,0,1241,Brazil (1985),Sci-Fi,0.203794
1,1,3366,Head On (1998),Drama,0.193231
2,3,1149,Poltergeist (1982),Horror|Thriller,0.154567
3,4,172,Miller's Crossing (1990),Drama,0.143401
4,5,1469,Betrayed (1988),Drama|Thriller,0.142527
5,6,888,Striptease (1996),Comedy|Crime,0.141053
6,8,1273,Carlito's Way (1993),Crime|Drama,0.12946


# Hybrid model recommendations

The hybrid model uses extra features beyond ALS:

- genre features

- user taste profile

- user–movie genre similarity

- ALS score

This section ranks user–movie pairs using the hybrid model and shows the top results.

To keep things simple, the existing pairs table is used and rows for the chosen user are filtered.

In [19]:
# Feature columns for the hybrid model
hybrid_feature_cols = [c for c in pairs.columns if c not in ["label", "u_index", "m_index"]]

In [20]:
len(hybrid_feature_cols)

38

In [21]:
hybrid_feature_cols[:10]

['genre_Action_user',
 'genre_Adventure_user',
 'genre_Animation_user',
 "genre_Children's_user",
 'genre_Comedy_user',
 'genre_Crime_user',
 'genre_Documentary_user',
 'genre_Drama_user',
 'genre_Fantasy_user',
 'genre_Film-Noir_user']

In [22]:
def hybrid_recommend_for_user(u_index, pairs_df, hybrid_model, feature_cols, movie_map, top_n=10):
    """
    Rank movies for a given user using the hybrid LightGBM model.
    Uses the precomputed feature rows in pairs_df for that user.
    """
    user_rows = pairs_df[pairs_df["u_index"] == u_index].copy()
    if user_rows.empty:
        print("No feature rows found for user:", u_index)
        return None
    X_user = user_rows[feature_cols]  # keep as DataFrame for feature names
    user_rows["score"] = hybrid_model.predict(X_user)
    # sort by descending score
    user_rows = user_rows.sort_values("score", ascending=False)
    # take top-N and join movie info
    top = user_rows.head(top_n).merge(movie_map, on="m_index", how="left")
    return top[["m_index", "title", "genres", "label", "score"]].reset_index(drop=True)


In [23]:
hybrid_recs = hybrid_recommend_for_user(
    example_user, pairs, hybrid_model, hybrid_feature_cols, movie_map, top_n=10
)

In [24]:
hybrid_recs

Unnamed: 0,m_index,title,genres,label,score
0,171,Forrest Gump (1994),Comedy|Romance|War,1,3.357812
1,64,Star Wars: Episode VI - Return of the Jedi (1983),Action|Adventure|Romance|Sci-Fi|War,1,3.238572
2,5,"Princess Bride, The (1987)",Action|Adventure|Comedy|Romance,1,2.431225
3,44,Star Wars: Episode IV - A New Hope (1977),Action|Adventure|Fantasy|Sci-Fi,1,1.831969
4,203,Blazing Saddles (1974),Comedy|Western,1,1.446891
5,647,Batman Returns (1992),Action|Adventure|Comedy|Crime,0,1.391715
6,19,Big (1988),Comedy|Fantasy,1,1.064028
7,711,Lethal Weapon (1987),Action|Comedy|Crime|Drama,1,1.029469
8,163,Lethal Weapon 3 (1992),Action|Comedy|Crime|Drama,1,1.016858
9,22,Back to the Future (1985),Comedy|Sci-Fi,1,0.384793


# Comparing ALS and hybrid recommendations

Both lists for the same user are printed, side by side, to see how the hybrid model changes ranking compared to ALS alone.

In [25]:
print("Example user:", example_user)
print("\nMovies in recent history:")
display(get_user_history(example_user, max_movies=5))

Example user: 3334

Movies in recent history:


Unnamed: 0,movie_id,title,genres,rating,timestamp
0,3361,Bull Durham (1988),Comedy,4.0,2000-09-01 20:22:07
1,1060,Swingers (1996),Comedy|Drama,4.0,2000-09-01 20:22:07
2,2396,Shakespeare in Love (1998),Comedy|Romance,4.0,2000-09-01 20:21:40
3,2795,Vacation (1983),Comedy,4.0,2000-09-01 20:21:19
4,104,Happy Gilmore (1996),Comedy,4.0,2000-09-01 20:20:23


In [26]:
print("Top 10 ALS recommendations:")
display(als_recs)

Top 10 ALS recommendations:


Unnamed: 0,rank,m_index,title,genres,als_score
0,0,1241,Brazil (1985),Sci-Fi,0.203794
1,1,3366,Head On (1998),Drama,0.193231
2,3,1149,Poltergeist (1982),Horror|Thriller,0.154567
3,4,172,Miller's Crossing (1990),Drama,0.143401
4,5,1469,Betrayed (1988),Drama|Thriller,0.142527
5,6,888,Striptease (1996),Comedy|Crime,0.141053
6,8,1273,Carlito's Way (1993),Crime|Drama,0.12946


In [27]:
print("Top 10 Hybrid recommendations:")
display(hybrid_recs)

Top 10 Hybrid recommendations:


Unnamed: 0,m_index,title,genres,label,score
0,171,Forrest Gump (1994),Comedy|Romance|War,1,3.357812
1,64,Star Wars: Episode VI - Return of the Jedi (1983),Action|Adventure|Romance|Sci-Fi|War,1,3.238572
2,5,"Princess Bride, The (1987)",Action|Adventure|Comedy|Romance,1,2.431225
3,44,Star Wars: Episode IV - A New Hope (1977),Action|Adventure|Fantasy|Sci-Fi,1,1.831969
4,203,Blazing Saddles (1974),Comedy|Western,1,1.446891
5,647,Batman Returns (1992),Action|Adventure|Comedy|Crime,0,1.391715
6,19,Big (1988),Comedy|Fantasy,1,1.064028
7,711,Lethal Weapon (1987),Action|Comedy|Crime|Drama,1,1.029469
8,163,Lethal Weapon 3 (1992),Action|Comedy|Crime|Drama,1,1.016858
9,22,Back to the Future (1985),Comedy|Sci-Fi,1,0.384793


# Wrapper to test user

This helper lets someone type a user index and see:

- recent watch history

- ALS recommendations

- hybrid recommendations

This is useful when explaining the project in a demo or portfolio.

In [28]:
def show_user_demo(u_index, top_n=10):
    print("=" * 60)
    print(f"User index: {u_index}")
    print("=" * 60)
    print("\nRecent history:")
    display(get_user_history(u_index, max_movies=10))
    als_list = als_recommend_for_user(u_index, als_model, user_item, movie_map, top_n=top_n)
    hybrid_list = hybrid_recommend_for_user(u_index, pairs, hybrid_model, hybrid_feature_cols, movie_map, top_n=top_n)
    print("\nTop ALS recommendations:")
    display(als_list)
    print("\nTop Hybrid recommendations:")
    display(hybrid_list)


In [29]:
# Demo for sample user
show_user_demo(example_user, top_n=10)

User index: 3334

Recent history:


Unnamed: 0,movie_id,title,genres,rating,timestamp
0,3361,Bull Durham (1988),Comedy,4.0,2000-09-01 20:22:07
1,1060,Swingers (1996),Comedy|Drama,4.0,2000-09-01 20:22:07
2,2396,Shakespeare in Love (1998),Comedy|Romance,4.0,2000-09-01 20:21:40
3,2795,Vacation (1983),Comedy,4.0,2000-09-01 20:21:19
4,104,Happy Gilmore (1996),Comedy,4.0,2000-09-01 20:20:23
5,39,Clueless (1995),Comedy|Romance,4.0,2000-09-01 20:20:23
6,2424,You've Got Mail (1998),Comedy|Romance,4.0,2000-09-01 20:20:23
7,3671,Blazing Saddles (1974),Comedy|Western,2.0,2000-09-01 20:19:34
8,3039,Trading Places (1983),Comedy,5.0,2000-09-01 20:19:04
9,3072,Moonstruck (1987),Comedy,3.0,2000-09-01 20:19:04



Top ALS recommendations:


Unnamed: 0,rank,m_index,title,genres,als_score
0,0,1241,Brazil (1985),Sci-Fi,0.203794
1,1,3366,Head On (1998),Drama,0.193231
2,3,1149,Poltergeist (1982),Horror|Thriller,0.154567
3,4,172,Miller's Crossing (1990),Drama,0.143401
4,5,1469,Betrayed (1988),Drama|Thriller,0.142527
5,6,888,Striptease (1996),Comedy|Crime,0.141053
6,8,1273,Carlito's Way (1993),Crime|Drama,0.12946



Top Hybrid recommendations:


Unnamed: 0,m_index,title,genres,label,score
0,171,Forrest Gump (1994),Comedy|Romance|War,1,3.357812
1,64,Star Wars: Episode VI - Return of the Jedi (1983),Action|Adventure|Romance|Sci-Fi|War,1,3.238572
2,5,"Princess Bride, The (1987)",Action|Adventure|Comedy|Romance,1,2.431225
3,44,Star Wars: Episode IV - A New Hope (1977),Action|Adventure|Fantasy|Sci-Fi,1,1.831969
4,203,Blazing Saddles (1974),Comedy|Western,1,1.446891
5,647,Batman Returns (1992),Action|Adventure|Comedy|Crime,0,1.391715
6,19,Big (1988),Comedy|Fantasy,1,1.064028
7,711,Lethal Weapon (1987),Action|Comedy|Crime|Drama,1,1.029469
8,163,Lethal Weapon 3 (1992),Action|Comedy|Crime|Drama,1,1.016858
9,22,Back to the Future (1985),Comedy|Sci-Fi,1,0.384793


# Summary

This notebook shows how the trained models can be used to produce recommendations.

It:

- loads the saved ALS and hybrid models

- explores a user’s past interactions

- generates top-N ALS recommendations

- generates top-N hybrid recommendations

- provides a simple function to demo any user’s results

These pieces are enough to power a basic front end, such as a Streamlit app, where a user can select a user id and see both ALS and hybrid suggestions.