# Notebook 04: Recommendation Generation & Qualitative Evaluation

## Objective

This notebook uses the trained Matrix Factorization model to:
- Generate personalized Top-N movie recommendations
- Map movie IDs to human-readable titles
- Perform qualitative inspection of recommendation outputs
- Save prediction artifacts for downstream deployment

This notebook complements RMSE-based evaluation by validating
that recommendations are interpretable and plausible for real users.



In [45]:
import numpy as np
import pandas as pd
from sklearn.decomposition import TruncatedSVD
from sklearn.metrics import mean_squared_error

In [47]:
# Load ratings (for filtering seen movies / qualitative checks)
ratings = pd.read_csv(
    "ratings.dat",
    sep="::",
    engine="python",
    names=["user_id", "movie_id", "rating", "timestamp"]
)

# Load movies metadata
movies = pd.read_csv(
    "movies.dat",
    sep="::",
    engine="python",
    names=["movie_id", "title", "genres"],
    encoding="latin-1"
)


ratings.head(), movies.head()


(   user_id  movie_id  rating  timestamp
 0        1      1193       5  978300760
 1        1       661       3  978302109
 2        1       914       3  978301968
 3        1      3408       4  978300275
 4        1      2355       5  978824291,
    movie_id                               title                        genres
 0         1                    Toy Story (1995)   Animation|Children's|Comedy
 1         2                      Jumanji (1995)  Adventure|Children's|Fantasy
 2         3             Grumpier Old Men (1995)                Comedy|Romance
 3         4            Waiting to Exhale (1995)                  Comedy|Drama
 4         5  Father of the Bride Part II (1995)                        Comedy)

In [48]:
movies.head()


Unnamed: 0,movie_id,title,genres
0,1,Toy Story (1995),Animation|Children's|Comedy
1,2,Jumanji (1995),Adventure|Children's|Fantasy
2,3,Grumpier Old Men (1995),Comedy|Romance
3,4,Waiting to Exhale (1995),Comedy|Drama
4,5,Father of the Bride Part II (1995),Comedy


In [49]:
pred_df = pd.read_parquet("pred_df.parquet")
pred_df.shape

(5400, 3662)

In [50]:
def get_top_n_recommendations(user_id, n=10):
    """
    Generate Top-N movie recommendations for a given user,
    excluding movies already rated by the user.
    """
    # Movies already seen
    seen_movies = ratings[ratings["user_id"] == user_id]["movie_id"].values

    # Predicted scores for user
    user_scores = pred_df.loc[user_id]

    # Exclude seen movies
    user_scores = user_scores.drop(index=seen_movies, errors="ignore")

    # Top-N recommendations
    top_n = (
        user_scores
        .sort_values(ascending=False)
        .head(n)
        .reset_index(name="predicted_rating")
        .rename(columns={"index": "movie_id"})
    )

    return top_n

In [51]:
sample_user_id = pred_df.index[0]

top_recs = get_top_n_recommendations(sample_user_id, n=10)

top_recs = top_recs.merge(movies, on="movie_id", how="left")
top_recs


Unnamed: 0,movie_id,predicted_rating,title,genres
0,912,1.029738,Casablanca (1942),Drama|Romance|War
1,1193,0.895314,One Flew Over the Cuckoo's Nest (1975),Drama
2,1148,0.771576,"Wrong Trousers, The (1993)",Animation|Comedy
3,3481,0.721225,High Fidelity (2000),Comedy
4,1580,0.688855,Men in Black (1997),Action|Adventure|Comedy|Sci-Fi
5,923,0.68657,Citizen Kane (1941),Drama
6,3578,0.648183,Gladiator (2000),Action|Drama
7,745,0.620201,"Close Shave, A (1995)",Animation|Comedy|Thriller
8,3753,0.614813,"Patriot, The (2000)",Action|Drama|War
9,3897,0.607832,Almost Famous (2000),Comedy|Drama


## Qualitative Evaluation of Recommendations

The recommended movies appear coherent and plausible, with many
well-known and highly rated titles appearing near the top of the list.
This indicates that the model successfully captures user preference
patterns from historical interactions.

This qualitative check complements RMSE-based evaluation by confirming
that the ranked outputs are interpretable and realistic for real users.
