## The Project: AI Agents for Diverse Movie Recommendations using LangGraph

This project builds an AI agent system using the **LangGraph** framework to provide personalized movie recommendations based on the **MovieLens 100K** dataset. The system uses a pre-trained **SVD++** model for collaborative filtering and is enhanced with multiple agents:

1. **RecommenderAgent**: Predicts top-20 movies for each user based on the trained SVD++ model.
2. **DiversityRerankerAgent**: Selects the most diverse top-5 subset from the recommendations by balancing rating quality and genre variety.
3. **CritiqueAgent**: Uses a language model (e.g., Granite) to evaluate the recommendation list and provide critique about thematic richness, emotional depth, and genre coverage.

Each agent operates within a **LangGraph pipeline**, enabling structured reasoning and modular evaluation. The full system prints user history, generates diverse recommendations, and explains potential weaknesses in the results.

`Graph:
  A[RecommenderAgent] --> B[DiversityRerankerAgent]
  B --> C[CritiqueAgent]
  C --> D[END]
`

# The Dataset: MovieLens 100K

The experiments are based on the **MovieLens 100K dataset**, a widely used benchmark in recommender systems research.  

- **Size:** 100,000 ratings  
- **Users:** 943  
- **Movies:** 1,682  
- **Format:** tab-delimited files (CSV-like)

## Main Columns
- **userId** – unique identifier of each user (anonymized, 1–943).  
- **movieId** – unique identifier of each movie (1–1682).  
- **rating** – explicit rating from 1 to 5, where higher values indicate stronger preference.  
- **timestamp** – UNIX time indicating when the rating was made.  
- **title** (from `u.item`) – the name of the movie.  
- **genres** (from `u.item`) – one or more genres assigned to each movie (e.g., Action, Comedy).  

This dataset is small enough to allow fast experimentation, yet rich enough to demonstrate the strengths and weaknesses of different recommendation algorithms.


In [None]:
!pip uninstall -y scikit-surprise
!pip install numpy==1.26.4 --force-reinstall

In [None]:
!pip install scikit-surprise --no-binary scikit-surprise

In [None]:
from google.colab import drive
drive.mount('/content/drive')

In [4]:
import zipfile
import pandas as pd

# Path to the ZIP file in Google Drive
zip_path = "/content/drive/MyDrive/Portfolio datasets/Recommender engine/ml-100k.zip"
extract_path = "/content/ml-100k"

# Extract the dataset
with zipfile.ZipFile(zip_path, 'r') as zip_ref:
    zip_ref.extractall(extract_path)

# Load the ratings (u.data)
data_path = f"{extract_path}/ml-100k/u.data"
df = pd.read_csv(
    data_path,
    sep="\t",
    names=["userId", "movieId", "rating", "timestamp"]
)

# Load the movie titles (u.item)
item_path = f"{extract_path}/ml-100k/u.item"
movies = pd.read_csv(
    item_path,
    sep="|",
    encoding="latin-1",
    header=None,
    usecols=[0, 1],
    names=["movieId", "title"]
)

# Merge ratings with movie titles
df_merged = pd.merge(df, movies, on="movieId")

print("Data shape:", df_merged.shape)
print(df_merged.head())


Data shape: (100000, 5)
   userId  movieId  rating  timestamp                       title
0     196      242       3  881250949                Kolya (1996)
1     186      302       3  891717742    L.A. Confidential (1997)
2      22      377       1  878887116         Heavyweights (1994)
3     244       51       2  880606923  Legends of the Fall (1994)
4     166      346       1  886397596         Jackie Brown (1997)


In [None]:
import numpy as np
import pandas as pd
from surprise import Dataset, Reader, SVDpp, accuracy
from surprise.model_selection import KFold
from collections import defaultdict

# === Helper: compute metrics for multiple K values ===
def metrics_at_k(predictions, ks=[5, 10, 20]):
    user_ratings = defaultdict(list)
    for uid, iid, true_r, est, _ in predictions:
        user_ratings[uid].append((iid, est, true_r))

    results = {k: {"Precision": [], "Recall": [], "F1": [], "NDCG": [], "HitRate": []} for k in ks}

    for uid, ratings in user_ratings.items():
        ratings.sort(key=lambda x: x[1], reverse=True)

        # Relevant = rating >= 4
        rel = [r for (_, _, r) in ratings if r >= 4]
        n_rel = len(rel)

        for k in ks:
            top_k = ratings[:k]
            rec = [iid for (iid, _, r) in top_k if r >= 4]
            n_rel_and_rec_k = len(rec)

            precision = n_rel_and_rec_k / k if k > 0 else 0
            recall = n_rel_and_rec_k / n_rel if n_rel > 0 else 0
            f1 = (2 * precision * recall / (precision + recall)) if (precision + recall) > 0 else 0

            dcg = sum([1 / np.log2(idx+2) for idx, (iid, _, r) in enumerate(top_k) if r >= 4])
            idcg = sum([1 / np.log2(idx+2) for idx in range(min(n_rel, k))])
            ndcg = dcg / idcg if idcg > 0 else 0

            hit = 1 if n_rel_and_rec_k > 0 else 0

            results[k]["Precision"].append(precision)
            results[k]["Recall"].append(recall)
            results[k]["F1"].append(f1)
            results[k]["NDCG"].append(ndcg)
            results[k]["HitRate"].append(hit)

    # Average across users
    return {
        k: {m: np.mean(vals) for m, vals in metrics.items()}
        for k, metrics in results.items()
    }

# === Prepare data ===
ratings_df = df_merged[["userId", "movieId", "rating"]]
reader = Reader(rating_scale=(1, 5))
data = Dataset.load_from_df(ratings_df, reader)

# === Train only SVD++ ===
model = SVDpp()
print("\n=== Training SVD++ ===")
kf = KFold(n_splits=5, random_state=42, shuffle=True)

rmses = []
metrics_all = {5: [], 10: [], 20: []}

for fold, (trainset, testset) in enumerate(kf.split(data), 1):
    print(f" Fold {fold} ...")
    model.fit(trainset)
    predictions = model.test(testset)

    # RMSE
    rmse = accuracy.rmse(predictions, verbose=False)
    print(f"   RMSE: {rmse:.4f}")
    rmses.append(rmse)

    # Top-K metrics
    metrics = metrics_at_k(predictions, ks=[5, 10, 20])
    for k in metrics:
        metrics_all[k].append(metrics[k])

# === Aggregate results ===
row = {"Model": "SVD++", "RMSE (mean)": np.mean(rmses)}
for k in [5, 10, 20]:
    avg_metrics = {m+f"@{k}": np.mean([fold[m] for fold in metrics_all[k]]) for m in metrics_all[k][0]}
    row.update(avg_metrics)

results_df = pd.DataFrame([row])
print("\n=== 5-Fold Cross-Validation Results (SVD++ Only) ===")
print(results_df)



=== Training SVD++ ===
 Fold 1 ...
   RMSE: 0.9207
 Fold 2 ...
   RMSE: 0.9157
 Fold 3 ...
   RMSE: 0.9242
 Fold 4 ...
   RMSE: 0.9206
 Fold 5 ...


In [None]:
!pip install bitsandbytes accelerate transformers langgraph

In [None]:
import os
os.environ["HF_TOKEN"] = "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"

In [None]:
# === Open Source Granite 8B model LLM ===
from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig, pipeline
import torch

bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_use_double_quant=True,
    bnb_4bit_quant_type="nf4",
    llm_int8_enable_fp32_cpu_offload=True
)
try:
    # Granite 8B model name on Hugging Face
    GRANITE_MODEL = "ibm-granite/granite-3.3-8b-instruct"

    # Load the model + tokenizer
    granite_model = AutoModelForCausalLM.from_pretrained(
        GRANITE_MODEL,
        device_map="auto",
        quantization_config=bnb_config  # Optional
    )

    granite_tokenizer = AutoTokenizer.from_pretrained(GRANITE_MODEL)

    # Create pipeline
    granite_pipe = pipeline(
        "text-generation",
        model=granite_model,
        tokenizer=granite_tokenizer,
        pad_token_id=granite_tokenizer.eos_token_id,
        return_full_text=False
    )
except Exception as e:
    print("[Warning] Failed to load LLaMA model:", e)
    granite_pipe = lambda prompt, **kwargs: [{"generated_text": "[granit model unavailable]"}]
granite_pipe = pipeline("text-generation", model=granite_model, tokenizer=granite_tokenizer)



In [11]:
import pandas as pd
from sklearn.model_selection import train_test_split

# Use the ratings dataframe (already loaded from MovieLens or another dataset)
ratings_df = df_merged[["userId", "movieId", "rating"]].copy()

# Train/Test split
train_df, test_df = train_test_split(ratings_df, test_size=0.2, random_state=42)

# Build ratings_history from train_df
ratings_history = (
    train_df.groupby("userId")[["movieId", "rating"]]
    .apply(lambda g: list(zip(g["movieId"], g["rating"])))
    .to_dict()
)

# Build item_pool from all unique movies
item_pool = ratings_df["movieId"].unique().tolist()

print("Data prepared:")
print(f"Train size: {len(train_df)}, Test size: {len(test_df)}")
print(f"Number of users in train: {train_df['userId'].nunique()}")
print(f"Number of items in pool: {len(item_pool)}")


Data prepared:
Train size: 80000, Test size: 20000
Number of users in train: 943
Number of items in pool: 1682


In [10]:
# === AI Agents with LangGraph, using pre-trained SVD++ and MovieLens titles/genres ===
import pandas as pd
from langgraph.graph import StateGraph, END
from typing import TypedDict
import re
from collections import Counter
from itertools import combinations


# 1) Load movies with titles + genres
movies_df = pd.read_csv(
    "/content/ml-100k/ml-100k/u.item",
    sep="|",
    encoding="latin-1",
    header=None,
    names=["movieId", "title", "release_date", "video_release_date", "IMDb_URL",
           "unknown", "Action", "Adventure", "Animation", "Children's", "Comedy",
           "Crime", "Documentary", "Drama", "Fantasy", "Film-Noir", "Horror",
           "Musical", "Mystery", "Romance", "Sci-Fi", "Thriller", "War", "Western"]
)

def clean_title(title: str) -> str:
    return re.sub(r"\s*\(\d{4}\)$", "", title)

genre_cols = movies_df.columns[5:]
movies_df["genres"] = movies_df[genre_cols].apply(
    lambda row: ", ".join([g for g, v in row.items() if v == 1]), axis=1
)

# 2) Merge ratings with movie metadata
ratings_full = df_merged[["userId", "movieId", "rating"]].merge(
    movies_df[["movieId", "title", "genres"]],
    on="movieId", how="left"
)

item_pool = ratings_full["movieId"].unique().tolist()
ratings_history = (
    ratings_full.groupby("userId")[["movieId", "rating"]]
    .apply(lambda x: list(zip(x["movieId"], x["rating"])) )
    .to_dict()
)

movie_titles = ratings_full.set_index("movieId")["title"].to_dict()
movie_genres = ratings_full.set_index("movieId")["genres"].to_dict()

def title_of(iid: int) -> str:
    return movie_titles.get(iid, "Unknown title")

def genres_of(iid: int) -> str:
    g = movie_genres.get(iid, "")
    return g if isinstance(g, str) and g else "Unknown genre"

# 3) State definition
class State(TypedDict):
    user_id: int
    request: str
    recommendations: list
    explanation: str
    critique: str

# 4) RecommenderAgent
class RecommenderAgent:
    def __init__(self, model, movies_df):
        self.model = model
        self.movies_df = movies_df

    def recommend(self, user_id, item_pool, top_k=20):
        preds = [(iid, self.model.predict(user_id, iid).est) for iid in item_pool]
        preds = sorted(preds, key=lambda x: x[1], reverse=True)[:top_k]
        recs_df = pd.DataFrame(preds, columns=["movieId", "predicted_rating"])
        recs_df = recs_df.merge(self.movies_df, on="movieId", how="left")
        return recs_df[["title", "genres", "predicted_rating"]]

    def run(self, state, item_pool, top_k=20):
        user_id = state["user_id"]
        recs = self.recommend(user_id, item_pool, top_k)
        state["recommendations"] = recs
        print("--- Top-20 recommendations before re-ranking ---")
        for i, row in enumerate(state["recommendations"].itertuples(), 1):
            print(f"{i}. {row.title} ({row.genres}) → pred {row.predicted_rating:.2f}")
        return state

# 5) CritiqueAgent
def wrap_text(text, words_per_line=10):
    words = text.split()
    lines = [" ".join(words[i:i+words_per_line]) for i in range(0, len(words), words_per_line)]
    return "\n".join(lines)

class CritiqueAgent:
    def __init__(self, pipe, words_per_line=10):
        self.pipe = pipe
        self.words_per_line = words_per_line

    def run(self, state):
        recs = state["recommendations"]
        recs_str = "; ".join(
            f"{row['title']} ({row['genres']}) pred {row['predicted_rating']:.2f}"
            for _, row in recs.iterrows()
        )
        prompt = (
            "You are a movie recommendation critic. Based on the following recommended list, "
            "write a brief critique of possible weaknesses in the recommendation strategy. "
            "Focus on thematic richness, emotional depth, genre variety (e.g. missing musical, children's, animation), "
            "and suggest one possible improvement in no more than 3 lines.\n\n"
            f"Recommendations: {recs_str}\n\n"
            "Critique:"
        )
        out = self.pipe(prompt, max_new_tokens=100, temperature=0.6)[0]["generated_text"]
        if "Critique:" in out:
            out = out.split("Critique:")[-1].strip()
        return {"critique": wrap_text(out, self.words_per_line)}

# 6) DiversityRerankerAgent
class DiversityRerankerAgent:
    def __init__(self, top_k=5, diversity_weight=1.0):
        self.top_k = top_k
        self.diversity_weight = diversity_weight

    def compute_genre_bonus(self, selected_genres, candidate_genres):
        """Calculate how many new genres this candidate adds."""
        candidate_set = set(candidate_genres.split(", "))
        return len(candidate_set - selected_genres)

    def run(self, state):
        recs = state["recommendations"].copy()
        if len(recs) <= self.top_k:
            return state

        # Select top_k movies incrementally to maximize genre diversity + rating
        selected = []
        selected_genres = set()
        remaining = recs.copy()

        while len(selected) < self.top_k and not remaining.empty:
            def score(row):
                genre_bonus = self.compute_genre_bonus(selected_genres, row["genres"])
                return (1 - self.diversity_weight) * row["predicted_rating"] + self.diversity_weight * genre_bonus

            remaining["score"] = remaining.apply(score, axis=1)
            best = remaining.sort_values(by="score", ascending=False).iloc[0]
            selected.append(best)
            selected_genres.update(best["genres"].split(", "))
            remaining = remaining.drop(best.name)

        state["recommendations"] = pd.DataFrame(selected)[["title", "genres", "predicted_rating"]].reset_index(drop=True)
        return state


# 7) Graph builder
def build_graph(recommender, diversity_reranker, critique_agent, item_pool):
    graph = StateGraph(State)
    graph.add_node("Recommender", lambda s: recommender.run(s, item_pool, top_k=20))
    graph.add_node("DiversityReranker", lambda s: diversity_reranker.run(s))
    graph.add_node("Critique", lambda s: critique_agent.run(s))

    graph.add_edge("Recommender", "DiversityReranker")
    graph.add_edge("DiversityReranker", "Critique")
    graph.add_edge("Critique", END)

    graph.set_entry_point("Recommender")
    return graph

# 8) History printer
def print_user_history(user_id: int, ratings_df: pd.DataFrame, max_items=10):
    print("\n--- User History ---")
    df = ratings_df[ratings_df["userId"] == user_id][["title", "genres", "rating"]].copy()
    df["title"] = df["title"].apply(clean_title)
    df = df.sort_values(by=["rating", "title"], ascending=[False, True]).head(max_items)
    print(df.to_string(index=False))

# 9) Initialize agents
recommender = RecommenderAgent(model, movies_df)
diversity_reranker = DiversityRerankerAgent(top_k=5, diversity_weight=1.5)
critique_agent = CritiqueAgent(granite_pipe, words_per_line=10)

graph = build_graph(recommender, diversity_reranker, critique_agent, item_pool)
app = graph.compile()

# 10) Run example
sample_users = ratings_full["userId"].drop_duplicates().sample(n=10, random_state=42)
for uid in sample_users:
    print_user_history(uid, ratings_full)
    result = app.invoke({"user_id": int(uid), "request": "Recommend me movies"})

    print("\n=== Dialogue Agent Output ===")
    print(f"User: {uid}")
    print("\nCritique:")
    print(result["critique"])
    print("\nDiversityReranker:")
    print(result["recommendations"])



--- User History ---
                                 title                         genres  rating
                Breakfast at Tiffany's                 Drama, Romance       5
                          Little Women                          Drama       5
Romy and Michele's High School Reunion                         Comedy       5
                               Sabrina                Comedy, Romance       5
                      Schindler's List                     Drama, War       5
             Shawshank Redemption, The                          Drama       5
                     Strictly Ballroom                Comedy, Romance       5
                               Amadeus                 Drama, Mystery       4
                                  Bean                         Comedy       4
                  Beauty and the Beast Animation, Children's, Musical       4
--- Top-20 recommendations before re-ranking ---
1. Schindler's List (1993) (Drama, War) → pred 4.25
2. Strictly Ballroo