## The Project: AI Agents for Diverse Movie Recommendations using LangGraph

This project builds an AI agent system using the **LangGraph** framework to provide personalized movie recommendations based on the **MovieLens 100K** dataset. The system uses a pre-trained **NeuMF** model ([He et al., 2017](https://dl.acm.org/doi/10.1145/3038912.3052569)) for collaborative filtering and is enhanced with multiple agents:

1. **RecommenderAgent**: Predicts top-20 movies for each user based on the trained NeuMF model.
2. **DiversityRerankerAgent**: Selects the most diverse top-5 subset from the recommendations by balancing rating quality and genre variety.
3. **CritiqueAgent**: Uses a language model (e.g., Granite) to evaluate the recommendation list and provide critique about thematic richness, emotional depth, and genre coverage.

Each agent operates within a **LangGraph pipeline**, enabling structured reasoning and modular evaluation. The full system prints user history, generates diverse recommendations, and explains potential weaknesses in the results.


`Graph:
  A[RecommenderAgent] --> B[DiversityRerankerAgent]
  B --> C[CritiqueAgent]
  C --> D[END]
`

# The Dataset: MovieLens 100K

The experiments are based on the **MovieLens 100K dataset**, a widely used benchmark in recommender systems research.  

- **Size:** 100,000 ratings  
- **Users:** 943  
- **Movies:** 1,682  
- **Format:** tab-delimited files (CSV-like)

## Main Columns
- **userId** – unique identifier of each user (anonymized, 1–943).  
- **movieId** – unique identifier of each movie (1–1682).  
- **rating** – explicit rating from 1 to 5, where higher values indicate stronger preference.  
- **timestamp** – UNIX time indicating when the rating was made.  
- **title** (from `u.item`) – the name of the movie.  
- **genres** (from `u.item`) – one or more genres assigned to each movie (e.g., Action, Comedy).  

This dataset is small enough to allow fast experimentation, yet rich enough to demonstrate the strengths and weaknesses of different recommendation algorithms.


In [None]:
!pip uninstall -y scikit-surprise
!pip install numpy==1.26.4 --force-reinstall

In [None]:
from google.colab import drive
drive.mount('/content/drive')

In [2]:
import zipfile
import pandas as pd

# Path to the ZIP file in Google Drive
zip_path = "/content/drive/MyDrive/Portfolio datasets/Recommender engine/ml-100k.zip"
extract_path = "/content/ml-100k"

# Extract the dataset
with zipfile.ZipFile(zip_path, 'r') as zip_ref:
    zip_ref.extractall(extract_path)

# Load the ratings (u.data)
data_path = f"{extract_path}/ml-100k/u.data"
df = pd.read_csv(
    data_path,
    sep="\t",
    names=["userId", "movieId", "rating", "timestamp"]
)

# Load the movie titles (u.item)
item_path = f"{extract_path}/ml-100k/u.item"
movies = pd.read_csv(
    item_path,
    sep="|",
    encoding="latin-1",
    header=None,
    usecols=[0, 1],
    names=["movieId", "title"]
)

# Merge ratings with movie titles
df_merged = pd.merge(df, movies, on="movieId")

print("Data shape:", df_merged.shape)
print(df_merged.head())


Data shape: (100000, 5)
   userId  movieId  rating  timestamp                       title
0     196      242       3  881250949                Kolya (1996)
1     186      302       3  891717742    L.A. Confidential (1997)
2      22      377       1  878887116         Heavyweights (1994)
3     244       51       2  880606923  Legends of the Fall (1994)
4     166      346       1  886397596         Jackie Brown (1997)


# NeuMF 5-Fold Cross-Validation: Code Overview

## What the code does
This script trains and evaluates a **NeuMF (Neural Matrix Factorization)** recommender on MovieLens 100K using **5-fold cross-validation**, reporting both **RMSE** and **Top-K ranking metrics**.

## Data handling
- Expects a pre-built `df_merged` with `userId`, `movieId`, `rating`.
- Converts `userId`/`movieId` to zero-based indices.
- Wraps samples in a `RatingsDataset` and uses PyTorch `DataLoader` for batching.

## Model architecture (NeuMF)
- **GMF branch:** user/item embeddings with element-wise product to capture linear interactions.
- **MLP branch:** separate user/item embeddings concatenated and passed through fully-connected layers (default hidden sizes: `[64, 32, 16]`) with ReLU.
- **Fusion:** concatenation of GMF output and MLP output, followed by a final linear layer to predict a rating.
- Default embedding sizes: `emb_size_gmf=32`, `emb_size_mlp=32`.

## Training setup
- Optimizer: Adam (`lr=0.001`).
- Loss: Mean Squared Error (predicting explicit ratings 1–5).
- Epochs: `5` per fold.
- Batch size: `512`.
- For each fold, a **fresh NeuMF model is initialized**, trained, and evaluated.

## Evaluation
- **RMSE** on the test split of each fold.
- **Top-K metrics** computed by `metrics_at_k` for K ∈ {5, 10, 20}:
  - Precision@K, Recall@K, F1@K
  - NDCG@K
  - HitRate@K
- An item is considered **relevant** if `true_rating ≥ 4`.
- The script prints per-fold results and then an aggregated mean across folds.


In [3]:
import torch
import torch.nn as nn
import torch.optim as optim
from torch.utils.data import Dataset, DataLoader
import numpy as np
import pandas as pd
from sklearn.model_selection import KFold
from collections import defaultdict

# === Dataset Wrapper ===
class RatingsDataset(Dataset):
    def __init__(self, df):
        self.users = torch.tensor(df["userId"].values, dtype=torch.long)
        self.items = torch.tensor(df["movieId"].values, dtype=torch.long)
        self.ratings = torch.tensor(df["rating"].values, dtype=torch.float32)

    def __len__(self):
        return len(self.ratings)

    def __getitem__(self, idx):
        return self.users[idx], self.items[idx], self.ratings[idx]

# === NeuMF Model ===
class NeuMF(nn.Module):
    def __init__(self, n_users, n_items, emb_size_gmf=32, emb_size_mlp=32, hidden=[64,32,16]):
        super(NeuMF, self).__init__()

        # GMF embeddings
        self.user_emb_gmf = nn.Embedding(n_users, emb_size_gmf)
        self.item_emb_gmf = nn.Embedding(n_items, emb_size_gmf)

        # MLP embeddings
        self.user_emb_mlp = nn.Embedding(n_users, emb_size_mlp)
        self.item_emb_mlp = nn.Embedding(n_items, emb_size_mlp)

        # MLP layers
        mlp_layers = []
        input_size = emb_size_mlp * 2
        for h in hidden:
            mlp_layers.append(nn.Linear(input_size, h))
            mlp_layers.append(nn.ReLU())
            input_size = h
        self.mlp = nn.Sequential(*mlp_layers)

        # Final prediction layer
        self.output = nn.Linear(emb_size_gmf + hidden[-1], 1)

    def forward(self, users, items):
        gmf_u = self.user_emb_gmf(users)
        gmf_i = self.item_emb_gmf(items)
        gmf = gmf_u * gmf_i

        mlp_u = self.user_emb_mlp(users)
        mlp_i = self.item_emb_mlp(items)
        mlp = self.mlp(torch.cat([mlp_u, mlp_i], dim=1))

        x = torch.cat([gmf, mlp], dim=1)
        return self.output(x).squeeze()

    def predict(self, users, items):
        self.eval()
        with torch.no_grad():
            return self.forward(users, items)


# === Helper: Top-K metrics ===
def metrics_at_k(users, items, ratings, preds, ks=[5,10,20]):
    user_ratings = defaultdict(list)
    for u, i, r, p in zip(users, items, ratings, preds):
        user_ratings[int(u)].append((i, p, r))

    results = {k: {"Precision": [], "Recall": [], "F1": [], "NDCG": [], "HitRate": []} for k in ks}

    for uid, ratings in user_ratings.items():
        ratings.sort(key=lambda x: x[1], reverse=True)
        rel = [r for (_, _, r) in ratings if r >= 4]
        n_rel = len(rel)

        for k in ks:
            top_k = ratings[:k]
            rec = [iid for (iid, _, r) in top_k if r >= 4]
            n_rel_and_rec_k = len(rec)

            precision = n_rel_and_rec_k / k if k > 0 else 0
            recall = n_rel_and_rec_k / n_rel if n_rel > 0 else 0
            f1 = (2*precision*recall / (precision+recall)) if (precision+recall)>0 else 0

            dcg = sum([1/np.log2(idx+2) for idx,(iid,_,r) in enumerate(top_k) if r>=4])
            idcg = sum([1/np.log2(idx+2) for idx in range(min(n_rel, k))])
            ndcg = dcg/idcg if idcg>0 else 0

            hit = 1 if n_rel_and_rec_k>0 else 0

            results[k]["Precision"].append(precision)
            results[k]["Recall"].append(recall)
            results[k]["F1"].append(f1)
            results[k]["NDCG"].append(ndcg)
            results[k]["HitRate"].append(hit)

    return {k: {m: np.mean(vals) for m, vals in metrics.items()} for k, metrics in results.items()}

# === Load ratings data (from df_merged) ===
ratings_df = df_merged[["userId", "movieId", "rating"]].copy()
ratings_df["userId"] -= 1
ratings_df["movieId"] -= 1

n_users = ratings_df["userId"].nunique()
n_items = ratings_df["movieId"].nunique()

# === 5-Fold Cross Validation ===
kf = KFold(n_splits=5, shuffle=True, random_state=42)
results = []

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

for fold, (train_idx, test_idx) in enumerate(kf.split(ratings_df), 1):
    print(f"\n=== Fold {fold} ===")
    train_df = ratings_df.iloc[train_idx]
    test_df = ratings_df.iloc[test_idx]

    train_loader = DataLoader(RatingsDataset(train_df), batch_size=512, shuffle=True)
    test_loader = DataLoader(RatingsDataset(test_df), batch_size=512, shuffle=False)

    # Init new NeuMF each fold
    model = NeuMF(n_users, n_items).to(device)
    criterion = nn.MSELoss()
    optimizer = optim.Adam(model.parameters(), lr=0.001)

    # Training
    epochs = 5
    for epoch in range(epochs):
        model.train()
        train_loss = 0
        for users, items, ratings in train_loader:
            users, items, ratings = users.to(device), items.to(device), ratings.to(device)
            optimizer.zero_grad()
            preds = model(users, items)
            loss = criterion(preds, ratings)
            loss.backward()
            optimizer.step()
            train_loss += loss.item() * len(ratings)
        print(f"  Epoch {epoch+1}/{epochs}, Train RMSE: {np.sqrt(train_loss/len(train_df)):.4f}")

    # Evaluation
    model.eval()
    test_preds, test_truth, test_users, test_items = [], [], [], []
    with torch.no_grad():
        for users, items, ratings in test_loader:
            users, items = users.to(device), items.to(device)
            preds = model(users, items).cpu().numpy()
            test_preds.extend(preds)
            test_truth.extend(ratings.numpy())
            test_users.extend(users.cpu().numpy())
            test_items.extend(items.cpu().numpy())

    rmse = np.sqrt(np.mean((np.array(test_preds) - np.array(test_truth))**2))
    print(f"  Fold {fold} RMSE: {rmse:.4f}")

    # Top-K
    metrics = metrics_at_k(test_users, test_items, test_truth, test_preds, ks=[5,10,20])

    row = {"Fold": fold, "RMSE": rmse}
    for k in [5,10,20]:
        for m,v in metrics[k].items():
            row[m+f"@{k}"] = v
    results.append(row)

# === Aggregate Results ===
results_df = pd.DataFrame(results)
print("\n=== NeuMF 5-Fold CV Results (Per Fold) ===")
print(results_df)

avg_results = results_df.mean(numeric_only=True)
print("\n=== NeuMF 5-Fold CV Results (Averaged) ===")
print(avg_results)



=== Fold 1 ===
  Epoch 1/5, Train RMSE: 2.0333
  Epoch 2/5, Train RMSE: 1.0867
  Epoch 3/5, Train RMSE: 1.0354
  Epoch 4/5, Train RMSE: 1.0034
  Epoch 5/5, Train RMSE: 0.9794
  Fold 1 RMSE: 0.9935

=== Fold 2 ===
  Epoch 1/5, Train RMSE: 2.1229
  Epoch 2/5, Train RMSE: 1.0801
  Epoch 3/5, Train RMSE: 1.0291
  Epoch 4/5, Train RMSE: 0.9982
  Epoch 5/5, Train RMSE: 0.9763
  Fold 2 RMSE: 0.9891

=== Fold 3 ===
  Epoch 1/5, Train RMSE: 2.0012
  Epoch 2/5, Train RMSE: 1.0866
  Epoch 3/5, Train RMSE: 1.0352
  Epoch 4/5, Train RMSE: 1.0029
  Epoch 5/5, Train RMSE: 0.9792
  Fold 3 RMSE: 1.0066

=== Fold 4 ===
  Epoch 1/5, Train RMSE: 2.0889
  Epoch 2/5, Train RMSE: 1.0906
  Epoch 3/5, Train RMSE: 1.0362
  Epoch 4/5, Train RMSE: 1.0032
  Epoch 5/5, Train RMSE: 0.9795
  Fold 4 RMSE: 0.9961

=== Fold 5 ===
  Epoch 1/5, Train RMSE: 2.0578
  Epoch 2/5, Train RMSE: 1.0911
  Epoch 3/5, Train RMSE: 1.0407
  Epoch 4/5, Train RMSE: 1.0084
  Epoch 5/5, Train RMSE: 0.9846
  Fold 5 RMSE: 1.0026

=== NeuMF

In [None]:
!pip install bitsandbytes accelerate transformers langgraph

In [5]:
import os
os.environ["HF_TOKEN"] = "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"

In [None]:
# === Open Source Granite 8B model LLM ===
from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig, pipeline
import torch

bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_use_double_quant=True,
    bnb_4bit_quant_type="nf4",
    llm_int8_enable_fp32_cpu_offload=True
)
try:
    # Granite 8B model name on Hugging Face
    GRANITE_MODEL = "ibm-granite/granite-3.3-8b-instruct"

    # Load the model + tokenizer
    granite_model = AutoModelForCausalLM.from_pretrained(
        GRANITE_MODEL,
        device_map="auto",
        quantization_config=bnb_config  # Optional
    )

    granite_tokenizer = AutoTokenizer.from_pretrained(GRANITE_MODEL)

    # Create pipeline
    granite_pipe = pipeline(
        "text-generation",
        model=granite_model,
        tokenizer=granite_tokenizer,
        pad_token_id=granite_tokenizer.eos_token_id,
        return_full_text=False
    )
except Exception as e:
    print("[Warning] Failed to load LLaMA model:", e)
    granite_pipe = lambda prompt, **kwargs: [{"generated_text": "[granit model unavailable]"}]
granite_pipe = pipeline("text-generation", model=granite_model, tokenizer=granite_tokenizer)



In [8]:
import pandas as pd
from sklearn.model_selection import train_test_split

# Use the ratings dataframe (already loaded from MovieLens or another dataset)
ratings_df = df_merged[["userId", "movieId", "rating"]].copy()

# Train/Test split
train_df, test_df = train_test_split(ratings_df, test_size=0.2, random_state=42)

# Build ratings_history from train_df
ratings_history = (
    train_df.groupby("userId")[["movieId", "rating"]]
    .apply(lambda g: list(zip(g["movieId"], g["rating"])))
    .to_dict()
)

# Build item_pool from all unique movies
item_pool = ratings_df["movieId"].unique().tolist()

print("Data prepared:")
print(f"Train size: {len(train_df)}, Test size: {len(test_df)}")
print(f"Number of users in train: {train_df['userId'].nunique()}")
print(f"Number of items in pool: {len(item_pool)}")


Data prepared:
Train size: 80000, Test size: 20000
Number of users in train: 943
Number of items in pool: 1682


In [9]:
# === AI Agents with LangGraph, using pre-trained SVD++ and MovieLens titles/genres ===
import pandas as pd
from langgraph.graph import StateGraph, END
from typing import TypedDict
import re
from collections import Counter
from itertools import combinations

from transformers.utils import logging
logging.set_verbosity_error()


# 1) Load movies with titles + genres
movies_df = pd.read_csv(
    "/content/ml-100k/ml-100k/u.item",
    sep="|",
    encoding="latin-1",
    header=None,
    names=["movieId", "title", "release_date", "video_release_date", "IMDb_URL",
           "unknown", "Action", "Adventure", "Animation", "Children's", "Comedy",
           "Crime", "Documentary", "Drama", "Fantasy", "Film-Noir", "Horror",
           "Musical", "Mystery", "Romance", "Sci-Fi", "Thriller", "War", "Western"]
)

def clean_title(title: str) -> str:
    return re.sub(r"\s*\(\d{4}\)$", "", title)

genre_cols = movies_df.columns[5:]
movies_df["genres"] = movies_df[genre_cols].apply(
    lambda row: ", ".join([g for g, v in row.items() if v == 1]), axis=1
)

# 2) Merge ratings with movie metadata
ratings_full = df_merged[["userId", "movieId", "rating"]].merge(
    movies_df[["movieId", "title", "genres"]],
    on="movieId", how="left"
)

item_pool = ratings_full["movieId"].unique().tolist()
ratings_history = (
    ratings_full.groupby("userId")[["movieId", "rating"]]
    .apply(lambda x: list(zip(x["movieId"], x["rating"])) )
    .to_dict()
)

movie_titles = ratings_full.set_index("movieId")["title"].to_dict()
movie_genres = ratings_full.set_index("movieId")["genres"].to_dict()

def title_of(iid: int) -> str:
    return movie_titles.get(iid, "Unknown title")

def genres_of(iid: int) -> str:
    g = movie_genres.get(iid, "")
    return g if isinstance(g, str) and g else "Unknown genre"

# 3) State definition
class State(TypedDict):
    user_id: int
    request: str
    recommendations: list
    explanation: str
    critique: str

# === Build mapping from original IDs to continuous indices ===
unique_movie_ids = sorted(df_merged["movieId"].unique())
movie2idx = {mid: idx for idx, mid in enumerate(unique_movie_ids)}
idx2movie = {idx: mid for mid, idx in movie2idx.items()}
df_merged["movieId"] = df_merged["movieId"].map(movie2idx)
movies_df["movieId"] = movies_df["movieId"].map(movie2idx)

unique_user_ids = sorted(df_merged["userId"].unique())
user2idx = {uid: idx for idx, uid in enumerate(unique_user_ids)}
idx2user = {idx: uid for uid, idx in user2idx.items()}
df_merged["userId"] = df_merged["userId"].map(user2idx)

# Drop rows with missing values (if any)
df_merged.dropna(subset=["userId", "movieId"], inplace=True)
df_merged["userId"] = df_merged["userId"].astype(int)
df_merged["movieId"] = df_merged["movieId"].astype(int)
movies_df.dropna(subset=["movieId"], inplace=True)
movies_df["movieId"] = movies_df["movieId"].astype(int)

# 4) RecommenderAgent for PyTorch-based NeuMF
class RecommenderAgent:
    def __init__(self, model, movies_df, device, idx2movie=None):
        """
        Initialize the recommendation agent.

        Args:
            model (nn.Module): Trained PyTorch NeuMF model.
            movies_df (pd.DataFrame): Movie metadata with 'movieId', 'title', 'genres'.
            device (torch.device): Device to run the model on ('cuda' or 'cpu').
            idx2movie (dict, optional): Mapping from internal movie indices back to original movieIds.
        """
        self.model = model
        self.movies_df = movies_df
        self.device = device
        self.idx2movie = idx2movie

    def recommend(self, user_id, item_pool, top_k=20):
        """
        Generate top-K movie recommendations for a specific user.

        Args:
            user_id (int): Internal user ID (after mapping to 0-based index).
            item_pool (list[int]): List of candidate internal movie IDs to consider.
            top_k (int): Number of top recommendations to return.

        Returns:
            pd.DataFrame: DataFrame with columns: title, genres, predicted_rating.
        """
        # Validate user and item indices
        max_user_id = self.model.user_emb_gmf.num_embeddings
        max_item_id = self.model.item_emb_gmf.num_embeddings
        assert 0 <= user_id < max_user_id, f"Invalid user_id: {user_id} (max allowed: {max_user_id - 1})"
        assert all(0 <= i < max_item_id for i in item_pool), "item_pool contains invalid movieId values"

        # Convert user and items to tensors
        user_tensor = torch.tensor([user_id] * len(item_pool), dtype=torch.long).to(self.device)
        item_tensor = torch.tensor(item_pool, dtype=torch.long).to(self.device)

        # Generate predictions
        self.model.eval()
        with torch.no_grad():
            preds = self.model(user_tensor, item_tensor).cpu().numpy()

        # Pair items with predictions and sort
        preds = list(zip(item_pool, preds))
        preds = sorted(preds, key=lambda x: x[1], reverse=True)[:top_k]

        # Build result DataFrame
        recs_df = pd.DataFrame(preds, columns=["movieId", "predicted_rating"])
        recs_df = recs_df.merge(self.movies_df, on="movieId", how="left")

        # Optional: map back to original movieId
        if self.idx2movie is not None:
            recs_df["movieId_original"] = recs_df["movieId"].map(self.idx2movie)

        return recs_df[["title", "genres", "predicted_rating"]]

    def run(self, state, item_pool, top_k=20):
        """
        Run recommendation for a given user and update the state dictionary.

        Args:
            state (dict): Dictionary containing at least the key 'user_id'.
            item_pool (list[int]): List of internal movie IDs to recommend from.
            top_k (int): Number of top recommendations to generate.

        Returns:
            dict: Updated state with a 'recommendations' DataFrame.
        """
        user_id = state["user_id"]
        recs = self.recommend(user_id, item_pool, top_k)
        state["recommendations"] = recs

        print("--- Top-{} recommendations before re-ranking ---".format(top_k))
        for i, row in enumerate(recs.itertuples(), 1):
            print(f"{i}. {row.title} ({row.genres}) → pred {row.predicted_rating:.2f}")

        return state



# 5) CritiqueAgent
def wrap_text(text, words_per_line=10):
    words = text.split()
    lines = [" ".join(words[i:i+words_per_line]) for i in range(0, len(words), words_per_line)]
    return "\n".join(lines)

class CritiqueAgent:
    def __init__(self, pipe, words_per_line=10):
        self.pipe = pipe
        self.words_per_line = words_per_line

    def run(self, state):
        recs = state["recommendations"]
        recs_str = "; ".join(
            f"{row['title']} ({row['genres']}) pred {row['predicted_rating']:.2f}"
            for _, row in recs.iterrows()
        )
        prompt = (
            "You are a movie recommendation critic. Based on the following recommended list, "
            "write a brief critique of possible weaknesses in the recommendation strategy. "
            "Focus on thematic richness, emotional depth, genre variety (e.g. missing musical, children's, animation), "
            "and suggest one possible improvement in no more than 3 lines.\n\n"
            f"Recommendations: {recs_str}\n\n"
            "Critique:"
        )
        out = self.pipe(prompt, max_new_tokens=100, temperature=0.6)[0]["generated_text"]
        if "Critique:" in out:
            out = out.split("Critique:")[-1].strip()
        return {"critique": wrap_text(out, self.words_per_line)}

# 6) DiversityRerankerAgent
class DiversityRerankerAgent:
    def __init__(self, top_k=5, diversity_weight=1.0):
        self.top_k = top_k
        self.diversity_weight = diversity_weight

    def compute_genre_bonus(self, selected_genres, candidate_genres):
        """Calculate how many new genres this candidate adds."""
        candidate_set = set(candidate_genres.split(", "))
        return len(candidate_set - selected_genres)

    def run(self, state):
        recs = state["recommendations"].copy()
        if len(recs) <= self.top_k:
            return state

        # Select top_k movies incrementally to maximize genre diversity + rating
        selected = []
        selected_genres = set()
        remaining = recs.copy()

        while len(selected) < self.top_k and not remaining.empty:
            def score(row):
                genre_bonus = self.compute_genre_bonus(selected_genres, row["genres"])
                return (1 - self.diversity_weight) * row["predicted_rating"] + self.diversity_weight * genre_bonus

            remaining["score"] = remaining.apply(score, axis=1)
            best = remaining.sort_values(by="score", ascending=False).iloc[0]
            selected.append(best)
            selected_genres.update(best["genres"].split(", "))
            remaining = remaining.drop(best.name)

        state["recommendations"] = pd.DataFrame(selected)[["title", "genres", "predicted_rating"]].reset_index(drop=True)
        return state


# 7) Graph builder
def build_graph(recommender, diversity_reranker, critique_agent, item_pool):
    graph = StateGraph(State)
    graph.add_node("Recommender", lambda s: recommender.run(s, item_pool, top_k=20))
    graph.add_node("DiversityReranker", lambda s: diversity_reranker.run(s))
    graph.add_node("Critique", lambda s: critique_agent.run(s))

    graph.add_edge("Recommender", "DiversityReranker")
    graph.add_edge("DiversityReranker", "Critique")
    graph.add_edge("Critique", END)

    graph.set_entry_point("Recommender")
    return graph

# 8) History printer
def print_user_history(user_id: int, ratings_df: pd.DataFrame, max_items=10):
    print("\n--- User History ---")
    df = ratings_df[ratings_df["userId"] == user_id][["title", "genres", "rating"]].copy()
    df["title"] = df["title"].fillna("Unknown title").apply(clean_title)
    df = df.sort_values(by=["rating", "title"], ascending=[False, True]).head(max_items)
    print(df.to_string(index=False))

# 9) Initialize agents
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
#recommender = RecommenderAgent(model, movies_df, device)
recommender = RecommenderAgent(model, movies_df, device, idx2movie=idx2movie)


diversity_reranker = DiversityRerankerAgent(top_k=5, diversity_weight=1.5)
critique_agent = CritiqueAgent(granite_pipe, words_per_line=10)

graph = build_graph(recommender, diversity_reranker, critique_agent, item_pool)
app = graph.compile()

# 10) Run example
sample_users = ratings_full["userId"].drop_duplicates().sample(n=10, random_state=42)
for uid in sample_users:
    print_user_history(uid, ratings_full)
    result = app.invoke({"user_id": int(uid), "request": "Recommend me movies"})

    print("\n=== Dialogue Agent Output ===")
    print(f"User: {uid}")
    print("\nCritique:")
    print(result["critique"])
    print("\nDiversityReranker:")
    print(result["recommendations"])



--- User History ---
                         title                      genres  rating
                      Anaconda Action, Adventure, Thriller       5
         Browning Version, The                       Drama       5
     In the Name of the Father                       Drama       5
                   Kansas City                       Crime       5
                  My Fair Lady            Musical, Romance       5
             Santa Clause, The          Children's, Comedy       5
      Sex, Lies, and Videotape                       Drama       5
                  12 Angry Men                       Drama       4
                Absolute Power           Mystery, Thriller       4
Ace Ventura: When Nature Calls                      Comedy       4
--- Top-20 recommendations before re-ranking ---
1. Quartier Mozart (1992) (Comedy) → pred 4.55
2. Tin Cup (1996) (Comedy, Romance) → pred 4.32
3. Monty Python and the Holy Grail (1974) (Comedy) → pred 4.30
4. Spy Hard (1996) (Comedy) → pred