## 12 · Inference: Recommend Top-N Items for a User  
To demonstrate inference, generate top-10 item recommendations for a randomly selected user.

First, reload the original `ratings.csv` and rebuild the user/item ID mappings used during training. Then, load the latest model checkpoint and restore the trained embedding weights. If you trained the model with DDP, strip the `'module.'` prefix from checkpoint keys.

Next, select a user, compute their embedding, and take the dot product against all item embeddings to produce predicted scores. Finally, extract the top-N items with the highest scores and print their IDs and associated scores.

In [None]:
# 12. Inference: Recommend Top-N Items for a User

# ---------------------------------------------
# Step 1: Reload original ratings CSV + mappings
# ---------------------------------------------
df = pd.read_csv("/mnt/cluster_storage/rec_sys_tutorial/raw/ratings.csv")

# Recompute ID mappings (same as during preprocessing)
unique_users = sorted(df["user_id"].unique())
unique_items = sorted(df["item_id"].unique())

user2idx = {uid: j for j, uid in enumerate(unique_users)}
item2idx = {iid: j for j, iid in enumerate(unique_items)}
idx2item = {v: k for k, v in item2idx.items()}

# ---------------------------------------------
# Step 2: Load model from checkpoint
# ---------------------------------------------
model = MatrixFactorizationModel(
    num_users=len(user2idx),
    num_items=len(item2idx),
    embedding_dim=train_config["embedding_dim"]
)

with result.checkpoint.as_directory() as ckpt_dir:
    state_dict = torch.load(os.path.join(ckpt_dir, "model.pt"), map_location="cpu")

    # Remove 'module.' prefix if using DDP-trained model
    if any(k.startswith("module.") for k in state_dict):
        state_dict = {k.replace("module.", ""): v for k, v in state_dict.items()}

    model.load_state_dict(state_dict)

model.eval()

# ---------------------------------------------
# Step 3: Select a user and generate recommendations
# ---------------------------------------------
# Choose a random user from the original dataset
original_user_id = df["user_id"].sample(1).iloc[0]
user_idx = user2idx[original_user_id]

print(f"Generating recommendations for user_id={original_user_id} (internal idx={user_idx})")

# Compute scores for all items for this user
with torch.no_grad():
    user_vector = model.user_embedding(torch.tensor([user_idx]))           # [1, D]
    item_vectors = model.item_embedding.weight                             # [num_items, D]
    scores = torch.matmul(user_vector, item_vectors.T).squeeze(0)          # [num_items]

    topk = torch.topk(scores, k=10)
    top_item_ids = [idx2item[i.item()] for j in topk.indices]
    top_scores = topk.values.tolist()

# ---------------------------------------------
# Step 4: Print Top-N Recommendations
# ---------------------------------------------
print("\nTop 10 Recommended Item IDs:")
for i, (item_id, score) in enumerate(zip(top_item_ids, top_scores), 1):
    print(f"{i:2d}. Item ID: {item_id} | Score: {score:.2f}")

### 13 · Join Top-N Item IDs with Movie Titles  
To make your recommendations more interpretable, join the top-10 recommended `item_id`s with movie titles from the original `u.item` metadata file.

Load only the relevant columns—`item_id` and `title`—from `u.item`, then merge them with the top-N predictions you computed in the previous step. The result is a user-friendly list of movie titles with associated predicted scores, rather than raw item IDs.

This small addition makes the model outputs easier to understand and more useful for downstream applications.

In [None]:
# 13. Join Top-N Item IDs with Movie Titles from u.item

item_metadata = pd.read_csv(
    "/mnt/cluster_storage/rec_sys_tutorial/ml-100k/u.item",
    sep="|",
    encoding="latin-1",
    header=None,
    usecols=[0, 1],  # Only item_id and title
    names=["item_id", "title"]
)

# Join with top-N items
top_items_df = pd.DataFrame({
    "item_id": top_item_ids,
    "score": top_scores
})

merged = top_items_df.merge(item_metadata, on="item_id", how="left")

print("\nTop 10 Recommended Movies:")
for j, row in merged.iterrows():
    print(f"{j+1:2d}. {row['title']} | Score: {row['score']:.2f}")

### 14 · Cleanup Shared Storage  
Reclaim cluster disk space by deleting the entire tutorial output directory.  
Run this only when you’re **sure** you don’t need the checkpoints or metrics anymore.

In [None]:
# 14. Cleanup -- delete checkpoints and metrics from model training

TARGET_PATH = "/mnt/cluster_storage/rec_sys_tutorial"

if os.path.exists(TARGET_PATH):
    shutil.rmtree(TARGET_PATH)
    print(f"✅ Deleted everything under {TARGET_PATH}")
else:
    print(f"⚠️ Path does not exist: {TARGET_PATH}")

### 🎉 Wrapping Up & Next Steps

Awesome work making it to the end. In this tutorial, you used **Ray Train and Ray Data on Anyscale** to scale a full matrix factorization recommendation system, end-to-end, from a raw CSV to multi-GPU distributed training and personalized top-N item recommendations.

You should now feel confident:

* Using **Ray Data** to preprocess, encode, and shard large tabular datasets  
* Streaming data into PyTorch with `iter_torch_batches()` for efficient training  
* Scaling matrix factorization across multiple GPUs with **Ray Train’s `TorchTrainer`**  
* Saving and resuming training with distributed **Ray Checkpoints**  
* Running multi-node, fault-tolerant jobs without touching orchestration code  
* Performing post-training inference using Ray-restored model checkpoints and learned user/item embeddings

---

### 🚀 Where can you take this next?

Here are a few directions you can explore to extend or adapt this workload:

1. **Ranking Metrics & Evaluation**  
   * Add metrics like **Root Mean Squared Error (RMSE)**, **Normalized Discounted Cumulative Gain (NDCG)**, or **Hit@K** to evaluate recommendation quality.  
   * Filter out already-rated items during inference to measure novelty.

2. **Two-Tower and Deep Models**  
   * Replace dot product with a **two-tower neural model** or a **deep MLP**.  
   * Add side features (For example, timestamp, genre) into each tower for better personalization.

3. **Recommendation Personalization**  
   * Store and cache user embeddings after training.  
   * Run lightweight inference tasks to generate recommendations in real-time.

4. **Content-Based or Hybrid Models**  
   * Join movie metadata (genres, tags) and build a hybrid collaborative–content model.  
   * Embed titles or genres using pre-trained language models.

5. **Hyperparameter Optimization**  
   * Use **Ray Tune** to sweep embedding sizes, learning rates, or regularization.  
   * Track performance over epochs and checkpoint the best models automatically.

6. **Data Scaling**  
   * Switch from MovieLens 100K to 1M or 10M as Ray Data handles it seamlessly.  
   * Save and load from cloud object storage (S3, GCS) for real-world deployments.

7. **Production Inference**  
   * Wrap the recommendation system into a **Ray Serve** endpoint for serving top-N results.  
   * Build a simple demo that recommends movies to live users.

8. **End-to-End MLOps**  
   * Register the best model with MLflow or Weights & Biases.  
   * Package the training job as a Ray Job and schedule it with Anyscale.

9. **Multi-tenant recommendation systems**  
   * Extend this to support **multiple audiences** or contexts (For example, multi-country, A/B groups).  
   * Train and serve context-aware models in parallel using Ray.

This pattern gives you a solid foundation for scaling recommendation workloads across real datasets and real infrastructure—without rewriting your model or managing your cluster. Let Ray + Anyscale do the heavy lifting.