# Production Training (Temporal Split)

This notebook implements a complete "Production" pipeline, simulating a real-world deployment scenario for a two-stage recommender system.

**Pipeline Steps:**
1. **Data Processing:** Load MovieLens data and split temporally (train on past, test on future).
2. **Retriever Training:** Train a Two-Tower model to retrieve candidates.
3. **Hard Negative Mining:** Use the trained retriever to find "hard" negatives (items the model incorrectly thinks are relevant) for the ranker.
4. **Ranker Training:** Train a Cross-Encoder model to re-rank the candidates.
5. **Evaluation:** Evaluate the full pipeline on the held-out test set (last 5 interactions per user).

**Key Characteristics:**
* **Rating Threshold:** Only interactions with ratings >= 3.5 are considered positive.
* **Real-world Simulation:** Strict temporal split avoids data leakage.


### Setup (Colab)
Run the following cell to install the package and dependencies if running in Google Colab.
If running locally, ensure you have installed the package via `pip install -e .`

In [None]:
!git clone https://github.com/zheliu17/nanoRecSys.git
%pip install -q -e ./nanoRecSys

import psutil  # noqa: F401

# In fact, we don't need psutil. force-reinstall to trigger colab restart
%pip install --force-reinstall psutil=={psutil.__version__}
print("Installation complete. Please restart runtime...")

In [None]:
import wandb
import nanoRecSys.data.build_dataset
import nanoRecSys.data.splits

wandb.login()

nanoRecSys.data.build_dataset.process_data()
nanoRecSys.data.splits.create_user_time_split()

In [None]:
import nanoRecSys.train


# Retriever Training Arguments
class Args:
    mode = "retriever"
    epochs = 5
    batch_size = 4096
    lr = 4e-2
    num_workers = 2
    build_embeddings = True  # Build embeddings for fast retrieval after training


nanoRecSys.train.main(Args)

## Negative Mining & Ranker Training
We use the trained retriever to mine hard negatives (items that the retriever assigns high scores to, but are not the ground truth interaction).  The Ranker is then trained to distinguish between the positive item and these hard negatives.
The Ranker is a Cross-Encoder that takes both User and Item features as input.

In [None]:
import nanoRecSys.training.mine_negatives

nanoRecSys.training.mine_negatives.main(batch_size=1024, top_k=15, skip_top=0)


class Args:
    mode = "ranker"
    epochs = 1
    limit_train_batches = 0.5
    batch_size = 2048
    explicit_neg_weight = 4
    random_neg_ratio = 0.01

    lr = 1e-3
    item_lr = 0
    num_workers = 2


nanoRecSys.train.main(Args)

## Evaluation
We evaluate the full pipeline (Retriever + Ranker) on the held-out test set (last 5 interactions).
First, we look at the **Popularity Baseline** to establish a lower bound.

In [None]:
from nanoRecSys.eval.offline_eval import OfflineEvaluator

evaluator = OfflineEvaluator(1024)
results = evaluator.eval_popularity()

df = evaluator.formatted_results(results)
df

Now we evaluate the full **Ranker** model. Typically, re-ranking should improve metrics (NDCG, Recall) over pure retrieval or popularity baselines.

In [None]:
results = evaluator.eval_ranker()

df = evaluator.formatted_results(results)
df