# Baselines Notebook

This notebook mirrors `src/baselines/run_baselines.py`: it runs the **content-based** (untrained SBERT) and **collaborative filtering** (item-item) baselines on the same eval set and reports the same IR metrics (Accuracy@1, Accuracy@10, Recall@10, MRR@10, NDCG@10, MAP@100) so you can compare with the trained two-tower SBERT model.

- **Content-based:** Same base model (e.g. `all-MiniLM-L6-v2`) with no fine-tuning; ranks by cosine similarity between query and product embeddings.
- **CF (item-item):** Co-occurrence in same order from `order_products__prior.csv`; scores candidates by sum of co-occurrence with the user's prior basket. Progress bars show loading, co-occurrence build, eval history, and ranking.

Adjust `processed_dir` and `data_dir` below if needed. Set `run_content_based` / `run_cf` to `False` to skip a baseline.

## Setup: paths and options

In [None]:
from pathlib import Path

from src.constants import DEFAULT_DATA_DIR, DEFAULT_PROCESSED_DIR
from src.utils import resolve_processed_dir

# Resolve to param subdir (e.g. processed/p5_mp20_ef0.1) when needed
processed_dir, msg = resolve_processed_dir(DEFAULT_PROCESSED_DIR, DEFAULT_PROCESSED_DIR)
data_dir = DEFAULT_DATA_DIR

# Which baselines to run (set False to skip)
run_content_based = True
run_cf = True

# Content-based: same base model as training, untrained
model_name = "sentence-transformers/all-MiniLM-L6-v2"

if msg:
    print(msg)
print("Processed dir:", processed_dir)
print("Data dir:", data_dir)

## Load eval data

In [None]:
from src.baselines.collaborative_filtering import load_eval_data

eval_queries, eval_corpus, eval_relevant_docs = load_eval_data(processed_dir)
print(f"Eval queries: {len(eval_queries)}, corpus size: {len(eval_corpus)}")

## Content-based baseline (untrained SBERT)

In [None]:
if run_content_based:
    from src.baselines.content_based import ContentBasedBaseline
    from src.baselines.metrics import compute_ir_metrics

    print("Building content-based (untrained SBERT) baseline...")
    cb = ContentBasedBaseline(eval_queries, eval_corpus, model_name=model_name)
    cb_rankings = cb.rank_all()
    cb_metrics = compute_ir_metrics(cb_rankings, eval_relevant_docs)

    print("\n--- Content-based (untrained SBERT) ---")
    print(f"  Accuracy@1:   {cb_metrics['accuracy_at_1']:.4f}")
    print(f"  Accuracy@10:  {cb_metrics['accuracy_at_10']:.4f}")
    print(f"  Recall@10:    {cb_metrics['recall_at_10']:.4f}")
    print(f"  MRR@10:       {cb_metrics['mrr_at_10']:.4f}")
    print(f"  NDCG@10:      {cb_metrics['ndcg_at_10']:.4f}")
    print(f"  MAP@100:      {cb_metrics['map_at_100']:.4f}")
else:
    print("Skipped content-based baseline (run_content_based=False).")

## Collaborative filtering baseline (item-item)

In [None]:
if run_cf:
    from src.baselines.collaborative_filtering import ItemItemCFBaseline
    from src.baselines.metrics import compute_ir_metrics

    print("Building collaborative filtering (item-item) baseline...")
    cf = ItemItemCFBaseline(data_dir, processed_dir)
    cf_rankings = cf.rank_all(eval_query_ids=list(eval_queries.keys()))
    cf_metrics = compute_ir_metrics(cf_rankings, eval_relevant_docs)

    print("\n--- Collaborative filtering (item-item) ---")
    print(f"  Accuracy@1:   {cf_metrics['accuracy_at_1']:.4f}")
    print(f"  Accuracy@10: {cf_metrics['accuracy_at_10']:.4f}")
    print(f"  Recall@10:    {cf_metrics['recall_at_10']:.4f}")
    print(f"  MRR@10:       {cf_metrics['mrr_at_10']:.4f}")
    print(f"  NDCG@10:      {cf_metrics['ndcg_at_10']:.4f}")
    print(f"  MAP@100:      {cf_metrics['map_at_100']:.4f}")
else:
    print("Skipped CF baseline (run_cf=False).")

## Compare with SBERT

After 4â€“5 epochs the trained two-tower SBERT typically reaches **Accuracy@10 ~0.54**, **Recall@10 ~0.13**, **NDCG@10 ~0.15**. See the README for the full metrics table.