# Project Overview

This notebook documents the high-level plan for the project:

- Pretrain a session-based recommender on large public datasets:
  - YOOCHOOSE (RecSys 2015)
  - Amazon Books (sequential interactions)

- Transfer / fine-tune the pretrained model on:
  - MARS MOOC dataset

- Target:
  - Cold-start and data-scarce session-based recommendation in MOOCs.


In [1]:
import sys
import platform

print("Python version:", sys.version)
print("Platform:", platform.platform())


Python version: 3.11.14 | packaged by Anaconda, Inc. | (main, Oct 21 2025, 18:30:03) [MSC v.1929 64 bit (AMD64)]
Platform: Windows-10-10.0.22621-SP0


In [2]:
import pandas as pd
import numpy as np
import torch

print("pandas:", pd.__version__)
print("numpy:", np.__version__)
print("torch:", torch.__version__)
print("CUDA available:", torch.cuda.is_available())


pandas: 2.3.3
numpy: 2.3.5
torch: 2.9.1+cpu
CUDA available: False


## Next Steps

1. **Dataset EDA & Schema Understanding**
   - `01_eda_yoochoose.ipynb`: Load and inspect YOOCHOOSE.
   - `02_eda_amazon_books.ipynb`: Load and inspect Amazon Books.
   - `03_eda_mars.ipynb`: Load and inspect MARS.

2. **Sessionization & Sequence Building**
   - Convert raw interactions to session-based sequences:
     - (user_id, session_id, [item_1, item_2, ..., item_T], timestamps)

3. **Pretraining**
   - Implement a simple SASRec-like model.
   - Pretrain on YOOCHOOSE + Amazon Books.

4. **Fine-tuning on MARS**
   - Adapt item vocabulary & embeddings.
   - Fine-tune model on MARS sessions.
   - Evaluate on next-item prediction metrics (Hit@K, NDCG@K).

5. **(Later) Meta-Learning Extension**
   - Treat datasets / categories as meta-tasks.
   - MAML-like adaptation to MOOC domain.
