# LinUCB Alpha Sweep Tutorial
This notebook walks through preparing the MovieLens-100K-based simulator in EasyRL4Rec, running LinUCB with multiple exploration coefficients (`alpha`), and visualizing how the metric changes.
It mirrors the automation we scripted, but keeps everything reproducible for other users.

## Prerequisites
1. Download MovieLens-100K (e.g., `curl -LO https://files.grouplens.org/datasets/movielens/ml-100k.zip`).
2. Extract into `data/MovieLens/data_raw/`:
   ```bash
   mkdir -p data/MovieLens/data_raw
   unzip ml-100k.zip -d data/MovieLens/data_raw/
   ```
3. Install repo dependencies (`conda create`, `sh install.sh`, clone tianshou`).
4. Activate the conda env (`conda activate easyrl4rec`).
5. Run the cells below to convert the raw files, split train/test, and generate the MF simulator.
GPU is optional; CPU is fine for this tutorial.

In [None]:
import json
import subprocess
import sys
from pathlib import Path
import pandas as pd
import matplotlib.pyplot as plt

ROOT = Path('..').resolve()  # notebook sits in examples/
print('Project root:', ROOT)
def run_cmd(args):
    print('\n$',' '.join(str(a) for a in args))
    subprocess.run(args, check=True, cwd=ROOT)

## Step 1 – Prepare the MovieLens simulator data
These cells convert the ML-100K raw files into the format EasyRL4Rec expects, split train/test logs, and generate the matrix-factorization rating matrix the simulator uses.

In [None]:
run_cmd([sys.executable, 'script/convert_100k.py'])

In [None]:
run_cmd([sys.executable, 'data/MovieLens/split_train_test.py'])
run_cmd([sys.executable, 'data/MovieLens/provide_MF_results.py'])

### What do these preprocessing scripts do?
- `script/convert_100k.py`: converts the original ML-100K raw files (`u.data`, `u.user`, `u.item`) into the format expected inside EasyRL4Rec (double-colon separated `ratings.dat`, `users.dat`, `movies.dat`). See lines 5–43 for the transformations, including mapping occupations to ints and embedding release years into the title.
- `data/MovieLens/split_train_test.py`: splits `ratings.dat` into `movielens-1m-train.csv`/`movielens-1m-test.csv` (90/10 chronological split); this is what `MovieLensData.get_train_data()` reads (lines 1–60).
- `data/MovieLens/provide_MF_results.py`: trains a simple MF model (lines 16–132) to predict missing entries in the user–item matrix and saves the dense `rating_matrix.csv` that powers the simulator; see the `MatrixFactorization` definition (lines 40–108).

## Step 2 – Sweep LinUCB alpha values
We reuse `script/sweep_linucb_alpha.py` to run LinUCB with several `alpha` values.
Use `--extra-args` to pass flags to `run_LinUCB.py` (here we force `--num_workers 0` because macOS blocks shared memory).

In [None]:
alphas = ['0.1','0.5','1.0','2.0','5.0']
cmd = [
    sys.executable, 'script/sweep_linucb_alpha.py',
    '--message-prefix', 'notebook',
    '--plot-path', 'visual_results/linucb_alpha_comparison.png',
    '--results-json', 'visual_results/linucb_alpha_results.json',
    '--alphas', *alphas,
    '--extra-args', '--num_workers', '0'
]
run_cmd(cmd)

## Step 3 – Inspect the aggregated metrics
The sweep script saves metrics and a plot under `visual_results/`. Let's load and visualize them inline.

In [None]:
results_path = ROOT / 'visual_results' / 'linucb_alpha_results.json'
with open(results_path) as f:
    metrics = json.load(f)
df = pd.DataFrame(metrics).T.astype(float).sort_index(key=lambda s: s.astype(float))
df

In [None]:
fig, axes = plt.subplots(1, 2, figsize=(12,4))
df.sort_index().plot(y='ctr', marker='o', ax=axes[0])
axes[0].set_title('CTR vs alpha')
axes[0].set_ylabel('Average reward (ctr)')
df.sort_index().plot(y='click_loss', marker='o', color='C3', ax=axes[1])
axes[1].set_title('Click loss vs alpha')
axes[1].set_ylabel('Avg |prediction - reward|')
plt.tight_layout()
plt.show()

## How the training script works
`script/sweep_linucb_alpha.py` shells out to `examples/usermodel/run_LinUCB.py`. That runner:
1. Parses base hyperparameters via `examples/usermodel/usermodel_utils.py` (lines 20–120).
2. Calls `run_Egreedy.prepare_dataset` (lines 91–118) which uses `MovieLensData` to load and encode features.
3. Builds the LinUCB user model through `run_Egreedy.setup_user_model`, which constructs `EnsembleModel` → `UserModel_Pairwise_Variance` (source: `src/core/userModel/user_model_ensemble.py`, `src/core/userModel/user_model_pairwise_variance.py`).
4. Kicks off training and evaluation; online metrics are computed in `src/core/evaluation/evaluator_static.py` (see `test_static_model_in_RL_env`).
These references are handy if you want to dive into the exact loss, state tracker, or evaluator logic.

## Step 4 – Next steps
- Adjust the alpha grid or pass `--extra-args --epoch 5` to train longer.
- Use the saved logs (`saved_models/MovieLensEnv-v0/LinUCB/logs/[message]_*`) for deeper analysis.
- Swap in other environments by changing `--env` inside the sweep script.