# Experiment 02: Rotation vs Masking Auxiliary Task

**Proposal**: "Explore whether finance-specific self-supervised tasks can better align auxiliary and main objectives than generic rotation-based tasks."

This notebook trains a model with **rotation** as the aux task and compares metrics to the **mask**-trained model from Experiment 01.

**Prerequisites**: Run Experiment 01 first to obtain `checkpoints/joint/best.pt` (mask aux).

## 1. Setup

In [None]:
import os
import sys

cwd = os.getcwd()
PROJECT_ROOT = os.path.dirname(cwd) if os.path.basename(cwd) == "experiments" else cwd
if PROJECT_ROOT not in sys.path:
    sys.path.insert(0, PROJECT_ROOT)
os.chdir(PROJECT_ROOT)
print(f"Working directory: {os.getcwd()}")

## 2. Train with Rotation Aux Task

In [None]:
!python -m src.train --parquet data/raw/btcusdt_1h.parquet \
  --train_end 2022-12-31 --val_end 2023-12-31 \
  --epochs 30 --aux_task rotation --lambda_aux 1.0 \
  --checkpoint_dir checkpoints/rotation

## 3. Evaluation

In [None]:
!python -m src.eval --checkpoint checkpoints/rotation/best.pt \
  --ttt_steps 10 --ttt_lr 0.05 --ttt_optimizer adam \
  --entropy_adaptive --entropy_gate_threshold 0.3 --threshold 0.35

## 4. Comparison: Mask vs Rotation

| Aux Task | Mode | Accuracy | F1 | ECE | Brier | IC |
|----------|------|----------|-----|-----|-------|------|
| **Mask** | Baseline | 0.7639 | 0.0804 | 0.0586 | 0.1725 | 0.0917 |
| Mask | TTT (standard) | 0.4671 | 0.3196 | 0.1488 | 0.1928 | 0.0580 |
| Mask | TTT (online) | 0.3445 | 0.3537 | 0.1986 | 0.2129 | -0.0292 |
| **Rotation** | Baseline | 0.5032 | 0.3678 | 0.1532 | 0.1891 | 0.1895 |
| Rotation | TTT (standard) | 0.4594 | 0.3623 | 0.1628 | 0.1983 | 0.1385 |
| Rotation | TTT (online) | 0.7729 | 0.0222 | 0.1549 | 0.1947 | -0.0007 |

**Summary**: Rotation baseline has better F1 and IC than mask baseline. Mask TTT (online) yields higher F1 (0.35) with balanced predictions; rotation TTT (online) collapses (high acc, near-zero F1).

## 5. Regime-Stratified Evaluation (03)

See Experiment 03 for performance stability across volatility percentiles and regime-specific analysis.