# DLinear + TimesNet + TimeMixer Training & Backtesting

This notebook demonstrates how to:
1. Load per-asset OHLCV candle data from the `tensorlink-dev/open-synth-training-data` HF dataset
2. Engineer 16 micro-structure features per 1-hour bar using `OHLCVEngineer`
3. Build a hybrid model using **DLinearBlock**, **TimesNetBlock**, and **TimeMixerBlock**
4. Choose between **HorizonHead** (per-step mu/sigma via cross-attention) or **NeuralBridgeHead** (macro return + micro texture with bridge constraints)
5. Train with CRPS loss and backtest with multi-interval scoring

## 1. Imports & Setup

In [None]:
import sys
import os

# Ensure the project root is on the path
PROJECT_ROOT = os.path.abspath(os.path.join(os.getcwd(), ".."))
if PROJECT_ROOT not in sys.path:
    sys.path.insert(0, PROJECT_ROOT)
os.chdir(PROJECT_ROOT)

import numpy as np
import pandas as pd
import torch
import torch.optim as optim
import matplotlib.pyplot as plt

from src.models.registry import discover_components, registry
from src.models.factory import HybridBackbone, SynthModel
from src.models.heads import HorizonHead, NeuralBridgeHead, GBMHead, SDEHead
from src.data.market_data_loader import (
    HFOHLCVSource,
    MockDataSource,
    OHLCVEngineer,
    OHLCV_FEATURE_NAMES,
    MarketDataLoader,
    ZScoreEngineer,
)
from src.research.trainer import Trainer, DataToModelAdapter
from src.research.metrics import (
    crps_ensemble,
    CRPSMultiIntervalScorer,
    SCORING_INTERVALS,
)

# Auto-discover all registered blocks
discover_components("src/models/components")

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
print(f"Using device: {device}")
print(f"Registered blocks: {list(registry.blocks.keys())}")
print(f"OHLCV feature count: {len(OHLCV_FEATURE_NAMES)}")
print(f"Features: {OHLCV_FEATURE_NAMES}")

In [None]:
# Commented out IPython magic to ensure Python compatibility.
# 2. Clone the repository
!git clone https://github.com/tensorlink-dev/open-synth-miner
# %cd open-synth-miner
!uv pip install torchsde
# 3. Install dependencies using uv
# --system: Installs into the Colab runtime (no venv needed)
# -e .: Installs the package in editable mode
!uv pip install --system -e .

# 4. (Optional) Verify installation
!python main.py --help

In [None]:
import sys
import os

# Ensure the project root is on the path
PROJECT_ROOT = os.path.abspath(os.path.join(os.getcwd(), ".."))
if PROJECT_ROOT not in sys.path:
    sys.path.insert(0, PROJECT_ROOT)
os.chdir(PROJECT_ROOT)

import numpy as np
import pandas as pd
import torch
import torch.optim as optim
import matplotlib.pyplot as plt

from src.models.registry import discover_components, registry
from src.models.factory import HybridBackbone, SynthModel
from src.models.heads import HorizonHead, NeuralBridgeHead
from src.data.market_data_loader import (
    HFOHLCVSource,
    MockDataSource,
    OHLCVEngineer,
    OHLCV_FEATURE_NAMES,
    MarketDataLoader,
)
from src.research.trainer import Trainer, DataToModelAdapter
from src.research.metrics import (
    crps_ensemble,
    CRPSMultiIntervalScorer,
    SCORING_INTERVALS,
)

# Auto-discover all registered blocks
discover_components("src/models/components")

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
print(f"Using device: {device}")
print(f"Registered blocks: {list(registry.blocks.keys())}")
print(f"OHLCV feature count: {len(OHLCV_FEATURE_NAMES)}")
print(f"Features: {OHLCV_FEATURE_NAMES}")

## 2. Configuration

In [None]:
# ----- Data config -----
REPO_ID = "tensorlink-dev/open-synth-training-data"
TIMEFRAME = "5m"  # "5m" or "1m"
ASSET_FILES = {
    "BTC_USD": "data/BTC_USD/{timeframe}.parquet",
    "ETH_USD": "data/ETH_USD/{timeframe}.parquet",
    "SOL_USD": "data/SOL_USD/{timeframe}.parquet",
}
ASSETS = list(ASSET_FILES.keys())
USE_HF = True

## 3. Data Loading

`HFOHLCVSource` downloads per-asset parquet files from the HF dataset and returns `AssetData`
with full OHLCV columns. `OHLCVEngineer` then resamples the raw candles (1m/5m) to 1-hour bars
and computes 16 micro-structure features:

| Feature | Description |
|---------|-------------|
| `open, high, low, close, volume` | Standard 1h OHLCV |
| `realized_vol` | Intra-hour log-return std |
| `skew, kurtosis` | Higher moments of intra-hour returns |
| `parkinson_vol` | Range-based volatility estimator |
| `efficiency` | Fractal efficiency (net move / total path) |
| `vwap_dev` | Close deviation from VWAP |
| `signed_vol_sum` | Net buying pressure |
| `up_wick, down_wick` | Upper/lower wick ratios |
| `body_size` | Candle body as fraction of range |
| `clv` | Close Location Value (-1 to +1) |

Set `USE_HF = False` in the config cell to fall back to `MockDataSource` for offline testing.

In [None]:
engineer = OHLCVEngineer(resample_rule="1h")

if USE_HF:
    source = HFOHLCVSource(
        repo_id=REPO_ID,
        asset_files=ASSET_FILES,
        repo_type="dataset",
        timeframe=TIMEFRAME,
    )
else:
    # Offline fallback: MockDataSource generates synthetic random-walk prices.
    # OHLCVEngineer gracefully handles the single-price case (O=H=L=C=price).
    source = MockDataSource(length=8000, freq="5min", seed=42, base_price=100.0)

### Train / Validation / Test Split

Using a static holdout with a fractional cutoff to create leak-safe temporal splits.

In [None]:
# Retrieve block classes from the registry
DLinearBlock = registry.get_block("dlinearblock")
TimesNetBlock = registry.get_block("timesnetblock")
TimeMixerBlock = registry.get_block("timemixerblock")

# Instantiate the blocks
blocks = [
    DLinearBlock(d_model=D_MODEL, kernel_size=DLINEAR_KERNEL),
    TimesNetBlock(d_model=D_MODEL, top_k=TIMESNET_TOP_K, dropout=0.1),
    TimeMixerBlock(
        d_model=D_MODEL,
        down_sampling_window=TIMEMIXER_DOWN_WINDOW,
        down_sampling_layers=TIMEMIXER_DOWN_LAYERS,
        moving_avg_kernel=DLINEAR_KERNEL,
        dropout=0.1,
    ),
    DLinearBlock(d_model=D_MODEL, kernel_size=DLINEAR_KERNEL),
]

backbone = HybridBackbone(
    input_size=FEATURE_DIM,
    d_model=D_MODEL,
    blocks=blocks,
    validate_shapes=True,
)

# Build the head based on HEAD_TYPE
if HEAD_TYPE == "neural_bridge":
    head = NeuralBridgeHead(
        latent_size=backbone.output_dim,
        micro_steps=MICRO_STEPS,
        hidden_dim=BRIDGE_HIDDEN,
    )
    head_label = f"NeuralBridgeHead (micro_steps={MICRO_STEPS})"
else:
    head = HorizonHead(
        latent_size=backbone.output_dim,
        horizon_max=HORIZON_MAX,
        nhead=HORIZON_NHEAD,
        n_layers=HORIZON_LAYERS,
        dropout=0.1,
        kv_dim=128,
    )
    head_label = f"HorizonHead (horizon_max={HORIZON_MAX})"

model = SynthModel(backbone=backbone, head=head).to(device)

total_params = sum(p.numel() for p in model.parameters())
backbone_params = sum(p.numel() for p in backbone.parameters())
head_params = sum(p.numel() for p in head.parameters())
print(f"SynthModel with {head_label} built successfully")
print(f"  Backbone output dim: {backbone.output_dim}")
print(f"  Backbone params:     {backbone_params:,}")
print(f"  Head params:         {head_params:,}")
print(f"  Total parameters:    {total_params:,}")

## 4. Build the Model

### Backbone
The backbone stacks four blocks with complementary inductive biases:

1. **DLinearBlock** — trend-seasonal decomposition with separate linear projections
2. **TimesNetBlock** — FFT period discovery + 2D Inception convolutions
3. **TimeMixerBlock** — multi-scale past-decomposable mixing (bottom-up seasonal, top-down trend)
4. **DLinearBlock** — final decomposition refinement

### Head Options

Set `HEAD_TYPE` in the config cell to switch between:

#### `"horizon"` — HorizonHead (per-step mu/sigma via cross-attention)
Generates per-step drift and volatility trajectories via learned queries that
cross-attend to the full backbone sequence. Outputs `(mu_1…mu_H, sigma_1…sigma_H)`
for time-varying GBM path simulation.

#### `"neural_bridge"` — NeuralBridgeHead (macro return + micro texture)
Hierarchical head that predicts:
- **Macro destination** — the 1H log-return (where does the price end up?)
- **Micro texture** — sub-hour path shape (how does it get there?)

Uses a *bridge constraint* to force the generated path to start at 0 and end at the
predicted return. Unlike HorizonHead, this outputs the path tensor **directly** —
no external simulation loop needed.

```
h_t (batch, d_model)  ← last-step backbone embedding
      │
 ┌────┴────┐
 │         │
 ▼         ▼
macro_proj  texture_net
(→ 1H ret)  (→ micro_steps deviations)
 │         │
 │     Bridge constraint
 │     (zero endpoints)
 │         │
 ▼         ▼
linear_path + bridge → micro_returns (batch, micro_steps)
```

In [None]:
# Retrieve block classes from the registry
DLinearBlock = registry.get_block("dlinearblock")
TimesNetBlock = registry.get_block("timesnetblock")
TimeMixerBlock = registry.get_block("timemixerblock")

# Instantiate the blocks
blocks = [
    DLinearBlock(d_model=D_MODEL, kernel_size=DLINEAR_KERNEL),
    TimesNetBlock(d_model=D_MODEL, top_k=TIMESNET_TOP_K, dropout=0.1),
    TimeMixerBlock(
        d_model=D_MODEL,
        down_sampling_window=TIMEMIXER_DOWN_WINDOW,
        down_sampling_layers=TIMEMIXER_DOWN_LAYERS,
        moving_avg_kernel=DLINEAR_KERNEL,
        dropout=0.1,
    ),
    DLinearBlock(d_model=D_MODEL, kernel_size=DLINEAR_KERNEL),
]

backbone = HybridBackbone(
    input_size=FEATURE_DIM,
    d_model=D_MODEL,
    blocks=blocks,
    validate_shapes=True,
)

# Build the head based on HEAD_TYPE
if HEAD_TYPE == "neural_bridge":
    head = NeuralBridgeHead(
        latent_size=backbone.output_dim,
        micro_steps=MICRO_STEPS,
        hidden_dim=BRIDGE_HIDDEN,
    )
    head_label = f"NeuralBridgeHead (micro_steps={MICRO_STEPS})"
else:
    head = HorizonHead(
        latent_size=backbone.output_dim,
        horizon_max=HORIZON_MAX,
        nhead=HORIZON_NHEAD,
        n_layers=HORIZON_LAYERS,
        dropout=0.1,
    )
    head_label = f"HorizonHead (horizon_max={HORIZON_MAX})"

model = SynthModel(backbone=backbone, head=head).to(device)

total_params = sum(p.numel() for p in model.parameters())
backbone_params = sum(p.numel() for p in backbone.parameters())
head_params = sum(p.numel() for p in head.parameters())
print(f"SynthModel with {head_label} built successfully")
print(f"  Backbone output dim: {backbone.output_dim}")
print(f"  Backbone params:     {backbone_params:,}")
print(f"  Head params:         {head_params:,}")
print(f"  Total parameters:    {total_params:,}")

### Sanity Check: Forward Pass

In [None]:
# Quick smoke test with a dummy batch
dummy_x = torch.randn(2, INPUT_LEN, FEATURE_DIM, device=device)
dummy_price = torch.tensor([100.0, 50.0], device=device)

with torch.no_grad():
    paths, param_a, param_b = model(dummy_x, initial_price=dummy_price, horizon=PRED_LEN, n_paths=100)

if HEAD_TYPE == "neural_bridge":
    print(f"Micro path shape:  {paths.shape}      (batch, micro_steps) — direct path output")
    print(f"Macro return shape: {param_a.shape}    (batch, 1) — predicted 1H log-return")
    print(f"\nSample macro returns: {param_a.detach().cpu().numpy().round(4)}")
    print(f"Sample micro path:   {param_b[0].detach().cpu().numpy().round(4)}")
else:
    print(f"Paths shape:     {paths.shape}      (batch, n_paths, horizon)")
    print(f"Mu_seq shape:    {param_a.shape}    (batch, horizon) — per-step drift")
    print(f"Sigma_seq shape: {param_b.shape}  (batch, horizon) — per-step volatility")
    print(f"\nSample mu trajectory:    {param_a[0].detach().cpu().numpy().round(4)}")
    print(f"Sample sigma trajectory: {param_b[0].detach().cpu().numpy().round(4)}")

## 5. Training Loop

Uses the `Trainer` class which handles:
- `DataToModelAdapter` to bridge `MarketDataLoader` batch format to `SynthModel` inputs
- CRPS loss for probabilistic calibration
- Sharpness and log-likelihood tracking

In [None]:
optimizer = optim.Adam(model.parameters(), lr=LR)
adapter = DataToModelAdapter(device=device, target_is_log_return=True)

trainer = Trainer(
    model=model,
    optimizer=optimizer,
    n_paths=N_PATHS,
    device=device,
    adapter=adapter,
)

# Metric tracking
history = {
    "train_loss": [],
    "train_crps": [],
    "train_sharpness": [],
    "val_crps": [],
}

In [None]:
for epoch in range(1, EPOCHS + 1):
    # --- Training ---
    epoch_losses = []
    epoch_crps = []
    epoch_sharp = []

    for batch in train_dl:
        metrics = trainer.train_step(batch)
        epoch_losses.append(metrics["loss"])
        epoch_crps.append(metrics["crps"])
        epoch_sharp.append(metrics["sharpness"])

    avg_loss = np.mean(epoch_losses)
    avg_crps = np.mean(epoch_crps)
    avg_sharp = np.mean(epoch_sharp)

    history["train_loss"].append(avg_loss)
    history["train_crps"].append(avg_crps)
    history["train_sharpness"].append(avg_sharp)

    # --- Validation ---
    val_metrics = trainer.validate(val_dl)
    history["val_crps"].append(val_metrics["val_crps"])

    if epoch % 3 == 0 or epoch == 1:
        print(
            f"Epoch {epoch:3d}/{EPOCHS}  "
            f"train_loss={avg_loss:.5f}  "
            f"train_crps={avg_crps:.5f}  "
            f"val_crps={val_metrics['val_crps']:.5f}  "
            f"sharpness={avg_sharp:.5f}"
        )

### Training Curves

In [None]:
fig, axes = plt.subplots(1, 3, figsize=(15, 4))

axes[0].plot(history["train_loss"], label="Train Loss (CRPS)")
axes[0].set_xlabel("Epoch")
axes[0].set_ylabel("Loss")
axes[0].set_title("Training Loss")
axes[0].legend()

axes[1].plot(history["train_crps"], label="Train CRPS")
axes[1].plot(history["val_crps"], label="Val CRPS")
axes[1].set_xlabel("Epoch")
axes[1].set_ylabel("CRPS")
axes[1].set_title("CRPS: Train vs Validation")
axes[1].legend()

axes[2].plot(history["train_sharpness"], label="Sharpness", color="tab:green")
axes[2].set_xlabel("Epoch")
axes[2].set_ylabel("Std(paths)")
axes[2].set_title("Forecast Sharpness")
axes[2].legend()

plt.tight_layout()
plt.show()

## 6. Backtesting on Test Set

We evaluate the trained model on the held-out test split using the `CRPSMultiIntervalScorer`
which computes CRPS at the standard scoring intervals (5min, 30min, 3hour, 24hour).

In [None]:
model.eval()
scorer = CRPSMultiIntervalScorer(time_increment=TIME_INCREMENT)

interval_scores = {name: [] for name in SCORING_INTERVALS}
overall_scores = []
all_test_crps = []

for batch in test_dl:
    adapted = adapter(batch)
    history_t = adapted["history"]
    initial_price = adapted["initial_price"]
    target_factors = adapted["target_factors"]
    horizon = target_factors.shape[-1]

    with torch.no_grad():
        paths, mu, sigma = model(
            history_t,
            initial_price=initial_price,
            horizon=horizon,
            n_paths=N_PATHS,
        )

    # Ensemble CRPS on the target factors
    sim_paths = paths.transpose(1, 2)  # (batch, horizon, n_paths)
    crps_vals = crps_ensemble(sim_paths, target_factors)
    all_test_crps.append(crps_vals.mean().item())

    # Multi-interval CRPS per sample
    for sample_idx in range(paths.shape[0]):
        total_crps, detailed = scorer(paths[sample_idx], paths[sample_idx, 0])  # score against median path
        overall_scores.append(total_crps)
        for row in detailed:
            interval_name = row["Interval"]
            if interval_name in interval_scores and row["Increment"] == "Total":
                interval_scores[interval_name].append(float(row["CRPS"]))

avg_test_crps = np.mean(all_test_crps)
print(f"\n{'='*50}")
print(f"BACKTEST RESULTS")
print(f"{'='*50}")
print(f"Average Test CRPS: {avg_test_crps:.6f}")
print(f"\nMulti-Interval CRPS Breakdown:")
for name, scores in interval_scores.items():
    if scores:
        print(f"  {name:>12s}: {np.mean(scores):.6f} (n={len(scores)})")
    else:
        print(f"  {name:>12s}: N/A (horizon too short for this interval)")

## 7. Fan Chart Visualization

Visualize the probabilistic forecasts as fan charts with P5/P50/P95 percentile bands.

In [None]:
# Grab a few test samples for visualization
model.eval()
test_batch = next(iter(test_dl))
adapted = adapter(test_batch)

with torch.no_grad():
    paths, mu, sigma = model(
        adapted["history"],
        initial_price=adapted["initial_price"],
        horizon=adapted["target_factors"].shape[-1],
        n_paths=N_PATHS,
    )

paths_np = paths.cpu().numpy()           # (batch, n_paths, horizon)
targets_np = adapted["target_factors"].cpu().numpy()  # (batch, horizon)

n_show = min(4, paths_np.shape[0])
fig, axes = plt.subplots(1, n_show, figsize=(5 * n_show, 4), squeeze=False)

for i in range(n_show):
    ax = axes[0, i]
    sample_paths = paths_np[i]  # (n_paths, horizon)
    t = np.arange(sample_paths.shape[1])

    p5 = np.percentile(sample_paths, 5, axis=0)
    p25 = np.percentile(sample_paths, 25, axis=0)
    p50 = np.percentile(sample_paths, 50, axis=0)
    p75 = np.percentile(sample_paths, 75, axis=0)
    p95 = np.percentile(sample_paths, 95, axis=0)

    ax.fill_between(t, p5, p95, alpha=0.15, color="tab:blue", label="P5-P95")
    ax.fill_between(t, p25, p75, alpha=0.3, color="tab:blue", label="P25-P75")
    ax.plot(t, p50, color="tab:blue", linewidth=2, label="Median")
    ax.plot(t, targets_np[i], color="tab:red", linewidth=2, linestyle="--", label="Actual")

    ax.set_title(f"Sample {i}")
    ax.set_xlabel("Horizon Step")
    ax.set_ylabel("Price Factor")
    if i == 0:
        ax.legend(fontsize=8)

plt.suptitle("DLinear + TimesNet + TimeMixer Fan Charts (Test Set)", fontsize=14, y=1.02)
plt.tight_layout()
plt.show()

## 8. Path Distribution Analysis

In [None]:
# Terminal price distribution for the first test sample
fig, axes = plt.subplots(1, 3, figsize=(16, 4))

terminal_prices = paths_np[0, :, -1]
axes[0].hist(terminal_prices, bins=50, alpha=0.7, color="tab:blue", edgecolor="white")
axes[0].axvline(targets_np[0, -1], color="tab:red", linewidth=2, linestyle="--", label="Actual")
axes[0].set_title("Terminal Price Distribution")
axes[0].set_xlabel("Price Factor")
axes[0].set_ylabel("Count")
axes[0].legend()

# Per-step mu and sigma trajectories for a few test samples
model.eval()
test_batch_viz = next(iter(test_dl))
adapted_viz = adapter(test_batch_viz)
with torch.no_grad():
    _, mu_viz, sigma_viz = model(
        adapted_viz["history"],
        initial_price=adapted_viz["initial_price"],
        horizon=adapted_viz["target_factors"].shape[-1],
        n_paths=10,
    )
mu_np = mu_viz.cpu().numpy()
sigma_np = sigma_viz.cpu().numpy()
t_steps = np.arange(mu_np.shape[-1]) if mu_np.ndim > 1 else np.array([0])

for i in range(min(4, mu_np.shape[0])):
    if mu_np.ndim > 1:
        axes[1].plot(t_steps, mu_np[i], alpha=0.6, label=f"Sample {i}" if i < 3 else None)
        axes[2].plot(t_steps, sigma_np[i], alpha=0.6, label=f"Sample {i}" if i < 3 else None)
    else:
        axes[1].axhline(mu_np[i], alpha=0.6)
        axes[2].axhline(sigma_np[i], alpha=0.6)

axes[1].set_xlabel("Horizon Step")
axes[1].set_ylabel("Drift (mu_t)")
axes[1].set_title("Learned Per-Step Drift")
axes[1].legend(fontsize=8)

axes[2].set_xlabel("Horizon Step")
axes[2].set_ylabel("Volatility (sigma_t)")
axes[2].set_title("Learned Per-Step Volatility")
axes[2].legend(fontsize=8)

plt.tight_layout()
plt.show()

## 9. Summary

This notebook demonstrated the full training and backtesting workflow within the open-synth-miner framework:

- **Data**: `HFOHLCVSource` loads per-asset OHLCV parquets from `tensorlink-dev/open-synth-training-data`. `OHLCVEngineer` resamples raw candles to 1-hour bars and computes 16 micro-structure features
- **Backbone**: `HybridBackbone` with `DLinearBlock` + `TimesNetBlock` + `TimeMixerBlock` + `DLinearBlock`
- **Head**: Selectable via `HEAD_TYPE` config:
  - `"horizon"` — `HorizonHead` cross-attention decoder generating per-step `(mu_t, sigma_t)` for time-varying GBM simulation
  - `"neural_bridge"` — `NeuralBridgeHead` hierarchical head predicting macro 1H return + micro sub-hour texture with bridge constraints (outputs path directly)
- **Training**: CRPS-optimized via the `Trainer` class
- **Backtesting**: Multi-interval CRPS evaluation (5min, 30min, 3hour, 24hour)

### Head comparison

| | GBMHead | HorizonHead | NeuralBridgeHead |
|---|---|---|---|
| Input | Last step `h[:,-1]` | Full sequence | Last step `h[:,-1]` |
| Output | `(mu, sigma)` scalars | Per-step `(mu_t, sigma_t)` | `(macro_ret, micro_path)` |
| Simulation | External GBM | External time-varying GBM | **None** — path is direct |
| Expressiveness | Constant dynamics | Time-varying drift/vol | Learned texture + bridge |
| Resolution | N/A | 1 per horizon step | `micro_steps` per hour |