# ClyptQ E2E Backtest Test

Comprehensive backtest test:
- DataSpec-based: `OHLCVSpec(exchange="gateio", market_type="spot")` → automatic path determination
- Dynamic Universe: Filter-based instead of hardcoded symbols (e.g., `CryptoTop30`)
- Strategy: Pure operator-based (no direct pandas/numpy usage)
- Various Alpha, Factor, Transform combinations
- Engine-based backtest execution

In [None]:
# Auto-reload modules (development only)
%load_ext autoreload
%autoreload 2

# Standard imports (OK in notebooks only)
import pandas as pd
import numpy as np
from datetime import datetime, timedelta

# ClyptQ imports
from clyptq import operator
from clyptq.data.provider import DataProvider
from clyptq.data.spec import OHLCVSpec
from clyptq.universe import (
    Universe,
    DynamicUniverse,
    CryptoLiquid,
    CryptoTop20,
    CryptoTop30,
    LiquidityFilter,
    DataAvailabilityFilter,
)
from clyptq.strategy import Strategy
from clyptq.trading.engine import Engine

## 1. Define DataSpec & Dynamic Universe

- OHLCVSpec: Specify only exchange, market_type, timeframe → automatic path determination
- Dynamic Universe: Filter-based Top N selection without hardcoded symbols

In [None]:
# Define DataSpec
ohlcv_spec = OHLCVSpec(
    exchange="gateio",
    market_type="spot",
    timeframe="1d",
)

# Define Dynamic Universe - No hardcoded symbols!
# Top 30 by volume + liquidity/data availability filters
universe = CryptoLiquid(
    top_n=30,
    min_dollar_volume=100_000,  # Min daily dollar volume $100k
)

# Time period (data range: 2025-10-11 ~ 2026-01-08)
START = datetime(2025, 10, 15)
END = datetime(2026, 1, 5)

print(f"OHLCVSpec: {ohlcv_spec.exchange}/{ohlcv_spec.market_type}/{ohlcv_spec.timeframe}")
print(f"Universe: {universe.name}, top_n={universe.n}")
print(f"Period: {START} ~ {END}")

## 2. Research Style: DataProvider.load() → Access all data

By specifying only OHLCVSpec, DataProvider automatically determines the path.

In [None]:
# Create DataProvider (for research)
# Specify only OHLCVSpec - automatic path and symbol determination
p = DataProvider(
    universe=universe,
    specs={"ohlcv": ohlcv_spec},  # Pass OHLCVSpec → automatic path determination
    rebalance_freq="1d",
    mode="research",  # research mode: load all symbols + apply universe filter
)

# Load data for entire period (automatic symbol discovery + universe filter)
p.load(start=START, end=END)

print(f"=== Data Loaded ===")
print(f"Total symbols (N): {len(p.symbols)}")
print(f"Universe symbols (n): {p.n_universe}")
print(f"in_universe_mask shape: {p.in_universe_mask.shape if p.in_universe_mask is not None else 'N/A'}")

In [None]:
# Research: Access all data (direct pandas usage OK - notebooks only)
close = p["close"]
volume = p["volume"]

print(f"=== Raw Data (All N symbols) ===")
print(f"Close shape: {close.shape} (T x N)")
print(f"Volume shape: {volume.shape}")
print(f"Date range: {close.index[0]} ~ {close.index[-1]}")

# Check universe-filtered symbols
print(f"\n=== Universe Filtered (n={p.n_universe}) ===")
print(f"Universe symbols:")
for i, sym in enumerate(p.universe_symbols[:10], 1):
    print(f"  {i:2d}. {sym}")
if len(p.universe_symbols) > 10:
    print(f"  ... and {len(p.universe_symbols) - 10} more")

# Filtered data using in_universe_mask
close_universe = close[p.universe_symbols]
print(f"\nFiltered close shape: {close_universe.shape} (T x n)")

In [None]:
# Research: operator usage examples
returns = operator.ts_returns(close)
ma_20 = operator.ts_mean(close, 20)
vol_20 = operator.ts_std(returns, 20)

# Calculate dollar volume (liquidity basis)
dollar_volume = operator.mul(close, volume)
avg_dv = operator.ts_mean(dollar_volume, 20)

print(f"Returns shape: {returns.shape}")
print(f"MA20 shape: {ma_20.shape}")
print(f"Avg Dollar Volume shape: {avg_dv.shape}")

In [None]:
# Check Top 30 in universe by dollar volume (last day)
# Filter to universe symbols only using in_universe_mask
mask = p.in_universe_mask
close_masked = close.where(mask)
volume_masked = volume.where(mask)

dollar_volume = operator.mul(close_masked, volume_masked)
avg_dv = operator.ts_mean(dollar_volume, 20)

last_dv = avg_dv.iloc[-1].dropna().sort_values(ascending=False)
print(f"=== Top 30 in Universe (by avg dollar volume) ===")
print(f"Universe symbols at last timestamp: {mask.iloc[-1].sum()}")
print()
for i, (symbol, dv) in enumerate(last_dv.head(30).items(), 1):
    print(f"  {i:2d}. {symbol:20s} ${dv:>15,.0f}")

## 3. Define Strategy (Pure operators only + DataSpec + Dynamic Universe)

In [None]:
class MultiAlphaStrategy(Strategy):
    """Multi-Alpha Strategy using only operators (Spot Long-Only).
    
    DataSpec:
    - exchange: gateio
    - market_type: spot
    - timeframe: 1d
    
    Universe:
    - Dynamic: CryptoLiquid top 30 by dollar volume
    
    Alphas:
    1. Momentum: 20-day returns ranked
    2. Mean Reversion: inverse Z-score ranked
    3. Volume: Dollar volume ratio ranked
    
    Transforms (Spot Long-Only):
    - rank → l1_norm (no demean!)
    """
    
    name = "MultiAlpha_GateIO_Spot"
    
    # DataSpec - Engine automatically determines path
    data = {
        "ohlcv": OHLCVSpec(
            exchange="gateio",
            market_type="spot",
            timeframe="1d",
        ),
    }
    
    # Dynamic Universe - no hardcoded symbols
    universe = CryptoLiquid(
        top_n=30,
        min_dollar_volume=100_000,
        lookback=20,
    )
    
    rebalance_freq = "1d"
    
    def warmup_periods(self) -> int:
        return 30
    
    def compute_signal(self):
        close = self.provider["close"]
        volume = self.provider["volume"]
        
        # Alpha 1: Momentum (20-day return ranked)
        returns_20d = operator.ts_returns(close, period=20)
        alpha_momentum = operator.rank(returns_20d)  # [0, 1]
        
        # Alpha 2: Mean Reversion (inverse Z-score ranked)
        ma_20 = operator.ts_mean(close, 20)
        std_20 = operator.ts_std(close, 20)
        zscore = operator.div(
            operator.sub(close, ma_20),
            operator.add(std_20, 1e-8)
        )
        alpha_mr = operator.rank(operator.neg(zscore))  # [0, 1]
        
        # Alpha 3: Volume Ratio
        dollar_volume = operator.mul(close, volume)
        dv_ma = operator.ts_mean(dollar_volume, 20)
        dv_ratio = operator.div(dollar_volume, operator.add(dv_ma, 1e-8))
        alpha_volume = operator.rank(dv_ratio)  # [0, 1]
        
        # Combine Alphas (all in [0, 1])
        combined = operator.ca_reduce_avg(
            alpha_momentum,
            alpha_mr,
            alpha_volume
        )
        
        # Transform: rank → l1_norm (Spot long-only, no demean!)
        signal = operator.rank(combined)
        signal = operator.l1_norm(signal)  # sum=1, positive only
        
        return signal

## 4. Run Engine-based Backtest

- data_path omitted → automatic determination from OHLCVSpec
- Universe also defined in Strategy

In [None]:
# Create strategy
strategy = MultiAlphaStrategy()

print(f"Strategy: {strategy.name}")
print(f"DataSpec: {strategy.data}")
print(f"Universe: {strategy.universe}")
print(f"Rebalance: {strategy.rebalance_freq}")

In [None]:
# Create Engine and run backtest
engine = Engine()

result = engine.run(
    strategy=strategy,
    mode="backtest",
    # data_path omitted! Automatic from OHLCVSpec
    start=START,
    end=END,
    initial_capital=10000.0,
    market_type="spot",  # Spot: long-only, negative weights error
    verbose=True,
)

print(f"\nBacktest completed!")

In [None]:
# Analyze results
if result.metrics:
    print("Performance Metrics:")
    print(f"  Total Return: {result.metrics.total_return:.2%}")
    print(f"  Annualized Return: {result.metrics.annualized_return:.2%}")
    print(f"  Volatility: {result.metrics.volatility:.2%}")
    print(f"  Sharpe Ratio: {result.metrics.sharpe_ratio:.2f}")
    print(f"  Max Drawdown: {result.metrics.max_drawdown:.2%}")
    print(f"  Num Trades: {result.metrics.num_trades}")
    print(f"  Win Rate: {result.metrics.win_rate:.2%}")

In [None]:
# Visualize equity curve
if result.snapshots:
    import matplotlib.pyplot as plt
    
    equity = [s.equity for s in result.snapshots]
    timestamps = [s.timestamp for s in result.snapshots]
    
    fig, axes = plt.subplots(2, 1, figsize=(12, 8))
    
    # Equity curve
    axes[0].plot(timestamps, equity)
    axes[0].set_title('Equity Curve')
    axes[0].set_ylabel('Equity ($)')
    axes[0].grid(True)
    
    # Drawdown
    equity_series = pd.Series(equity, index=timestamps)
    rolling_max = equity_series.expanding().max()
    drawdown = (equity_series - rolling_max) / rolling_max
    
    axes[1].fill_between(timestamps, drawdown, 0, alpha=0.5, color='red')
    axes[1].set_title('Drawdown')
    axes[1].set_ylabel('Drawdown (%)')
    axes[1].grid(True)
    
    plt.tight_layout()
    plt.show()

## 5. Multi-Timeframe Strategy (1d, 3d, 5d Alphas)

Resample 1d data to 3d and 5d to combine alphas from different timeframes.

**Timeline behavior:**
- 1 tick = 1 day (system clock)
- 1d Alpha: calculated every tick
- 3d Alpha: calculated when 3d bar closes
- 5d Alpha: calculated when 5d bar closes

```
Tick | Date       | 1d Alpha | 3d Alpha | 5d Alpha
-----+------------+----------+----------+----------
   1 | 2025-10-15 |   CALC   |   CALC   |   CALC   
   2 | 2025-10-16 |   CALC   |   hold   |   hold   
   3 | 2025-10-17 |   CALC   |   hold   |   hold   
   4 | 2025-10-18 |   CALC   |   CALC   |   hold   ← 3d bar closed
   5 | 2025-10-19 |   CALC   |   hold   |   hold   
   6 | 2025-10-20 |   CALC   |   hold   |   CALC   ← 5d bar closed
```

In [None]:
# Test multi-timeframe data access
# Resample 1d data to 3d and 5d
close_1d = p["close"]
close_3d = p["close", "3d"]  # 3-day resample
close_5d = p["close", "5d"]  # 5-day resample

print("=== Multi-Timeframe Data ===")
print(f"1d close shape: {close_1d.shape} ({len(close_1d)} bars)")
print(f"3d close shape: {close_3d.shape} ({len(close_3d)} bars)")
print(f"5d close shape: {close_5d.shape} ({len(close_5d)} bars)")

print("\n1d timestamps (first 10):")
for ts in close_1d.index[:10]:
    print(f"  {ts}")

print("\n3d timestamps (first 5):")
for ts in close_3d.index[:5]:
    print(f"  {ts}")

print("\n5d timestamps (first 5):")
for ts in close_5d.index[:5]:
    print(f"  {ts}")

In [None]:
class MultiTimeframeStrategy(Strategy):
    """Multi-Timeframe Momentum Strategy.
    
    Resample 1d data to 3d and 5d for alpha calculation.
    DataProvider automatically applies ffill to 1d index.
    
    Update frequency per timeframe:
    - 1d Alpha: calculated daily (short-term momentum)
    - 3d Alpha: new value every 3 days, hold previous otherwise
    - 5d Alpha: new value every 5 days, hold previous otherwise
    
    Final signal = weighted_avg(alpha_1d, alpha_3d, alpha_5d)
    """
    
    name = "MultiTimeframe_Momentum"
    
    data = {
        "ohlcv": OHLCVSpec(
            exchange="gateio",
            market_type="spot",
            timeframe="1d",  # Base timeframe
        ),
    }
    
    universe = CryptoLiquid(top_n=30, min_dollar_volume=100_000)
    rebalance_freq = "1d"
    
    def warmup_periods(self) -> int:
        return 30  # Sufficient warmup for 5d lookback
    
    def compute_signal(self):
        # 1d data (system clock)
        close_1d = self.provider["close"]
        
        # 3d, 5d resample (automatically ffilled to 1d index)
        close_3d = self.provider["close", "3d"]
        close_5d = self.provider["close", "5d"]
        
        # Alpha 1: 1d Momentum (daily returns, 5-day lookback)
        returns_1d = operator.ts_returns(close_1d, period=5)
        alpha_1d = operator.rank(returns_1d)  # [0, 1]
        
        # Alpha 2: 3d Momentum (already aligned to 1d index)
        returns_3d = operator.ts_returns(close_3d, period=3)
        alpha_3d = operator.rank(returns_3d)  # [0, 1]
        
        # Alpha 3: 5d Momentum (already aligned to 1d index)
        returns_5d = operator.ts_returns(close_5d, period=5)
        alpha_5d = operator.rank(returns_5d)  # [0, 1]
        
        # Combine: weighted average
        # 1d: 40%, 3d: 35%, 5d: 25%
        combined = operator.add(
            operator.add(
                operator.mul(alpha_1d, 0.4),
                operator.mul(alpha_3d, 0.35)
            ),
            operator.mul(alpha_5d, 0.25)
        )
        
        # Transform: rank → l1_norm (Spot long-only)
        signal = operator.rank(combined)
        signal = operator.l1_norm(signal)
        
        return signal

In [None]:
# Multi-Timeframe Strategy backtest
mtf_strategy = MultiTimeframeStrategy()

mtf_result = engine.run(
    strategy=mtf_strategy,
    mode="backtest",
    start=START,
    end=END,
    initial_capital=10000.0,
    market_type="spot",
    verbose=True,
)

print(f"\n=== Multi-Timeframe Strategy Result ===")
if mtf_result.metrics:
    print(f"Total Return: {mtf_result.metrics.total_return:.2%}")
    print(f"Annualized Return: {mtf_result.metrics.annualized_return:.2%}")
    print(f"Sharpe Ratio: {mtf_result.metrics.sharpe_ratio:.2f}")
    print(f"Max Drawdown: {mtf_result.metrics.max_drawdown:.2%}")

## 6. Futures Mode Example (Long/Short + Leverage)

Futures allows demean (short positions allowed)

In [None]:
class FuturesStrategy(Strategy):
    """Futures strategy with long/short and leverage."""
    
    name = "Momentum_Futures"
    
    data = {
        "ohlcv": OHLCVSpec(
            exchange="gateio",
            market_type="spot",  # Currently using spot data
            timeframe="1d",
        ),
    }
    
    universe = CryptoLiquid(top_n=30, min_dollar_volume=100_000)
    rebalance_freq = "1d"
    
    def warmup_periods(self) -> int:
        return 30
    
    def compute_signal(self):
        close = self.provider["close"]
        
        # Momentum strategy (long/short)
        returns_20d = operator.ts_returns(close, period=20)
        signal = operator.rank(returns_20d)
        signal = operator.demean(signal)  # Futures: long/short OK!
        signal = operator.l1_norm(signal)
        
        return signal

# Futures backtest
futures_strategy = FuturesStrategy()

futures_result = engine.run(
    strategy=futures_strategy,
    mode="backtest",
    start=START,
    end=END,
    initial_capital=10000.0,
    market_type="futures",  # Futures: short OK
    leverage=2.0,
    verbose=True,
)

print(f"\nFutures Result:")
if futures_result.metrics:
    print(f"  Total Return: {futures_result.metrics.total_return:.2%}")
    print(f"  Sharpe Ratio: {futures_result.metrics.sharpe_ratio:.2f}")
    print(f"  Max Drawdown: {futures_result.metrics.max_drawdown:.2%}")

# Visualize Futures equity curve
if futures_result.snapshots:
    import matplotlib.pyplot as plt
    
    equity = [s.equity for s in futures_result.snapshots]
    timestamps = [s.timestamp for s in futures_result.snapshots]
    
    fig, axes = plt.subplots(2, 1, figsize=(12, 8))
    
    # Equity curve
    axes[0].plot(timestamps, equity, color='blue')
    axes[0].set_title('Futures Strategy - Equity Curve (2x Leverage)')
    axes[0].set_ylabel('Equity ($)')
    axes[0].axhline(y=10000, color='gray', linestyle='--', alpha=0.5)
    axes[0].grid(True)
    
    # Drawdown
    equity_series = pd.Series(equity, index=timestamps)
    rolling_max = equity_series.expanding().max()
    drawdown = (equity_series - rolling_max) / rolling_max
    
    axes[1].fill_between(timestamps, drawdown, 0, alpha=0.5, color='red')
    axes[1].set_title('Drawdown')
    axes[1].set_ylabel('Drawdown (%)')
    axes[1].grid(True)
    
    plt.tight_layout()
    plt.show()

## 7. Summary

### Key Points

1. **DataSpec-based**: `OHLCVSpec(exchange, market_type, timeframe)` → automatic path determination
2. **Dynamic Universe**: `CryptoLiquid(top_n=30)` → vectorized `in_universe_mask (T x N)` automatic calculation
3. **Research mode**: `provider.load()` → load N symbols + apply universe filter
4. **Strategy**: Use pure operators only (no pandas/numpy)
5. **Multi-Timeframe**: Access resampled data via `provider["close", "3d"]` format

### Dynamic Universe Structure

```
All Symbols (N=2430)
    ↓ LiquidityFilter (min $100k)
    ↓ PriceFilter (min $0.01)
    ↓ DataAvailabilityFilter (min 20 bars)
Tradeable (M)
    ↓ Scoring (dollar_volume)
    ↓ TopN Selection (n=30)
In Universe (n=30)

# Research mode results
p.symbols           # N=2430 (all)
p.universe_symbols  # n=30 (filtered)
p.in_universe_mask  # (T x N) boolean DataFrame
```

### Multi-Timeframe Timeline

```
System Clock = 1d (1 tick = 1 day)

Tick | 1d Alpha | 3d Alpha | 5d Alpha
-----+----------+----------+----------
  1  |   CALC   |   CALC   |   CALC   
  2  |   CALC   |   hold   |   hold   
  3  |   CALC   |   hold   |   hold   
  4  |   CALC   |   CALC   |   hold   ← 3d bar
  5  |   CALC   |   hold   |   hold   
  6  |   CALC   |   hold   |   CALC   ← 5d bar
```

### Spot vs Futures

| | Spot (Long-Only) | Futures (Long/Short) |
|---|---|---|
| Weight range | [0, 1] positive only | [-1, 1] negative allowed |
| Transform | `rank → l1_norm` | `rank → demean → l1_norm` |
| Short | Error | Allowed |
| Leverage | 1x fixed | 1x ~ 125x |

In [None]:
print("E2E Backtest Test Complete!")