Skip to content

Utkal059/orderflow-backtester

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 

Repository files navigation

▲ OrderFlow Backtester v3.0

Institutional-grade order flow backtesting, ML pipeline, and portfolio management platform.

Built for quant research interviews. Not a toy — a system with real execution modeling, walk-forward validation, and portfolio-level risk analysis.


What This Actually Does

Most backtesters are glorified spreadsheets. This one simulates what happens when you trade:

  • Slippage: half-spread + volatility impact + size impact (√ market impact model)
  • Latency: 1-bar signal delay — your signal fires, but execution happens next bar
  • Partial fills: 85% fill probability per bar, partial fills on the rest
  • Fees: configurable in basis points, tracked per-trade
  • Position sizing: fixed fraction, volatility-targeted (15% ann. vol), or quarter-Kelly

The ML pipeline uses walk-forward validation (not k-fold — that leaks future data in time series) and runs feature leakage detection on every training run.


Architecture

Frontend (React + Vite)                    Backend (Python + FastAPI)
┌─────────────────────┐                   ┌──────────────────────────┐
│ Dashboard            │                   │ /backtest                │
│  • Single / Portfolio│◄──── api.js ─────►│ /portfolio/backtest      │
│  • Data source select│  (retry+timeout)  │ /ml/train                │
│  • Position sizing   │                   │ /ml/insights             │
│                      │                   │ /ws/logs (WebSocket)     │
│ Results              │                   ├──────────────────────────┤
│  • Equity + Drawdown │                   │ Engine                   │
│  • Trade log (paged) │                   │  ├── strategies (5)      │
│  • Portfolio corr.   │                   │  ├── execution model     │
│  • Cost analysis     │                   │  ├── position sizing     │
│                      │                   │  ├── metrics (20+)       │
│ ML / Alpha           │                   │  └── portfolio combiner  │
│  • Walk-forward table│                   ├──────────────────────────┤
│  • Leakage check     │                   │ ML Pipeline              │
│  • Model comparison  │                   │  ├── XGBoost + LR        │
│  • SHAP + importance │                   │  ├── Walk-forward (5w)   │
│  • Signal quality    │                   │  ├── Leakage detection   │
└─────────────────────┘                   │  └── SHAP (OOS only)     │
                                           ├──────────────────────────┤
                                           │ Data Sources             │
                                           │  ├── Synthetic (GARCH)   │
                                           │  └── CSV (Yahoo, custom) │
                                           └──────────────────────────┘

No Lookahead Bias — Here's How

Component Prevention Method
Signals Strategies process bars sequentially; each bar only sees past data
Execution 1-bar latency delay between signal and fill
ML Labels Future returns used for labels, but train/test split is strictly temporal
ML Split 70% train → 10% val → 20% test, chronological order, no shuffling
Walk-Forward Expanding window: each fold trains on all prior data only
SHAP Computed on out-of-sample test set only
Features All derived from rolling windows of past data

Quick Start

# Backend
cd orderflow-backtester-v3
pip install -r backend/requirements.txt
uvicorn backend.main:app --reload --port 8000

# Frontend (new terminal)
cd frontend
npm install
npm run dev

Open http://localhost:5173


Strategies

Strategy Signal Logic Exit Logic
order_flow_imbalance Z-score of rolling OFI > ±1.5σ Mean reversion to ±0.3σ
queue_exhaustion Book-side depletion + intensity spike Flow reversal or 12-bar timeout
momentum_burst 5-bar momentum + volume spike (1.8x) Trailing stop (vol-adjusted)
mean_reversion Price z-score > ±2σ + tight spread Z-score crosses ±0.5σ
composite_alpha Weighted ensemble vote of all four Combined signal threshold ±0.4

Execution Model

Fill Price = Mid Price
           + (Spread / 2)                    ← always pay the spread
           + (Volatility × Price × 0.1)      ← vol-proportional impact
           + (Price × 0.0001 × √size_ratio)  ← square-root market impact

With 85% full-fill probability. Remaining 15% get 50-95% partial fills.

ML Pipeline

Training: XGBoost (200 trees, depth 5, 0.05 LR, regularized) Validation: Walk-forward with 4-5 expanding windows Comparison: XGBoost vs Logistic Regression baseline Features: 20 features (12 raw order flow + 8 derived: z-scores, rolling stats, composites) Leakage check: Flags features with |corr| > 0.5 to target

Portfolio System

  • Equal weight: simple 1/N allocation
  • Risk parity: inverse-volatility weighting
  • Correlation matrix: computed from equity curve returns
  • Diversification ratio: weighted avg vol / portfolio vol
  • Warnings: auto-flagged when |corr| > 0.7 between assets

API Reference

Endpoint Method Description
GET /health Health check
GET /strategies List strategies
GET /symbols?source=synthetic List symbols by source
GET /data-sources List data sources
POST /backtest Single-asset backtest
POST /portfolio/backtest Multi-asset portfolio
POST /ml/train Train + evaluate + walk-forward
POST /ml/insights Feature importance + SHAP
WS /ws/logs Live log streaming

Metrics (Computed, Not Mocked)

Performance: Total Return, Annualized Return, Sharpe, Sortino, Calmar Risk: Max Drawdown, Drawdown Duration, VaR 95%, CVaR 95%, Vol, Skewness, Kurtosis Trade: Win Rate, Avg Win/Loss, Profit Factor, Hold Duration, Max Win/Loss Streaks Cost: Total Fees, Total Slippage (tracked per-trade) ML: OOS Accuracy, AUC-ROC, Precision, Recall, IC, ICIR, Turnover, Signal Decay Portfolio: Diversification Ratio, Correlation Matrix, Per-Asset Breakdown


Project Structure

orderflow-backtester-v3/
├── backend/
│   ├── config.py                 ← centralized config
│   ├── main.py                   ← FastAPI app + WebSocket
│   ├── requirements.txt
│   ├── data/
│   │   ├── base.py               ← abstract DataSource interface
│   │   ├── generator.py          ← synthetic GARCH + order flow
│   │   └── csv_loader.py         ← CSV ingestion (Yahoo, custom)
│   ├── engine/
│   │   ├── backtest.py           ← event-driven engine
│   │   ├── execution.py          ← slippage, latency, partial fills
│   │   ├── metrics.py            ← 20+ performance metrics
│   │   ├── portfolio.py          ← multi-asset portfolio engine
│   │   ├── position.py           ← sizing: fixed, vol-target, Kelly
│   │   └── strategies.py         ← 5 strategies incl. ensemble
│   ├── ml/
│   │   ├── features.py           ← feature engineering
│   │   ├── pipeline.py           ← XGBoost + SHAP + comparison
│   │   └── validation.py         ← walk-forward + leakage detection
│   └── routes/
│       ├── backtest.py           ← /backtest + /portfolio/backtest
│       └── ml.py                 ← /ml/train + /ml/insights
└── frontend/
    ├── index.html
    ├── package.json
    ├── vite.config.js
    └── src/
        ├── main.jsx
        ├── App.jsx               ← tabs + keyboard shortcuts
        ├── api.js                ← API service (retry, timeout)
        ├── Navbar.jsx            ← status + clock + tabs
        ├── Dashboard.jsx         ← config + portfolio mode
        ├── ResultsView.jsx       ← metrics + trade log + correlation
        ├── MLInsights.jsx        ← walk-forward + leakage + SHAP
        └── hooks/
            └── useBackend.js     ← connection + clock hooks

Design Decisions

Why synthetic data? — Exchange tick data costs $10K+/year. The GARCH(1,1) generator produces realistic vol clustering and order flow correlations. CSV loader supports real data when available.

Why event-driven? — Loop-based backtests allow accidental vectorized operations that leak future data. Bar-by-bar processing with explicit state makes this impossible.

Why XGBoost over LSTM? — For tabular order flow features with 20 columns, gradient-boosted trees consistently outperform sequence models. SHAP provides the interpretability trading desks require. The model comparison proves this empirically on each run.

Why walk-forward over k-fold? — k-fold shuffles time series data, placing future bars in the training set. Walk-forward expanding windows guarantee the model never trains on future data. The overfit ratio per window quantifies model stability.

Why quarter-Kelly? — Full Kelly is theoretically optimal but practically catastrophic — it assumes perfect edge estimation. Quarter-Kelly provides geometric growth with ~75% lower variance of outcome.


What Makes This Different From Tutorial Projects

Tutorial Project This Project
if price > MA: buy Z-score of rolling OFI with adaptive exits
returns.mean() / returns.std() Proper annualized Sharpe from daily returns
random_state=42, train_test_split() Walk-forward validation, no shuffling
Mock data in frontend All data from backend API, error states everywhere
Single asset only Portfolio with correlation + risk parity
No execution costs Slippage + fees + latency + partial fills, tracked per-trade

Technologies

Backend: Python 3.12, FastAPI, NumPy, pandas, XGBoost, SHAP, scikit-learn Frontend: React 18, Vite, Canvas charts Protocol: REST + WebSocket


Built as a quantitative research platform for prop trading interviews. Every metric is computed from actual PnL. No mock values. No silent fallbacks.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors