
# Order Book Simulator — Interactive Report

**Author:** Sohan Shingade (UC San Diego — Data Science & Finance)

## Executive Overview (Non-technical)
- I built a **matching engine** that simulates how exchanges match buy/sell orders using **price–time priority**: best price wins; ties go to the earliest order.
- The engine supports **limit & market orders**, **partial fills**, **cancels**, **replaces**, and **IOC/FOK** time-in-force rules.
- I generate a synthetic, deterministic order flow (fixed seed) to analyze **spread, midprice, depth, and imbalance**, and I measure **latency** for core operations.
- This demonstrates three qualities that matter in trading systems:
  1. **Correctness** (invariants: no crossed book, FIFO),
  2. **Maintainability** (typed modules, unit + property tests, docs & CLI),
  3. **Efficiency** (O(1)-ish best-price ops via heaps + deques; latency benchmarks).

Use the controls below to explore the simulated market state and performance.



## 0) Setup (Install optional interactive deps if missing)

> The core project needs only `numpy`, `pandas`, `matplotlib`.  
> For interactive charts, I optionally use **plotly** and **ipywidgets**.

Run this once if needed:
```bash
pip install plotly ipywidgets
# For Jupyter Lab:
jupyter labextension list  # recent Jupyter doesn't require manual widget extensions
```


In [1]:

# Path setup so we can import the package directly if running from repo
import sys, pathlib
ROOT = pathlib.Path.cwd()
if (ROOT / "orderbook").exists() and str(ROOT) not in sys.path:
    sys.path.insert(0, str(ROOT))
    
ROOT = pathlib.Path.cwd().parent   # go up one level from notebooks/
RESULTS_DIR = ROOT / "results"

import numpy as np
import pandas as pd

# Interactive libs
import plotly.express as px
import plotly.graph_objects as go
from ipywidgets import interact, FloatSlider, IntSlider, VBox, HBox, Dropdown, Layout

from orderbook.sim import SimConfig, Simulator, save_artifacts
from orderbook.metrics import l1_metrics_from_snapshots, summarize_latency_ns

(RESULTS_DIR / "figures").mkdir(parents=True, exist_ok=True)
print("Writing artifacts to:", RESULTS_DIR)


Writing artifacts to: /Users/sohan/Documents/quantprep/orderbook-simulator/results


In [2]:
# --- Plotly & display setup (paste near the top) ---
from IPython.display import display
import os
import plotly, plotly.io as pio

# Pick the right renderer for your environment
# if "VSCODE_PID" in os.environ:
#     pio.renderers.default = "vscode"              # VS Code
# else:
#     try:
#         # Works in JupyterLab/Notebook with a live kernel
#         pio.renderers.default = "notebook_connected"
#     except Exception:
#         pio.renderers.default = "browser"         # Fallback: opens charts in your web browser
pio.renderers.default = "plotly_mimetype"
print("Plotly", plotly.__version__, "| renderer:", pio.renderers.default)


Plotly 6.3.0 | renderer: plotly_mimetype


## 1) Interactive parameters

In [3]:

# Sliders for quick exploration
seed_slider   = IntSlider(value=30, min=0, max=10_000, step=1, description='seed')
events_slider = IntSlider(value=150_000, min=10_000, max=400_000, step=10_000, description='events')
p_limit       = FloatSlider(value=0.65, min=0.0, max=1.0, step=0.01, description='p_limit')
p_market      = FloatSlider(value=0.20, min=0.0, max=1.0, step=0.01, description='p_market')
p_cancel      = FloatSlider(value=0.10, min=0.0, max=1.0, step=0.01, description='p_cancel')
p_replace     = FloatSlider(value=0.05, min=0.0, max=1.0, step=0.01, description='p_replace')
sigma_ticks   = FloatSlider(value=1.5, min=0.2, max=5.0, step=0.1, description='sigma_ticks')
size_mean     = FloatSlider(value=100.0, min=10.0, max=1000.0, step=10.0, description='size_mean')

VBox([seed_slider, events_slider, HBox([p_limit, p_market]), HBox([p_cancel, p_replace]), HBox([sigma_ticks, size_mean])])


VBox(children=(IntSlider(value=30, description='seed', max=10000), IntSlider(value=150000, description='events…

## 2) Run simulation (reproducible) and build interactive figures

In [4]:

def run_sim(seed, events, p_lim, p_mkt, p_can, p_rep, sig_ticks, s_mean):
    # Normalize mix to keep probabilities sensible (the simulator uses absolute, but we clip)
    total = p_lim + p_mkt + p_can + p_rep
    if total == 0:
        p_lim = 1.0; p_mkt = p_can = p_rep = 0.0
    else:
        p_lim, p_mkt, p_can, p_rep = [x/total for x in (p_lim, p_mkt, p_can, p_rep)]

    cfg = SimConfig(
        seed=int(seed),
        n_events=int(events),
        tick_size=0.01,
        p_limit=p_lim,
        p_market=p_mkt,
        p_cancel=p_can,
        p_replace=p_rep,
        mid0=100.0,
        sigma_ticks=float(sig_ticks),
        drift_per_1k=0.0,
        size_mean=float(s_mean),
        size_min=10,
        p_ioc=0.05,
        p_fok=0.02,
        snapshot_every=250,
    )
    sim = Simulator(cfg)
    art = sim.run()
    paths = save_artifacts(art, str(RESULTS_DIR))
    return art, paths

def interactive_run(seed, events, p_lim, p_mkt, p_can, p_rep, sig_ticks, s_mean):
    art, paths = run_sim(seed, events, p_lim, p_mkt, p_can, p_rep, sig_ticks, s_mean)
    snaps = art.snapshots.copy()
    snaps['event'] = snaps['event'].astype(int)

    # L1 metrics
    m = l1_metrics_from_snapshots(snaps)
    df_metrics = pd.DataFrame({
        'event': snaps['event'],
        'spread': m.spread.values,
        'mid': m.mid.values,
        'bid_depth': m.bid_depth.values,
        'ask_depth': m.ask_depth.values,
        'imbalance': m.imbalance.values,
    })

    # Interactive line chart with range slider
    fig_mid = px.line(df_metrics, x='event', y='mid', title='Midprice')
    fig_mid.update_layout(xaxis_rangeslider_visible=True)

    fig_spread = px.line(df_metrics, x='event', y='spread', title='Spread (L1)')
    fig_spread.update_layout(xaxis_rangeslider_visible=True)

    fig_depths = go.Figure()
    fig_depths.add_trace(go.Scatter(x=df_metrics['event'], y=df_metrics['bid_depth'], mode='lines', name='bid_depth'))
    fig_depths.add_trace(go.Scatter(x=df_metrics['event'], y=df_metrics['ask_depth'], mode='lines', name='ask_depth'))
    fig_depths.update_layout(title='L1 Depths', xaxis_title='event', yaxis_title='shares', xaxis_rangeslider_visible=True)

    fig_imb = px.line(df_metrics, x='event', y='imbalance', title='Order Book Imbalance')
    fig_imb.update_layout(xaxis_rangeslider_visible=True)

    # Latency histogram (μs)
    us = art.latencies_ns / 1_000.0
    fig_lat = px.histogram(x=us, nbins=60, title='Operation Latency (μs)')
    fig_lat.update_layout(xaxis_title='latency (μs)', yaxis_title='count')

    # Latency summary box
    summ = summarize_latency_ns(art.latencies_ns)
    summary_md = f"""**Latency summary (ns)**  
- p50: {summ['p50_ns']:.0f}  
- p90: {summ['p90_ns']:.0f}  
- p99: {summ['p99_ns']:.0f}  
- ops/sec: {summ['ops_per_sec']:.0f}
"""

    display(fig_mid)
    display(fig_spread)
    display(fig_depths)
    display(fig_imb)
    display(fig_lat)

    print(summary_md)
    print("Artifacts saved:", paths)

interact(
    interactive_run,
    seed=seed_slider, events=events_slider,
    p_lim=p_limit, p_mkt=p_market, p_can=p_cancel, p_rep=p_replace,
    sig_ticks=sigma_ticks, s_mean=size_mean
);


interactive(children=(IntSlider(value=30, description='seed', max=10000), IntSlider(value=150000, description=…

### Chart: Midprice
The **midprice** is the average of best bid and best ask.  
It represents the "fair" short-term value of the asset according to current supply and demand.  
Watching its path shows how the simulated market drifts and mean-reverts under order flow.

### Chart: Spread
The **spread** is the difference between best ask and best bid.  
It measures instantaneous transaction cost and liquidity tightness.  
Narrow spreads indicate high liquidity, while wide spreads signal low liquidity or imbalance.

### Chart: L1 Depths
The **bid depth** and **ask depth** at the top of book show how many shares are waiting at best bid and ask.  
Together they reflect immediate liquidity available for trading.  
Asymmetries here often predict short-term price moves.

### Chart: Order Book Imbalance
The **imbalance** compares bid vs ask depth at L1.  
Positive imbalance (>0) means more demand than supply at the top, potentially upward pressure.  
Negative imbalance (<0) indicates more supply, downward pressure.

### Chart: Latency Histogram
This shows the distribution of matching engine operation latencies (in microseconds).  
Most ops are fast; the tail comes from cancels/replaces scanning within price levels.  
Efficient engines have tight distributions with low tails.


## Notes for Interviewers (1 minute)
- Matching is strictly **price–time priority** with **FIFO** at each price level.
- The engine includes **IOC** and **FOK**, **cancel**, and **replace** (price change resets queue priority).
- Hot paths use **`dict[price] -> deque`** and **heaps** for best-price discovery with lazy cleanup.
- I validate correctness via **unit tests** and (optionally) **property-based tests**.
- The simulator + plots let me reason about **spread, depth, imbalance**, and **latency** under configurable flow.
