Deterministic Multi-Language HFT Replay & Adversarial Microstructure Bench
Rust core + Zig 0.13 ITCH parser + C++20 hot kernels + Go control plane Β |Β
193 tests (163 Rust + 20 Zig + 10 Go, incl. seeded ITCH parser fuzz harness) Β |Β 40 B canonical Event ABI Β |Β
MoldUDP64 + WAL + SPSC mmap ring Β |Β RustβZigβC++ canonical L2 hash bit-identical Β |Β
6-guard fail-closed risk gate Β |Β
React/uPlot CHAOS desk with 5 live storm injectors and a 3-file run
recorder (run.yaml + events.jsonl + ticks.jsonl) β see
dashboard/, api/ and engine/.
Single-origin desk: Rust engine on :9090 β Go bridge on :8080 β React/uPlot dashboard. Left rail = CHAOS DECK + SCOREBOARD + TARGET (ZEUS-HFT). Bottom-right = pipeline stage latency (parse / apply / analytics / risk / wire) p50 + p99.
Important
π Latency documents:
- π
docs/latency-alpha.mdβ Ξ± optimisation log: intrinsic share flip 25 % β 73 % (cache-bound β compute-bound), rejected alternatives kept with diagnosis - π
docs/latency-cross-hw.mdβHotOrderBook<256>TOTAL p50 = 22 ns reproduced 5/5 across two CPU generations (AMD Ryzen + Intel i7), zero variance on p50
π Design documents:
- π―
docs/feed-design.mdβ why ITCH replay alone is not enough for chaos, why the synthetic feed lives in Go, storm parameter discipline, rejected alternatives - π§±
docs/stack-rationale.mdβ why 4 + 1 languages, what each layer is forbidden from doing, FFI contract, cost of the multi-language stack (honest) - π
docs/reference-vs-hot.mdβ three independent order book implementations (BTreeMap oracle + array/slab hot path + C++ template), one canonical L2 hash, gated in CI fail-fast
| Signal | Value |
|---|---|
HotOrderBook::apply TOTAL p50 (5/5 cross-CPU) |
22 ns |
| Cross-impl L2 hash agreement (Rust β Zig β C++) | 0xf54ce1b763823e87 |
| Snapshot resume | replay-from-checkpoint β‘ replay-from-scratch (e2e proven) |
| Tests | 193 (163 Rust + 20 Zig + 10 Go, incl. seeded ITCH parser fuzz harness β 4 invariants Γ 200kβ1M iters) |
Canonical Event ABI |
40 B, frozen, #[repr(C)] |
| Risk gate guards | 6, fail-closed, latching |
| Live storm injectors | 5 (Phantom Β· Cancel Β· Ignition Β· Crash Β· Lat-Arb) |
Multi-language pipeline for market-data replay, microstructure analytics, HFT aggression detection, and adversarial bot stress-testing under controlled storm conditions. Same input bytes produce identical state, byte-for-byte, across runs and platforms.
Warning
Deterministic substrate + adversarial harness for live trading bots.
Modeled: deterministic order flow replay, microstructure analytics, chaos pattern detection, live adversarial storms against a connected trading bot (TARGET endpoint), real-time bot PnL telemetry, audit-grade run recording.
Not modeled inside Flowlab: internal matching engine, queue position, fill probability. Connected bots own venue connectivity and their own execution path (ZEUS-HFT runs as a TARGET against this harness in production); Flowlab is the substrate they plug into, not a replacement for them.
flowlab is the deterministic data + analytics substrate an HFT research and execution stack sits on top of. Real trading bots (ZEUS-HFT today) connect as TARGET, receive the live event stream, run their own decision + venue routing, and stream back PnL telemetry that Flowlab records, stresses, and replays.
Four languages, one source of truth: Rust owns the state machine, Zig owns the parser, C++ owns the hot kernels, Go owns I/O. The cross-implementation invariants β canonical L2 hash bit-identical, 40 B Event ABI, replay-stable analytics, deterministic WAL halt on gaps β are validated in CI, not assumed.
The scope is a deliberate choice. A serious strategy stack needs a reproducible substrate before matching engine, fill model and queue tracking. Those layers are intentionally out of perimeter: they depend on venue-specific assumptions (NASDAQ ITCH vs CME MDP3 vs CBOE PITCH) and on proprietary information (queue position, fill probability, market impact) that must not contaminate the base. flowlab provides the primitives a coherent matching layer can be built on; it does not pretend to be that layer.
- Determinism first. Sequence-driven execution. Wall-clock time is informational; it is never an input to ordering.
- Event sourcing. All state derives from an immutable, append-only event log. Snapshots are cached projections, never sources of truth.
- Separation of I/O and computation. Go handles the world; Rust owns the truth; C++ owns speed; Zig owns specialization.
- Zero-copy across the boundary. No serialization between ingest and core. Mmap ring β Zig parser β Rust normalizer.
- Performance discipline. Pre-allocated buffers,
#[repr(C)]layouts, no hidden allocation in the hot path. Correctness is proved by cross-impl hash agreement before any micro-opt lands.
The project is Rust-core: ~80% of the code (and the entire deterministic state machine, replay, WAL, risk gate, analytics) is Rust. The other languages are scoped specializations.
| Concern | Language | Scope | LOC |
|---|---|---|---|
| Truth | Rust | Event ABI, state machine, replay, WAL, analytics | ~16,200 |
| Specialization | Zig 0.13 | comptime ITCH 5.0 parser, zero-copy |
~490 |
| Speed (opt-in) | C++20 | L2 book + Welford stats behind --features native |
~530 |
| I/O + control | Go | mmap ring writer, WS ingest, control plane + CHAOS | ~2,800 |
| UI | TS/TSX | React + Vite + uPlot CHAOS desk (dashboard/) |
~1,200 |
C++ count is hand-written code only; the vendored single-header
hotpath/include/xxhash.h (~6.6 kLOC, XXH3 reference impl v0.8.3)
is excluded.
Go never participates in replay. C++ and Zig never touch the
network. The deterministic core has no runtime dependency on either
GC or syscalls beyond read / mmap.
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β NON-DETERMINISTIC β
β Go ingest ββ WS / HTTP / file ββ reconnect ββ backpressure β
ββββββββββββββββββββ¬ββββββββββββββββββββββββββββββββββββββββββββββββ
β mmap ring (SPSC, lock-free, atomic indices)
βΌ
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β DETERMINISTIC CORE β
β β
β Zig feed-parser ββ ITCH 5.0 / FIX / OUCH ββ zero-copy β
β β β
β β extern "C" (40 B Event, #[repr(C)], align 8) β
β βΌ β
β Rust normalizer ββ canonical event log ββ WAL ββ snapshots β
β β β
β βββ replay engine (flowlab-replay) β
β βββ microstructure analytics + risk gate (flowlab-flow) β
β βββ HFT aggression detection (flowlab-chaos) β
β βββ state verifier (flowlab-verify) β
β β
β C++ hot path (hotpath/) β FFI ββ book, hasher, stats β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
offset size field
0 8 ts u64 LE, nanoseconds (informational)
8 8 price u64 LE, integer ticks
16 8 qty u64 LE
24 8 order_id u64 LE
32 4 instrument_id u32 LE
36 1 event_type u8
37 1 side u8
38 2 _pad [u8; 2]
ββββββββ
40 B, align(8), #[repr(C)]
Properties:
- Little-endian on all supported targets.
bytemuck::Pod + Zeroableon the Rust side; POD on the C++ side.- Bit-identical across Rust / C++ / Zig; no padding ambiguity.
- Any change bumps the canonical L2 hash in
flowlab-verify.
The deterministic core can be driven by a runtime binary that exposes a versioned telemetry stream over TCP. A Go bridge consumes that stream and re-broadcasts it as JSON over WebSocket to a React/uPlot dashboard.
flowlab-engine (Rust, :9090)
βββ Source : synthetic | itch
βββ Pipeline : HotOrderBook + VPIN + spread + imbalance
βββ Risk gate : CircuitBreaker (probed for latency)
βββ Backpressure : bounded mpsc(8192), drop counter exposed
βββ Wire : [u32 len][u16 ver=1][bincode|json payload]
β
β TCP
βΌ
api/ (Go, :8080)
EngineClient + WS /stream
β
βΌ
dashboard/ (React + Vite + uPlot)
6 streaming panels + Β±2Ο bands + KPI sidebar
Frame schema lives in engine/src/wire.rs
(TelemetryFrame::{Header, Tick, Book, Risk, Lat, Heartbeat}). Every
Tick carries two clocks kept deliberately separate:
event_time_ns\u2014 source-provided wall clock (ITCH ts48, Binance E)process_time_ns\u2014 engineCLOCK_MONOTONICat applylatency_ns = process - eventis computed only where the two clocks are comparable (replay yes, crypto WAN no).
# 1. start the Rust runtime
cargo run -p flowlab-engine --release -- \
--source synthetic --wire json --listen 127.0.0.1:9090 --tick-hz 50
# 2. start the Go bridge
cd api && go run ./cmd/api -addr :8080 -feed engine -engine 127.0.0.1:9090
# 3. start the dashboard
cd ../dashboard && npm install && npm run dev # http://localhost:5173-feed=synthetic keeps the legacy in-process Go feed for offline
dashboard hacking. The dashboard contract is unchanged across modes.
The dashboard ships in two modes, and they are not the same thing. For demos and stress runs use single-origin; for UI work use Vite dev.
| Mode | Command | Origins | When to use |
|---|---|---|---|
| Single-origin | .\run-desk.ps1 |
:8080 only β Go serves dashboard/dist/ + WS + control plane |
Default. Demos, recorded runs, day-to-day driving the desk. No CORS, no WS proxy fragility, Firefox does not throttle. |
| UI dev (HMR) | .\run-desk.ps1 -Dev (or npm run dev) |
:5173 (Vite UI) β proxies /stream /storm /run /bot /health /status /reset to :8080 |
Only when modifying React components. Hot-module-reload, source maps, React DevTools. |
The Go process always owns :8080 (WS + REST + recorder). Vite never
talks to the Rust engine directly. In production there is no Node
runtime at all β the React bundle is just static files served by Go.
The dashboard is also a stress-testing console against a real external trading bot (the target). Five chaos kinds can be fired into the synthetic feed live, with full auditability: every run produces a deterministic 3-file artefact set on disk.
βββββββββββββββββββββββββββββββββββ
β CHAOS DECK (left aside) β
β PHANTOM Β· CANCEL Β· IGNITION β
β CRASH Β· LAT-ARB β
β β
β severity β [0,1] β
β duration β [5s, 120s] β
ββββββββββββββ¬βββββββββββββββββββββ
β POST /storm/start
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββ
β StormController (api/server/storm.go) β
β + per-kind injectors in feed.go β
β β’ PhantomLiquidity β Β±40% depth oscillation β
β β’ CancellationStorm β 4Γ EPS, VPIN β, vel β β
β β’ MomentumIgnition β directional drift on imbβ
β β’ FlashCrash β mid slide + spread blowout β
β β’ LatencyArbProxy β p99 tail explosion β
ββββββββββββββ¬βββββββββββββββββββββββββββββββββββββ
β corrupted Ticks broadcast on /stream
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββ
β TARGET (external HFT bot, e.g. ZEUS-HFT) β
β reads its own venue feed (cTrader demo) β
β exposes /api/state with equity / pnl / signals β
ββββββββββββββ¬βββββββββββββββββββββββββββββββββββββ
β polled by Recorder + BotPanel
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββ
β Recorder (api/server/recorder.go) β
β data/runs/<UTC-id>/ β
β ββ run.yaml desk-grade summary β
β ββ events.jsonl storm + signal events β
β ββ ticks.jsonl 1Hz sampled microstructureβ
βββββββββββββββββββββββββββββββββββββββββββββββββββ
| Button | Kind | Effect on the synthetic tick |
|---|---|---|
PHANTOM |
PhantomLiquidity |
Depth oscillates Β±40% at ~1Hz (visible book churn) |
CANCEL |
CancellationStorm |
EPS up to 4Γ, VPIN climbs, trade velocity collapses |
IGNITION |
MomentumIgnition |
Directional mid drift (sign = current imbalance) |
CRASH |
FlashCrash |
Linear mid slide + spread blowout |
LAT-ARB |
LatencyArbProxy |
P99 tail latency explodes; P50 stays cold |
Severity (0..1) and duration (5s..120s) are operator-controlled. The
storm ends deterministically; the breaker latches if regime escalates
to Crisis and gaps accumulate.
run.yaml is the desk's scoreboard. Self-contained, hand-rolled YAML
(no dep), with verdict computed from target.delta:
run_id: 2026-04-22T17-50-18Z
started_at: 2026-04-22T17:50:18Z
ended_at: 2026-04-22T17:50:31Z
duration_s: 13
tick_samples: 14
storms_fired:
- kind: FlashCrash
started_at_ms: 1776876627753
severity: 0.850
duration_ms: 3000
target_pnl_delta_eur: -2.40
target:
bot: zeus-hft
currency: EUR
start_equity: 99712.73
end_equity: 99710.33
delta: -2.40
total_trades: 4
wins: 1
losses: 3
win_rate: 0.250
verdict: TARGET_DAMAGED # TARGET_INTACT | TARGET_DAMAGED (<-10) | TARGET_KILLED (<-100)The example above is illustrative β
delta: -2.40would actually classify asTARGET_INTACT. Verdict thresholds (-10,-100) are calibrated against the bot's own equity scale; recorded runs ondata/runs/include all three outcomes.
events.jsonl is the audit trail (run_start, storm_start,
storm_stop, target_signal, run_stop). ticks.jsonl is a 1 Hz
sample of the full microstructure tick (mid, depth, VPIN, latency,
regime, storm_active, storm_kind) for offline analysis.
| Method | Path | Purpose |
|---|---|---|
POST |
/storm/start |
Fire one of 5 storm kinds |
POST |
/storm/stop |
Cancel active storm |
GET |
/storm/status |
{mode, kind?, severity, expires_at_ms?} |
POST |
/run/start |
Open a new recorded run directory |
POST |
/run/stop |
Finalize run.yaml, close JSONL files |
GET |
/run/status |
Currently active run (if any) |
GET |
/run/list |
All runs on disk |
GET |
/run/{id}/yaml |
Stream a finished run.yaml |
GET |
/bot/state |
Reverse-proxy to TARGET's /api/state |
GET |
/bot/health |
Reverse-proxy to TARGET's /api/health |
The dashboard's CHAOS DECK + SCOREBOARD + TARGET panels are thin clients over these endpoints. Every storm fired and every tick sampled during an active run lands on disk before the WebSocket frame leaves the bridge.
The bench is bot-agnostic. Any external trading bot becomes the
TARGET by exposing two HTTP endpoints on 127.0.0.1:3001 (default
β configurable via botHealthURL in api/server/server.go):
| Method | Path | Required response |
|---|---|---|
GET |
/api/health |
200 OK with any body. Used for liveness only. |
GET |
/api/state |
JSON object with at least: equity (number, base currency), currency (3-letter ISO), total_trades, wins, losses. Optional: realized_pnl, unrealized_pnl, signals (array). |
Minimal /api/state body the recorder understands:
{
"bot": "my-hft-bot",
"currency": "EUR",
"equity": 99710.33,
"realized_pnl": -1.60,
"unrealized_pnl": 0.00,
"total_trades": 4,
"wins": 1,
"losses": 3
}The connection model is pull, not push:
- The bot keeps its own venue connection (e.g. ZEUS-HFT \u2192 cTrader FIX on demo). flowlab does not route orders or feed data into the bot.
- flowlab polls
/api/stateevery \u22481 s for the BotPanel and twice per recorded run (start + stop) forrun.yaml'starget.delta. - The CHAOS storms reshape the synthetic feed on
/stream, not the bot's venue feed. The bot is stressed indirectly: if it subscribes to flowlab's/stream, it sees corrupted ticks; if it trades on its own venue, the bench measures whether the bot's internal regime detection / risk gate noticed the external storm.
This pull model is intentional: it lets flowlab benchmark any
bot \u2014 Python, C++, Go, closed-source binaries \u2014 as long as it can
expose two GET endpoints. Polling is best-effort and non-blocking; if
the bot is down, the recorder still produces a valid run.yaml with
zeroed target fields and a bot_unreachable event in events.jsonl.
No serialization layer exists between ingest and core.
Go (ingest) ββ[ mmap ring, SPSC, atomic indices ]βββΊ Zig (parser)
β
extern "C" (40 B Event) β
βΌ
Rust (normalizer)
0 "FLOWRING" magic (8 B)
8 capacity u64 LE (power of two)
64 writeIdx u64 atomic
128 readIdx u64 atomic
192 payload capacity bytes
Single producer / single consumer. Release-store on writeIdx after
payload; acquire-load on the reader. Producer blocks on full; reader
never observes a partial batch.
- MoldUDP64 (
session[10] | seq BE | count BE | [u16 BE len | msg]Γcount) - UDP multicast ingress with
GapTrackerand bounded forward buffer. count = 0β heartbeat;count = 0xFFFFβ end-of-session.
| Component | Guarantee |
|---|---|
WAL (replay::wal) |
64 MiB segments, `len |
| Event log | Append-only, length-prefixed, CRC'd |
| Snapshots | Content-addressed; replay resumes from nearest seq |
| Ring IPC | Backpressure, never silent drop |
| MoldUDP64 gap handling | Bounded forward buffer; deterministic gap halt |
Bit-exact replay is a test invariant: the WAL reproduces the canonical
L2 hash 0xf54ce1b763823e87 over 5000 events.
flow/src/circuit_breaker.rs is the last
line of defence before any outbound order. Every submission path MUST
call CircuitBreaker::check(&Intent) and respect the Decision.
| Guard | Trip condition |
|---|---|
| Rate limit | Token bucket (orders / sec) |
| Position cap | abs(net_pos + order_qty) > max_position |
| Daily-loss floor | cash_flow_ticks < -max_daily_loss_ticks |
| OTR ceiling | orders / max(1, trades) > max_otr (post-warmup) |
| Feed gap | gaps_within(window) >= gap_threshold |
| Manual kill | Operator latch |
Tripping latches. Recovery is explicit (reset, start_of_day).
Fail-closed, always.
- Monotonic global sequence ID per event log.
- Compound key
(channel_id, seq)for multi-feed scenarios. - Timestamps are informational only; ordering never depends on them.
- Sequence gaps halt the engine deterministically β no silent skip.
The engine carries two clocks and never conflates them.
| Clock | Source | Used for |
|---|---|---|
event_time_ns |
feed payload (ITCH ts48, Binance E) |
display, microstructure analytics |
process_time_ns |
engine CLOCK_MONOTONIC at apply() |
latency probes, replay diagnostics |
Neither clock affects ordering. Replay is driven by sequence ID alone,
so process_time_ns is reproducible across runs only in pure replay
mode; in live ingest it diverges by design. latency_ns = process - event is reported only where the two clocks are actually comparable
(replay yes, crypto WAN no).
Every boundary that crosses runtime domains has an explicit back-pressure rule. Silent drop is forbidden.
| Boundary | Rule |
|---|---|
| Go ingest β Rust core (mmap ring) | SPSC, producer blocks on full; reader never sees partial batch |
| MoldUDP64 ingress (forward buf) | Bounded; overflow β deterministic gap halt |
| Engine β telemetry bridge (mpsc) | Bounded mpsc(8192); drop counter exposed in TelemetryFrame::Risk |
| Bridge β WebSocket clients | Per-client buffer; slow client is dropped, not the engine |
The deterministic core is never starved by a slow consumer and never silently loses data; failures surface as halts or counters.
Every stage emits an xxh3_64 digest with domain tag FLOWLAB-L2-v1.
Rust digest βββ
Zig digest βββ£ MUST BE IDENTICAL (validated, CI-gated)
C++ digest βββ
- Rust β Zig: validated. The Zig parser asserts at
comptimethat@sizeOf(Event) == 40and exportsflowlab_event_size(), which Rust checks on everyparse_itchcall. Same input bytes produce the same canonical events. - Rust β C++: validated. The canonical L2 hash uses
XXH3_64bitson both sides (single-headerxxhash.hv0.8.3 in C++,xxhash-rustin Rust) with the same scheme: domain seedXXH3_64bits("FLOWLAB-L2-v1", 13), 16 B per level(price_le, total_qty_le),'|'side separator, XOR-fold per level. Cross-FFI bench inbench/benches/pipeline.rsasserts byte-for-byte digest equality over 50 000 events: RustHotOrderBookand C++OrderBookboth produce0xf54ce1b763823e87.
Any digest mismatch within a verified pair is a hard failure: replay aborts.
| Failure | Behaviour |
|---|---|
| Corrupted event (bad magic / CRC) | Rejected at normalization, never logged |
| Sequence gap | Deterministic halt + gap report |
| Parser error | -1 return; caller handles; no partial log |
| Cross-language hash mismatch | Replay aborted; state diff emitted |
| Ring full | Producer blocks (backpressure, no drop) |
No partial state mutation is ever allowed. Atomic all-or-nothing.
Rules enforced in code review and CI.
- Pre-allocate everything in the hot path.
HashMap::with_capacity,Vec::with_capacityfor book levels, order lookup, event buffers, replay buffers. Zero reallocations during steady state. #[repr(C)]for every struct crossing the FFI.repr(align(n))where required. Layout is part of the ABI; changing it bumps the canonical hash.- No unaligned raw-pointer access. Parser readers use
read_unalignedonly where the wire format demands it, and never for shared-memory structs. - Swap hashers only with numbers. The stdlib hasher is DoS-safe and the default. A faster hasher ships only with a bench proving the win on our workload.
- Correctness before speed, always. A change that breaks the canonical L2 hash does not land, period.
These rules are aligned with the FFI and ABI contracts above: stable layout plus pre-allocated capacity is what eliminates both layout mismatches and memory churn.
flowlab/
βββ core/ Rust: event, state machine, FFI to C++
βββ replay/ Rust: engine, WAL, ring, ITCH, MoldUDP64, UDP
βββ flow/ Rust: microstructure analytics + risk gate
βββ chaos/ Rust (+C++): HFT aggression detection
βββ verify/ Rust: cross-language state hashing
βββ engine/ Rust: live runtime + TCP telemetry wire (:9090)
βββ hotpath/ C++20: book, matching sim, rolling stats
βββ feed-parser/ Zig 0.13: comptime ITCH parsers, zero-copy
βββ ingest/ Go: WS / HTTP / file ingest, mmap ring producer
βββ api/ Go: control plane, CHAOS injector, recorder (:8080)
βββ dashboard/ React + Vite + uPlot: CHAOS desk UI
βββ bench/ Rust: criterion benchmarks
βββ bin/ Built Go binaries (api server)
βββ data/ Binary event logs + run artefacts (data/runs/)
βββ docs/ latency-alpha.md (Ξ± optimisation log) +
β latency-cross-hw.md (cross-CPU reproduction)
βββ run-desk.ps1 One-shot orchestrator (engine + api + dashboard)
βββ .github/workflows/ CI matrix (Linux + Windows, Β±native)
βββ Cargo.toml Rust workspace
βββ go.work Go workspace
βββ Makefile Unified build orchestration
βββ README.md
Every top-level folder has its own README.md with the detailed
contract.
# full build (portable Rust + Zig + C++ + Go)
make all
# Rust workspace, portable only
cargo build --release
# Rust + native FFI (C++ hot path + Zig static lib)
cargo build -p flowlab-core --features native
# Zig feed parser
cd feed-parser && zig build -Doptimize=ReleaseFast
# Go services
cd ingest && go build ./...
cd api && go build ./...Prerequisites for --features native:
- Rust β₯ 1.83 (edition 2024)
- Zig 0.13.0 on PATH
- A C++20 toolchain: MSVC 2022 Build Tools on Windows, clang++ β₯ 16 or g++ β₯ 12 elsewhere
cargo test --workspace # pure Rust
cargo test --workspace --features native # + FFI
cd feed-parser && zig build test --summary all # Zig unit tests
cd ingest && go test -race -count=1 ./... # GoPassing counts (verified by cargo test --workspace --features native
zig build test+go test ./...): 193 total = 163 Rust + 20 Zig- 10 Go. The Zig count includes the seeded ITCH parser fuzz harness
(
feed-parser/src/fuzz.zig, also gated separately in CI viazig build fuzz). Breakdown:
| Surface | Tests |
|---|---|
flowlab-chaos (5 storm injectors + legacy + clustering) |
71 |
flowlab-replay (unit + ring_ipc) |
38 + 2 |
flowlab-e2e (chaos drift / e2e / fuzz / cpp_hasher_agreement / snapshot_resume) |
22 |
flowlab-core (hot_book + snapshot + event + state) |
15 |
flowlab-flow (circuit breaker, analytics, regime) |
8 |
flowlab-bench (cross-impl hash gate + latency bins) |
5 |
flowlab-engine (lib + ich_real) |
1 |
| Doctest | 1 |
Zig feed-parser (itch.zig + main.zig) |
12 |
Go ingest/ (mmap ring + WS feed) |
6 |
Go api/ (regime parity vs Rust) |
4 |
FLOWLAB follows the conventional three-tier test pyramid of every language in the stack. Tests live next to what they prove, by design:
| Tier | Where | What it proves |
|---|---|---|
| Unit | #[cfg(test)] mod tests at the bottom of each Rust module (chaos/src/*.rs, flow/src/*.rs, core/src/*.rs, β¦); *_test.go next to each Go file; test "..." blocks in each .zig source |
Module-internal invariants. Access to pub(crate) and unexported items β must stay co-located. |
| Integration | <crate>/tests/*.rs (e.g. engine/tests/ich_real.rs, replay/tests/ring_ipc.rs) |
The crate's public API contract. Compiled as a separate binary; can only see pub items. |
| System | tests/ (flowlab-e2e crate β e2e/, fuzz/, chaos/) |
Cross-crate invariants: bit-exact replay, hash agreement, ring SPSC ordering, parser robustness, 10 M-event drift. The signal layer. |
cargo test --workspace collects every Rust tier in one command; tier
separation is structural, not procedural. Putting unit tests anywhere
but next to the module they cover would force pub(crate) items to
leak into the public API β the layout above is the language
convention, not loose organization.
cargo bench -p flowlab-bench # pipeline, replay
cargo bench -p flowlab-bench --features native # with C++ + ZigAll benches use pre-allocated buffers and seeded synthetic streams. No I/O inside measured regions.
Median over 5 consecutive runs, two CPU generations (AMD Ryzen +
Intel i7), Windows, no kernel tuning, no isolcpus, no HUGE pages.
TOTAL p50 = 22 ns was recorded on 5/5 runs on both boxes β the
number is a property of the code, not of the host.
| Metric (STEADY mix + prefetch, 500k events) | Median | Best |
|---|---|---|
| TOTAL p50 | 22 ns | 22 ns |
| TOTAL p99 | 88 ns | 80 ns |
| TRADE p50 | 28-30 ns | 28 ns |
| TRADE p99 | 128-144 | 96 ns |
| Wall apply-only (500 000 events) | 17.4 ms | 16.6 ms |
The Ξ± optimisation log moved the workload from cache-bound to
compute-bound: TRADE intrinsic share of the mix p99 went from 25 % to
73 %. Two changes (trade() hot/cold split + prefetch_event on
slab + level grid), zero new unsafe, full backwards-compatible
semantics. A third change (lane-batched apply_lanes) was tried,
proved correct via canonical L2 hash equivalence, and rejected
(+8.8 % wall vs interleaved) β diagnosis kept in the doc.
- Methodology, per-phase histograms, attribution and rejected
alternatives β
docs/latency-alpha.md - Cross-hardware reproduction (5/5 runs, two CPUs, zero variance on
p50) β
docs/latency-cross-hw.md
| Bench | Time | Throughput |
|---|---|---|
itch_parse/10000 |
~120-130 Β΅s | ~2.3-2.6 GiB/s |
itch_parse/100000 |
~1.22-1.35 ms | ~2.3 GiB/s |
full_hot/10000 |
β | ~600-650 MiB/s |
Methodology (required for any number above to be replicable):
- Reported as best observed stable run under idle system, not the mean across noisy runs. Multi-run variance is reported separately for diagnostics.
- Hardware: consumer x86-64 laptop, hybrid P/E-core CPU, AVX2.
- Power scheme: Windows Balanced, processor boost mode
AGGRESSIVE (
PERFBOOSTMODE = 2). No CPU pinning, no priority boost (the OS scheduler outperformed manual pinning on this rig). - Thermal: passive idle, no sustained load before measurement.
- Harness: Criterion 0.5.1, 100 samples, 3 s warm-up + 5 s measurement.
- Compilers:
rustc --releasewithtarget-cpu=native, MSVC 14.44/O2 /Oi /arch:AVX2, Zig 0.13ReleaseFast.
Numbers are not portable across machines or thermal states. Reproduce them on your own hardware before quoting.
.github/workflows/ci.yml runs on every push and pull request:
| Job | OS matrix | What it validates |
|---|---|---|
rust |
Ubuntu + Windows | fmt, clippy -D warnings, tests |
rust-native |
Ubuntu + Windows | Zig + C++ FFI build, native tests |
go |
Ubuntu + Windows | go vet, build, go test -race |
CI is the source of truth for cross-platform determinism.
FLOWLAB documents both. Reviewers are expected to read source, not banners.
Implemented and tested:
- Canonical 40 B
EventABI, frozen,#[repr(C)]+ Zigextern struct+ C++ POD layout BTreeMapreference orderbook +HotOrderBook<256>with slab-backed order index (Vec dense + aHash sparse, fixed seed)- WAL: 64 MiB segments, CRC-32 per record, torn-tail recovery, bit-exact replay over 5000 events
- MoldUDP64 frame parser +
GapTrackerwith bounded forward buffer - SPSC lock-free mmap ring (Go writer β Rust reader) with explicit Acquire/Release fences
- ITCH 5.0 parser in both Rust and Zig with cross-impl event agreement
- Microstructure analytics: imbalance, rolling spread, VPIN, threshold-based regime classifier (Calm/Volatile/Aggressive/Crisis)
- Circuit breaker: 6 guards, fail-closed latch, 7 unit tests
- Chaos infrastructure: 5 live storm injectors (PhantomLiquidity, CancellationStorm, MomentumIgnition, FlashCrash, LatencyArbProxy), legacy quote-stuff / spoof detectors, clustering, stress-window extractor
- Three chaos integration tests: 10 M-event drift, corruption injection, multi-thread burst desync
- C++
OrderBook<MaxLevels>(flat-array L2) and WelfordRollingStatsheader, callable through the FFI. AVX2 batch update was prototyped and rejected at engine tick cadence (~50 Hz, batch size 1, SIMD prologue dominates over scalar Welford); rationale preserved in hotpath/src/stats.cpp - Snapshot binary format:
FLSNmagic, versioned, little-endian layout, hand-rolled (no schema crate dep).replay-from-checkpoint β‘ replay-from-scratchproven by tests/e2e/snapshot_resume.rs - Windows mmap ring writer via
CreateFileMapping+MapViewOfFile, byte-identical to the POSIX layout - Chaos passive detectors wired into the engine hot loop. All 5
detectors live behind
ChaosChain::default_itch()in engine/src/engine.rs; every applied event runs throughprocess_intowith a reused buffer (zero per-event alloc), emissions are broadcast asChaosFrameon the telemetry wire - CI cross-impl L2 hash gate enforced on every push. Job
Cross-language L2 hash agreement gatein .github/workflows/ci.yml runscargo test -p flowlab-bench --features native -- cross_impl_l2_hash_agreement; any RustβC++ digest drift fails the build
Partial β landed but not yet at full scope:
| Item | What's done / what's missing | Reference |
|---|---|---|
| Lab matching engine | Removed deliberately in favor of adapting external trading bots through the /bot/state adapter; see Adversarial Desk above |
api/server/bot_proxy.go |
| Control API | /health, /status, /stream, /storm/*, /run/*, /bot/* implemented; /metrics (Prometheus) and /ingest/* are not yet wired |
api/server/server.go |
These live in the connected bot, not in the Flowlab core:
- Internal matching engine (deliberately removed in favour of the TARGET adapter β connected bots route to real venues themselves)
- Direct exchange connectivity (owned by the bot; ZEUS-HFT runs FIX 4.4 dual-connection against live venues as TARGET today)
- Strategy logic and position sizing (owned by the bot)
Flowlab provides the deterministic substrate, the adversarial
harness, and the telemetry/recording layer. Live trading happens
through connected bots; the screenshot in docs/dashboard.png shows
ZEUS-HFT live with real EUR PnL streamed via /bot/state.
Proprietary β see LICENSE. All rights reserved.
For commercial licensing or usage permissions, contact
Faraone-Dev@users.noreply.github.com.