Skip to content

Add DataFusion + Parquet query engine design doc for one_d4#1068

Merged
aaylward merged 7 commits intomainfrom
claude/datafusion-parquet-plan-tN00g
Feb 26, 2026
Merged

Add DataFusion + Parquet query engine design doc for one_d4#1068
aaylward merged 7 commits intomainfrom
claude/datafusion-parquet-plan-tN00g

Conversation

@aaylward
Copy link
Copy Markdown
Collaborator

@cloudflare-workers-and-pages
Copy link
Copy Markdown

cloudflare-workers-and-pages bot commented Feb 25, 2026

Deploying with  Cloudflare Workers  Cloudflare Workers

The latest updates on your project. Learn more about integrating Git with Workers.

Status Name Latest Commit Preview URL Updated (UTC)
✅ Deployment successful!
View logs
1d4-web 7920e3b Commit Preview URL

Branch Preview URL
Feb 26 2026, 05:05 AM

Revise the DataFusion/Parquet design doc based on the planned chariot
integration (issue #1049). Key changes:

- Motif detection stays in Java — no Rust port of detectors
- Lichess bulk ingest is a Java CLI jar reusing one_d4 detectors
- motif_query (Rust) narrows to query engine + Parquet writer only
- Parquet schema includes 7 new Phase 9 motifs
- Architecture diagram updated to show Java-centric data flow
- New open questions on Phase 9 ordering and Java Parquet writer

https://claude.ai/code/session_011dyxSbaXZV93zBZL5SMbun
Replace the direct ChessQL→SQL compilation model with a Substrait-based
pipeline: ChessQL→Substrait Plan→{SQL, DataFusion}. Key design changes:

- SubstraitCompiler produces Substrait protobuf plans from ChessQL AST
- QueryRouter dispatches plans to SQL (via substrait-java) or DataFusion
  (via datafusion-substrait) based on feature flag or cost routing
- motif_query Rust service accepts Substrait plan bytes, not SQL strings
- Shadow mode runs both backends in parallel for migration validation
- Optional cost-based routing (boolean filters→DataFusion, sequence→SQL)
- Architecture diagram shows dual-backend query flow
- Implementation phases reordered: Substrait compiler first, then crate
- New open questions on sequence() coverage and version pinning

https://claude.ai/code/session_011dyxSbaXZV93zBZL5SMbun
Detail how the indexer transitions from per-game SQL INSERTs to Parquet:

- Current IndexWorker writes 1 game at a time; Parquet files are immutable
- Buffered writer in motif_query: accumulate rows, flush at threshold
  (5000 rows) or interval (60s), with explicit /v1/flush endpoint
- Java IndexWorker batches per month (10-100 games) before POST
- Compaction merges small files: timer-based, file-count triggered,
  lock-file coordination with writer, target 5-25 MB steady-state
- Dual-write during migration: SQL first (authoritative), Parquet second
- File size analysis: small batches → buffer flush → compaction targets
- Tradeoff analysis: why not append, why not Delta Lake/Iceberg (yet)
- Updated open questions: compaction concurrency, read-your-writes
- Removed answered questions (buffering strategy, compaction design)

https://claude.ai/code/session_011dyxSbaXZV93zBZL5SMbun
Rework the Parquet write section with concrete throughput assumptions:

- 10K-100K games/month total (~300-3,300/day), 10-100 per player-month
- Buffer threshold raised to 10K rows (most partitions flush 1-5 times)
- Time-based flush at 5 min (safety net, not primary trigger)
- Compaction downgraded to low priority: 30-min interval, only merges
  tail files < 100 KB, skip partitions with recent writes
- Compaction scaling table: when to consider Delta Lake (500K+/month)
- File size targets: 500 KB - 2.5 MB for Chess.com, 20-80 MB for Lichess
- Storage estimates updated for 50K games/month Chess.com baseline
- Write amplification analysis for single-file-per-partition approach

https://claude.ai/code/session_011dyxSbaXZV93zBZL5SMbun
Major simplification of the write strategy. Instead of real-time Parquet
writes with buffering/compaction/dual-write, the indexer writes to SQL
as it does today (unchanged), and a periodic batch job exports SQL data
to Parquet for analytical queries.

Architecture changes:
- IndexWorker: no changes at all — continues writing to SQL
- ParquetExportJob: weekly/monthly cron, SELECT → write one Parquet file
  per partition — no buffering, no compaction, no small-file problem
- Lichess ingest: direct to Parquet (batch job, never touches SQL)
- game_storage_backends metadata table: tracks which backend has each
  partition's data ('sql', 'parquet', 'both')
- StorageAwareQueryRouter: checks metadata, dispatches to SQL or
  DataFusion, with time-based shortcut (current month → SQL)

Eliminated: in-memory buffer, flush thresholds, compaction.rs, /v1/flush
endpoint, dual-write logic, crash recovery, read-your-writes problem.

motif_query Rust service simplified to query engine + batch file writer.

https://claude.ai/code/session_011dyxSbaXZV93zBZL5SMbun
Detail how adding new motifs / fixing detectors works with the SQL-first
+ periodic Parquet export architecture:

- Schema evolution: ALTER TABLE in SQL, schema-on-read in Parquet (old
  files return NULL for new columns, treated as FALSE)
- Re-analysis pipeline: read PGN from SQL, re-run detectors, UPDATE
  motif columns in place, mark partition as parquet_stale
- Re-export: export job detects stale partitions, overwrites Parquet
- Lichess re-analysis: three options (re-ingest from dump, store PGN in
  SQL, or separate PGN Parquet table) — recommend re-ingest for now
- game_storage_backends gains parquet_stale and last_reanalyzed_at
  columns for staleness tracking
- Query router falls back to SQL for stale partitions during the window
  between re-analysis and re-export
- Timeline diagram showing the re-analysis → re-export flow

https://claude.ai/code/session_011dyxSbaXZV93zBZL5SMbun
@aaylward aaylward merged commit 3949b97 into main Feb 26, 2026
12 checks passed
@aaylward aaylward deleted the claude/datafusion-parquet-plan-tN00g branch February 26, 2026 18:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants