feat: add parquet receipt store with DuckDB range queries#2861
feat: add parquet receipt store with DuckDB range queries#2861
Conversation
|
The latest Buf updates on your PR. Results from workflow Buf / buf (pull_request).
|
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #2861 +/- ##
===========================================
+ Coverage 48.36% 57.32% +8.95%
===========================================
Files 671 2097 +1426
Lines 50621 172767 +122146
===========================================
+ Hits 24485 99032 +74547
- Misses 23988 64850 +40862
- Partials 2148 8885 +6737
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
| for blockNum, logs := range chunk.logs { | ||
| if blockNum < fromBlock || blockNum > toBlock { | ||
| continue | ||
| } | ||
| for _, lg := range logs { | ||
| if matchLog(lg, crit) { | ||
| logCopy := *lg | ||
| result = append(result, &logCopy) | ||
| } | ||
| } | ||
| } |
Check warning
Code scanning / CodeQL
Iteration over map Warning
| "database/sql" | ||
| "fmt" | ||
| "path/filepath" | ||
| "runtime" |
Check notice
Code scanning / CodeQL
Sensitive package import Note
| go func() { | ||
| for { | ||
| latestVersion := s.latestVersion.Load() | ||
| pruneBeforeBlock := latestVersion - s.config.KeepRecent | ||
| if pruneBeforeBlock > 0 { | ||
| pruned := s.pruneOldFiles(uint64(pruneBeforeBlock)) | ||
| if pruned > 0 && s.log != nil { | ||
| s.log.Info(fmt.Sprintf("Pruned %d parquet file pairs older than block %d", pruned, pruneBeforeBlock)) | ||
| } | ||
| } | ||
|
|
||
| // Add jitter to avoid thundering herd | ||
| jitter := time.Duration(float64(pruneIntervalSeconds)*0.5) * time.Second | ||
| sleepDuration := time.Duration(pruneIntervalSeconds)*time.Second + jitter | ||
|
|
||
| select { | ||
| case <-s.pruneStop: | ||
| return | ||
| case <-time.After(sleepDuration): | ||
| // Continue to next iteration | ||
| } | ||
| } | ||
| }() |
Check notice
Code scanning / CodeQL
Spawning a Go routine Note
| } | ||
|
|
||
| // Add random jitter (up to 50% of base interval) to avoid thundering herd | ||
| jitter := time.Duration(rand.Float64()*float64(pruneIntervalSeconds)*0.5) * time.Second |
Check notice
Code scanning / CodeQL
Floating point arithmetic Note
| } | ||
|
|
||
| // Add random jitter (up to 50% of base interval) to avoid thundering herd | ||
| jitter := time.Duration(rand.Float64()*float64(pruneIntervalSeconds)*0.5) * time.Second |
Check notice
Code scanning / CodeQL
Floating point arithmetic Note
## Summary This PR adds a parquet-based receipt storage backend with DuckDB for efficient range queries on logs, enabling fast `eth_getLogs` queries across block ranges. - Add parquet backend option (`Backend: "parquet"` in config) - Parquet files rotate every 500 blocks - DuckDB queries across closed parquet files for efficient log filtering - WAL for crash recovery of in-progress parquet files - Pruning of old parquet files based on `KeepRecent` config - Build tag support: use `-tags duckdb` to enable parquet backend The parquet backend supports the new `FilterLogs` range query API introduced in #2788, enabling efficient cross-block log queries without falling back to per-receipt fetching. ## Dependencies - Depends on #2788 (ledger cache layer) ## Test plan - Receipt store unit tests pass (without duckdb tag) - Parquet store tests pass with `-tags duckdb` - Integration testing with full node using parquet backend
This PR adds a parquet-based receipt storage backend with DuckDB for efficient range queries on logs, enabling fast `eth_getLogs` queries across block ranges. - Add parquet backend option (`Backend: "parquet"` in config) - Parquet files rotate every 500 blocks - DuckDB queries across closed parquet files for efficient log filtering - WAL for crash recovery of in-progress parquet files - Pruning of old parquet files based on `KeepRecent` config - Build tag support: use `-tags duckdb` to enable parquet backend The parquet backend supports the new `FilterLogs` range query API introduced in #2788, enabling efficient cross-block log queries without falling back to per-receipt fetching. - Depends on #2788 (ledger cache layer) - Receipt store unit tests pass (without duckdb tag) - Parquet store tests pass with `-tags duckdb` - Integration testing with full node using parquet backend
Summary
This PR adds a parquet-based receipt storage backend with DuckDB for efficient range queries on logs, enabling fast
eth_getLogsqueries across block ranges.Backend: "parquet"in config)KeepRecentconfig-tags duckdbto enable parquet backendThe parquet backend supports the new
FilterLogsrange query API introduced in #2788, enabling efficient cross-block log queries without falling back to per-receipt fetching.Dependencies
Test plan
-tags duckdb