Various bugfixes by dwerner · Pull Request #7 · edgeandnode/phaser-bridge

dwerner · 2025-10-13T18:51:08Z

No description provided.

Previously, empty parquet files (0 rows but valid metadata) were treated as valid data coverage. This caused the scanner to report ranges as complete when they actually contained no data. Changes: - Check total row count in read_block_range_from_parquet - Return None for parquet files with 0 rows - Add support for .empty marker files in find_missing_ranges - Add support for .empty marker files in has_completed_segment - Simplify filename parsing to only support new format This allows distinguishing between "checked but empty" ranges (using 0-byte .empty files) and actual data files.

When a sync range contains no data, write a 0-byte .empty marker file instead of a 3.2KB empty parquet file. This makes it clearer that the range was checked but contained no data. Changes: - Add write_empty_marker() function to create .empty files - Replace all write_empty_range() calls with write_empty_marker() - Update all three sync functions (blocks, transactions, logs) The .empty files are recognized by the data scanner and prevent re-syncing of empty ranges.

Add detailed logging to track data flow from Erigon's BlockDataBackend through the bridge to phaser-query. Changes: - Log each batch received from BlockDataBackend with count - Log stream completion with total batch count - Change log level from Debug to Info for visibility - Add batch counting for blocks, transactions, and logs streams This helps diagnose issues where streams complete without sending data.

Distinguish between live streaming and historical sync with separate filename patterns and an is_live flag. Changes: - Add is_live flag to ParquetWriter - Add with_config_and_mode constructor - Use 'live_{type}_from_{start}_{timestamp}.tmp' pattern for live files - Use '{type}_from_{start}_{timestamp}.parquet.tmp' for historical files - Add write_empty_range method for marking empty ranges This allows different handling of live vs historical data and makes it easier to identify the source of parquet files.

Previously, progress tracking relied on in-memory worker state which could become stale or inaccurate. Now we scan the actual parquet files on disk to determine what's truly completed. Changes: - Use DataScanner to analyze sync range progress - Calculate blocks_synced from complete segments on disk - Find max_completed_block from actual complete segments - Remove in-memory aggregation of worker progress This provides accurate progress even after restarts and handles cases where workers report completion but files aren't written.

dwerner added 7 commits October 9, 2025 13:25

remove commit script

7236641

Format whitespace in jsonrpc-bridge capabilities

4f09f1f

dwerner merged commit 6b44e93 into main Oct 13, 2025

dwerner deleted the various-bugfixes branch October 13, 2025 18:51

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Various bugfixes#7

Various bugfixes#7
dwerner merged 7 commits intomainfrom
various-bugfixes

dwerner commented Oct 13, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

dwerner commented Oct 13, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant