Tx index for Parquet receipt store by jewei1997 · Pull Request #3222 · sei-protocol/sei-chain

jewei1997 · 2026-04-09T20:23:45Z

Describe your changes and provide context

Create a tx index to store (tx hash -> block number) to allow parquet to quickly serve getReceiptByHash queries.

Testing performed to validate your change

unit tests

github-actions · 2026-04-09T20:24:49Z

The latest Buf updates on your PR. Results from workflow Buf / buf (pull_request).

Build	Format	Lint	Breaking	Updated (UTC)
`✅ passed`	`✅ passed`	`✅ passed`	`✅ passed`	Apr 10, 2026, 8:05 PM

codecov · 2026-04-09T20:31:50Z

Codecov Report

❌ Patch coverage is 76.27737% with 65 lines in your changes missing coverage. Please review.
✅ Project coverage is 59.03%. Comparing base (1739f29) to head (5cc02d2).
⚠️ Report is 2 commits behind head on main.

Files with missing lines	Patch %	Lines
sei-db/ledger_db/receipt/tx_hash_index.go	72.93%	21 Missing and 15 partials ⚠️
sei-db/ledger_db/receipt/parquet_store.go	70.58%	14 Missing and 11 partials ⚠️
sei-db/config/receipt_config.go	83.33%	1 Missing and 1 partial ⚠️
sei-db/ledger_db/parquet/reader.go	94.59%	1 Missing and 1 partial ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #3222      +/-   ##
==========================================
+ Coverage   59.00%   59.03%   +0.02%     
==========================================
  Files        2065     2066       +1     
  Lines      169362   169617     +255     
==========================================
+ Hits        99931   100128     +197     
- Misses      60671    60703      +32     
- Partials     8760     8786      +26

Flag	Coverage Δ
sei-chain-pr	`76.04% <76.27%> (?)`
sei-db	`70.41% <ø> (ø)`

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines	Coverage Δ
sei-db/ledger_db/parquet/store.go	`69.00% <100.00%> (+0.67%)`	⬆️
sei-db/config/receipt_config.go	`77.35% <83.33%> (+1.16%)`	⬆️
sei-db/ledger_db/parquet/reader.go	`80.24% <94.59%> (+1.90%)`	⬆️
sei-db/ledger_db/receipt/parquet_store.go	`68.10% <70.58%> (+1.43%)`	⬆️
sei-db/ledger_db/receipt/tx_hash_index.go	`72.93% <72.93%> (ø)`

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

yzang2019 · 2026-04-10T09:21:17Z

+func (r *Reader) fileForBlock(blockNumber uint64) string {
+	r.mu.RLock()
+	defer r.mu.RUnlock()
+


fileForBlock() acquires r.mu.RLock() to read closedReceiptFiles, but it does not hold pruneMu. getReceiptByTxHashFromFiles acquires pruneMu.RLock() separately.

Between these two locks, a concurrent prune could delete the file that fileForBlock just returned. The query would then fail on a missing file. The existing GetReceiptByTxHash avoids this by acquiring pruneMu first, then snapshotting files — GetReceiptByTxHashInBlock should follow the same pattern

yzang2019 · 2026-04-10T09:26:48Z

 		return err
 	}

+	if s.txHashIndex != nil {


Index update happens after WriteReceipts happens here. If crash happens in between, it could lead to the parquet data exists but the index doesn't. And WAL replay doesn't seems to reindex theses receipts as well

Was this discussed on a call? I'm also curious about how we handle a crash here.

+1 Can we make the index update part of the same durable/recoverable flow, or otherwise rebuild missing index entries on startup from closed parquet files?

We do WAL crash recovery here

sei-chain/sei-db/ledger_db/receipt/parquet_store.go

Line 448 in b62c9aa

if s.txHashIndex != nil {

.

cody-littley · 2026-04-10T16:55:36Z

+// should contain blockNumber, falling back to a full scan on miss.
+func (r *Reader) GetReceiptByTxHashInBlock(ctx context.Context, txHash common.Hash, blockNumber uint64) (*ReceiptResult, error) {


falling back to a full scan on miss.

Could this be a potential DOS attack vector? If somebody is sending you requests for transactions that don't exist, will that cause us to do lots of full scans?

yeah i think we should do limits on the RPC layer similar to what we do for eth_getLogs today

cody-littley · 2026-04-10T16:58:41Z

+func (r *Reader) getReceiptByTxHashFromFiles(ctx context.Context, txHash common.Hash, files []string) (*ReceiptResult, error) {
+	r.pruneMu.RLock()
+	defer r.pruneMu.RUnlock()
+	return r.getReceiptByTxHashFromFilesLocked(ctx, txHash, files)
+}


The only places where I see this method called pass in nil for the list of files. Is the files parameter actually needed?

cody-littley · 2026-04-10T17:16:34Z

 		return err
 	}

+	if s.txHashIndex != nil {


Was this discussed on a call? I'm also curious about how we handle a crash here.

cody-littley

LLM review left the following commnets:

Bug Report: Branch vs Main

Branch commits: 5 commits (tx hash index for parquet receipt store + pruning race fixes)
Scope: 21 changed files in sei-db/
Clean areas: parquet/reader.go, parquet/store.go, config changes, and test changes — no bugs found.

Bug 1 (High): Iterator leak in `PruneBefore`

File: sei-db/ledger_db/receipt/tx_hash_index.go, lines 146–190

The Pebble iterator opened at line 146 has no defer close. The batch gets a deferred Close() (line 155), but the iterator is only closed explicitly at line 188 on the happy path. There are four early-return error paths (lines 169, 173, 178, 185) that all leak it.

A leaked Pebble iterator pins its memtable/sstable snapshot, blocking compaction and growing memory. Since the pruner retries on a timer, repeated transient I/O errors would accumulate leaked iterators.

Fix: Add defer iter.Close() right after the NewIter nil-error check at line 152, mirroring the batch pattern.

Bug 2 (High): Resource leak when `replayWAL()` fails during construction

File: sei-db/ledger_db/receipt/parquet_store.go, lines 70–72

If the tx-hash index backend is Pebble, the constructor opens the Pebble DB (line 51) and starts the pruner goroutine (line 63) before calling replayWAL() at line 70. If replayWAL() fails, the error return leaks all three resources:

store (parquet.Store) — open DuckDB connection, parquet writers, its own prune goroutine
idx (PebbleTxHashIndex) — open Pebble database with file locks
pruner goroutine — running in background, holding a reference to the index

Note the contrast with the error paths inside the switch (lines 53, 66) which correctly call store.Close(). The Pebble file lock prevents re-opening the index on retry.

Fix: Call wrapper.Close() before returning the error (it already handles all three teardowns).

Bug 3 (Medium): `IndexBlock` overwrite leaves a stale reverse-index entry

File: sei-db/ledger_db/receipt/tx_hash_index.go, lines 119–139

When the same tx hash is indexed at a new block number, IndexBlock overwrites the primary key (h + txHash → newBlock) and writes a new reverse key (b + newBlock + txHash), but never deletes the old reverse key (b + oldBlock + txHash).

Scenario:

IndexBlock(100, [A]) → writes h+A → 100 and b+100+A → []
IndexBlock(200, [A]) → writes h+A → 200 and b+200+A → [] — but b+100+A remains
PruneBefore(150) → scans [b+0, b+150), finds b+100+A, deletes h+A
GetBlockNumber(A) → returns (0, false, nil) even though the receipt lives at block 200

The receipt is still in parquet, so the query falls back to a full DuckDB scan — no data loss, but the index silently degrades to full-scan performance for every overwritten-then-pruned hash.

Fix: In IndexBlock, read the existing value of h + txHash before overwriting. If it exists and differs, delete the old b + oldBlock + txHash entry in the same batch.

Bug 4 (Medium): `txHashIndexPruner.Stop()` panics on double call

File: sei-db/ledger_db/receipt/tx_hash_index.go, lines 252–254

Stop() calls close(p.stopCh) with no sync.Once guard. A second call panics with "close of closed channel". This is reachable because parquetReceiptStore.Close() checks s.indexPruner != nil but never nils it out after stopping. A deferred cleanup + explicit Close() (a normal Go pattern) would crash the node.

Contrast with PebbleTxHashIndex.Close() at line 199, which correctly uses closeOnce.

Fix: Add a sync.Once to Stop(), consistent with the rest of the file.

Bug 5 (Low): Pruner starts before WAL replay, creating a race window

File: sei-db/ledger_db/receipt/parquet_store.go, lines 63 and 70

The pruner goroutine is started at line 63 and immediately executes a prune cycle (the sleep comes after the prune in the loop body). WAL replay doesn't start until line 70. If the WAL replays index entries for blocks that the pruner concurrently decides are pruneable, the two goroutines race on the same Pebble keys.

In practice the window is narrow (the pruner targets old blocks while the WAL replays recent ones), but it's an unnecessary race that's trivially avoided.

Fix: Move pruner.Start() to after replayWAL() returns successfully.

Create a tx index to store (tx hash -> block number) to allow parquet to quickly serve getReceiptByHash queries. unit tests

Tx index for Parquet receipt store

614df03

jewei1997 added the non-app-hash-breaking label Apr 9, 2026

add untracked files

96ea15c

jewei1997 requested review from cody-littley and yzang2019 April 9, 2026 20:32

fix

8f6163f

yzang2019 reviewed Apr 10, 2026

View reviewed changes

jewei1997 added 2 commits April 10, 2026 10:18

only accept pebbledb

9a2b958

avoid parquet files being pruned when queried

b62c9aa

yzang2019 requested a review from Kbhat1 April 10, 2026 16:05

cody-littley reviewed Apr 10, 2026

View reviewed changes

Comment thread sei-db/ledger_db/receipt/tx_hash_index.go Outdated

default pebbledb fix

769ee51

Kbhat1 reviewed Apr 10, 2026

View reviewed changes

Comment thread sei-db/ledger_db/receipt/parquet_store.go

jewei1997 added 3 commits April 10, 2026 15:52

LLM bugs fixes

9c8db9c

fixes

50a4958

Merge branch 'main' into tx-index-for-parquet

5cc02d2

yzang2019 approved these changes Apr 10, 2026

View reviewed changes

Kbhat1 approved these changes Apr 10, 2026

View reviewed changes

jewei1997 added this pull request to the merge queue Apr 10, 2026

Merged via the queue into main with commit df8c2bb Apr 10, 2026
39 checks passed

jewei1997 deleted the tx-index-for-parquet branch April 10, 2026 21:17

jewei1997 added a commit that referenced this pull request Apr 13, 2026

Tx index for Parquet receipt store (#3222)

9dec008

Create a tx index to store (tx hash -> block number) to allow parquet to quickly serve getReceiptByHash queries. unit tests

		// should contain blockNumber, falling back to a full scan on miss.
		func (r Reader) GetReceiptByTxHashInBlock(ctx context.Context, txHash common.Hash, blockNumber uint64) (ReceiptResult, error) {

Conversation

jewei1997 commented Apr 9, 2026

Describe your changes and provide context

Testing performed to validate your change

Uh oh!

github-actions Bot commented Apr 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov Bot commented Apr 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cody-littley left a comment

Choose a reason for hiding this comment

Bug Report: Branch vs Main

Bug 1 (High): Iterator leak in PruneBefore

Bug 2 (High): Resource leak when replayWAL() fails during construction

Bug 3 (Medium): IndexBlock overwrite leaves a stale reverse-index entry

Bug 4 (Medium): txHashIndexPruner.Stop() panics on double call

Bug 5 (Low): Pruner starts before WAL replay, creating a race window

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

github-actions Bot commented Apr 9, 2026 •

edited

Loading

codecov Bot commented Apr 9, 2026 •

edited

Loading

Bug 1 (High): Iterator leak in `PruneBefore`

Bug 2 (High): Resource leak when `replayWAL()` fails during construction

Bug 3 (Medium): `IndexBlock` overwrite leaves a stale reverse-index entry

Bug 4 (Medium): `txHashIndexPruner.Stop()` panics on double call