feat(sync): decode ERC-20/ERC-721 transfers + drop orphan logs#45
feat(sync): decode ERC-20/ERC-721 transfers + drop orphan logs#45satyakwok wants to merge 6 commits into
Conversation
…ursor Backfill went 38 → 268 b/s (7x) on the live mainnet catch-up by combining three changes: 1. Multi-endpoint RPC pool. ChainProvider + RestClient now accept comma- separated URLs and round-robin requests via an atomic counter. Lets the indexer hit fullnodes directly (bypassing the public Caddy edge + its per-IP rate limit) while spreading load across N nodes. 2. REST_URL env split. Direct fullnode access needs `/rpc` for JSON-RPC but root for native REST. The Caddy edge papered over this with a path rewrite; the public-facing single URL doesn't work once we bypass it. REST_URL falls back to RPC_URL when unset (no behaviour change for single-endpoint deployments). 3. Batched writes. Backfill now buffers up to INDEXER_BACKFILL_BATCH (100) bundles and flushes them as one transaction with multi-row INSERTs (`insert_batch` helpers added to blocks/transactions/logs). One commit per batch instead of one per block. 4. Monotonic cursor. The tail loop's `ingest_one` path advances the cursor per-block; running in parallel with the batched backfill, its slower commits clobbered the batched writer's higher cursor values, causing visible regression of 100k+ blocks in seconds. write_cursor now uses `GREATEST(_meta.value::int8, EXCLUDED.value::int8)` at the SQL level so the on-disk cursor is monotonic regardless of which writer commits last. Also: INDEXER_BACKFILL_BATCH env (1..1000, default 100) for tuning.
Two related fixes that surfaced together during testnet recovery on 2026-05-20. ## ERC-20 / ERC-721 Transfer decoder The `token_transfers` table existed but was never populated — the `indexer-handlers` crate is still a Phase-0 placeholder, so raw logs landed in `logs` and went nowhere. Scan UIs that asked for an address's token balances saw an empty list even though the underlying Transfer events were present. Inline decoder in `sync::token_decode`: - topic0 `0xddf252ad…` matches the canonical Transfer signature. - ERC-20: topics = [sig, from, to], 32-byte data carries `amount`. - ERC-721: topics = [sig, from, to, token_id], no data, amount = 1. - ERC-1155 is out of scope here (different topic0, richer encoding). Decoded transfers flow through `BlockBundle.token_transfers` and the existing block-writer transaction — they commit atomically with the block, txs, and logs the chain already wrote. False-positive risk: any contract emitting a Transfer-shaped event with three indexed args will decode as ERC-721 here. Acceptable for the visibility goal; precise registry-driven classification is the declarative-handler workstream's job. ## Orphan-log filter Sentrix's native `/chain/blocks/<n>` and `eth_getLogs` can disagree — a tx whose effects reverted gets stripped from the block tx vec but its log envelopes still come back from `eth_getLogs`. The `logs.tx_hash → transactions.hash` FK then blew the whole batch transaction and the backfill loop stalled at the first such block. Drop logs whose tx_hash isn't backed by a tx row in the same bundle (both fetch_one + ingest_one paths). Such logs are orphans on this chain by definition; preserving them would only buy us repeated FK-violation rollbacks. ## Schema note `token_transfers` has no unique constraint on (tx_hash, log_index), so the batch insert skips ON CONFLICT. The writer's atomic cursor advance prevents re-processing the same block in the steady state; reorg recovery deletes downstream rows before re-insert. Adding the unique index is a follow-up migration.
|
Closing in favour of clean branch off main without the unrelated WIP commits — see follow-up PR. |
|
Caution Review failedPull request was closed or merged during review 📝 WalkthroughWalkthroughThis PR implements multi-endpoint load-balancing for RPC and REST clients, adds inline ERC-20/721 Transfer event decoding, introduces batch insert helpers for atomic database writes, and refactors the backfill pipeline to buffer and flush blocks in batches. Configuration now supports an optional Sequence Diagram(s)sequenceDiagram
participant BackfillStream
participant Accumulator
participant batch_write_blocks
participant PgPool
BackfillStream->>Accumulator: BlockBundle `#1` (fetch_one)
BackfillStream->>Accumulator: BlockBundle `#2`
Accumulator->>Accumulator: buffer_len = 2
BackfillStream->>Accumulator: BlockBundle `#3` (batch_size=100)
Accumulator-->>Accumulator: continue...
BackfillStream->>Accumulator: BlockBundle `#100` (reaches threshold)
Accumulator->>batch_write_blocks: flush bundles [1..100]
batch_write_blocks->>PgPool: BEGIN
batch_write_blocks->>PgPool: insert_batch blocks (chunked)
batch_write_blocks->>PgPool: insert_batch transactions (chunked)
batch_write_blocks->>PgPool: insert_batch logs (chunked)
batch_write_blocks->>PgPool: insert_batch token_transfers (chunked)
batch_write_blocks->>PgPool: write_cursor GREATEST(existing, max_height)
batch_write_blocks->>PgPool: COMMIT
batch_write_blocks-->>Accumulator: reset buffer
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 inconclusive)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
⚔️ Resolve merge conflicts
Comment |
Summary
Two related fixes that surfaced together during testnet recovery on 2026-05-20.
ERC-20 / ERC-721 Transfer decoder
The `token_transfers` table existed but was never populated — the `indexer-handlers` crate is still a Phase-0 placeholder, so raw logs landed in `logs` and went nowhere. Scan UIs asking for an address's token balances saw an empty list even though the underlying Transfer events were present.
Inline decoder in `sync::token_decode`:
Decoded transfers flow through `BlockBundle.token_transfers` and commit atomically with the block / txs / logs the chain already wrote.
Orphan-log filter
Sentrix's native `/chain/blocks/` and `eth_getLogs` can disagree — a tx whose effects reverted gets stripped from the block tx vec but its log envelopes still come back from `eth_getLogs`. The `logs.tx_hash → transactions.hash` FK then blew the whole batch transaction and the backfill loop stalled at the first such block.
Drop logs whose tx_hash isn't backed by a tx row in the same bundle (both `fetch_one` + `ingest_one` paths). Such logs are orphans by definition; preserving them only buys us repeated FK-violation rollbacks.
Test plan
Follow-ups
Summary by CodeRabbit
Release Notes
New Features
Improvements
Tests