Skip to content

feat(sync): decode ERC-20/ERC-721 transfers + drop orphan logs#45

Closed
satyakwok wants to merge 6 commits into
mainfrom
fix/erc20-transfer-handler
Closed

feat(sync): decode ERC-20/ERC-721 transfers + drop orphan logs#45
satyakwok wants to merge 6 commits into
mainfrom
fix/erc20-transfer-handler

Conversation

@satyakwok
Copy link
Copy Markdown
Member

@satyakwok satyakwok commented May 20, 2026

Summary

Two related fixes that surfaced together during testnet recovery on 2026-05-20.

ERC-20 / ERC-721 Transfer decoder

The `token_transfers` table existed but was never populated — the `indexer-handlers` crate is still a Phase-0 placeholder, so raw logs landed in `logs` and went nowhere. Scan UIs asking for an address's token balances saw an empty list even though the underlying Transfer events were present.

Inline decoder in `sync::token_decode`:

  • topic0 `0xddf252ad…` matches the canonical Transfer signature.
  • ERC-20: topics = [sig, from, to], 32-byte data carries `amount`.
  • ERC-721: topics = [sig, from, to, token_id], no data, amount = 1.
  • ERC-1155 is out of scope here (different topic0, richer encoding).

Decoded transfers flow through `BlockBundle.token_transfers` and commit atomically with the block / txs / logs the chain already wrote.

Orphan-log filter

Sentrix's native `/chain/blocks/` and `eth_getLogs` can disagree — a tx whose effects reverted gets stripped from the block tx vec but its log envelopes still come back from `eth_getLogs`. The `logs.tx_hash → transactions.hash` FK then blew the whole batch transaction and the backfill loop stalled at the first such block.

Drop logs whose tx_hash isn't backed by a tx row in the same bundle (both `fetch_one` + `ingest_one` paths). Such logs are orphans by definition; preserving them only buys us repeated FK-violation rollbacks.

Test plan

  • `cargo test -p indexer-sync token_decode` (4/4 passing — ERC-20, ERC-721, non-Transfer skip, malformed topic skip)
  • Deployed on testnet host; backfilled 22k blocks past stuck point; sUSDC bridge transfers correctly populate `token_transfers` (5 mints + 1 burn captured)
  • Steady-state tail loop processes new transfers within ~15-block safe-lag

Follow-ups

  • `token_transfers` has no unique constraint on `(tx_hash, log_index)`; batch insert skips ON CONFLICT. Reorg + retry safety is currently provided by the atomic cursor advance. Worth a migration when the handler framework formalises.
  • False-positive risk: any contract emitting a Transfer-shaped event with three indexed args decodes as ERC-721. Precise registry-driven classification stays the declarative-handler workstream's job.

Summary by CodeRabbit

Release Notes

  • New Features

    • Added multi-URL load-balancing support for RPC and REST providers with round-robin request distribution
    • Implemented token transfer event decoding and tracking
    • Configurable REST endpoint base URL via environment variable
  • Improvements

    • Enhanced database write performance through batched insert operations
    • Metrics endpoint now segregated to isolated listener for security
  • Tests

    • Added comprehensive test coverage for token transfer decoding and multi-URL provider handling

Review Change Stack

satyakwok added 6 commits May 14, 2026 20:51
…ursor

Backfill went 38 → 268 b/s (7x) on the live mainnet catch-up by combining
three changes:

1. Multi-endpoint RPC pool. ChainProvider + RestClient now accept comma-
   separated URLs and round-robin requests via an atomic counter. Lets the
   indexer hit fullnodes directly (bypassing the public Caddy edge + its
   per-IP rate limit) while spreading load across N nodes.

2. REST_URL env split. Direct fullnode access needs `/rpc` for JSON-RPC
   but root for native REST. The Caddy edge papered over this with a path
   rewrite; the public-facing single URL doesn't work once we bypass it.
   REST_URL falls back to RPC_URL when unset (no behaviour change for
   single-endpoint deployments).

3. Batched writes. Backfill now buffers up to INDEXER_BACKFILL_BATCH (100)
   bundles and flushes them as one transaction with multi-row INSERTs
   (`insert_batch` helpers added to blocks/transactions/logs). One commit
   per batch instead of one per block.

4. Monotonic cursor. The tail loop's `ingest_one` path advances the cursor
   per-block; running in parallel with the batched backfill, its slower
   commits clobbered the batched writer's higher cursor values, causing
   visible regression of 100k+ blocks in seconds. write_cursor now uses
   `GREATEST(_meta.value::int8, EXCLUDED.value::int8)` at the SQL level so
   the on-disk cursor is monotonic regardless of which writer commits last.

Also: INDEXER_BACKFILL_BATCH env (1..1000, default 100) for tuning.
Two related fixes that surfaced together during testnet recovery on
2026-05-20.

## ERC-20 / ERC-721 Transfer decoder

The `token_transfers` table existed but was never populated — the
`indexer-handlers` crate is still a Phase-0 placeholder, so raw logs
landed in `logs` and went nowhere. Scan UIs that asked for an
address's token balances saw an empty list even though the underlying
Transfer events were present.

Inline decoder in `sync::token_decode`:

- topic0 `0xddf252ad…` matches the canonical Transfer signature.
- ERC-20: topics = [sig, from, to], 32-byte data carries `amount`.
- ERC-721: topics = [sig, from, to, token_id], no data, amount = 1.
- ERC-1155 is out of scope here (different topic0, richer encoding).

Decoded transfers flow through `BlockBundle.token_transfers` and the
existing block-writer transaction — they commit atomically with the
block, txs, and logs the chain already wrote.

False-positive risk: any contract emitting a Transfer-shaped event
with three indexed args will decode as ERC-721 here. Acceptable for
the visibility goal; precise registry-driven classification is the
declarative-handler workstream's job.

## Orphan-log filter

Sentrix's native `/chain/blocks/<n>` and `eth_getLogs` can disagree —
a tx whose effects reverted gets stripped from the block tx vec but
its log envelopes still come back from `eth_getLogs`. The
`logs.tx_hash → transactions.hash` FK then blew the whole batch
transaction and the backfill loop stalled at the first such block.

Drop logs whose tx_hash isn't backed by a tx row in the same bundle
(both fetch_one + ingest_one paths). Such logs are orphans on this
chain by definition; preserving them would only buy us repeated
FK-violation rollbacks.

## Schema note

`token_transfers` has no unique constraint on (tx_hash, log_index),
so the batch insert skips ON CONFLICT. The writer's atomic cursor
advance prevents re-processing the same block in the steady state;
reorg recovery deletes downstream rows before re-insert. Adding the
unique index is a follow-up migration.
@satyakwok
Copy link
Copy Markdown
Member Author

Closing in favour of clean branch off main without the unrelated WIP commits — see follow-up PR.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 20, 2026

Caution

Review failed

Pull request was closed or merged during review

📝 Walkthrough

Walkthrough

This PR implements multi-endpoint load-balancing for RPC and REST clients, adds inline ERC-20/721 Transfer event decoding, introduces batch insert helpers for atomic database writes, and refactors the backfill pipeline to buffer and flush blocks in batches. Configuration now supports an optional rest_url parameter. ChainProvider and RestClient parse comma-separated endpoint lists and distribute requests via round-robin cursor. BlockBundle now carries decoded token_transfers, persisted atomically alongside blocks/logs/transactions in a single write_cursor update using GREATEST to enforce monotonic height advances. The backfill logic filters orphan logs (those without corresponding transactions), computes token_transfers inline, and buffers BlockBundles for batched flushing. Metrics endpoint is segregated to loopback-only in smoke tests.

Sequence Diagram(s)

sequenceDiagram
  participant BackfillStream
  participant Accumulator
  participant batch_write_blocks
  participant PgPool
  BackfillStream->>Accumulator: BlockBundle `#1` (fetch_one)
  BackfillStream->>Accumulator: BlockBundle `#2`
  Accumulator->>Accumulator: buffer_len = 2
  BackfillStream->>Accumulator: BlockBundle `#3` (batch_size=100)
  Accumulator-->>Accumulator: continue...
  BackfillStream->>Accumulator: BlockBundle `#100` (reaches threshold)
  Accumulator->>batch_write_blocks: flush bundles [1..100]
  batch_write_blocks->>PgPool: BEGIN
  batch_write_blocks->>PgPool: insert_batch blocks (chunked)
  batch_write_blocks->>PgPool: insert_batch transactions (chunked)
  batch_write_blocks->>PgPool: insert_batch logs (chunked)
  batch_write_blocks->>PgPool: insert_batch token_transfers (chunked)
  batch_write_blocks->>PgPool: write_cursor GREATEST(existing, max_height)
  batch_write_blocks->>PgPool: COMMIT
  batch_write_blocks-->>Accumulator: reset buffer
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

  • Sentriscloud/indexer-rs#37: Applies the same code-level refactors—multi-endpoint round-robin for ChainProvider/RestClient, batched insert_batch-based backfill writes, and monotonic write_cursor update.
  • Sentriscloud/indexer-rs#36: Makes overlapping substantial changes to the backfill and block writing pipeline around batched ingestion and transactional behavior.
  • Sentriscloud/indexer-rs#20: Modifies native RestClient initialization and multi-endpoint URL support for native REST block backfill.
🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 inconclusive)

Check name Status Explanation Resolution
Description check ❓ Inconclusive The PR description provides comprehensive context but does not follow the repository's template structure with required scope checkboxes and deploy impact assessment. Complete the description by filling out all template sections: scope checkboxes, checks (forge build, test, fmt, slither), and deploy impact assessment.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and concisely summarizes the main changes: ERC-20/ERC-721 transfer decoding and orphan log filtering in the sync module.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/erc20-transfer-handler
⚔️ Resolve merge conflicts
  • Resolve merge conflict in branch fix/erc20-transfer-handler

Comment @coderabbitai help to get the list of available commands and usage tips.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant