Skip to content

feat(contracts): detect contracts via addresses registry + eth_getCode#63

Merged
satyakwok merged 1 commit into
mainfrom
feat/contracts-addresses-detector
Jun 8, 2026
Merged

feat(contracts): detect contracts via addresses registry + eth_getCode#63
satyakwok merged 1 commit into
mainfrom
feat/contracts-addresses-detector

Conversation

@satyakwok
Copy link
Copy Markdown
Member

@satyakwok satyakwok commented Jun 8, 2026

What

Replaces the to_addr IS NULL contract-creation heuristic with an addresses registry classified by a lazy eth_getCode detector — mirroring the legacy indexer. /contracts/* now serves addresses WHERE is_contract = true ORDER BY first_seen_block.

Why

The shipped to_addr-NULL detection populated nothing on this chain: contract-creation txs are recorded with to_addr = the created contract address, not NULL (SELECT count(*) FROM transactions WHERE to_addr IS NULL = 0). The correct model is an address registry with an is_contract flag set by probing eth_getCode.

Changes

  • chain: ChainProvider::get_code (eth_getCode)
  • db: migration 0005_addresses + addresses module (upsert_batch / unclassified_batch / classify / list_contracts / backfill_from_transactions)
  • sync: block writer registers every from/to address; contract_detect worker classifies unclassified rows, rate-limited (INDEXER_CONTRACT_DETECT_INTERVAL_SECS=4, INDEXER_CONTRACT_DETECT_BATCH=10)
  • indexer: spawns the detector + a one-time address-history backfill from existing transactions
  • api: /contracts/* reads addresses; removes the now-dead contracts db module (migration 0004 table left in place — migrations are append-only)
  • smoke: fixture now seeds one contract + one EOA; asserts /contracts/* returns only the contract (is_contract filter)

Notes

  • Classification is gradual (rate-limited getCode) so /contracts/* fills over time after deploy — bump the rate envs for a faster initial catch-up.

Summary by CodeRabbit

  • New Features

    • Introduced continuous background contract detection that classifies addresses over time.
    • Added bytecode retrieval capability to the provider.
  • Refactor

    • Migrated contract storage and classification from batch-based to lazy detection pipeline.
    • Updated contract leaderboards to use new address registry.

Replaces the to_addr-NULL creation heuristic (wrong for Sentrix, which records to_addr = the created contract address, so zero rows ever matched) with an addresses table classified by a lazy eth_getCode detector, mirroring the legacy indexer. /contracts/* now serves addresses WHERE is_contract = true ORDER BY first_seen_block.

- chain: ChainProvider::get_code (eth_getCode)
- db: migration 0005_addresses + addresses module (upsert_batch/unclassified_batch/classify/list_contracts/backfill)
- sync: block writer registers from/to addresses; contract_detect worker classifies unclassified rows rate-limited
- indexer: spawn detector + one-time address-history backfill; INDEXER_CONTRACT_DETECT_{INTERVAL_SECS,BATCH}
- api: /contracts/* reads the addresses table; removes the dead contracts module (0004 table left in place, append-only)
@satyakwok satyakwok merged commit 928f4b4 into main Jun 8, 2026
7 of 8 checks passed
@satyakwok satyakwok deleted the feat/contracts-addresses-detector branch June 8, 2026 06:32
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Jun 8, 2026

Review Change Stack

Caution

Review failed

Pull request was closed or merged during review

📝 Walkthrough

Walkthrough

This PR introduces a lazy contract detection architecture to replace eager contract identification via transaction analysis. A new addresses table stores all observed addresses with first/last seen blocks and deferred contract classification (flag and code hash). A periodic detector worker scans unclassified rows, probes them via eth_getCode, and classifies each as contract or EOA. The block writer now seeds addresses from transaction from/to fields instead of eagerly inserting contracts. API routes migrate from querying the old contracts table to the new addresses table. Daemon configuration adds detector tuning parameters, one-time addresses backfill spawns on startup if needed, and the detector task runs concurrently with other sync work.

Sequence Diagram

sequenceDiagram
    participant Sync as Block Sync
    participant BW as Block Writer
    participant DB as Addresses Registry
    participant Detector as Contract Detector
    participant Provider as ChainProvider
    Sync->>BW: write_block(transactions)
    BW->>BW: Extract from/to addresses
    BW->>DB: upsert_batch(addresses, block)
    loop On detector interval
        Detector->>DB: unclassified_batch(limit)
        DB-->>Detector: [address1, address2, ...]
        loop For each unclassified address
            Detector->>Provider: get_code(address)
            Provider-->>Detector: code_bytes
            Detector->>Detector: Compute keccak256 hash
            Detector->>DB: classify(address, is_contract, code_hash)
        end
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

  • Sentriscloud/indexer-rs#35: Both PRs modify crates/sync/src/block_writer.rs to change how block writes handle address/contract data—one introduces bulk backfill writes, the other refactors block/batch writes to seed addresses for the new detector pipeline.
🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Description check ⚠️ Warning The description covers What, Why, and Changes sections, explaining the problem, solution, and implementation details. However, required template sections like Scope checklist, Checks, Deploy impact, and Linked issue are missing or incomplete. Complete the PR description template by filling in Scope (contract/test/deploy type), Checks (forge build/test/fmt/slither), Deploy impact assessment, and Linked issue (#), then verify the changes are properly documented.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and accurately summarizes the main change: introducing contract detection via an addresses registry and eth_getCode instead of the previous to_addr heuristic.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/contracts-addresses-detector

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@satyakwok satyakwok self-assigned this Jun 8, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant