feat(indexer): is_contract auto-detection + faster testnet backfill#9
Merged
Conversation
Image `wget`s 127.0.0.1:8081/health (api) and 127.0.0.1:8082/health
(worker) by default — that's the mainnet stack's port layout. Testnet
relocates both via API_PORT=8083 + INDEXER_HEALTH_PORT=8084 to share
the host with the mainnet stack, but the bake-time healthcheck still
hit the old ports → exit 1 → docker reported unhealthy even though
both services were happily serving 200s on testnet-api.sentrixchain.com.
Add explicit `healthcheck:` blocks on each compose service that point
at the relocated ports. Same interval / timeout / retries as the
Dockerfile defaults so behaviour matches mainnet otherwise.
Verified: post-recreate, `docker inspect -f '{{.State.Health.Status}}'`
returns healthy on both `sentrix-indexer-testnet-{api,worker}` within
seconds of start_period elapsing.
indexBlock was writing blocks/transactions/logs/token_transfers but never upserting into the addresses table — so addresses sat empty even after 50K+ indexed txs. Any UI/API that lists "addresses we've seen" (eg /contracts/stats, scan recent-deployments feed) returned nothing. Adds per-tx upsert of from + to (when non-null), tracking first_seen_block / last_seen_block. Coinbase sentinel skipped on the from side so the all-zero address doesn't claim a row from validator rewards. is_contract stays false at insert time; a separate eth_getCode pass marks it true for addresses with non-empty code (cheap, lazy, out of the hot write path). Surfaced by PR #8266 reviewer asking why a deployed contract didn't appear in any list — the contract is on-chain and readable via eth_getCode, but our indexer's address-derived endpoints had no row to return.
Two follow-ups to the addresses-table fix (PR #8): 1. **Contract detection worker** (`apps/indexer/src/contract-detect.ts`). The hot tx-insertion path in sync.ts upserts addresses with is_contract=false + code_hash=NULL because doing eth_getCode mid-batch would dominate runtime. This worker runs in the background, picks up addresses with code_hash IS NULL, and flips the flag based on whether the chain reports any deployed code. Slow cadence (10 addrs / 4s) so a fresh boot doesn't fire 1000+ getCode calls in one second. Uses a "0x" sentinel for code_hash on EOAs so we never re-probe them. 2. **Backfill batch size 50 -> 500** in `docker-compose.testnet.yml`. Testnet sits 2.5M blocks ahead of the indexer's main cursor; at 50/batch the catch-up ETA was ~70h. 500/batch trims that to ~7h with no observed RPC pressure increase (chain has retry429 handling). 3. `SentrixClient.getCode(address)` thin wrapper around viem's `getBytecode`. Returns "0x" when the address is an EOA so the detector worker can use a string sentinel rather than special-case undefined. Surfaced by the PR #8266 audit: even after the addresses-table fix (commit 037662d), `/contracts/stats` returned empty because every new row had is_contract=false by default and only one address (manually upserted) was flipped. With the auto-detector running, every contract deployed across the chain will surface in addresses-table queries within seconds of being indexed.
satyakwok
added a commit
that referenced
this pull request
May 5, 2026
Pairs with the contract-detect worker (PR #9): once an address gets flipped to is_contract=true, this endpoint surfaces it ordered by first_seen_block DESC. Doesn't depend on the transactions table — unlike /contracts/stats which INNER JOINs on indexed call history and lags the addresses table by hours during backfill catch-up. Surfaced by PR #8266 reviewer feedback: a deployed contract should appear in the explorer's contracts list immediately, not just via direct address lookup. Schema returns rank, address, first_seen_block, last_seen_block, code_hash. limit clamped to MAX_PAGE (100).
satyakwok
added a commit
that referenced
this pull request
May 5, 2026
* ops(testnet): override Dockerfile healthcheck for relocated ports
Image `wget`s 127.0.0.1:8081/health (api) and 127.0.0.1:8082/health
(worker) by default — that's the mainnet stack's port layout. Testnet
relocates both via API_PORT=8083 + INDEXER_HEALTH_PORT=8084 to share
the host with the mainnet stack, but the bake-time healthcheck still
hit the old ports → exit 1 → docker reported unhealthy even though
both services were happily serving 200s on testnet-api.sentrixchain.com.
Add explicit `healthcheck:` blocks on each compose service that point
at the relocated ports. Same interval / timeout / retries as the
Dockerfile defaults so behaviour matches mainnet otherwise.
Verified: post-recreate, `docker inspect -f '{{.State.Health.Status}}'`
returns healthy on both `sentrix-indexer-testnet-{api,worker}` within
seconds of start_period elapsing.
* fix(indexer): populate addresses table from each tx
indexBlock was writing blocks/transactions/logs/token_transfers but never
upserting into the addresses table — so addresses sat empty even after
50K+ indexed txs. Any UI/API that lists "addresses we've seen" (eg
/contracts/stats, scan recent-deployments feed) returned nothing.
Adds per-tx upsert of from + to (when non-null), tracking
first_seen_block / last_seen_block. Coinbase sentinel skipped on the from
side so the all-zero address doesn't claim a row from validator rewards.
is_contract stays false at insert time; a separate eth_getCode pass marks
it true for addresses with non-empty code (cheap, lazy, out of the hot
write path).
Surfaced by PR #8266 reviewer asking why a deployed contract didn't
appear in any list — the contract is on-chain and readable via
eth_getCode, but our indexer's address-derived endpoints had no row to
return.
* feat(indexer): is_contract auto-detection + faster backfill
Two follow-ups to the addresses-table fix (PR #8):
1. **Contract detection worker** (`apps/indexer/src/contract-detect.ts`).
The hot tx-insertion path in sync.ts upserts addresses with
is_contract=false + code_hash=NULL because doing eth_getCode mid-batch
would dominate runtime. This worker runs in the background, picks up
addresses with code_hash IS NULL, and flips the flag based on whether
the chain reports any deployed code. Slow cadence (10 addrs / 4s) so
a fresh boot doesn't fire 1000+ getCode calls in one second.
Uses a "0x" sentinel for code_hash on EOAs so we never re-probe them.
2. **Backfill batch size 50 -> 500** in `docker-compose.testnet.yml`.
Testnet sits 2.5M blocks ahead of the indexer's main cursor; at
50/batch the catch-up ETA was ~70h. 500/batch trims that to ~7h with
no observed RPC pressure increase (chain has retry429 handling).
3. `SentrixClient.getCode(address)` thin wrapper around viem's
`getBytecode`. Returns "0x" when the address is an EOA so the
detector worker can use a string sentinel rather than special-case
undefined.
Surfaced by the PR #8266 audit: even after the addresses-table fix
(commit 037662d), `/contracts/stats` returned empty because every
new row had is_contract=false by default and only one address (manually
upserted) was flipped. With the auto-detector running, every contract
deployed across the chain will surface in addresses-table queries
within seconds of being indexed.
* feat(api): /contracts/recent endpoint — addresses by deployment height
Pairs with the contract-detect worker (PR #9): once an address gets
flipped to is_contract=true, this endpoint surfaces it ordered by
first_seen_block DESC. Doesn't depend on the transactions table —
unlike /contracts/stats which INNER JOINs on indexed call history and
lags the addresses table by hours during backfill catch-up.
Surfaced by PR #8266 reviewer feedback: a deployed contract should
appear in the explorer's contracts list immediately, not just via
direct address lookup.
Schema returns rank, address, first_seen_block, last_seen_block,
code_hash. limit clamped to MAX_PAGE (100).
---------
Co-authored-by: satyakwok <satyakwok@users.noreply.github.com>
satyakwok
added a commit
that referenced
this pull request
May 7, 2026
…n insert (#11) * ops(testnet): override Dockerfile healthcheck for relocated ports Image `wget`s 127.0.0.1:8081/health (api) and 127.0.0.1:8082/health (worker) by default — that's the mainnet stack's port layout. Testnet relocates both via API_PORT=8083 + INDEXER_HEALTH_PORT=8084 to share the host with the mainnet stack, but the bake-time healthcheck still hit the old ports → exit 1 → docker reported unhealthy even though both services were happily serving 200s on testnet-api.sentrixchain.com. Add explicit `healthcheck:` blocks on each compose service that point at the relocated ports. Same interval / timeout / retries as the Dockerfile defaults so behaviour matches mainnet otherwise. Verified: post-recreate, `docker inspect -f '{{.State.Health.Status}}'` returns healthy on both `sentrix-indexer-testnet-{api,worker}` within seconds of start_period elapsing. * fix(indexer): populate addresses table from each tx indexBlock was writing blocks/transactions/logs/token_transfers but never upserting into the addresses table — so addresses sat empty even after 50K+ indexed txs. Any UI/API that lists "addresses we've seen" (eg /contracts/stats, scan recent-deployments feed) returned nothing. Adds per-tx upsert of from + to (when non-null), tracking first_seen_block / last_seen_block. Coinbase sentinel skipped on the from side so the all-zero address doesn't claim a row from validator rewards. is_contract stays false at insert time; a separate eth_getCode pass marks it true for addresses with non-empty code (cheap, lazy, out of the hot write path). Surfaced by PR #8266 reviewer asking why a deployed contract didn't appear in any list — the contract is on-chain and readable via eth_getCode, but our indexer's address-derived endpoints had no row to return. * feat(indexer): is_contract auto-detection + faster backfill Two follow-ups to the addresses-table fix (PR #8): 1. **Contract detection worker** (`apps/indexer/src/contract-detect.ts`). The hot tx-insertion path in sync.ts upserts addresses with is_contract=false + code_hash=NULL because doing eth_getCode mid-batch would dominate runtime. This worker runs in the background, picks up addresses with code_hash IS NULL, and flips the flag based on whether the chain reports any deployed code. Slow cadence (10 addrs / 4s) so a fresh boot doesn't fire 1000+ getCode calls in one second. Uses a "0x" sentinel for code_hash on EOAs so we never re-probe them. 2. **Backfill batch size 50 -> 500** in `docker-compose.testnet.yml`. Testnet sits 2.5M blocks ahead of the indexer's main cursor; at 50/batch the catch-up ETA was ~70h. 500/batch trims that to ~7h with no observed RPC pressure increase (chain has retry429 handling). 3. `SentrixClient.getCode(address)` thin wrapper around viem's `getBytecode`. Returns "0x" when the address is an EOA so the detector worker can use a string sentinel rather than special-case undefined. Surfaced by the PR #8266 audit: even after the addresses-table fix (commit 037662d), `/contracts/stats` returned empty because every new row had is_contract=false by default and only one address (manually upserted) was flipped. With the auto-detector running, every contract deployed across the chain will surface in addresses-table queries within seconds of being indexed. * feat(api): /contracts/recent endpoint — addresses by deployment height Pairs with the contract-detect worker (PR #9): once an address gets flipped to is_contract=true, this endpoint surfaces it ordered by first_seen_block DESC. Doesn't depend on the transactions table — unlike /contracts/stats which INNER JOINs on indexed call history and lags the addresses table by hours during backfill catch-up. Surfaced by PR #8266 reviewer feedback: a deployed contract should appear in the explorer's contracts list immediately, not just via direct address lookup. Schema returns rank, address, first_seen_block, last_seen_block, code_hash. limit clamped to MAX_PAGE (100). * feat(api): /coinblast/whales — buys+sells above an SRX threshold The CoinBlast /live frontend wants a "Whale Activity" strip that surfaces large single trades alongside the regular feed. Pre-fix the client would have to fetch /coinblast/trades?limit=200 and filter by srx_amount in JS — wasteful (most rows aren't whale-sized) and racy (any window narrower than the page miss the threshold tail). New endpoint: GET /coinblast/whales?threshold=<srx>&limit=<N> threshold is decimal SRX (default 100). We multiply ×1e18 in pg-side numeric to compare against cb_trades.srx_amount (numeric(78,0)) without ever round-tripping through JS Number — wallet-sized whales sit comfortably above 2^53. Graduations are excluded server-side (one-shot supply migrations, not user trades). Order: srx_amount desc tie-broken by block_number desc, so the panel leads with the biggest single trade in the window and falls back to recency for equal-size whales. * feat(indexer): cb_tokens metadata fields + sig-gated POST endpoint Adds image_url, description, twitter_url, telegram_url, website_url, metadata_updated_at columns to cb_tokens (all nullable; NULL until the owner posts metadata). New endpoint POST /coinblast/metadata accepts { curve_address, stamp_ms, signature, image_url, description, twitter_url, telegram_url, website_url } and updates the row only when the EIP-191 signature recovers to the indexed owner_address. Replay window 5 min on stamp_ms. Closes the gap that recordLocalLaunch only stored metadata in the launching browser's localStorage — multi-browser visibility now works via indexer. * fix(indexer): normalize log address + topics + tx hash to lowercase on insert txs.fromAddr / txs.toAddr already store lowercase (sync.ts:112-114). Downstream consumers (scan, faucet, indexer endpoints) query with lowercase WHERE clauses + JOINs. But logs.address and tokenTransfers.contract were inserted as-is from viem's getLogsRange output, which depending on the RPC implementation can be EIP-55 checksum (mixed-case). Result: address-history queries, token-event filters, and tokenTransfers JOINs would silently miss events for any contract whose address came back checksummed. Bug class: data-correctness, not crash. Hard to detect without running queries against real production data. Defense-in-depth fix: explicitly .toLowerCase() the log address, all four topics (selector-prefix LIKE patterns work either way, but consistent), tx hash, and the tokenTransfers.contract column. logAddr is computed once per log and reused at all three tokenTransfers insert sites (erc20, erc721, erc1155). tsc clean. No schema migration needed — column type stays varchar(42) with no case-sensitive constraint. --------- Co-authored-by: satyakwok <satyakwok@users.noreply.github.com>
4 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Two follow-ups to the addresses-table fix from #8 — surfaced by deeper review of why
/contracts/statsstill returned empty even after addresses started populating.1. Contract-detection worker (
apps/indexer/src/contract-detect.ts)The hot tx-insertion path in
sync.tsupserts addresses withis_contract=false+code_hash=NULLto keep tx insertion fast. This worker runs in the background and flips the flag for addresses with non-empty bytecode.eth_getCodeper address; non-empty →is_contract=true+code_hash=keccak256(code); empty →code_hash="0x"sentinel so we don't re-probe EOAsWithout this, addresses that came in via the
sync.tsupsert sat withis_contract=falseforever, and/contracts/stats(whichINNER JOINs onis_contract=true) was permanently empty regardless of how fulladdressesgot.2. Backfill batch size 50 → 500 (testnet)
Testnet currently sits 2.5M blocks ahead of the main cursor; at
INDEXER_BATCH_SIZE=50the catch-up ETA was ~70h. Bump to 500 indocker-compose.testnet.ymlonly — mainnet stays at default. Each block fetch is independent andretry429()handles transient 429/502s, so no observed RPC pressure increase from larger batches.3.
SentrixClient.getCode(address)Thin wrapper around viem's
getBytecodewith EOA-as-"0x"normalisation so the detector worker can use a string sentinel instead of special-casingundefined.