Skip to content

fix(indexer): normalize log address + topics + tx hash to lowercase on insert#11

Merged
satyakwok merged 7 commits into
mainfrom
fix/normalize-log-address-case
May 7, 2026
Merged

fix(indexer): normalize log address + topics + tx hash to lowercase on insert#11
satyakwok merged 7 commits into
mainfrom
fix/normalize-log-address-case

Conversation

@satyakwok
Copy link
Copy Markdown
Member

Bug

logs.address and tokenTransfers.contract were inserted as-is from viem's getLogsRange output. Some RPCs return EIP-55 checksum (mixed-case) for log addresses. Other tables (txs.fromAddr/txs.toAddr) store lowercase. Mismatch silently breaks address-history queries + JOINs.

Fix

Compute logAddr = l.address.toLowerCase() once per log, reuse for the logs insert + all three tokenTransfers branches (erc20/erc721/erc1155). Also lowercase the four topics + tx hash for consistency.

Verified

tsc --noEmit clean. No schema migration needed.

satyakwok added 7 commits May 5, 2026 16:41
Image `wget`s 127.0.0.1:8081/health (api) and 127.0.0.1:8082/health
(worker) by default — that's the mainnet stack's port layout. Testnet
relocates both via API_PORT=8083 + INDEXER_HEALTH_PORT=8084 to share
the host with the mainnet stack, but the bake-time healthcheck still
hit the old ports → exit 1 → docker reported unhealthy even though
both services were happily serving 200s on testnet-api.sentrixchain.com.

Add explicit `healthcheck:` blocks on each compose service that point
at the relocated ports. Same interval / timeout / retries as the
Dockerfile defaults so behaviour matches mainnet otherwise.

Verified: post-recreate, `docker inspect -f '{{.State.Health.Status}}'`
returns healthy on both `sentrix-indexer-testnet-{api,worker}` within
seconds of start_period elapsing.
indexBlock was writing blocks/transactions/logs/token_transfers but never
upserting into the addresses table — so addresses sat empty even after
50K+ indexed txs. Any UI/API that lists "addresses we've seen" (eg
/contracts/stats, scan recent-deployments feed) returned nothing.

Adds per-tx upsert of from + to (when non-null), tracking
first_seen_block / last_seen_block. Coinbase sentinel skipped on the from
side so the all-zero address doesn't claim a row from validator rewards.
is_contract stays false at insert time; a separate eth_getCode pass marks
it true for addresses with non-empty code (cheap, lazy, out of the hot
write path).

Surfaced by PR #8266 reviewer asking why a deployed contract didn't
appear in any list — the contract is on-chain and readable via
eth_getCode, but our indexer's address-derived endpoints had no row to
return.
Two follow-ups to the addresses-table fix (PR #8):

1. **Contract detection worker** (`apps/indexer/src/contract-detect.ts`).
   The hot tx-insertion path in sync.ts upserts addresses with
   is_contract=false + code_hash=NULL because doing eth_getCode mid-batch
   would dominate runtime. This worker runs in the background, picks up
   addresses with code_hash IS NULL, and flips the flag based on whether
   the chain reports any deployed code. Slow cadence (10 addrs / 4s) so
   a fresh boot doesn't fire 1000+ getCode calls in one second.
   Uses a "0x" sentinel for code_hash on EOAs so we never re-probe them.

2. **Backfill batch size 50 -> 500** in `docker-compose.testnet.yml`.
   Testnet sits 2.5M blocks ahead of the indexer's main cursor; at
   50/batch the catch-up ETA was ~70h. 500/batch trims that to ~7h with
   no observed RPC pressure increase (chain has retry429 handling).

3. `SentrixClient.getCode(address)` thin wrapper around viem's
   `getBytecode`. Returns "0x" when the address is an EOA so the
   detector worker can use a string sentinel rather than special-case
   undefined.

Surfaced by the PR #8266 audit: even after the addresses-table fix
(commit 037662d), `/contracts/stats` returned empty because every
new row had is_contract=false by default and only one address (manually
upserted) was flipped. With the auto-detector running, every contract
deployed across the chain will surface in addresses-table queries
within seconds of being indexed.
Pairs with the contract-detect worker (PR #9): once an address gets
flipped to is_contract=true, this endpoint surfaces it ordered by
first_seen_block DESC. Doesn't depend on the transactions table —
unlike /contracts/stats which INNER JOINs on indexed call history and
lags the addresses table by hours during backfill catch-up.

Surfaced by PR #8266 reviewer feedback: a deployed contract should
appear in the explorer's contracts list immediately, not just via
direct address lookup.

Schema returns rank, address, first_seen_block, last_seen_block,
code_hash. limit clamped to MAX_PAGE (100).
The CoinBlast /live frontend wants a "Whale Activity" strip that
surfaces large single trades alongside the regular feed. Pre-fix the
client would have to fetch /coinblast/trades?limit=200 and filter by
srx_amount in JS — wasteful (most rows aren't whale-sized) and racy
(any window narrower than the page miss the threshold tail).

New endpoint:

  GET /coinblast/whales?threshold=<srx>&limit=<N>

threshold is decimal SRX (default 100). We multiply ×1e18 in pg-side
numeric to compare against cb_trades.srx_amount (numeric(78,0))
without ever round-tripping through JS Number — wallet-sized whales
sit comfortably above 2^53. Graduations are excluded server-side
(one-shot supply migrations, not user trades).

Order: srx_amount desc tie-broken by block_number desc, so the panel
leads with the biggest single trade in the window and falls back to
recency for equal-size whales.
Adds image_url, description, twitter_url, telegram_url, website_url,
metadata_updated_at columns to cb_tokens (all nullable; NULL until the
owner posts metadata).

New endpoint POST /coinblast/metadata accepts { curve_address, stamp_ms,
signature, image_url, description, twitter_url, telegram_url,
website_url } and updates the row only when the EIP-191 signature
recovers to the indexed owner_address. Replay window 5 min on stamp_ms.

Closes the gap that recordLocalLaunch only stored metadata in the
launching browser's localStorage — multi-browser visibility now works
via indexer.
…n insert

txs.fromAddr / txs.toAddr already store lowercase (sync.ts:112-114).
Downstream consumers (scan, faucet, indexer endpoints) query with
lowercase WHERE clauses + JOINs. But logs.address and
tokenTransfers.contract were inserted as-is from viem's getLogsRange
output, which depending on the RPC implementation can be EIP-55
checksum (mixed-case).

Result: address-history queries, token-event filters, and
tokenTransfers JOINs would silently miss events for any contract whose
address came back checksummed. Bug class: data-correctness, not crash.
Hard to detect without running queries against real production data.

Defense-in-depth fix: explicitly .toLowerCase() the log address, all
four topics (selector-prefix LIKE patterns work either way, but
consistent), tx hash, and the tokenTransfers.contract column. logAddr
is computed once per log and reused at all three tokenTransfers insert
sites (erc20, erc721, erc1155).

tsc clean. No schema migration needed — column type stays varchar(42)
with no case-sensitive constraint.
@satyakwok satyakwok merged commit 225bc5b into main May 7, 2026
@satyakwok satyakwok deleted the fix/normalize-log-address-case branch May 7, 2026 05:53
satyakwok added a commit that referenced this pull request May 7, 2026
Two related sync correctness fixes.

1) getNativeTransaction silently dropped txs on transient failure
   The chain client did one fetch with no retry; on network blip /
   5xx / 429 it returned null. sync.ts treats null as 'skip and
   continue' but also commits the surrounding block + advances
   last_synced_height — so the missed tx is never re-fetched. Add
   a 4-attempt backoff (250 ms → 2 s, ~3.75 s total) for non-404
   non-2xx responses + network errors. 404 still terminal.
   Also bump the sync.ts skip site from silent continue to
   log.warn so an operator can grep journalctl after the fact.

2) Latent case-normalisation gaps
   PR #11 lowercased logs.address + logs.tx_hash on insert but
   missed: blocks.{hash, parent_hash, validator, state_root} and
   token_transfers.tx_hash. Today's chain RPC returns lowercase
   so all four columns are clean (verified live on testnet +
   mainnet — zero mixed-case rows), but the contract isn't
   guaranteed across viem versions / future RPC changes, and
   downstream queries all assume lowercase storage. Add
   defensive .toLowerCase() at the insert site so a future RPC
   regression can't silently break JOINs and address-history
   filters.

Co-authored-by: satyakwok <satyakwok@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant