fix(audit): UTXO build race + cross-chain broadcast resilience + read failover#60
Merged
BitHighlander merged 2 commits intodevelopfrom Apr 30, 2026
Merged
fix(audit): UTXO build race + cross-chain broadcast resilience + read failover#60BitHighlander merged 2 commits intodevelopfrom
BitHighlander merged 2 commits intodevelopfrom
Conversation
… failover Addresses the eight remaining audit findings from PR #59's review. P1 — UTXO build-then-approval race (#1) and broadcast resilience (#2): - bitcoin/dogecoin/litecoin/bitcoincash/dash all awaited buildTx() fire-and-forget then created the approval event afterwards. Race produced phantom approvals (response.unsignedTx undefined) or null storedEvent depending on timing. Now buildTx is awaited BEFORE addEvent, with unsignedTx attached to the event from the start. Also applied to thorchain/cosmos/maya/osmosis/ripple — they already awaited the build but used raw fetch() without timeout/retry. - Both build and broadcast now go through fetchJsonWithTimeout with 15s budget + 1 retry on 5xx. P2 — cross-cutting resilience: - New chrome-extension/src/background/fetchUtils.ts with fetchJsonWithTimeout + fetchWithTimeout. Applied to all six Pioneer endpoints in index.ts (portfolio, insight, nodes, tokens/metadata, tokens/custom GET/POST/DELETE, tokens/balances) and the UTXO/Tendermint chain handlers above. - New chrome-extension/src/background/chains/rpcFailover.ts with withRpcFailoverByNetworkId. Iterates user-override → Pioneer → last-resort with transient/definitive classification and per-attempt transport timeout. Applied to GET_ASSET_BALANCE, GET_EVM_BALANCE, VALIDATE_ERC20_TOKEN. - Drop check binds to the URL that accepted the broadcast — querying a different RPC could produce false-positive drop warnings, especially after a last-resort fallback succeeded. dropCheckUrlByHash maps hash → success URL; performDropCheck queries that exact URL with a 4s timeout, falling back to getProvider only on service-worker restart. - Solana broadcastTransaction iterates SOLANA_RPC_URLS on transient failures (rate limit, 5xx, timeout). Definitive errors (insufficient funds, blockhash expired, signature verification, account-in-use) skip the loop. Cached health-checked URL is invalidated on failure so the next caller re-tests. - Tron tronGridPost (injected bundle, can't import fetchUtils) gets inline timeout + one retry on 5xx. Broadcast paths get a longer 12s budget; reads stay at 8s. - Health poll (checkKeepKey) gets AbortSignal.timeout(3000) + a singleflight guard so a stalled localhost:1646 can't stack overlapping probes every 5s. Out of scope: fetchBalances per-chain RPC enrichment (line 500) still uses a single makeStaticProvider per chain since it's already inside a chain-iteration loop; further failover there would be over-engineering. GET_GAS_ESTIMATE (line 909) still uses the active provider directly — could be migrated to withRpcFailover later. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two findings from PR #60's review. P1: Solana broadcast classifier was over-eager about definitive errors. - 'already processed' is treated as success — the tx is in mempool / processed somewhere; the dApp still needs the signature. Recover it from the signed bytes (first 64-byte sig in the signed-tx layout IS the canonical Solana transaction signature). Inline base58 encoder added; reuses the existing BASE58_ALPHABET const that base58Decode already declared. - 'blockhash not found' demoted from definitive to transient — can be RPC freshness for dApp-supplied transactions; trying the next URL is worth it. 'block height exceeded' stays definitive (the blockhash window has truly closed). - 'account in use' demoted to transient — usually a parallel-write race that can resolve on the next URL. P2: bitcoinHandler's broadcast catch logged + sent transaction_error but didn't re-throw, so the case 'transfer' fell through to default and returned 'Method transfer not supported' to the dApp instead of the real broadcast error. Other UTXO/Tendermint handlers don't have this catch — only bitcoin. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Closes the eight remaining audit findings from #59's review.
P1 — release blockers
#1 UTXO build-then-approval race (5 chains)
bitcoin/dogecoin/litecoin/bitcoincash/dashall startedbuildTx()fire-and-forget then created the approval event. Two race outcomes:addEvent→getEventByIdreturned null and silently dropped the unsignedTx update.response.unsignedTxwas undefined, signing crashed or used stale data.Now
await fetchJsonWithTimeout(...)for the build BEFORE creating the event, withunsignedTxattached to the event from creation. Same pattern applied tothorchain/cosmos/maya/osmosis/ripplefor consistency (they already awaited but used raw fetch).#2 UTXO/Tendermint broadcast no timeout/retry/failover
All build and broadcast calls now go through
fetchJsonWithTimeoutwith 15s budget + 1 retry on 5xx. Explicitresponse.okcheck before.json()so a transient Pioneer hiccup surfaces as a clean error instead of malformed-JSON parse failure.P2 — cross-cutting resilience
#5 EVM read failover
New
chrome-extension/src/background/chains/rpcFailover.tswithwithRpcFailoverByNetworkId(networkId, op, options). Iterates user-override → Pioneer URLs → last-resort with transient/definitive classification and a 5s transport timeout per attempt. Applied toGET_ASSET_BALANCE,GET_EVM_BALANCE,VALIDATE_ERC20_TOKEN. DistinctfailedRpcscooldown fromethereumHandler's active-provider path so cross-chain rate limits stay independent.#6 Drop check binds to success URL
scheduleDropCheck(hash, delayMs, successUrl?)now records the URL that accepted the broadcast in a hash→URL map.performDropCheckqueries that exact URL with a 4s transport timeout instead of runninggetProvider()again — which could pick a different RPC that never saw the tx and produce false-positive drop warnings (especially after last-resort fallback succeeded). Map cleared after the latest scheduled check (45s). Service-worker restart loses the URL hint and falls back togetProvider— best effort.#7 Solana send-time failover
broadcastTransactioniteratesSOLANA_RPC_URLS(cached health-pick first, then any others). Transient errors (rate limit, 5xx, timeout) fail over; definitive errors (insufficient funds, blockhash expired, signature verification, account in use, already processed) skip the loop. Cached health pick invalidated on failure so the next caller re-tests.#8 Tron raw broadcast resilience
tronGridPost(injected bundle — can't sharefetchUtils) gets inline timeout + one retry on 5xx. Broadcast paths use 12s budget; reads stay at 8s.#9 Health poll singleflight
checkKeepKeygetsAbortSignal.timeout(3000)+ ahealthPollInflightguard. A stalledlocalhost:1646request can't stack overlapping probes every 5s; worst case is one stalled request hung for 3s before the next tick can run.#10 Pioneer endpoint timeouts
New
chrome-extension/src/background/fetchUtils.tswithfetchJsonWithTimeout+fetchWithTimeout. Applied to all Pioneer endpoints (portfolio, insight, nodes, tokens/metadata, tokens/custom GET/POST/DELETE, tokens/balances) plus all UTXO/Tendermint chain handlers.Out of scope
fetchBalancesper-chain RPC enrichment atindex.ts:500— already inside an outer chain-iteration loop; nested failover is over-engineering.GET_GAS_ESTIMATEatindex.ts:909— still uses the active provider directly. Migrate towithRpcFailover(the active-provider variant in ethereumHandler) in a follow-up if rate-limit complaints surface there.Files
chrome-extension/src/background/fetchUtils.tschrome-extension/src/background/chains/rpcFailover.tsTest plan
[DROP-CHECK]line shows the right URL was queried🤖 Generated with Claude Code