fix(audit): UTXO build race + cross-chain broadcast resilience + read failover by BitHighlander · Pull Request #60 · keepkey/keepkey-client

BitHighlander · 2026-04-30T22:24:28Z

Summary

Closes the eight remaining audit findings from #59's review.

P1 — release blockers

#1 UTXO build-then-approval race (5 chains)

bitcoin/dogecoin/litecoin/bitcoincash/dash all started buildTx() fire-and-forget then created the approval event. Two race outcomes:

Build resolved before addEvent → getEventById returned null and silently dropped the unsignedTx update.
User approved before build resolved → response.unsignedTx was undefined, signing crashed or used stale data.

Now await fetchJsonWithTimeout(...) for the build BEFORE creating the event, with unsignedTx attached to the event from creation. Same pattern applied to thorchain/cosmos/maya/osmosis/ripple for consistency (they already awaited but used raw fetch).

#2 UTXO/Tendermint broadcast no timeout/retry/failover

All build and broadcast calls now go through fetchJsonWithTimeout with 15s budget + 1 retry on 5xx. Explicit response.ok check before .json() so a transient Pioneer hiccup surfaces as a clean error instead of malformed-JSON parse failure.

P2 — cross-cutting resilience

#5 EVM read failover

New chrome-extension/src/background/chains/rpcFailover.ts with withRpcFailoverByNetworkId(networkId, op, options). Iterates user-override → Pioneer URLs → last-resort with transient/definitive classification and a 5s transport timeout per attempt. Applied to GET_ASSET_BALANCE, GET_EVM_BALANCE, VALIDATE_ERC20_TOKEN. Distinct failedRpcs cooldown from ethereumHandler's active-provider path so cross-chain rate limits stay independent.

#6 Drop check binds to success URL

scheduleDropCheck(hash, delayMs, successUrl?) now records the URL that accepted the broadcast in a hash→URL map. performDropCheck queries that exact URL with a 4s transport timeout instead of running getProvider() again — which could pick a different RPC that never saw the tx and produce false-positive drop warnings (especially after last-resort fallback succeeded). Map cleared after the latest scheduled check (45s). Service-worker restart loses the URL hint and falls back to getProvider — best effort.

#7 Solana send-time failover

broadcastTransaction iterates SOLANA_RPC_URLS (cached health-pick first, then any others). Transient errors (rate limit, 5xx, timeout) fail over; definitive errors (insufficient funds, blockhash expired, signature verification, account in use, already processed) skip the loop. Cached health pick invalidated on failure so the next caller re-tests.

#8 Tron raw broadcast resilience

tronGridPost (injected bundle — can't share fetchUtils) gets inline timeout + one retry on 5xx. Broadcast paths use 12s budget; reads stay at 8s.

#9 Health poll singleflight

checkKeepKey gets AbortSignal.timeout(3000) + a healthPollInflight guard. A stalled localhost:1646 request can't stack overlapping probes every 5s; worst case is one stalled request hung for 3s before the next tick can run.

#10 Pioneer endpoint timeouts

New chrome-extension/src/background/fetchUtils.ts with fetchJsonWithTimeout + fetchWithTimeout. Applied to all Pioneer endpoints (portfolio, insight, nodes, tokens/metadata, tokens/custom GET/POST/DELETE, tokens/balances) plus all UTXO/Tendermint chain handlers.

Out of scope

fetchBalances per-chain RPC enrichment at index.ts:500 — already inside an outer chain-iteration loop; nested failover is over-engineering.
GET_GAS_ESTIMATE at index.ts:909 — still uses the active provider directly. Migrate to withRpcFailover (the active-provider variant in ethereumHandler) in a follow-up if rate-limit complaints surface there.

Files

New: chrome-extension/src/background/fetchUtils.ts
New: chrome-extension/src/background/chains/rpcFailover.ts
Modified: 5 UTXO handlers, 5 Tendermint handlers, ethereumHandler, solanaHandler, tron-provider, index.ts (Pioneer timeouts + 3 read failover sites + health poll guard), injected.js bundle output

Test plan

BTC/LTC/DOGE/BCH/DASH transfer from side panel — popup opens with a populated unsignedTx (no flicker, no "loading transaction" hang)
Pioneer down test (block api.keepkey.info via /etc/hosts): build/broadcast surface a clean timeout error within ~15s instead of hanging forever
BSC/ETH balance refresh with one of Pioneer's primary RPCs rate-limiting — failover catches it; balance still loads
ERC-20 validation against a token on a Pioneer-rate-limited chain — fails over to the next URL
Solana broadcast with a flaky helius/solana endpoint — second URL accepts the tx
Tron transfer with TronGrid 5xx — retried once, succeeds on second attempt
Disconnect KeepKey for 30+ seconds, watch console for log volume — overlapping probes are gone (one log every 5s, not stacked)
Drop check after a broadcast that succeeded via last-resort RPC — [DROP-CHECK] line shows the right URL was queried
Existing flows from feat: Pioneer-sourced EVM chain registry + Solana sign-message UX #57/feat(ux): risk-tiered approval colors, hex dump + copy, deep-link Chainlist, typed timeouts #58/fix(eth): RPC failover on broadcast + pin networks on JsonRpcProvider #59: BSC switch, signMessage popup, broadcast failover — no regressions

🤖 Generated with Claude Code

… failover Addresses the eight remaining audit findings from PR #59's review. P1 — UTXO build-then-approval race (#1) and broadcast resilience (#2): - bitcoin/dogecoin/litecoin/bitcoincash/dash all awaited buildTx() fire-and-forget then created the approval event afterwards. Race produced phantom approvals (response.unsignedTx undefined) or null storedEvent depending on timing. Now buildTx is awaited BEFORE addEvent, with unsignedTx attached to the event from the start. Also applied to thorchain/cosmos/maya/osmosis/ripple — they already awaited the build but used raw fetch() without timeout/retry. - Both build and broadcast now go through fetchJsonWithTimeout with 15s budget + 1 retry on 5xx. P2 — cross-cutting resilience: - New chrome-extension/src/background/fetchUtils.ts with fetchJsonWithTimeout + fetchWithTimeout. Applied to all six Pioneer endpoints in index.ts (portfolio, insight, nodes, tokens/metadata, tokens/custom GET/POST/DELETE, tokens/balances) and the UTXO/Tendermint chain handlers above. - New chrome-extension/src/background/chains/rpcFailover.ts with withRpcFailoverByNetworkId. Iterates user-override → Pioneer → last-resort with transient/definitive classification and per-attempt transport timeout. Applied to GET_ASSET_BALANCE, GET_EVM_BALANCE, VALIDATE_ERC20_TOKEN. - Drop check binds to the URL that accepted the broadcast — querying a different RPC could produce false-positive drop warnings, especially after a last-resort fallback succeeded. dropCheckUrlByHash maps hash → success URL; performDropCheck queries that exact URL with a 4s timeout, falling back to getProvider only on service-worker restart. - Solana broadcastTransaction iterates SOLANA_RPC_URLS on transient failures (rate limit, 5xx, timeout). Definitive errors (insufficient funds, blockhash expired, signature verification, account-in-use) skip the loop. Cached health-checked URL is invalidated on failure so the next caller re-tests. - Tron tronGridPost (injected bundle, can't import fetchUtils) gets inline timeout + one retry on 5xx. Broadcast paths get a longer 12s budget; reads stay at 8s. - Health poll (checkKeepKey) gets AbortSignal.timeout(3000) + a singleflight guard so a stalled localhost:1646 can't stack overlapping probes every 5s. Out of scope: fetchBalances per-chain RPC enrichment (line 500) still uses a single makeStaticProvider per chain since it's already inside a chain-iteration loop; further failover there would be over-engineering. GET_GAS_ESTIMATE (line 909) still uses the active provider directly — could be migrated to withRpcFailover later. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Two findings from PR #60's review. P1: Solana broadcast classifier was over-eager about definitive errors. - 'already processed' is treated as success — the tx is in mempool / processed somewhere; the dApp still needs the signature. Recover it from the signed bytes (first 64-byte sig in the signed-tx layout IS the canonical Solana transaction signature). Inline base58 encoder added; reuses the existing BASE58_ALPHABET const that base58Decode already declared. - 'blockhash not found' demoted from definitive to transient — can be RPC freshness for dApp-supplied transactions; trying the next URL is worth it. 'block height exceeded' stays definitive (the blockhash window has truly closed). - 'account in use' demoted to transient — usually a parallel-write race that can resolve on the next URL. P2: bitcoinHandler's broadcast catch logged + sent transaction_error but didn't re-throw, so the case 'transfer' fell through to default and returned 'Method transfer not supported' to the dApp instead of the real broadcast error. Other UTXO/Tendermint handlers don't have this catch — only bitcoin. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

BitHighlander and others added 2 commits April 30, 2026 17:23

BitHighlander merged commit 1df4d20 into develop Apr 30, 2026
3 of 4 checks passed

BitHighlander deleted the fix/audit-followup branch April 30, 2026 22:39

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(audit): UTXO build race + cross-chain broadcast resilience + read failover#60

fix(audit): UTXO build race + cross-chain broadcast resilience + read failover#60
BitHighlander merged 2 commits intodevelopfrom
fix/audit-followup

BitHighlander commented Apr 30, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

BitHighlander commented Apr 30, 2026

Summary

P1 — release blockers

#1 UTXO build-then-approval race (5 chains)

#2 UTXO/Tendermint broadcast no timeout/retry/failover

P2 — cross-cutting resilience

#5 EVM read failover

#6 Drop check binds to success URL

#7 Solana send-time failover

#8 Tron raw broadcast resilience

#9 Health poll singleflight

#10 Pioneer endpoint timeouts

Out of scope

Files

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant