Skip to content

fix(eth): read handlers participate in failover; method-rejection counted as transient#62

Merged
BitHighlander merged 2 commits intodevelopfrom
fix/eth-read-handlers-failover
May 1, 2026
Merged

fix(eth): read handlers participate in failover; method-rejection counted as transient#62
BitHighlander merged 2 commits intodevelopfrom
fix/eth-read-handlers-failover

Conversation

@BitHighlander
Copy link
Copy Markdown
Collaborator

Symptom (CowSwap)

```
| ethereumHandler | | getProvider | | Trying RPC [1/25]: https://rpc.flashbots.net/
| ethereumHandler | | getProvider | | ✅ RPC working! Block: 24997148
| handleWalletRequest | Error processing method eth_call: server response 403
responseBody: {"jsonrpc":"2.0","error":{"code":-32601,"message":"rpc method is not whitelisted"}}
```

Repeated on every subsequent eth_call — Flashbots gets picked first every time.

Root cause

Flashbots' RPC is narrow-purpose (privacy/MEV protection): it serves `eth_sendRawTransaction`, `eth_chainId`, and `eth_blockNumber`, and rejects everything else with HTTP 403 + JSON-RPC `-32601` (`rpc method is not whitelisted`). Pioneer's catalog ranks it in the top tier (correct for its actual purpose), so:

  1. `getProvider()` picks it first
  2. The pre-flight `getBlockNumber()` test passes (it's one of the supported methods)
  3. `getProvider()` returns the Flashbots provider
  4. The actual handler call (`eth_call` / `eth_getBalance` / `eth_getCode` / `eth_estimateGas` / etc.) hits 403 → throws → handler dies
  5. `failedRpcs` cooldown is never set because the cooldown is only updated when the pre-flight test fails, not when the actual call fails
  6. Next request: same flow, same Flashbots, same 403

Fix — two parts

1. Classify method-rejection as transient

`isTransientRpcError` now also matches:

  • `rpc method is not whitelisted` (Flashbots-specific)
  • `method not found` / `method not supported` / `method does not exist` (generic)
  • `-32601` (canonical JSON-RPC code)
  • `403` (catches ethers' `server response 403` wrapper)

These are URL-level capability problems; trying another URL helps. Treating them as transient means the URL gets a 60s cooldown via `failedRpcs`.

2. All 14 EVM read handlers migrate to `withRpcFailover`

`eth_getBlockByNumber`, `eth_blockNumber`, `eth_getBalance`, `eth_getTransactionReceipt`, `eth_getTransactionByHash`, `web3_clientVersion`, `eth_call`, `eth_max{Priority,}FeePerGas`, `eth_estimateGas`, `eth_gasPrice`, `eth_feeHistory`, `eth_getCode`, `eth_getStorageAt`, `eth_getTransactionCount` — each now runs through the failover loop with classification, transport timeout, and per-attempt URL rotation.

Plus two cross-cutting reads also migrated:

  • The fee-warning preflight bundle (4 reads in `Promise.all` on the sign path) wrapped as a single op so the four reads stay consistent across an RPC.
  • The side-panel transfer build (nonce + estimateGas + getFeeData) — each fails over independently.

`getProvider()` remains only as the drop-check fallback when the SW restart loses the success-URL hint. All other paths share `failedRpcs` through `withRpcFailover` so failover decisions are consistent.

Files

  • `chrome-extension/src/background/chains/ethereumHandler.ts` — `isTransientRpcError` expanded; 14 read handlers + 2 preflight bundles migrated.

Test plan

  • Visit CowSwap on Ethereum mainnet — multicall reads no longer 403; quote loads.
  • Visit Uniswap, do a multi-hop swap quote — `eth_call` reads complete.
  • Trigger Permit2 simulation (Uniswap pre-quote) — `stateOverride` passthrough still works.
  • Watch console: after the first Flashbots 403, subsequent reads should hit a different URL ([rpcFailover] log shows `transient failure, trying next:`).
  • Wait 60s; Flashbots should re-enter the rotation. Browse to a non-call route (just balance) — Flashbots may or may not be picked first depending on candidate ordering.
  • Existing flows (BSC switch, signMessage, broadcast failover) — no regression.

🤖 Generated with Claude Code

BitHighlander and others added 2 commits April 30, 2026 21:41
…nted as transient

CowSwap, Uniswap multicalls, and most modern dApps make heavy use of eth_call / eth_getCode / eth_getStorageAt / eth_estimateGas / eth_feeHistory. These were single-shot against getProvider()'s currently-selected URL with no failover, so a narrow-purpose RPC like Flashbots — which Pioneer's catalog ranks in the top tier and which getProvider's pre-flight getBlockNumber() probe always passes — would get picked first on every read and 403 with -32601 'rpc method is not whitelisted'. Repeated requests preferred Flashbots again because the failedRpcs cooldown only fires when the pre-flight test fails (it doesn't on Flashbots).

Two-part fix:

1. isTransientRpcError now classifies method-rejection as transient: 'rpc method is not whitelisted', 'method not found', 'method not supported', '-32601', '403'. Each failure records the URL in failedRpcs for 60s so the next call goes elsewhere.

2. All 14 EVM read handlers (eth_getBlockByNumber, eth_blockNumber, eth_getBalance, eth_getTransactionReceipt, eth_getTransactionByHash, web3_clientVersion, eth_call, eth_max{Priority,}FeePerGas, eth_estimateGas, eth_gasPrice, eth_feeHistory, eth_getCode, eth_getStorageAt, eth_getTransactionCount) now route through withRpcFailover instead of getProvider(). Two cross-cutting reads (sign-flow fee-warning preflight bundle, transfer-build nonce/gas/fee chain) also migrated.

getProvider remains only as the drop-check fallback when the SW restart loses the success-URL hint. All other paths share failedRpcs through withRpcFailover so failover decisions are consistent.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two findings from PR #62's review.

P1: handleEthGetStorageAt called provider.getStorageAt(...), but ethers v6 renamed it to getStorage. The migration to withRpcFailover preserved the buggy call (it was already broken in the pre-PR code). TypeError at runtime; no failover happens because the failure is a JS-level bug, not an RPC error. Switched to raw send('eth_getStorageAt', params) — byte-exact with the dApp's request and sidesteps v5/v6 method-name divergence.

P2: isTransientRpcError matched bare .includes('403'). A revert reason or hex payload containing "403" would have been classified as transient, causing the same successfully-rejected eth_call / eth_estimateGas to be replayed across every RPC and unnecessarily cooling them. Narrowed to "server response 403" (ethers v6's transport wrapper format from the user's earlier log) and "http 403". The Flashbots case still matches via the existing "rpc method is not whitelisted" / "-32601" patterns.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@BitHighlander BitHighlander merged commit 5f79b76 into develop May 1, 2026
3 of 4 checks passed
@BitHighlander BitHighlander deleted the fix/eth-read-handlers-failover branch May 1, 2026 02:48
BitHighlander added a commit that referenced this pull request May 1, 2026
Cumulative since 0.0.28:

  #57 feat: Pioneer-sourced EVM chain registry + Solana sign-message UX
  #58 feat(ux): risk-tiered approval colors, hex dump + copy, deep-link Chainlist, typed timeouts
  #59 fix(eth): RPC failover on broadcast + pin networks on JsonRpcProvider
  #60 fix(audit): UTXO build race + cross-chain broadcast resilience + read failover
  #61 fix: drop orphaned approval events on service-worker startup
  #62 fix(eth): read handlers participate in failover; method-rejection counted as transient
  #63 fix(masking): default MetaMask masking ON; readable contrast on Settings text

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant