Add a timeout to balance override detection#4317
Conversation
There was a problem hiding this comment.
Code Review
This pull request introduces a configurable timeout for balance override detection to prevent slow token implementations from stalling the quote pipeline. A critical issue was identified regarding the interaction between the timeout and batched RPC requests; specifically, a timeout on a single token could cause the entire batch to be incorrectly marked as not found and cached. To prevent these false negatives, it is recommended to return None on timeout instead of an error, thereby skipping the caching logic for that specific instance.
MartinquaXD
left a comment
There was a problem hiding this comment.
Because all verification calls are batched into a single JSON-RPC request by the ethrpc transport layer, one slow call blocks the entire batch.
This is not entirely correct btw. The current code checks overrides serially in a loop. That prevents those calls from landing in the same batch. (i.e. second call can only be enqueued for batching after the first one already returned a result)
| thiserror::Error, | ||
| }; | ||
|
|
||
| pub const DEFAULT_VERIFICATION_TIMEOUT: Duration = Duration::from_secs(1); |
There was a problem hiding this comment.
This can be private and moved into the test module since the config defines its timeout default separately, right?
There was a problem hiding this comment.
It's used in e2e and in the balances overrides constructor (internally)
I can create a separate Detector constructor that brings this by default but I don't think it's worth it
(Yet, another argument for the "config inside configured struct crate" strategy 😛)
There was a problem hiding this comment.
Ahh, sorry. I checked out the branch and grepped for that identifier but the usage in e2e didn't show up for some reason. 👌
Description
Balance override detection can stall the quote pipeline when probing certain tokens. Reflection tokens (e.g. LuckyBlock) implement
balanceOfby iterating over a storage array in_getReflectionRate(). During strategy verification, we override storage slots with a test value — if that value lands on an array-length slot, the EVM loops until the node's execution timeout.This adds a configurable timeout around the full
detect()call incached_detectionto prevent slow tokens from blocking quote processing. The default is 1 second.Changes
detection_timeoutfield toBalanceOverrideswith a default of 1sdetector.detect()intokio::time::timeoutinsidecached_detection, treating timeouts asDetectionError::NotFound(which gets cached to avoid retrying)detection-timeout-secsconfig field toBalanceOverridesConfigso the timeout is tunable per deploymentHow to test
Staging
Before applying this
After applying this