Fix exponential backoff (10secs, 4xx accepted) #1060
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This patch introduces two key refinements to the agent's exponential
backoff and retry logic, making it more patient and aligned with
standard network practices.
Increased Default Backoff Timings (config/base.rs)
The default values for the exponential backoff have been
significantly increased:
Initial Delay: Changed from 2 seconds to 10 seconds. This makes
the agent wait longer before the first retry, which is more suitable for
services that might be slow to initialize.
Maximum Delay: Changed from 60 seconds to 300 seconds (5
minutes). This allows the delay between retries to grow larger,
accommodating longer-term service disruptions.
Smarter Retry Strategy (resilient_client.rs)
The core logic in the custom StopOnSuccessStrategy has been improved
to be more intelligent about when to retry.
Before: The strategy would retry on any non-success status code,
including 4xx client errors (like 404 Not Found), which are typically
not temporary.
After: The strategy now delegates the decision for non-success
codes to reqwest-retry's built-in default_on_request_success function.
This default logic is smarter:
It will retry on 5xx server errors (e.g., 503 Service Unavailable).
It will NOT retry on most 4xx client errors (e.g., 404 Not
Found, 403 Forbidden), as these indicate a problem with the request
itself, not a temporary server issue.
Similarly, the handling of network errors is now delegated to
default_on_request_failure, ensuring consistent and robust behavior.