Skip to content

PLA-693: add upstream brownout mode for sustained 429/capacity incidents#36

Merged
0x666c6f merged 2 commits intomorpho-mainfrom
feature/pla-693-erpc-brownout-mode-for-sustained-upstream-429capacity
Mar 4, 2026
Merged

PLA-693: add upstream brownout mode for sustained 429/capacity incidents#36
0x666c6f merged 2 commits intomorpho-mainfrom
feature/pla-693-erpc-brownout-mode-for-sustained-upstream-429capacity

Conversation

@0x666c6f
Copy link
Copy Markdown
Collaborator

@0x666c6f 0x666c6f commented Mar 3, 2026

Summary

  • add upstream brownout state machine in UpstreamsRegistry to contain sustained remote 429/capacity pressure
  • phases: open -> closed -> half_open -> open with cooldown + half-open window
  • add fail-open safety: if brownout would exclude all non-cordoned upstreams, routing keeps upstreams available
  • add dedicated metrics: erpc_upstream_brownout_state and erpc_upstream_brownout_transition_total
  • add monitoring/runbook docs for 429 incidents and brownout observability

Config (env)

  • ERPC_BROWNOUT_ENABLED (default true)
  • ERPC_BROWNOUT_MIN_REQUESTS (default 20)
  • ERPC_BROWNOUT_MIN_REMOTE_RATE_LIMITED (default 5)
  • ERPC_BROWNOUT_THROTTLED_RATE_THRESHOLD (default 0.25)
  • ERPC_BROWNOUT_COOLDOWN (default 2m)
  • ERPC_BROWNOUT_HALF_OPEN_DURATION (default 45s)

Tests

  • go test ./upstream -run 'TestUpstreamsRegistry_(ThrottlingOrdering|BrownoutStateMachine|BrownoutFailOpenWhenAllWouldBeExcluded)' -count=1
  • go test ./upstream -count=1
  • go test ./erpc -run 'TestHttpServer_EvmGetLogs/FailSplitIfOneOfSubRequestsFailsMissingData' -count=1
  • make agent-check
  • make agent-review-load
  • make agent-gate (fails in-suite due flaky integration segment; rerun of the failing test above passes)

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 5b69db9251

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread upstream/registry.go Outdated
@0x666c6f 0x666c6f changed the base branch from feature/pla-692-erpc-optimize-eth_call-fallback-partial-retry-avoid-double to morpho-main March 4, 2026 16:49
@0x666c6f 0x666c6f merged commit 54fe505 into morpho-main Mar 4, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant