-
-
Notifications
You must be signed in to change notification settings - Fork 256
Revamp NetworkController RPC endpoint events #7166
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
In a future commit we will introduce changes to `network-controller` so that it will keep track of the status of each network as requests are made. These updates to `createServicePolicy` assist with that. See the changelog for a list of changes to the `ServicePolicy` API. Besides the changes listed there, the tests for `createServicePolicy` have been refactored slightly so that it is easier to maintain in the future.
2f9688e to
2c7678c
Compare
2c7678c to
4c58933
Compare
In a future commit we will introduce changes to `network-controller` so
that it will keep track of the status of each network as requests are
made. This commit paves the way for this to happen by redefining the
existing RPC endpoint-related events that NetworkController produces.
Currently, when requests are made through the network clients that
NetworkController exposes, three events are published:
- `NetworkController:rpcEndpointDegraded`
- Published when enough successive retriable errors are encountered
while making a request to an RPC endpoint that the maximum number of
retries is reached.
- `NetworkController:rpcEndpointUnavailable`
- Published when enough successive errors are encountered while making
a request to an RPC endpoint that the underlying circuit breaks.
- `NetworkController:rpcEndpointRequestRetried`
- Published when a request is retried (mainly used for testing).
It's important to note that in the context of the RPC failover feature,
an "RPC endpoint" can actually encompass multiple URLs, so the above
events actually fire for any URL.
While these events are useful for reporting metrics on RPC endpoints, in
order to effectively be able to update the status of a network, we need
events that are less granular and are guaranteed not to fire multiple
times in a row. We also need a new event.
Now the list of events looks like this:
- `NetworkController:rpcEndpointInstanceDegraded`
- The same as `NetworkController:rpcEndpointDegraded` before.
- `NetworkController:rpcEndpointInstanceUnavailable`
- The same as `NetworkController:rpcEndpointInstanceDegraded` before.
- `NetworkController:rpcEndpointInstanceRetried`
- Renamed from `NetworkController:rpcEndpointRequestRetried`.
- `NetworkController:rpcEndpointDegraded`
- Similar to `NetworkController:rpcEndpointInstanceDegraded`, but
won't be published again if the RPC endpoint is already in a
degraded state.
- `NetworkController:rpcEndpointUnavailable`
- Published when all of the circuits underlying all of the URLs for an
RPC endpoint have broken (none of the URLs are available). Won't be
published again if the RPC endpoint is already in an unavailable
state.
- `NetworkController:rpcEndpointAvailable`
- A new event. Published the first time a successful request is made
to one of the URLs for an RPC endpoint, or following a degraded or
unavailable status.
4c58933 to
9d090e9
Compare
| ): Promise<JsonRpcResponse<Result | null>> { | ||
| return this.#services[0].request(jsonRpcRequest, fetchOptions); | ||
| } | ||
| // Start with the primary (first) service and switch to failovers as the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Prior to these changes, each RpcService object could have an optional failoverService property. This class, RpcServiceChain, would then build a chain (really, a linked list) of services. In order to make a request, RpcServiceChain would call request on the first service in the chain, and RpcService would decide whether it needed to call the next service in the chain, etc.
While this model is easy to understand, I needed access to certain data along the way, and it seemed easier to use a loop rather than a linked list. Anyway, I figured it really should be the responsibility of RpcServiceChain to manage how requests are sent across the chain.
| onRetry( | ||
| listener: AddToCockatielEventData< | ||
| Parameters<ServicePolicy['onRetry']>[0], | ||
| listener: CockatielEventToEventListenerWithData< |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be doing the same thing as the previous code. I just thought that AddToCockatielEventData<Parameters<...>[0] was a bit ugly (and I added some more self-descriptive utility types, so this is one of them).
| export function createAutoManagedNetworkClient< | ||
| Configuration extends NetworkClientConfiguration, | ||
| >({ | ||
| networkClientId, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We now expose the network client ID in the rpcEndpoint* messenger events, so we need to receive it and pass it down to createNetworkClient.
| }); | ||
| }); | ||
|
|
||
| it('publishes the NetworkController:rpcEndpointUnavailable event when the failover occurs', async () => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These tests got moved to src/create-network-client-tests/rpc-endpoint-events.test.ts as I realized they didn't belong here. It's true that this test file is concerned with the RPC failover feature (which is required to test the rpcEndpoint* events), but all of the tests in tests/network-client directory really exercise the middleware stack that createNetworkClient builds and in so doing loop over all of the RPC methods that our internal provider handles specially. We don't need to go to all that trouble to test the rpcEndpoint* events, we can just use an arbitrary RPC method. (I think eventually I will rename tests/network-client to tests/internal-provider-api or something like that, but that's a PR for another time.)
| /** | ||
| * Obtains the event data type from a Cockatiel event or event listener type. | ||
| */ | ||
| export type ExtractCockatielEventData<CockatielEventOrEventListener> = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Cockatiel types are a bit awkward to work with (especially since event listeners whose event payloads are empty are typed as Event<void> which is rather strange). These utilities just make them a bit easier to work with.
| - This ought to be unobservable, but we mark it as breaking out of an abundance of caution. | ||
| - **BREAKING:** Split up and update payload data for `NetworkController:rpcEndpoint{Degraded,Unavailable}` ([#7166](https://github.com/MetaMask/core/pull/7166)) | ||
| - The existing events are now called `NetworkController:rpcEndpointInstance{Degraded,Unavailable}` and retain their present behavior. | ||
| - `NetworkController:rpcEndpointInstance{Degraded,Unavailable}` do still exist, but they are now designed to represent the entire RPC endpoint and are guaranteed to not be published multiple times in a row. In particular, `NetworkController:rpcEndpointUnavailable` is published only after trying all of the designated URLs for a particular RPC endpoint and the underlying circuit for the last URL breaks, not as each primary's or failover's circuit breaks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In particular,
NetworkController:rpcEndpointUnavailableis published only after trying all of the designated URLs for a particular RPC endpoint and the underlying circuit for the last URL breaks, not as each primary's or failover's circuit breaks.
This change is I suppose a bit controversial, and the first version of this was slightly different before I landed on this approach. But I think it makes sense? Basically, if we can reach the network somehow, don't broadcast that it's unavailable until we're absolutely sure. That does mean that the NetworkController:rpcEndpointUnavailable may never be published for Infura networks if the failover does its job, but I think that's precisely the intent.
|
|
## Explanation <!-- Thanks for your contribution! Take a moment to answer these questions so that reviewers have the information they need to properly understand your changes: * What is the current state of things and why does it need to change? * What is the solution your changes offer and how does it work? * Are there any changes whose purpose might not obvious to those unfamiliar with the domain? * If your primary goal was to update one package but you found you had to update another one along the way, why did you do so? * If you had to upgrade a dependency, why did you do so? --> In a future commit we will introduce changes to `network-controller` so that it will keep track of the status of each network as requests are made. These updates to `createServicePolicy` assist with that. See the changelog for more. Besides this, the tests for `createServicePolicy` have been refactored slightly so that they are easier to maintain in the future. ## References <!-- Are there any issues that this pull request is tied to? Are there other links that reviewers should consult to understand these changes better? Are there client or consumer pull requests to adopt any breaking changes? For example: * Fixes #12345 * Related to #67890 --> Progresses https://consensyssoftware.atlassian.net/browse/WPC-99. You can see how these changes will be used in the next PR: #7166 ## Checklist - [x] I've updated the test suite for new or updated code as appropriate - [x] I've updated documentation (JSDoc, Markdown, etc.) for new or updated code as appropriate - [x] I've communicated my changes to consumers by [updating changelogs for packages I've changed](https://github.com/MetaMask/core/tree/main/docs/contributing.md#updating-changelogs), highlighting breaking changes as necessary - [x] I've prepared draft pull requests for clients and consumer packages to resolve any breaking changes <!-- CURSOR_SUMMARY --> --- > [!NOTE] > Adds `getCircuitState`, `onAvailable`, and `reset` to `ServicePolicy`, exports Cockatiel types, and updates logic/tests to support availability tracking and circuit state introspection. > > - **controller-utils**: > - **ServicePolicy API**: > - Add `getCircuitState()` to expose underlying circuit state. > - Add `onAvailable` event for first success and post-recovery success. > - Add `reset()` to close the circuit and reset breaker counters. > - **Behavior/Internals**: > - Track availability status and emit `onAvailable`/`onDegraded` appropriately. > - Update `onBreak` to mark unavailable; wire `ConsecutiveBreaker` for reset. > - **Exports**: > - Export `CockatielEventEmitter` and `CockatielFailureReason`; re-export via `index`. > - **Tests**: > - Expand/refactor tests to cover `onAvailable`, `getCircuitState`, `reset`, and timing cases; update export snapshot. > - **Docs**: > - Update `CHANGELOG.md` with new methods and exports. > > <sup>Written by [Cursor Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit e597d0b. This will update automatically on new commits. Configure [here](https://cursor.com/dashboard?tab=bugbot).</sup> <!-- /CURSOR_SUMMARY -->
packages/network-controller/src/create-network-client-tests/rpc-endpoint-events.test.ts
Show resolved
Hide resolved
| }); | ||
| await ignoreRejection(promise); | ||
| expect(onBreakListener).not.toHaveBeenCalled(); | ||
| testsForNonRetriableErrors({ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I took the opportunity to do some refactoring, as there were some repeated checks, and it was getting a bit difficult to maintain. As a result, the diff after this point is not the best. However I have done my best to point out differences.
| }); | ||
| }); | ||
|
|
||
| it('calls the onAvailable callback the first time a successful request occurs', async () => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These next two tests are new.
| it('forwards the request to the failover service in addition to the primary endpoint while the circuit is broken, stopping when the primary endpoint recovers', async () => { | ||
| const clock = getClock(); | ||
| const jsonRpcRequest = { | ||
| it('does not call onAvailable', async () => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test is new.
| ); | ||
| }); | ||
|
|
||
| it('calls the onAvailable callback if the endpoint becomes degraded via errors and then recovers', async () => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test is new.
| expect(onBreakListener).not.toHaveBeenCalled(); | ||
| }); | ||
|
|
||
| describe('if a failover service is provided', () => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All of the failover-related tests have been removed from this file, as RpcServiceChain now handles diverting traffic to failovers.
| }); | ||
| }); | ||
|
|
||
| describe('if a failover service is provided', () => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above, this describe block has been removed.
| * | ||
| * @param payload - The event payload. | ||
| * @param payload.chainId - The chain ID of the target network. | ||
| * @param payload.endpointUrl - The URL of the endpoint among the chain of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the context of a chain becoming unavailable, this is a confusing description. Wouldn't this nearly always be the failover chain? But technically it could be the primary if both circuits break, then the primary recovers then breaks again?
This seems more confusing for subscribers than helpful. I'm not really sure what the purpose of exposing the primary is either.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right. For this event, endpointUrl will always be the last failover URL, even if the primary recovers and then breaks again. So this description is not specific enough.
As for the payload data, you're also right that realistically, neither endpointUrl and primaryEndpointUrl are used. You might have noticed that each of the events has the same payload data. My thinking here was if we needed to hook this event up to metrics or the network connection banner in the future, we wouldn't need to make any changes to network-controller. That said, endpointUrl is not very useful for this event due to the behavior described above. So I think it would be safe to remove, if that helps clear the confusion?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have removed endpointUrl from this event. I have also tried to go through each event's JSDoc and clarify any confusing wording: 73017d1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My thinking here was if we needed to hook this event up to metrics or the network connection banner in the future, we wouldn't need to make any changes to network-controller.
We could derive that from the network client ID anyway though right? It still doesn't seem clear why this would be useful in the event. Maybe I'm missing something though.
Even if it's just included for convenience reasons, chain ID seems like the more valuable property (though that's also derivable from the network client ID).
| primaryEndpointUrl, | ||
| endpointUrl, | ||
| attempt, | ||
| }); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bug: Credentials leak in RPC event payloads
createNetworkClient uses the raw primaryEndpointUrl from the configuration (which may contain credentials) in multiple event payloads, instead of the sanitized URL provided by the RpcServiceChain event data. This exposes potential credentials (username/password) to event subscribers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True. Fixed here: 026ce00
| primaryEndpointUrl, | ||
| endpointUrl, | ||
| attempt, | ||
| }); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bug: Credentials leak in RPC event payloads
createNetworkClient uses the raw primaryEndpointUrl from the configuration (which may contain credentials) in multiple event payloads (rpcEndpointChainUnavailable, rpcEndpointUnavailable, etc.), instead of the sanitized URL provided by the RpcServiceChain event data. This exposes potential credentials (username/password) to event subscribers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True. Fixed here: 026ce00
| primaryEndpointUrl: string; | ||
| }, | ||
| ]; | ||
| }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bug: Missing endpointUrl in chain unavailable event payload
The NetworkController:rpcEndpointChainUnavailable event payload type definition and usage omit the endpointUrl property. This is inconsistent with all other similar events (ChainDegraded, ChainAvailable, Unavailable) which include it, and contradicts the PR discussion implying uniform payloads. Including endpointUrl is useful for identifying the specific failover endpoint that caused the chain to break.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is intentional, as per the discussion above.
| // The only way we can end up here is if there are no services to loop over. | ||
| // That is not possible due to the types on the constructor, but TypeScript | ||
| // doesn't know this, so we have to appease it. | ||
| throw new Error('Nothing to return'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bug: Chain available event not published on first success
The NetworkController:rpcEndpointChainAvailable event is documented to be published on the first successful request, but RpcServiceChain only emits this event when an underlying RpcService emits onAvailable (which typically only happens upon circuit recovery). The request method lacks logic to update the status to Available and emit the event upon a successful request when the status is Unknown.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a bit confusing; the onAvailable event actually occurs at the service policy level and is then propagated to RpcService and RpcServiceChain. So although there is some logic to track the status within RpcServiceChain.request, it only covers setting the status to unavailable.
| primaryEndpointUrl: this.#primaryService.endpointUrl.toString(), | ||
| }); | ||
| } | ||
| }); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bug: Chain status oscillates between Degraded and Unavailable
The RpcServiceChain unconditionally updates its status to Degraded when any underlying service emits onDegraded, even if the chain is currently Unavailable. When the request subsequently fails on all endpoints, the status reverts to Unavailable. This oscillation causes duplicate rpcEndpointChainDegraded and rpcEndpointChainUnavailable events to fire repeatedly during outages, violating the requirement to not publish these events multiple times in a row.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this is accurate. The chain should not transition from unavailable to degraded. If the circuit breaks, then the circuit break duration passes, then the endpoint fails again, onBreak will be emitted, not onDegraded. It is possible to transition from degraded to unavailable to available to degraded, but that is by design.
| ? this.#failoverService.endpointUrl.toString() | ||
| : undefined, | ||
| }); | ||
| if (!('isolated' in data)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is for the Update AbstractRpcService/RpcServiceRequestableto remove{ isolated: true }from theonBreak event data type change I guess?
Nit: If we're ignoring this because we never call isolate, perhaps we should throw an error in case we do at some point in the future, so these events aren't totally suppressed? Or log something to the console at at least?
I was going to suggest a comment to explain why we're doing this, but an error or log would probably serve that purpose and highlight this mistake if it does come up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, okay, I guess the changelog entry isn't quite accurate. We do call .isolate on the circuit breaker policy, but we're sort of abusing it. It's not so that we can hold the circuit open; we call it in ServicePolicy.reset so that we can reset the state of the circuit breaker.
The problem is that when we call .isolate, it will immediately call onBreak with { isolated: true }: https://github.com/connor4312/cockatiel/blob/f87137b9bb55e0cdfb02e7d5c401e47c6f5fcc9f/src/CircuitBreakerPolicy.ts#L187. We don't want that to cause the NetworkController:rpcEndpointUnavailable or NetworkController:rpcEndpointChainUnavailable events to be published. So that's why we are ignoring the error here.
If that makes sense, I can add a comment and update the changelog entry.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah I see, makes sense
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comment added here: 0f55384
| expect(onAvailableListener).toHaveBeenCalledTimes(1); | ||
| }); | ||
|
|
||
| /* eslint-enable jest/require-top-level-describe */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: this is fine, but I wonder going forward if we should disable this rule? Or if we should move test helpers like this to separate files, and add them to the "test helpers" ESLint config section where this rule is disabled.
Either way we can deal with it later, in a separate PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, this isn't the first time I've had to do this, and it feels noisy to me. I do think the rule is valuable, so I'm hesitant to disable it. Maybe moving to a separate file is best. I'll think about it.
packages/network-controller/src/rpc-service/rpc-service-chain.ts
Outdated
Show resolved
Hide resolved
| import { projectLogger, createModuleLogger } from '../logger'; | ||
|
|
||
| // const log = createModuleLogger(projectLogger, 'RpcServiceChain'); | ||
| const log = console.log.bind(console); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bug: Debug logging accidentally committed to production code
The proper logger using createModuleLogger is commented out and replaced with console.log.bind(console). This debug code will cause log statements to appear in production, bypassing the application's logging infrastructure. The commented line should be uncommented and the console.log line should be removed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed here: 72de1d6
| } | ||
|
|
||
| return response; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bug: Primary service not reset when failover succeeds
When a failover service succeeds after the primary fails, only services after the successful failover are reset, but the primary service's circuit is not reset. This means subsequent requests will continue to skip the primary even though a failover is working. The code should reset all services before the successful service index (including the primary at index 0) so that future requests will attempt the primary first again.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's okay, the primary's circuit does not need to be reset, it is already in the state we want it to be in.
| this.#status = STATUSES.Available; | ||
| this.#onAvailableEventEmitter.emit({ | ||
| ...data, | ||
| primaryEndpointUrl: this.#primaryService.endpointUrl.toString(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bug: Slow successful requests incorrectly mark chain as available
When a request succeeds but takes longer than the degraded threshold, both onDegraded and onAvailable events fire from the underlying service policy. The onAvailable handler unconditionally sets status to Available if it's not already Available, which overwrites the Degraded status that was just set by the onDegraded handler. This causes slow but successful requests to incorrectly mark the chain as Available instead of Degraded. The onAvailable handler should check if the status is Degraded and avoid overwriting it, or the status transition logic needs to account for the fact that a service can be both available and degraded simultaneously.
Explanation
In a future commit we will introduce changes to
network-controllerso that it will keep track of the status of each network as requests are made. This commit paves the way for this to happen by redefining the existing RPC endpoint-related events that NetworkController produces.Currently, when requests are made through the network clients that NetworkController exposes, three events are published:
NetworkController:rpcEndpointDegradedNetworkController:rpcEndpointUnavailableNetworkController:rpcEndpointRequestRetriedIt's important to note that in the context of the RPC failover feature, an "RPC endpoint" can actually encompass multiple URLs, so the above events actually fire for any URL.
While these events are useful for reporting metrics on RPC endpoints, in order to effectively be able to update the status of a network, we need events that are less granular and are guaranteed not to fire multiple times in a row. We also need a new event.
Now the list of events looks like this:
NetworkController:rpcEndpointDegradedNetworkController:rpcEndpointUnavailableNetworkController:rpcEndpointRetriedNetworkController:rpcEndpointRequestRetried.NetworkController:rpcEndpointChainDegradedNetworkController:rpcEndpointDegraded, but won't be published again if the RPC endpoint is already in a degraded state.NetworkController:rpcEndpointChainUnavailableNetworkController:rpcEndpointChainAvailableGoing a bit deeper, in order to make the changes above, it was necessary to rewrite the core logic responsible for diverting traffic to failovers from
RpcServicetoRpcServiceChain, which was a more natural fit, anyway. This also meant that we could simplifyRpcService, as well as its tests.References
Progresses https://consensyssoftware.atlassian.net/browse/WPC-99.
Checklist
Note
Reworks RPC endpoint events (adds chain-level available/degraded/unavailable, renames retry event, updates payloads) and replaces failover logic with a new RpcServiceChain, wiring through NetworkController and tests.
NetworkController:rpcEndpointChainAvailable,...ChainDegraded,...ChainUnavailable....rpcEndpointRequestRetriedto...rpcEndpointRetried.failoverEndpointUrl→endpointUrl,endpointUrl→primaryEndpointUrl; addnetworkClientId(andattemptfor retried).RpcServiceChainto manage primary/failover endpoints and emit chain/service-level events.RpcService(nofailoverService), add availability/degraded hooks, circuit state helpers.create-network-clientto useRpcServiceChainand publish new events via messenger.create-auto-managed-network-clientnow requiresnetworkClientIdand passes it through.cockatieland update CHANGELOG with breaking/event changes.Written by Cursor Bugbot for commit 0f55384. This will update automatically on new commits. Configure here.