Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

op-node: RPC Limit client does not respect context #7525

Merged

Conversation

jyellick
Copy link
Contributor

@jyellick jyellick commented Oct 3, 2023

This change addresses two bugs in the limit wrapper for client RPCs.

  1. The existing limit.go implements its own semaphore which ignores the passed in context. This means that the semaphore will block indefinitely, even when the context of the request has already expired.

  2. The existing implementation does not guard against clients which invoke RPC methods after the channel underlying the custom semaphore has been closed. This results in panics where the closed channel is written to during shutdown, and results in test flakiness. This flakiness is most evident in the op-e2e-http-tests suite.

Along with these fixes comes a test which attempts to demonstrate the previous bad behavior. Because these bugs are inherently tied to the interaction of multiple go routines, the test ends up being a bit complex, but is well commented and hopefully remains readable.

This change addresses two bugs.

1. The existing limit.go implements its own semaphore which ignores the
   passed in context.  This means that the semaphore will block
   indefinitely, even when the context of the request has already
   expired.

2. The existing implementation does not guard against clients which
   invoke RPC methods after the channel underlying the custom semaphore
   has been closed.  This results in panics where the closed channel is
   written to during shutdown, and results in test flakiness.  This
   flakiness is most evident in the op-e2e-http-tests suite.

Along with these fixes comes a test which attempts to demonstrate the
previous bad behavior.  Because these bugs are inherently tied to the
interaction of multiple go routines, the test ends up being a bit
complex, but is well commented and hopefully remains readable.
@jyellick jyellick requested a review from a team as a code owner October 3, 2023 19:05
@jyellick jyellick requested a review from tynes October 3, 2023 19:05
@mergify mergify bot added the A-op-node Area: op-node label Oct 3, 2023
Copy link
Contributor

@Inphi Inphi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A very welcome change to reduce flakes. tysm!

@mergify
Copy link
Contributor

mergify bot commented Oct 3, 2023

This PR has been added to the merge queue, and will be merged soon.

@mergify mergify bot added the S-on-merge-train Status: This PR is in the merge queue label Oct 3, 2023
@tynes tynes added this pull request to the merge queue Oct 3, 2023
Merged via the queue into ethereum-optimism:develop with commit 9bfb4dd Oct 3, 2023
60 of 69 checks passed
@mergify mergify bot removed the S-on-merge-train Status: This PR is in the merge queue label Oct 3, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-op-node Area: op-node
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants