New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HTTP2 Proactive GOAWAY on Drain - Preamble #17026
HTTP2 Proactive GOAWAY on Drain - Preamble #17026
Conversation
The initial set of changes (broken up to reduce PR/change size) to enable proactive GOAWAY signals on drain events. These changes (currently a no-op) open the way for a push-based notification model to the Http::ConnectionManagerImpl class (or any other interested party). Aloowing a move _away_ from the pull-based model that currently exists, which causes issues such as envoyproxy#14350. The primary change here is to introduce the idea of a "drain tree" which allows the drain-manager to be structured with parent and child managers. Each manager may drain independent of it's parent and all drain actions cascade downward through the tree. The parent <-> child relationship is thread-safe. In addition, drain events can now be registered via callbacks to be invoked when draining starts. This is a non-thread-safe operation. A new thread-safe callback manager utility has been written to support the parent <-> child drain-manager relationship. Additional test coverage has been added for the drain-manager and thread-safe callback-manager, as well as updates to necessary mock classes. Because the callback registration impacts the DrainDecision interface, necessary no-op implementations have been addedto various classes implementing this interface. The final versions will be wired up with correct behavior in subsequent PRs that take advantage of the new functionality. Signed-off-by: John Murray <murray@stripe.com>
Signed-off-by: John Murray <murray@stripe.com>
This was a super good callout. I had anticipated for double-calling in the order of 1. direct, 2. parent; but not the other way around. I have updated the code to allow for multiple invocations and simply handling multiple on-drain-complete callbacks. I've added additional test cases to validate this behavior. |
/retest |
Retrying Azure Pipelines: |
The MacOS tests keep being cancelled (I'm guessing it's timing out), but are otherwise passing up until that point. This is ready for review. |
Signed-off-by: John Murray <murray@stripe.com>
/retest |
Retrying Azure Pipelines: |
@antoniovicente I have incorporated your feedback and this is ready for review. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Almost there.
/wait
Signed-off-by: John Murray <murray@stripe.com>
@antoniovicente I've addressed your feedback. Really liking how things have simplified. 😄 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The change looks great. Thanks for this refactor in preparation for H2 drain propagation.
/assign-from @envoyproxy/senior-maintainers for a final approval.
@envoyproxy/senior-maintainers assignee is @zuercher |
retrying macos timeout /retest |
Retrying Azure Pipelines: |
macos still timing out (seems to be the case most of the time) /retest |
Retrying Azure Pipelines: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good to me. I found some small issues, in comments and had one question about a test. None of these are blocking.
Signed-off-by: John Murray <murray@stripe.com>
2de1367
@zuercher Thank you for the review! I have addressed your feedback. |
/retest |
Retrying Azure Pipelines: |
/retest |
Retrying Azure Pipelines: |
/retest |
Retrying Azure Pipelines: |
/retest |
Retrying Azure Pipelines: |
gah, keep getting some failures in the coverage build for /retest |
Retrying Azure Pipelines: |
@zuercher This is ready for your review/merge again. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks very cool stuff. A few post-submit drive by comments for your next PR. Thank you @murray-stripe!
Hey @murray-stripe could you take a look at Thanks! |
The initial set of changes (broken up to reduce PR/change size) to enable proactive GOAWAY signals on drain events. These changes (currently a no-op) open the way for a push-based notification model to the Http::ConnectionManagerImpl class (or any other interested party). Aloowing a move _away_ from the pull-based model that currently exists, which causes issues such as envoyproxy#14350. The primary change here is to introduce the idea of a "drain tree" which allows the drain-manager to be structured with parent and child managers. Each manager may drain independent of it's parent and all drain actions cascade downward through the tree. The parent <-> child relationship is thread-safe. In addition, drain events can now be registered via callbacks to be invoked when draining starts. This is a non-thread-safe operation. A new thread-safe callback manager utility has been written to support the parent <-> child drain-manager relationship. Additional test coverage has been added for the drain-manager and thread-safe callback-manager, as well as updates to necessary mock classes. Because the callback registration impacts the DrainDecision interface, necessary no-op implementations have been addedto various classes implementing this interface. The final versions will be wired up with correct behavior in subsequent PRs that take advantage of the new functionality. Risk Level: Low. Changes included should be no-op. Testing: Additional test coverage included. Docs Changes: n/a Release Notes: n/a Signed-off-by: John Murray <murray@stripe.com> Signed-off-by: chris.xin <xinchuantao@qq.com>
The initial set of changes (broken up to reduce PR/change size) to enable proactive GOAWAY signals on drain events. These changes (currently a no-op) open the way for a push-based notification model to the Http::ConnectionManagerImpl class (or any other interested party). Aloowing a move _away_ from the pull-based model that currently exists, which causes issues such as envoyproxy#14350. The primary change here is to introduce the idea of a "drain tree" which allows the drain-manager to be structured with parent and child managers. Each manager may drain independent of it's parent and all drain actions cascade downward through the tree. The parent <-> child relationship is thread-safe. In addition, drain events can now be registered via callbacks to be invoked when draining starts. This is a non-thread-safe operation. A new thread-safe callback manager utility has been written to support the parent <-> child drain-manager relationship. Additional test coverage has been added for the drain-manager and thread-safe callback-manager, as well as updates to necessary mock classes. Because the callback registration impacts the DrainDecision interface, necessary no-op implementations have been addedto various classes implementing this interface. The final versions will be wired up with correct behavior in subsequent PRs that take advantage of the new functionality. Risk Level: Low. Changes included should be no-op. Testing: Additional test coverage included. Docs Changes: n/a Release Notes: n/a Signed-off-by: John Murray <murray@stripe.com>
Commit Message:
The initial set of changes (broken up to reduce PR/change size) to
enable proactive GOAWAY signals on drain events. These changes
(currently a no-op) open the way for a push-based notification model to
the Http::ConnectionManagerImpl class (or any other interested party).
Allowing a move away from the pull-based model that currently exists,
which causes issues such as #14350.
The primary change here is to introduce the idea of a "drain tree" which
allows the drain-manager to be structured with parent and child
managers. Each manager may drain independent of it's parent and all
drain actions cascade downward through the tree. The parent <-> child
relationship is thread-safe.
In addition, drain events can now be registered via callbacks to be
invoked when draining starts. This is a non-thread-safe operation.
A new thread-safe callback manager utility has been written to support
the parent <-> child drain-manager relationship.
Additional test coverage has been added for the drain-manager and
thread-safe callback-manager, as well as updates to necessary mock
classes.
Because the callback registration impacts the DrainDecision interface,
necessary no-op implementations have been addedto various classes
implementing this interface. The final versions will be wired up with
correct behavior in subsequent PRs that take advantage of the new
functionality.
Additional Description: This PR was created at the advisement of @antoniovicente in relation to #16201 as an effort to reduce the amount of change being reviewed in a single PR. The content of this PR is meant to be a no-op with #16201 enacting the change (and will be updated once this is merged to reduce it's scope).
Risk Level: Low. Changes included should be no-op.
Testing: Additional test coverage included.
Docs Changes: None.
Release Notes: None.
Platform Specific Features: N/A
[Optional Runtime guard:]
[Optional Fixes #Issue]
[Optional Deprecated:]
[Optional API Considerations:]