Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add a new filter goaway which could send GOAWAY probabilistically to help balance HTTP2 requests #88567

Conversation

answer1991
Copy link
Contributor

@answer1991 answer1991 commented Feb 26, 2020

What type of PR is this?
/kind feature

What this PR does / why we need it:

API Server and client use HTTP2 to connect each other, which means client always hit the same API Server across different request and the requests are NOT balanced even if we use a load-balance to expose API Server.

With this PR merged, a new filter goaway could send GOAWAY to HTTP2 client probabilistically. After client received GOAWAY, the in-flight watch request will NOT be influenced, and the new requests will use a new TCP connection, which means load-balance will work after client receive GOAWAY

Which issue(s) this PR fixes:

None

Special notes for your reviewer:

Nothing yet.

Does this PR introduce a user-facing change?:

Apiserver add a new flag --goaway-chance which is the fraction of requests that will be closed gracefully(GOAWAY) to prevent HTTP/2 clients from getting stuck on a single apiserver. 
After the connection closed(received GOAWAY), the client's other in-flight requests won't be affected, and the client will reconnect. 
The flag min value is 0 (off), max is .02 (1/50 requests); .001 (1/1000) is a recommended starting point.
Clusters with single apiservers, or which don't use a load balancer, should NOT enable this.

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:

None

@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. kind/feature Categorizes issue or PR as related to a new feature. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. labels Feb 26, 2020
@k8s-ci-robot
Copy link
Contributor

Hi @answer1991. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Feb 26, 2020
@k8s-ci-robot k8s-ci-robot added area/apiserver sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. and removed needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Feb 26, 2020
@dixudx
Copy link
Member

dixudx commented Feb 26, 2020

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Feb 26, 2020
@answer1991
Copy link
Contributor Author

/hold

let me do some more test firstly

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Feb 26, 2020
@answer1991 answer1991 force-pushed the feature/close-connection-when-over-load branch from 039fe87 to 9da54e3 Compare February 27, 2020 04:46
@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. labels Feb 27, 2020
@answer1991 answer1991 force-pushed the feature/close-connection-when-over-load branch from 9da54e3 to cf8a84f Compare February 27, 2020 04:52
@answer1991
Copy link
Contributor Author

/hold cancel

The testing result is awesome. After client received GOAWAY, the in-flight watch request will NOT be influenced, and the new requests will use a new TCP connection, which means load-balance will work after client receive GOAWAY. Please see details and result in my test cases.

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Feb 27, 2020
@answer1991 answer1991 force-pushed the feature/close-connection-when-over-load branch from cf8a84f to 15c42cc Compare February 27, 2020 05:01
@answer1991 answer1991 changed the title send a GOAWAY and tear down TCP connection when HTTP2 API Server is over load send a GOAWAY and gracefully tear down TCP connection when HTTP2 API Server is over load Feb 27, 2020
@answer1991
Copy link
Contributor Author

cc @lavalamp @wojtek-t @deads2k

@answer1991 answer1991 force-pushed the feature/close-connection-when-over-load branch from 15c42cc to dd074e1 Compare February 27, 2020 05:10
Copy link
Member

@dixudx dixudx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Mar 3, 2020
@lavalamp
Copy link
Member

lavalamp commented Mar 3, 2020

Can you edit the release note to speak about the flag rather than the filter? Thanks.

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: answer1991, lavalamp

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Mar 3, 2020
@answer1991
Copy link
Contributor Author

updated release note.

/hold cancel

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Mar 3, 2020
@answer1991
Copy link
Contributor Author

Merge progress seems blocked, let me fix the validating error message nit and push again to trigger the merge progress

@answer1991 answer1991 force-pushed the feature/close-connection-when-over-load branch from c8c86ef to 81f46b6 Compare March 3, 2020 06:18
@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Mar 3, 2020
@answer1991
Copy link
Contributor Author

@lavalamp kindly ping for a LGTM label. I removed the /hold label seems to trigger an authorize issue.

BTW, fix the validating error message nit.

Can we do an e2e run with this turned on before we merge

Not sure how to config the e2e API Server flags.

@deads2k
Copy link
Contributor

deads2k commented Mar 4, 2020

/lgtm

@pidb
Copy link

pidb commented May 18, 2021

Can I set it up for different resources?

@answer1991
Copy link
Contributor Author

answer1991 commented May 25, 2021

Can I set it up for different resources?

It's a HTTP filter and not care about resources.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/apiserver cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/feature Categorizes issue or PR as related to a new feature. lgtm "Looks good to me", indicates that a PR is ready to be merged. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants