Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kube-apiserver: use SO_REUSEPORT when creating listener #88893

Merged

Conversation

invidian
Copy link
Member

@invidian invidian commented Mar 6, 2020

What type of PR is this?

Uncomment only one /kind <> line, hit enter to put that in a new line, and remove leading whitespace from that line:

/kind api-change
/kind bug
/kind cleanup
/kind deprecation
/kind design
/kind documentation
/kind failing-test

/kind feature

/kind flake

What this PR does / why we need it:

So multiple instances of kube-apiserver can bind on the same address and
port, to provide seamless upgrades.

Which issue(s) this PR fixes:

Fixes #88785

Special notes for your reviewer:

Perhaps some e2e tests could be added for that too, but I don't know how to do that.

Does this PR introduce a user-facing change?:

kube-apiserver, kube-scheduler and kube-controller manager now use SO_REUSEPORT socket option when listening on address defined by --bind-address and --secure-port flags, when running on Unix systems (Windows is NOT supported). This allows to run multiple instances of those processes on a single host with the same configuration, which allows to update/restart them in a graceful way, without causing downtime.

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:


@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. do-not-merge/invalid-commit-message Indicates that a PR should not merge because it has an invalid commit message. size/S Denotes a PR that changes 10-29 lines, ignoring generated files. kind/feature Categorizes issue or PR as related to a new feature. needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. labels Mar 6, 2020
@k8s-ci-robot
Copy link
Contributor

Welcome @invidian!

It looks like this is your first PR to kubernetes/kubernetes 🎉. Please refer to our pull request process documentation to help your PR have a smooth ride to approval.

You will be prompted by a bot to use commands during the review process. Do not be afraid to follow the prompts! It is okay to experiment. Here is the bot commands documentation.

You can also check if kubernetes/kubernetes has its own contribution guidelines.

You may want to refer to our testing guide if you run into trouble with your tests not passing.

If you are having difficulty getting your pull request seen, please follow the recommended escalation practices. Also, for tips and tricks in the contribution process you may want to read the Kubernetes contributor cheat sheet. We want to make sure your contribution gets all the attention it needs!

Thank you, and welcome to Kubernetes. 😃

@k8s-ci-robot k8s-ci-robot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Mar 6, 2020
@k8s-ci-robot
Copy link
Contributor

Hi @invidian. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@invidian invidian force-pushed the invidian/kube-apiserver-so-reuseport branch from b5fa276 to b0c5e3d Compare March 6, 2020 09:13
@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Mar 6, 2020
@invidian invidian force-pushed the invidian/kube-apiserver-so-reuseport branch from b0c5e3d to 2dfda9f Compare March 6, 2020 09:14
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/invalid-commit-message Indicates that a PR should not merge because it has an invalid commit message. label Mar 6, 2020
@invidian invidian force-pushed the invidian/kube-apiserver-so-reuseport branch from 2dfda9f to 8c9fb1c Compare March 6, 2020 09:15
@k8s-ci-robot k8s-ci-robot added area/apiserver sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. and removed needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Mar 6, 2020
@sftim
Copy link
Contributor

sftim commented Mar 7, 2020

Does this change have an impact on portability?

/retest

@invidian
Copy link
Member Author

invidian commented Mar 7, 2020

Does this change have an impact on portability?

What kind of portability? OSes? Or CPU architectures? As far as I'm aware, this change will only work for Linux, I wasn't sure if I should target other OSes too (as mentioned in original issue).

It seems all tests fails with the same error. @sftim any hints how to reproduce it locally/why it fails?

@sftim
Copy link
Contributor

sftim commented Mar 7, 2020

@sftim any hints how to reproduce it locally/why it fails?

I'm afraid I don't have any insight into that.

@invidian invidian force-pushed the invidian/kube-apiserver-so-reuseport branch from 8c9fb1c to 62bc158 Compare March 7, 2020 15:38
@invidian
Copy link
Member Author

invidian commented Mar 7, 2020

I just run ./hack/update-bazel.sh, commited generated file and pushed again, maybe that's it? Can we re-test please?

@sftim
Copy link
Contributor

sftim commented Mar 7, 2020

/retest

@invidian invidian force-pushed the invidian/kube-apiserver-so-reuseport branch from 62bc158 to b94d995 Compare March 7, 2020 16:00
@k8s-ci-robot k8s-ci-robot added the area/dependency Issues or PRs related to dependency changes label Mar 7, 2020
@invidian
Copy link
Member Author

invidian commented Mar 7, 2020

I run now hack/update-vendor.sh and also pushed generated changes.

But I see now, that typecheck fails and I wonder if there is a way to disable it to ignore Windows or do I need to change the code to support windows as well 🤔

@invidian
Copy link
Member Author

invidian commented Mar 12, 2020

Ugh, fixed missing import in unit test (running tests locally now too, to make sure it's fine)

@invidian invidian force-pushed the invidian/kube-apiserver-so-reuseport branch from 6ba91b4 to ff02178 Compare March 12, 2020 18:17
@k8s-ci-robot k8s-ci-robot added the sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. label Mar 12, 2020
@invidian
Copy link
Member Author

invidian commented Mar 12, 2020

Ugh, fixed missing import in unit test (running tests locally now too, to make sure it's fine)

Also had to fix some other unit test types.

@lavalamp can you trigger tests again please?

@invidian invidian force-pushed the invidian/kube-apiserver-so-reuseport branch 2 times, most recently from 18531d0 to 24c1570 Compare March 12, 2020 22:07
@invidian
Copy link
Member Author

Tests are passing now.

So multiple instances of kube-apiserver can bind on the same address and
port, to provide seamless upgrades.

Signed-off-by: Mateusz Gozdek <mateusz@kinvolk.io>
@invidian invidian force-pushed the invidian/kube-apiserver-so-reuseport branch from 24c1570 to dfe1f96 Compare March 13, 2020 23:00
@lavalamp
Copy link
Member

/approve
/lgtm

Thanks!

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Mar 13, 2020
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: invidian, lavalamp

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Mar 13, 2020
@lavalamp
Copy link
Member

(this won't merge until we exit code freeze btw)

@invidian
Copy link
Member Author

Thanks!

Thanks for your guidance of my first K8s PR @lavalamp! I'm happy that it went in so smoothly.

@fejta-bot
Copy link

/retest
This bot automatically retries jobs that failed/flaked on approved PRs (send feedback to fejta).

Review the full test history for this PR.

Silence the bot with an /lgtm cancel or /hold comment for consistent failures.

1 similar comment
@fejta-bot
Copy link

/retest
This bot automatically retries jobs that failed/flaked on approved PRs (send feedback to fejta).

Review the full test history for this PR.

Silence the bot with an /lgtm cancel or /hold comment for consistent failures.

@k8s-ci-robot k8s-ci-robot merged commit ed4c2db into kubernetes:master Mar 18, 2020
@k8s-ci-robot k8s-ci-robot added this to the v1.19 milestone Mar 18, 2020
@invidian invidian deleted the invidian/kube-apiserver-so-reuseport branch March 18, 2020 07:40
@mfojtik
Copy link
Contributor

mfojtik commented Jun 24, 2020

@invidian @lavalamp don't we also need SO_REUSEADDR to make "more than one instance to bind on the same address and port. [default=false]" true?

Also do we need some orchestration during upgrade to avoid old API server responding if it is not fully down (which can confuse health checks).

@invidian
Copy link
Member Author

Also do we need some orchestration during upgrade to avoid old API server responding if it is not fully down (which can confuse health checks).

It has been discussed here: #88785 (comment). TL;DR it should be fine.

@invidian @lavalamp don't we also need SO_REUSEADDR to make "more than one instance to bind on the same address and port. [default=false]" true?

As far as I understand the difference between the two and from my testing, SO_REUSEPORT is sufficient to provide this functionality. Also as tests states.

@bhks
Copy link
Contributor

bhks commented Mar 4, 2021

Hi @invidian ,

Thank you for your PR.
I have question related to the health check of new API server coming up or the previous comment #88893 (comment).
Given you pointed out the answer to the comment #88785 (comment) but that comment answer mostly about the HA and version skew but not about the health checks.

When you get some time I am interested in learning about what happens if the new API server comes up but fails health due to issues.

My question is do we not need to first perform health check on new API server before start accepting actual traffic . Are we expecting the new binary to just work fine ?

@invidian
Copy link
Member Author

invidian commented Mar 4, 2021

My question is do we not need to first perform health check on new API server before start accepting actual traffic . Are we expecting the new binary to just work fine ?

It's a good question :) I don't know how this can be configured for health checks for individual instances. I usually rely on the fact, that if kube-apiserver is not functional, it restarts.

@bhks
Copy link
Contributor

bhks commented Mar 4, 2021

Thank you for your response!!!

Assuming its same IP and PORT on which API Server is being brought up:

So In cases of malfunction restart , when node have high traffic ~50% of the new connection can route to new API Server after it makes Listen() call without being accepted. Also we might not have mechanism to know if this new one is malfunctioning until we remove old running API Server.

If its different IP and same PORT then if it does not affect the traffic from my understanding.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/apiserver area/dependency Issues or PRs related to dependency changes area/test cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/feature Categorizes issue or PR as related to a new feature. lgtm "Looks good to me", indicates that a PR is ready to be merged. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. sig/testing Categorizes an issue or PR as relevant to SIG Testing. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
8 participants