Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove kube-proxy autocleanup for inactive modes #76109

Merged
merged 6 commits into from Apr 5, 2019

Conversation

vllry
Copy link
Contributor

@vllry vllry commented Apr 3, 2019

What type of PR is this?
/kind bug

What this PR does / why we need it:
kube-proxy attempts to clean up network rules for other modes (EG cleaning up iptables rules when running in IPVS mode). The complexity of this code is prone to bugs, and can delay kube-proxy's readiness between restarts.

This PR see KEP aims to remove auto-cleanup logic for non-current kube-proxy modes. In other words, kube-proxy will only automatically clean up rules relevant to its current mode. Users should use --cleanup or restart the node when switching between kube-proxy modes.

Which issue(s) this PR fixes:
Fixes #75408 (tracking issue)
Fixes #75360 (bug caused by auto-cleanup)

Special notes for your reviewer:
Worth discussing if we gate this behavior with a flag (EG --only-clean-current-mode), or outright GA. Outright GA is currently the plan.

Does this PR introduce a user-facing change?:

kube-proxy no longer automatically cleans up network rules created by running kube-proxy in other modes. If you are switching the mode that kube-proxy is in running in (EG: iptables to IPVS), you will need to run `kube-proxy --cleanup`, or restart the worker node (recommended) before restarting kube-proxy.

If you are not switching kube-proxy between different modes, this change should not require any action.

@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. kind/bug Categorizes issue or PR as related to a bug. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. labels Apr 3, 2019
@vllry
Copy link
Contributor Author

vllry commented Apr 3, 2019

/priority important-soon
/sig network
/assign @thockin

@k8s-ci-robot k8s-ci-robot added priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. sig/network Categorizes an issue or PR as relevant to SIG Network. and removed needs-priority Indicates a PR lacks a `priority/foo` label and requires one. needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Apr 3, 2019
Copy link
Member

@thockin thockin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I keep looking for something missing here, but I can't find it.

We should fix the description in cmd/kube-proxy/app/server.go on the cleanup-ipvs flag: "cleanup ipvs rules before running" is no longer true, right?

Otherwise LGTM

@vllry
Copy link
Contributor Author

vllry commented Apr 3, 2019

We should fix the description in cmd/kube-proxy/app/server.go on the cleanup-ipvs flag: "cleanup ipvs rules before running" is no longer true, right?

Correct, I'll fix that.

We also need docs changes, which I have no idea how to coordinate.

@thockin
Copy link
Member

thockin commented Apr 4, 2019 via email

@vllry vllry changed the title WIP: Remove kube-proxy autocleanup for inactive modes Remove kube-proxy autocleanup for inactive modes Apr 4, 2019
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Apr 4, 2019
@vllry
Copy link
Contributor Author

vllry commented Apr 4, 2019

/test pull-kubernetes-e2e-gce

@thockin
Copy link
Member

thockin commented Apr 4, 2019

/lgtm
/approve

@k8s-ci-robot k8s-ci-robot added lgtm "Looks good to me", indicates that a PR is ready to be merged. approved Indicates a PR has been approved by an approver from all required OWNERS files. labels Apr 4, 2019
@andrewsykim
Copy link
Member

andrewsykim commented Apr 4, 2019

Even though it's implied, I think the release note should say something about --cleanup-ipvs being deprecated and no longer having any effect. Should we also mark the release note for this as "action required" since we are being more explicit about users either setting --cleanup or rebooting nodes during a proxy mode switch? cc v1.14 patch release team @aleksandra-malinowska @spiffxp @tpepper since this is going into v1.14.1

@thockin
Copy link
Member

thockin commented Apr 4, 2019 via email

@andrewsykim
Copy link
Member

andrewsykim commented Apr 4, 2019

In IPVS mode --cleanup means "do low-impact cleanup and exit" while --cleanup --cleanup-ipvs means "do full cleanup and exit"

Sorry if I'm missing something, but I'm not seeing the changes in this PR reflect this 🤔 The current changes indicate --cleanup-ipvs has no effect and --cleanup will always do full clean up. Fwiw I prefer the current changes where we just deprecate --cleanup-ipvs but would like to make sure we're on the same page

/hold

@vllry
Copy link
Contributor Author

vllry commented Apr 4, 2019

Gotcha. I'll push up a change shortly, have to run to a meeting.

@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Apr 5, 2019
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: thockin, vllry

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@andrewsykim
Copy link
Member

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Apr 5, 2019
@andrewsykim
Copy link
Member

/hold cancel

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Apr 5, 2019
@vllry
Copy link
Contributor Author

vllry commented Apr 5, 2019

@thockin @andrewsykim issues resolved + tests finally passed, we're good to go.

@k8s-ci-robot k8s-ci-robot merged commit 71f4c9a into kubernetes:master Apr 5, 2019
k8s-ci-robot added a commit that referenced this pull request Apr 5, 2019
…upstream-release-1.14

Automated cherry pick of #76109: Removed cleanup for non-current kube-proxy modes in
@ravilr
Copy link
Contributor

ravilr commented Apr 9, 2019

@vllry @andrewsykim can this be cherry-picked to release-1.13 also please?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/bug Categorizes issue or PR as related to a bug. lgtm "Looks good to me", indicates that a PR is ready to be merged. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/network Categorizes an issue or PR as relevant to SIG Network. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Remove kube-proxy's automatic clean up logic Restarting iptables kube-proxier causes connections to fail
5 participants