Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

create new SCTP ipsets for IPVS proxier #77371

Merged
merged 1 commit into from May 28, 2019

Conversation

@andrewsykim
Copy link
Member

commented May 2, 2019

Signed-off-by: Andrew Sy Kim kiman@vmware.com

What type of PR is this?
/kind bug

What this PR does / why we need it:
In #74341 we updated the KUBE-NODE-PORT-SCTP and KUBE-NODE-PORT-LOCAL-SCTP ipset to use type hash ip:port instead of bitmap:port since bitmap does not support sctp. This ended up being a breaking change because ipset create <set> -exists actually returns error unless the existing set and the proposed set are identical. With the type updated, ipset returns an error regardless of -exists (and rightly so).

This PR adds a new ipset KUBE-NODE-PORT-SCTP-HASH and KUBE-NODE-PORT-LOCAL-SCTP-HASH so that the IPVS proxier does not error if an ipset from a previous version exists.

Shout out to @lbernail for finding the bug :)

Which issue(s) this PR fixes:

Fixes #77265

Special notes for your reviewer:

Does this PR introduce a user-facing change?:

Fix a bug where kube-proxy returns error due to existing ipset rules using a different hash type.
@andrewsykim

This comment has been minimized.

Copy link
Member Author

commented May 2, 2019

/sig network
/priority important-soon

// This is the list of IP Sets that need to be destroyed on start up for backwards compatbility reasons
// If a proporty of an IP Set is changed and if kube-proxy is required to recreate it, then add the
// set name into cleanupIPSetsOnStartup
var cleanupIPSetsOnStartup = sets.NewString(kubeNodePortSetSCTP, kubeNodePortLocalSetSCTP)

This comment has been minimized.

Copy link
@andrewsykim

andrewsykim May 2, 2019

Author Member

I don't like this pattern of white listing ipsets that need a clean up before starting but not sure if the alternatives are better. We did this for proxy rules and that ended up being EOL'd due to it's complexity. I was thinking of adding ReplaceSet in utilipset.Interface which does flush / destroy / create and calling that from ensureIPSet, but that would result in destroying and recreating the ipset on every proxy sync which I don't think is any better than doing the destory/flush/create once at start up.

This comment has been minimized.

Copy link
@andrewsykim

andrewsykim May 2, 2019

Author Member

The alternatives are:

  1. create a new ipset with a different name going forward
  2. revert #74341 - I don't think this is an option if we want to support sctp
  3. update ipset utils to be able to diff the existing and proposed ipset and act accordingly - recreate if different, do nothing otherwise. This will require a larger update and complexity added to pkg/util/ipset though. Not sure it's worth it

This comment has been minimized.

Copy link
@andrewsykim
burstSyncs := 2
klog.V(3).Infof("minSyncPeriod: %v, syncPeriod: %v, burstSyncs: %d", minSyncPeriod, syncPeriod, burstSyncs)
proxier.syncRunner = async.NewBoundedFrequencyRunner("sync-runner", proxier.syncProxyRules, minSyncPeriod, syncPeriod, burstSyncs)
proxier.gracefuldeleteManager.Run()
return proxier, nil
}

func ipSetExists(ipt utilipset.Interface, set string) (bool, error) {

This comment has been minimized.

Copy link
@andrewsykim

andrewsykim May 2, 2019

Author Member

This method might actually belong in utilipset.Interface

@andrewsykim andrewsykim force-pushed the andrewsykim:77265 branch from 7f7b411 to ef74efc May 2, 2019

@andrewsykim andrewsykim force-pushed the andrewsykim:77265 branch 4 times, most recently from 9980bf9 to 59f7ad9 May 2, 2019

@andrewsykim

This comment has been minimized.

Copy link
Member Author

commented May 2, 2019

/retest

@lbernail

This comment has been minimized.

Copy link
Contributor

commented May 3, 2019

Thank you for the PR!

I think the safest approach is alternative 1 (new name) because even if we find a solution for upgrading to 1.15, some people might need to rollback to a previous version which will fail because there will be a hash ip:port ipset configured instead of a bitmap ip:port

It's not perfect because we will leave an unuset ipset behind (maybe we can provide clean up logic in 1.15.x and make a note to remove it in 1.16 for instance?)

@andrewsykim

This comment has been minimized.

Copy link
Member Author

commented May 3, 2019

I think the safest approach is alternative 1 (new name) because even if we find a solution for upgrading to 1.15, some people might need to rollback to a previous version which will fail because there will be a hash ip:port ipset configured instead of a bitmap ip:port

Good point, this would not be rollback safe. I'll add a new ipset for this then.

@andrewsykim andrewsykim force-pushed the andrewsykim:77265 branch from 59f7ad9 to b6319b8 May 3, 2019

@k8s-ci-robot k8s-ci-robot added size/S and removed size/M labels May 3, 2019

@andrewsykim andrewsykim changed the title flush/destroy ipset KUBE-NODE-PORT-SCTP and KUBE-NODE-PORT-LOCAL-SCTP on start up create new ipsets for IPVS proxier May 3, 2019

create new ipset KUBE-NODE-PORT-SCTP-HASH and KUBE-NODE-PORT-LOCAL-SC…
…TP-HASH for ipvs proxier

Signed-off-by: Andrew Sy Kim <kiman@vmware.com>

@andrewsykim andrewsykim force-pushed the andrewsykim:77265 branch from b6319b8 to 43ded7c May 3, 2019

@andrewsykim andrewsykim changed the title create new ipsets for IPVS proxier create new SCTP ipsets for IPVS proxier May 3, 2019

// This ipset is no longer active but still used in previous versions.
// DO NOT create an ipset using this name
legacyKubeNodePortLocalSetSCTPComment = "Kubernetes nodeport SCTP port with externalTrafficPolicy=local"
legacyKubeNodePortLocalSetSCTP = "KUBE-NODE-PORT-LOCAL-SCTP"

This comment has been minimized.

Copy link
@andrewsykim

andrewsykim May 3, 2019

Author Member

Some alternatives we've considered so far:

  • add logic to delete/clean up ipset at start up, but as @lbernail mentioned, this is backwards incompatible on rollback since older versions will still use type bitmap
  • revert #74341 but that means no SCTP support for IPVS? I don't think that's an option
// This ipset is no longer active but still used in previous versions.
// DO NOT create an ipset using this name
legacyKubeNodePortSetSCTPComment = "Kubernetes nodeport SCTP port for masquerade purpose"
legacyKubeNodePortSetSCTP = "KUBE-NODE-PORT-SCTP"

This comment has been minimized.

Copy link
@andrewsykim

andrewsykim May 3, 2019

Author Member

Keeping these here to ensure the ipset names aren't re-used in the future.

@lbernail

This comment has been minimized.

Copy link
Contributor

commented May 3, 2019

Thank you!
Do we want to delete legacyKubeNodePortLocalSetSCTP and legacyKubeNodePortSetSCTP on startup if they exist? To clean-up on upgrade?

@andrewsykim

This comment has been minimized.

Copy link
Member Author

commented May 3, 2019

Do we want to delete legacyKubeNodePortLocalSetSCTP and legacyKubeNodePortSetSCTP on startup if they exist? To clean-up on upgrade?

TBH I think we should just leave it. I don't see risk in leaving the ipset there, especially with the name KUBE-NODE-PORT-SCTP and trying to clean up on start up would add complexity and more potential error cases in kube-proxy. What do you think?

@andrewsykim

This comment has been minimized.

Copy link
Member Author

commented May 3, 2019

/assign @thockin @m1093782566

This is a bit of a sensitive change so this likely needs your attention. What should be the general process for updating the ipsets type? I don't think we have other alternatives than to add new ones. See #77371 (comment) for some alternatives that we already discussed

@lbernail

This comment has been minimized.

Copy link
Contributor

commented May 3, 2019

TBH I think we should just leave it. I don't see risk in leaving the ipset there, especially with the name KUBE-NODE-PORT-SCTP and trying to clean up on start up would add complexity and more potential error cases in kube-proxy. What do you think?

I really don't have a strong opinion. If we were to do that we would have to remember to remove the additional logic later. I agree it's not a real problem

@andrewsykim

This comment has been minimized.

Copy link
Member Author

commented May 27, 2019

/area ipvs

@lbernail

This comment has been minimized.

Copy link
Contributor

commented May 27, 2019

/lgtm

@m1093782566

This comment has been minimized.

Copy link
Member

commented May 28, 2019

/approve

@k8s-ci-robot

This comment has been minimized.

Copy link
Contributor

commented May 28, 2019

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: andrewsykim, m1093782566

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@fejta-bot

This comment has been minimized.

Copy link

commented May 28, 2019

/retest
This bot automatically retries jobs that failed/flaked on approved PRs (send feedback to fejta).

Review the full test history for this PR.

Silence the bot with an /lgtm cancel or /hold comment for consistent failures.

@k8s-ci-robot k8s-ci-robot merged commit aa25195 into kubernetes:master May 28, 2019

20 checks passed

cla/linuxfoundation andrewsykim authorized
Details
pull-kubernetes-bazel-build Job succeeded.
Details
pull-kubernetes-bazel-test Job succeeded.
Details
pull-kubernetes-conformance-image-test Skipped.
pull-kubernetes-cross Skipped.
pull-kubernetes-dependencies Job succeeded.
Details
pull-kubernetes-e2e-gce Job succeeded.
Details
pull-kubernetes-e2e-gce-100-performance Job succeeded.
Details
pull-kubernetes-e2e-gce-csi-serial Skipped.
pull-kubernetes-e2e-gce-device-plugin-gpu Job succeeded.
Details
pull-kubernetes-e2e-gce-storage-slow Skipped.
pull-kubernetes-godeps Skipped.
pull-kubernetes-integration Job succeeded.
Details
pull-kubernetes-kubemark-e2e-gce-big Job succeeded.
Details
pull-kubernetes-local-e2e Skipped.
pull-kubernetes-node-e2e Job succeeded.
Details
pull-kubernetes-typecheck Job succeeded.
Details
pull-kubernetes-verify Job succeeded.
Details
pull-publishing-bot-validate Skipped.
tide In merge pool.
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.