Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement ServiceSpec.AllocateLoadBalancerNodePorts #92744

Merged
merged 2 commits into from
Nov 13, 2020

Conversation

uablrek
Copy link
Contributor

@uablrek uablrek commented Jul 2, 2020

What type of PR is this?

/kind feature
/sig network

What this PR does / why we need it:

Implements KEP 1864-disable-lb-node-ports

Which issue(s) this PR fixes:

Special notes for your reviewer:

NOTE; This is a Work In Progress and shall not be merged (yet).

Does this PR introduce a user-facing change?:

Automatic allocation of NodePorts for services with type LoadBalancer can now be disabled by setting the (new) parameter
Service.spec.allocateLoadBalancerNodePorts=false. The default is to allocate NodePorts for services with type LoadBalancer which is the existing behavior.

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:


@k8s-ci-robot k8s-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. release-note Denotes a PR that will be considered when it comes time to generate release notes. kind/feature Categorizes issue or PR as related to a new feature. size/S Denotes a PR that changes 10-29 lines, ignoring generated files. sig/network Categorizes an issue or PR as relevant to SIG Network. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. labels Jul 2, 2020
@k8s-ci-robot k8s-ci-robot added the sig/apps Categorizes an issue or PR as relevant to SIG Apps. label Jul 2, 2020
@uablrek uablrek changed the title [WIP] Added ServiceSpec.AllocateLoadBalancerNodePorts [WIP] Implement ServiceSpec.AllocateLoadBalancerNodePorts Jul 2, 2020
@uablrek
Copy link
Contributor Author

uablrek commented Jul 2, 2020

/cc @andrewsykim

The AllocateLoadBalancerNodePorts still defaults to "false" but I wanted to have a PR in place asap for discussions and as a place-holder.

@uablrek uablrek marked this pull request as draft July 2, 2020 10:46
@rikatz
Copy link
Contributor

rikatz commented Jul 2, 2020

/cc @rikatz

Wanna take a look later :)

@k8s-ci-robot k8s-ci-robot requested a review from rikatz July 2, 2020 11:30
@uablrek
Copy link
Contributor Author

uablrek commented Jul 2, 2020

Comment obsoleted by the bool -> *bool switch described below!

The allocateLoadBalancerNodePorts field is visible in output from kubectl get svc -o json ... only when the value is "true". If the value is false (the current default which will be changed) the field is hidden;

$ kubectl get -o json service/nodeport-disabled-3-ipv6 | jq .spec
{
  "allocateLoadBalancerNodePorts": true,
  "clusterIP": "fd00:4000::a567",
  "externalTrafficPolicy": "Cluster",
  "ipFamily": "IPv6",
  "ports": [
    {
      "nodePort": 30657,
      "port": 5001,
      "protocol": "TCP",
      "targetPort": 5001
    }
  ],
  "selector": {
    "app": "mconnect-daemonset"
  },
  "sessionAffinity": "None",
  "type": "LoadBalancer"
}

But;

kubectl get -o json service/nodeport-disabled-2-ipv6 | jq .spec
{
  "clusterIP": "fd00:4000::bf4",
  "externalTrafficPolicy": "Cluster",
  "ipFamily": "IPv6",
  "ports": [
    {
      "nodePort": 31734,
      "port": 5001,
      "protocol": "TCP",
      "targetPort": 5001
    }
  ],
  "selector": {
    "app": "mconnect-daemonset"
  },
  "sessionAffinity": "None",
  "type": "LoadBalancer"
}

This service does not have allocateLoadBalancerNodePorts defined.

@andrewsykim Is this something we can affect?

Ideal would be that allocateLoadBalancerNodePorts:true will be hidden once "true" becomes the default. Then users not caring about the function would not even see it.

@uablrek
Copy link
Contributor Author

uablrek commented Jul 2, 2020

A possibility may be to rename the field; "disableLoadBalancerNodePorts" with a default of "false". Nah.

@uablrek
Copy link
Contributor Author

uablrek commented Jul 3, 2020

Came across this, see the "Use pointers" comment;
https://stackoverflow.com/questions/37756236/json-golang-boolean-omitempty/

Seem to give you a 3-state var; true, false, unset.

Would it be feasible to use a pointer in the ServiceSpec struct?

	// allocateLoadBalancerNodePorts defines if NodePorts shall be automatically for services with type LoadBalancer.
	// Default is "true". It may be set to "false" in clusters that don't have an external load-balancer that uses NodePorts.
	// This is often the case in private clouds that for instance uses "metallb".
	AllocateLoadBalancerNodePorts *bool `json:"allocateLoadBalancerNodePorts,omitempty" protobuf:"bytes,17,opt,name=allocateLoadBalancerNodePorts"`

@uablrek
Copy link
Contributor Author

uablrek commented Jul 3, 2020

Tried it, works like a charm. I'll do the switch.

This also allows the admission hook to be more intelligent; if unset, (nil) set a default, if set, leave it.

@uablrek
Copy link
Contributor Author

uablrek commented Jul 3, 2020

/test pull-kubernetes-bazel-build

@uablrek
Copy link
Contributor Author

uablrek commented Jul 3, 2020

/test pull-kubernetes-bazel-test

@uablrek
Copy link
Contributor Author

uablrek commented Jul 5, 2020

I know that actually checking the flag shall not go into alpha according to the KEP but I could not resist commiting a very small update that seem to be sufficient;

$ kubectl get svc
NAME                      TYPE           CLUSTER-IP        EXTERNAL-IP   PORT(S)          AGE
mconnect-daemonset-ipv4   LoadBalancer   12.0.85.138       <pending>     5001:32633/TCP   2m40s
mconnect-daemonset-ipv6   LoadBalancer   fd00:4000::407c   <pending>     5001:32350/TCP   2m40s
nodeport-disabled-ipv4    LoadBalancer   12.0.93.198       <pending>     5001/TCP         2m16s
nodeport-disabled-ipv6    LoadBalancer   fd00:4000::7f51   <pending>     5001/TCP         2m16s
nodeport-enabled-ipv4     LoadBalancer   12.0.88.112       <pending>     5001:30090/TCP   2m16s
nodeport-enabled-ipv6     LoadBalancer   fd00:4000::9f8e   <pending>     5001:32449/TCP   2m16s

In this case the allocateLoadBalancerNodePorts flag is nil (unset) in the "mconnect-daemonset" svcs, expicitly set to false in "nodeport-disabled" and explicitly set to true in "nodeport-enabled".

I let the reviewers take a look and see if I have missed anything, and I can revert the commit if needed.

@uablrek
Copy link
Contributor Author

uablrek commented Jul 5, 2020

/test pull-kubernetes-bazel-test

@uablrek
Copy link
Contributor Author

uablrek commented Jul 5, 2020

Tests keep on failing. When searching for "allocateLoadBalancerNodePorts" in the pull-kubernetes-bazel-test logs, this was repeated 3 times;

    TestCompatibility/core.v1.Service/HEAD: compatibility.go:302: json differs
    TestCompatibility/core.v1.Service/HEAD: compatibility.go:303:   strings.Join({
          	... // 76 identical lines
          	`    "topologyKeys": [`,
          	`      "29"`,
        - 	"    ]",
        + 	"    ],",
        + 	`    "allocateLoadBalancerNodePorts": true`,
          	"  },",
          	`  "status": {`,
          	... // 10 identical lines
          }, "\n")
        
    TestCompatibility/core.v1.Service/HEAD: compatibility.go:308: yaml differs
    TestCompatibility/core.v1.Service/HEAD: compatibility.go:309:   strings.Join({
          	... // 30 identical lines
          	`  uid: "7"`,
          	"spec:",
        + 	"  allocateLoadBalancerNodePorts: true",
          	`  clusterIP: "24"`,
          	"  externalIPs:",
          	... // 31 identical lines
          }, "\n")

I suspect I have missed some file generation or that test-data must be updated.

@andrewsykim Can you please have a look and see what I have missed?

In previous failed test there was printouts liks "run; hack/..." which was very helpful. Thanks a lot to whoever added them.

@andrewsykim
Copy link
Member

TestCompatibility/core.v1.Service/HEAD: compatibility.go:329: if the diff is expected because of a new type or a new field, re-run with UPDATE_COMPATIBILITY_FIXTURE_DATA=true to update the compatibility data

^ saw this in the logs and there's a hack script to update this hack/update-generated-api-compatibility-data.sh.

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jul 6, 2020
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jul 7, 2020
@uablrek
Copy link
Contributor Author

uablrek commented Nov 11, 2020

/retest

1 similar comment
@uablrek
Copy link
Contributor Author

uablrek commented Nov 11, 2020

/retest

@@ -185,6 +185,11 @@ func dropServiceDisabledFields(newSvc *api.Service, oldSvc *api.Service) {
ing.IPMode = nil
}
}

// Clear AllocateLoadBalancerNodePorts if ServiceLBNodePortControl if not enabled
if !utilfeature.DefaultFeatureGate.Enabled(features.ServiceLBNodePortControl) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This check used to check for !allocateLoadBalancerNodePortsInUse but it seems like we dropped it, was that intentional?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, since the allocateLoadBalancerNodePorts must be nil if the feature-gate is not set. At least that's how I have understood it.

Copy link
Member

@andrewsykim andrewsykim Nov 12, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

allocateLoadBalancerNodePorts must be nil if the feature-gate is not set

This and if an existing resource has the field already set. So this check should still check if oldSvc was using this field, and if it was, don't drop it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah this needs to preserve the field if it was already set in oldSvc. The rationale is that a user might upgrade to 1.21, use the field, temporarily roll back to 1.20 (where it is gated) - the field value should be preserved so that when they roll forward it is still present.

If this is the only issue I will approve but we need a followup to fix this

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-skim: This will need a followup, still.

@@ -376,6 +389,11 @@ func dropTypeDependentFields(newSvc *api.Service, oldSvc *api.Service) {
newSvc.Spec.HealthCheckNodePort = 0
}

// AllocateLoadBalancerNodePorts may only be set for type LoadBalancer
if newSvc.Spec.Type != api.ServiceTypeLoadBalancer {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think here we also need to check whether allocatedLoadBalancerNodePorts was trying to be set by oldSvc, otherwise this wouldn't fail validation checks.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So something like

// If a user is switching to a type that doesn't need allocatedLoadBalancerNodePorts AND they did not change
// this field, it is safe to drop it. 
if oldSvc.Spec.Type == api.ServiceTypeLoadBalancer && newSvc.Spec.Type != api.ServiceTypeLoadBalancer {
   if oldSvc.Spec.AllocateLoadBalancerNodePorts == newSvc.Spec.AllocatedLoadBalancerNodePorts {
       newSvc.Spec.AllocateLoadBalancerNodePorts = nil
   }
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we also add a test case for this in TestDropTypeDependentFields?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why all the extra checks? If type != LoadBalancer the AllocateLoadBalancerNodePorts must be nil no matter what.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also find the conditions hard to follow. If the type is changed from LoadBalancer but newSvc sets AllocateLoadBalancerNodePorts to true or false it will remain != nil?

Copy link
Member

@andrewsykim andrewsykim Nov 12, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not a very common scenario, but I think this is covering the case where a user changes from Type=LoadBalancer to Type=NodePort (or others) AND tries to toggle allocateLoadBalancerNodePorts in the same request. That should fail validation since allocateLoadBalancerNodePorts is only relevant for Type=LoadBalancer. If we always drop based on type, that request would silently pass with allocateLoadBalancerNodePorts being dropped, which is not what the user intended.

In other words, we should only drop the field if the type changed from LoadBalancer -> NodePorts (or others) AND allocateLoadBalancerNodePorts did not change.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we only want to auto-clear fields if we don't need it AND the user is not trying to change it. If they are trying to change it then we should let validation fail.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as above, followup needed

Copy link
Member

@thockin thockin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The 2 comments I have can be a followup.

/lgtm
/approve

@@ -185,6 +185,11 @@ func dropServiceDisabledFields(newSvc *api.Service, oldSvc *api.Service) {
ing.IPMode = nil
}
}

// Clear AllocateLoadBalancerNodePorts if ServiceLBNodePortControl if not enabled
if !utilfeature.DefaultFeatureGate.Enabled(features.ServiceLBNodePortControl) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah this needs to preserve the field if it was already set in oldSvc. The rationale is that a user might upgrade to 1.21, use the field, temporarily roll back to 1.20 (where it is gated) - the field value should be preserved so that when they roll forward it is still present.

If this is the only issue I will approve but we need a followup to fix this

@@ -376,6 +389,11 @@ func dropTypeDependentFields(newSvc *api.Service, oldSvc *api.Service) {
newSvc.Spec.HealthCheckNodePort = 0
}

// AllocateLoadBalancerNodePorts may only be set for type LoadBalancer
if newSvc.Spec.Type != api.ServiceTypeLoadBalancer {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we only want to auto-clear fields if we don't need it AND the user is not trying to change it. If they are trying to change it then we should let validation fail.

@k8s-ci-robot k8s-ci-robot added lgtm "Looks good to me", indicates that a PR is ready to be merged. approved Indicates a PR has been approved by an approver from all required OWNERS files. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. labels Nov 12, 2020
@thockin
Copy link
Member

thockin commented Nov 13, 2020

uggh, needs rebase but is otherwise approved

@kikisdeliveryservice
Copy link
Member

As per: kubernetes/enhancements#1864 (comment)

Please get this rebased and re-LGTM ASAP

Thanks!
Kirsten

@k8s-ci-robot k8s-ci-robot removed lgtm "Looks good to me", indicates that a PR is ready to be merged. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. labels Nov 13, 2020
@uablrek
Copy link
Contributor Author

uablrek commented Nov 13, 2020

/retest

@uablrek
Copy link
Contributor Author

uablrek commented Nov 13, 2020

@andrewsykim @thockin Thanks for your guidance and patiance in my first API update 👍 I hope this will be the last update in this PR. I will immediate create a new PR for the followup's when it's merged.

I have rebased, but not updated the review items since;

The 2 comments I have can be a followup.

I have (temporary) removed the unused function allocateLoadBalancerNodePortsInUse() from pkg/registry/core/service/strategy.go:210 since it broke the static tests. It will be re-inserted in the followup. Beside that and the rebase conflicts the code is not altered. But if possible please check so I have not messed up the conflicts. I have checked myself of course but...

Copy link
Member

@andrewsykim andrewsykim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

Summarizing some of the follow-up items in case they were lost in review:

  1. CreateStrategy shouldn't drop spec.allocateLoadBalancerNodePorts if an exisitng Service was using it
  2. dropTypeDependentFields should check if spec.allocateLoadBalancerNodePorts is changing when the service type also changes
  3. unit tests for dropTypeDependentFields when type changes from LoadBalancer to something else.

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 13, 2020
@andrewsykim
Copy link
Member

/retest

Copy link
Member

@thockin thockin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

/lgtm
/approve

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: thockin, uablrek

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/api-change Categorizes issue or PR as related to adding, removing, or otherwise changing an API kind/feature Categorizes issue or PR as related to a new feature. lgtm "Looks good to me", indicates that a PR is ready to be merged. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/apps Categorizes an issue or PR as relevant to SIG Apps. sig/network Categorizes an issue or PR as relevant to SIG Network. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

9 participants