Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KCCM: fix GCP ILB by reintroducing readiness predicate for eTP:Local #121116

Conversation

alexanderConstantinescu
Copy link
Member

What type of PR is this?

/kind bug

What this PR does / why we need it:

As mentioned in the linked issue: service exposed through GCP ILBs might have their SLOs impacted as a consequence of applying a different set of predicates to different services (eTP: Cluster/Local), since all load balancers point to the same InstanceGroup. The fix here is therefore that we re-introduce the readiness predicate for eTP:Local services so that the predicates align across all classes of services. We can't do the inverse and remove the readiness predicate for eTP:Cluster, because that's the KEP-3458.

/sig network
/sig cloud-provider
/assign @thockin
/cc @aojea

This PR should have no effect on >= 1.27 since the predicates are already aligned under the feature gate StableLoadBalancerNodeSet, but this should get in so that we guard against the broken behaviour should someone turn that feature gate off.

Which issue(s) this PR fixes:

Fixes #121094

Special notes for your reviewer:

Does this PR introduce a user-facing change?

Fix 121094 by re-introducing the readiness predicate for externalTrafficPolicy: Local services. 

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:

None

@k8s-ci-robot k8s-ci-robot added the release-note Denotes a PR that will be considered when it comes time to generate release notes. label Oct 10, 2023
@k8s-ci-robot k8s-ci-robot added size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. kind/bug Categorizes issue or PR as related to a bug. labels Oct 10, 2023
@k8s-ci-robot k8s-ci-robot added sig/network Categorizes an issue or PR as relevant to SIG Network. sig/cloud-provider Categorizes an issue or PR as relevant to SIG Cloud Provider. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Oct 10, 2023
@k8s-ci-robot
Copy link
Contributor

This issue is currently awaiting triage.

If a SIG or subproject determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. area/cloudprovider labels Oct 10, 2023
Copy link
Member

@thockin thockin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We really don't want readiness to exclude nodes in case of etp=Local, but the current non-determinism is worse.

/approve
/lgtm

@@ -1002,6 +1002,7 @@ var (
etpLocalNodePredicates []NodeConditionPredicate = []NodeConditionPredicate{
nodeIncludedPredicate,
nodeUnTaintedPredicate,
nodeReadyPredicate,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the plan to backport this?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes!

@@ -1002,6 +1002,7 @@ var (
etpLocalNodePredicates []NodeConditionPredicate = []NodeConditionPredicate{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

allNodePredicates and etpLocalNodePredicates are identical now. If I understand, we could just merge them, but since this is all replaced by stableNodeSetPredicates (which is beta) it's moot, and a smaller delta is preferable, right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this was the idea.

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Oct 11, 2023
@k8s-ci-robot
Copy link
Contributor

LGTM label has been added.

Git tree hash: 63ae40e8ca669c8d03f80441fd930d1f19e1974c

@thockin
Copy link
Member

thockin commented Oct 11, 2023

/retest

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: alexanderConstantinescu, thockin

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Oct 11, 2023
@k8s-ci-robot k8s-ci-robot merged commit 9cf1910 into kubernetes:master Oct 11, 2023
14 checks passed
@k8s-ci-robot k8s-ci-robot added this to the v1.29 milestone Oct 11, 2023
@alexanderConstantinescu
Copy link
Member Author

/cherry-pick release-1.28

k8s-ci-robot added a commit that referenced this pull request Oct 23, 2023
…rry-pick-of-#121116-upstream-release-1.28

Automated cherry pick of #121116: KCCM: fix GCP ILB by reintroducing readiness predicate for
k8s-ci-robot added a commit that referenced this pull request Oct 23, 2023
…rry-pick-of-#121116-upstream-release-1.27

Automated cherry pick of #121116: KCCM: fix GCP ILB by reintroducing readiness predicate for
k8s-ci-robot added a commit that referenced this pull request Oct 23, 2023
…rry-pick-of-#121116-upstream-release-1.26

Automated cherry pick of #121116: KCCM: fix GCP ILB by reintroducing readiness predicate for
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/cloudprovider cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/bug Categorizes issue or PR as related to a bug. lgtm "Looks good to me", indicates that a PR is ready to be merged. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/cloud-provider Categorizes an issue or PR as relevant to SIG Cloud Provider. sig/network Categorizes an issue or PR as relevant to SIG Network. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[KCCM]: service controller predicates might impact ingress SLO on GCP with InstanceGroup based Load Balancing
3 participants