Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[KEP 3458]: 1.29 milestone bump - stable #4257

Merged
merged 1 commit into from
Oct 4, 2023

Conversation

alexanderConstantinescu
Copy link
Member

  • One-line PR description: 1.29 milestone bump for KEP - 3458. Bumping to stable
  • Other comments: none

/sig network
/assign @thockin
/assign @wojtek-t

@k8s-ci-robot k8s-ci-robot added the sig/network Categorizes an issue or PR as relevant to SIG Network. label Oct 2, 2023
@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory labels Oct 2, 2023
@k8s-ci-robot k8s-ci-robot added the size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. label Oct 2, 2023
keps/prod-readiness/sig-network/3458.yaml Show resolved Hide resolved
status: implementable
stage: beta
latest-milestone: "v1.28"
stage: stable
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Few comment about the KEP itself:

  1. In mitigations section:

For point 3. we could implement the following change in the service controller [...]

has that happened? If not, why we believe we don't need it? can you update the KEP?

Were upgrade and rollback tested? Was the upgrade->downgrade->upgrade path tested

has this happened? Can you please update the KEP with the test description and findings?

A new metric load_balancer_sync_count will be added

Has this happened?

A new metric nodesync_error_rate will be added

Has this happened?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

has that happened? If not, why we believe we don't need it? can you update the KEP?

Yes. The service controller still employs the predicate which excludes nodes with the taint ToBeDeletedByClusterAutoscaler, see: https://github.com/kubernetes/kubernetes/blob/master/staging/src/k8s.io/cloud-provider/controllers/service/controller.go#L1009-L1014C26

has this happened? Can you please update the KEP with the test description and findings?

No

Has this happened?

Yes, see: https://github.com/kubernetes/kubernetes/blob/master/staging/src/k8s.io/cloud-provider/controllers/service/metrics.go#L44-L49

Has this happened?

Yes, see: https://github.com/kubernetes/kubernetes/blob/master/staging/src/k8s.io/cloud-provider/controllers/service/metrics.go#L50-L55

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the "yes" thing - please update the KEP to reflect that (with links).

For the no thing [the test thing], can we try running something? @thockin - FYI

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the "yes" thing - please update the KEP to reflect that (with links).

Done.

For the no thing [the test thing], can we try running something?

I am not sitting on infrastructure which allows me to create cluster from HEAD. That being said: I quite sure the upgrade path works because this has been enabled by default since 1.27 and my company has successfully upgraded clusters from 1.26 -> 1.27, with services existing on the cluster that are impacted by this change.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can help get access to infra to do this manual testing, if that's what we need. That said, I'm a little skeptical that we will find anyhing by just doing it, without a) knowing what we are looking for; and b) tickling the edge cases specifically.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK - I buy that.

But then let's update the Were upgrade and rollback tested? Was the upgrade->downgrade->upgrade path tested? PRR question to something like

The owners of the KEP missed running the manual test when promoting the feature to Beta. Since then it was implicitly tested by many users that upgraded their clusters to 1.27+ versions without any bug reports, so running additional tests now wouldn't provide additional value.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've updated the answer to that question.

@k8s-ci-robot k8s-ci-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. labels Oct 2, 2023
Copy link
Member

@thockin thockin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
/approve

/hold

to discuss manual testing

status: implementable
stage: beta
latest-milestone: "v1.28"
stage: stable
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can help get access to infra to do this manual testing, if that's what we need. That said, I'm a little skeptical that we will find anyhing by just doing it, without a) knowing what we are looking for; and b) tickling the edge cases specifically.

@k8s-ci-robot k8s-ci-robot added do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. lgtm "Looks good to me", indicates that a PR is ready to be merged. labels Oct 2, 2023
@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Oct 4, 2023
@thockin thockin removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Oct 4, 2023
@thockin
Copy link
Member

thockin commented Oct 4, 2023

Thanks!

/lgtm
/approve

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Oct 4, 2023
@wojtek-t
Copy link
Member

wojtek-t commented Oct 4, 2023

/lgtm
/approve PRR

Thanks!

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: alexanderConstantinescu, thockin, wojtek-t

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Oct 4, 2023
@k8s-ci-robot k8s-ci-robot merged commit a81c32a into kubernetes:master Oct 4, 2023
4 checks passed
@k8s-ci-robot k8s-ci-robot added this to the v1.29 milestone Oct 4, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory lgtm "Looks good to me", indicates that a PR is ready to be merged. sig/network Categorizes an issue or PR as relevant to SIG Network. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants