Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KEP-3453 minimize iptables-restore to Beta #3577

Merged

Conversation

danwinship
Copy link
Contributor

@danwinship danwinship commented Oct 4, 2022

(For the moment this PR just exists to continue the conversation from #3454.)

/cc @wojtek-t

@k8s-ci-robot k8s-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory sig/network Categorizes an issue or PR as relevant to SIG Network. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. labels Oct 4, 2022
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 2, 2023
@wojtek-t
Copy link
Member

wojtek-t commented Jan 2, 2023

Sorry - I dropped it. Will get back to it this week.

/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 2, 2023
@danwinship
Copy link
Contributor Author

@wojtek-t We did a KEP review pass in the SIG Network meeting today, and @thockin was in favor of skipping Beta for this and just going straight to GA (perhaps "GA but with the feature gate still flippable, just in case").

Reiterating some discussion from the earlier PR: there was not any particular reason to think we were going to run into bugs with this code, and the examples in the KEP were just me brainstorming theoretical failure modes. Rather, the point of the KEP/feature gate was just that the possible badness was tiny_chance_of_bugs * huge_number_of_people_affected. But since testing in the perf e2e tests has not turned up bugs, we feel more confident about just inflicting it on everyone now.

@wojtek-t
Copy link
Member

wojtek-t commented Jan 5, 2023

@danwinship - I still think that the metric that we were discussing would actually be useful, but I don't want trigger the pain, if I'm the only person who sees that metric as useful (and I looked into the code and agree that adding this metric wouldn't be super trivial task).

So for now, may I ask you to add two-three sentences about that into:
https://github.com/kubernetes/enhancements/pull/3454/files#diff-6efffdb08cfb3aefcb1047676bfdecec745cf9ad84879e07a34f151fc97dd96aR400
[Are there any missing metrics that would be useful to have to improve observability of this feature? section of the PRR]
so leave the clear trace that we spend time discussing that just decided not doing this?

@wojtek-t wojtek-t self-assigned this Jan 5, 2023
@danwinship danwinship force-pushed the kep-3453-minimize-iptables-beta branch from 66d6342 to 047bdbd Compare January 6, 2023 17:01
@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. labels Jan 6, 2023
@danwinship
Copy link
Contributor Author

updated... I can't get "make verify" to run locally so let's see if it complains about me trying to skip beta...

/assign @thockin

@danwinship danwinship changed the title WIP KEP-3453 minimize iptables-restore to beta KEP-3453 minimize iptables-restore to GA Jan 6, 2023
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jan 6, 2023
(a) no new bugs, and (b) improved performance.
- General e2e tests modified to check the "partial sync failures"
metric at some point or points during the test run, TBD.
(We have decided to skip Beta.)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The risk here is being able to disable it, if needed (your point from the call). While we CAN make a disableable GA, it's unusual and has little ROI here. So I think we should either:

a) Do a regular locked-on GA (YOLO!)
b) Do a regular beta

There's not much cost to beta - it will be on by default and disableable and it follows the normal flow.

Thoughts?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I strongly believe we should make a beta.

As Tim pointed out, I don't think there is much of additional overhead.
Given we believe we're code-complete, Beta will effectively only be changing a feature gate.
And it gives us ability to disable that in case of problems (as opposed to GA that would be locked-on).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@thockin I'm fine going through beta but don't the arguments in favor of it contradict your new "we should get rid of beta" proposal? 🙂

Copy link
Member

@wojtek-t wojtek-t left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think my main request is to actually go through Beta. I also added couple other small comments, but those are more cosmetic.

(a) no new bugs, and (b) improved performance.
- General e2e tests modified to check the "partial sync failures"
metric at some point or points during the test run, TBD.
(We have decided to skip Beta.)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I strongly believe we should make a beta.

As Tim pointed out, I don't think there is much of additional overhead.
Given we believe we're code-complete, Beta will effectively only be changing a feature gate.
And it gives us ability to disable that in case of problems (as opposed to GA that would be locked-on).

keps/sig-network/3453-minimize-iptables-restore/kep.yaml Outdated Show resolved Hide resolved
keps/sig-network/3453-minimize-iptables-restore/README.md Outdated Show resolved Hide resolved
@danwinship danwinship force-pushed the kep-3453-minimize-iptables-beta branch from 047bdbd to 89636d4 Compare January 10, 2023 18:51
@danwinship danwinship changed the title KEP-3453 minimize iptables-restore to GA KEP-3453 minimize iptables-restore to Beta Jan 10, 2023
@danwinship danwinship force-pushed the kep-3453-minimize-iptables-beta branch from 89636d4 to c88cd7a Compare January 10, 2023 18:55
@wojtek-t
Copy link
Member

/approve PRR

Thanks!

@aojea
Copy link
Member

aojea commented Jan 10, 2023

/lgtm
/approve

beta sounds good, I have to update some scalability jobs anyway 😄

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jan 10, 2023
Copy link
Member

@thockin thockin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

/lgtm
/approve

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: aojea, danwinship, thockin, wojtek-t

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jan 12, 2023
@thockin thockin added this to the v1.27 milestone Jan 12, 2023
@k8s-ci-robot k8s-ci-robot merged commit fff1bdb into kubernetes:master Jan 12, 2023
@danwinship danwinship deleted the kep-3453-minimize-iptables-beta branch January 13, 2023 20:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory lgtm "Looks good to me", indicates that a PR is ready to be merged. sig/network Categorizes an issue or PR as relevant to SIG Network. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants