Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KEP-1040: Start drafting GA graduation criteria for API Priority and Fairness #3155

Merged
merged 2 commits into from Apr 12, 2022

Conversation

MikeSpreitzer
Copy link
Member

@MikeSpreitzer MikeSpreitzer commented Jan 17, 2022

  • One-line PR description:
    Start drafting GA graduation criteria for API Priority and Fairness

Issue #1040

  • Other comments: the beta criteria are a but musty, but that's water under the bridge now.

@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. labels Jan 17, 2022
@k8s-ci-robot k8s-ci-robot added kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. labels Jan 17, 2022
@MikeSpreitzer
Copy link
Member Author

/cc @wojtek-t
/cc @tkashem
@yue9944882


- Satisfaction with LIST and WATCH support
- APF allows us to disable client-side rate limiting (or we know the reason why not)
- Satisfaction that the API is sufficient to support borrowing between priority levels
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would say that maybe not necessary sufficient, but we are convinced we can "extend that in backward compatible way" (i.e. we will not have to change some details fields, validations, defaulting, etc. for that purpose)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@kikisdeliveryservice kikisdeliveryservice changed the title Start drafting GA graduation criteria for API Priority and Fairness KEP-1040: Start drafting GA graduation criteria for API Priority and Fairness Feb 3, 2022
@MikeSpreitzer
Copy link
Member Author

The force-push to 669f968 makes a change along the lines @wojtek-t suggested in review.

@wojtek-t
Copy link
Member

wojtek-t commented Feb 7, 2022

/lgtm

We can always extend it when we really will be targeting GA

/assign @deads2k @lavalamp

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Feb 7, 2022
@MikeSpreitzer
Copy link
Member Author

FYI, here is the description of how to add a field to an API object without bumping the version: https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/api_changes.md#new-field-in-existing-api-version

@MikeSpreitzer
Copy link
Member Author

@MikeSpreitzer

GA:

- Satisfaction with LIST and WATCH support
- APF allows us to disable client-side rate limiting (or we know the reason why not)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we cannot disable client-side rate limiting, why would we consider the feature complete?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my opinion there are two aspects:

  • whether without client-side rate-limiting the kube-apiserver (and etcd) are falling over
  • whether without client-side rate-limiting some other aspects of the systems are falling over because they are not able to keep up with load and accumulating huge backlog

Let's make a specific example. Let's say that endpoints controller can keep up with 100pods/s throughput in large enough clusters. Now, if we remove client-side rate-limiting completely, then e.g. scheduler would be able to scheduler say 500pods/s and endpoints controller will be accumulating backlog - thus network programming latency will go extremely high.

From that perspective, I think that APF does that job (and we would be able to get rid of client-side rate-limiting - there is nothing more to do there), but I still will be reluctant to actually getting rid of it from all components to avoid hurting other aspects of the system. But the improvements needed would actually be in completely different components, so shouldn't block APF graduation.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a compelling reason. I'd like for that explanation or something similar to be expressed here. Priority and fairness, with client-side rate limiting disabled, must be sufficient for the kube-apiserver to survive. If other components require additional client-side rate limiting, that will not stop our GA, but the kube-apiserver must survive without it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@MikeSpreitzer - can you please extend the text to incorporate it?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting distinction here. In short, this harkens back to the point I have been making in other contexts: Kubernetes can be seen as two layers. The lower layer is an extensible API service, and the higher layer is built on that API service and consists of the resource definitions and controllers for managing containerized workload. APF is focused on the lower layer, but Kubernetes will not be safe from overloads until the higher layer also has protections.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with the theory but I'm skeptical that we actually have some slow controller like this?

I'm also skeptical that this is how people would choose to solve a problem -- I've heard of tons of latency problems and I can't recall ever hearing someone suggest slowing down everything else in the system so that the slow thing can keep up.

And if you really want to slow down the e.g. scheduler to keep endpoints from looking bad, APF does let you do that...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Longer-term I agree with that.

Shorter-term I actually disagree.
Slow networking controllers (that complete can't keep up with other controllers) can actually cause an outage to your applications.
e.g. imagine that we can do a rolling upgrade super fast, but networking doesn't keep up and when the last old pod is actually deleted, the newer ones are not yet added to the LB mechanism. That means a completely outage of your service.

So while I'm definitely all for disabling client-side rate-limits eventually, I'm lacking a lot of confidence here and I would be opposed doing that before we prove it works. And at the same time, blocking P&F graduation on that doesn't sound desired.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wasn't there a pod readiness thing done a while ago to address that load balancer scenario?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That doesn't necessary mean everyone is using that. Recommended patterns are not necessarily quickly adopted by many users.


- Satisfaction with LIST and WATCH support
- APF allows us to disable client-side rate limiting (or we know the reason why not)
- Satisfaction that the API can be extended in a backward-compatible way to support borrowing between priority levels
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not clear on why we would consider the feature GA without borrowing being implemented.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We all are using P&F in production now and we've all seen cases where it actually prevented failover. So we have proofs that it already provides significant value even without borrowing.
In my personal opinion borrowing is an extensions/feature on top of the basic P&F and (while I fully agree that we should start working on that now-ish), I don't think we should actually block the GA of the feature as a whole on it.
WDYT?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Without borrowing, we have reservations about really compressing the number of concurrent requests to a value small enough to keep clusters near the edge. I'd like to be able to shrink the number of concurrent requests, but that has significant negative impacts without borrowing unused priority.

cc @tkashem

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I fully agree with the above. And I 100% agree we should do that.
My only concern is - why this should be a GA release blocker. All of us are effectively using P&F in production. And I think at this point no-one will switch back to max-in-flight as it's already visible worse.
So the way I'm looking at it is "P&F is missing an important extension", not "P&F is not GA-quality".

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We discussed it in the APF meeting today and decided to update the criteria to say borrowing is working. One of the leading considerations is the expectation that working borrowing will lead to significant changes to configuration. This would be disruptive enough to deserve corresponding from beta to GA.

@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Apr 11, 2022
@wojtek-t
Copy link
Member

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Apr 12, 2022
GA:

- Satisfaction with LIST and WATCH support.
- API annotations properly support strategic merge patch.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does this mean?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This means we botched the field tags when we first wrote them.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, you literally mean the tags on fields? OK

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(I don't think this is a GA requirement personally, it's more a tactical need until we get SSA on in integration tests)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lavalamp
Copy link
Member

/approve

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: lavalamp, MikeSpreitzer

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Apr 12, 2022
@k8s-ci-robot k8s-ci-robot merged commit ff71f81 into kubernetes:master Apr 12, 2022
@k8s-ci-robot k8s-ci-robot added this to the v1.24 milestone Apr 12, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory lgtm "Looks good to me", indicates that a PR is ready to be merged. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants