Skip to content

Commit

Permalink
review comments 1
Browse files Browse the repository at this point in the history
Signed-off-by: Adhityaa Chandrasekar <adtac@google.com>
  • Loading branch information
Adhityaa Chandrasekar committed Oct 2, 2020
1 parent d383448 commit 308e3a5
Showing 1 changed file with 41 additions and 29 deletions.
70 changes: 41 additions & 29 deletions keps/sig-api-machinery/1040-priority-and-fairness/README.md
Expand Up @@ -8,6 +8,7 @@
- [Motivation](#motivation)
- [Goals](#goals)
- [Non-Goals](#non-goals)
- [Future Goals](#future-goals)
- [Proposal](#proposal)
- [Request Categorization](#request-categorization)
- [Assignment to a Queue](#assignment-to-a-queue)
Expand Down Expand Up @@ -229,6 +230,28 @@ stake in the ground.
priority to avoid priority inversion problems. That will
necessarily be approximate, and we settle for that now.

### Future Goals

To recap, there are some issues that we have decided not to address
yet but we think may be interesting to consider in the future.

- Helping load balancers do a better job, considering each apiserver's
current load state.

- Do something about WATCH and/or CONNECT requests.

- React somehow to etcd overloads.

- Generate information to help something respond to downstream
congestion.

- Auto-tune the resource limit(s) and/or request cost(s).

- Be more useful for events.

- Thread additional information along the paths needed to enable more
precisely targeted avoidance of priority inversions.

## Proposal

In short, this proposal is about generalizing the existing
Expand Down Expand Up @@ -2017,10 +2040,12 @@ you need any help or guidance.
- Components depending on the feature gate:
- kube-apiserver

* **Does enabling the feature change any default behavior?** No.
* **Does enabling the feature change any default behavior?** Yes, requests that
weren't rejected before could get rejected. Performance of kube-apiserver
under heavy load will likely be different too.

* **Can the feature be disabled once it has been enabled (i.e. can we roll back
the enablement)?** Yes, the feature is feature-gated.
the enablement)?** Yes.

* **What happens if we reenable the feature if it was previously rolled back?**
The feature will be restored.
Expand All @@ -2035,8 +2060,9 @@ you need any help or guidance.
have widespread impact such as: (1) locking an administrator out of their
system, (2) rejecting controller requests, thereby bringing a lot of things
to a halt, (3) dropping node heartbeats, which may result in overloading
other nodes, (4) cause kube-proxy requests to apiserver to fail, breaking
existing workloads. <!-- anything else catastrophic? :) -->
other nodes, (4) rejecting kube-proxy requests to apiserver, thereby breaking
existing workloads, (5) dropping leader election requests, resulting in HA
failure, or any combination of the above.

* **What specific metrics should inform a rollback?** An abnormal spike in the
`apiserver_flowcontrol_rejected_requests_total` metric should potentially be
Expand All @@ -2047,11 +2073,11 @@ you need any help or guidance.
rejected requests.

* **Were upgrade and rollback tested? Was the upgrade->downgrade->upgrade path tested?**
<!-- I don't have the knowledge to answer this. -->
No.

* **Is the rollout accompanied by any deprecations and/or removals of features, APIs,
fields of API types, flags, etc.?**
<!-- Are max-requests-inflight, etc. being deprecated in favour of APF? -->
fields of API types, flags, etc.?** Yes, `--max-requests-inflights` will be
deprecated in favor of APF.

### Monitoring Requirements

Expand Down Expand Up @@ -2081,37 +2107,23 @@ of this feature?** No.
<!-- I don't have the knowledge to answer some questions in this section. -->

* **Will enabling / using this feature result in any new API calls?** No.
Describe them, providing:
- API call type (e.g. PATCH pods)
- estimated throughput
- originating component(s) (e.g. Kubelet, Feature-X-controller)
focusing mostly on:
- components listing and/or watching resources they didn't before
- API calls that may be triggered by changes of some Kubernetes resources
(e.g. update of object X triggers new updates of object Y)
- periodic API calls to reconcile state (e.g. periodic fetching state,
heartbeats, leader election, etc.)

* **Will enabling / using this feature result in introducing new API types?**
Describe them, providing:
- API type
- Supported number of objects per cluster
- Supported number of objects per namespace (for namespace-scoped objects)
Yes, a new flowcontrol API group, configuration types, and status types are
introduced. See `staging/src/k8s.io/api/flowcontrol/v1alpha1/types.go` for a
full list.

* **Will enabling / using this feature result in any new calls to the cloud
provider?** No.

* **Will enabling / using this feature result in increasing size or count of
the existing API objects?**
Describe them, providing:
- API type(s):
- Estimated increase in size: (e.g., new annotation of size 32B)
- Estimated amount of new objects: (e.g., new Object X for every existing Pod)
the existing API objects?** No.

* **Will enabling / using this feature result in increasing time taken by any
operations covered by [existing SLIs/SLOs]?**
Think about adding additional work or introducing new steps in between
(e.g. need to do X to start a container), etc. Please describe the details.
operations covered by [existing SLIs/SLOs]?** Yes, a non-negligible latency
is added to API calls to kube-apiserver. While [preliminary tests](https://github.com/tkashem/graceful/blob/master/priority-fairness/filter-latency/readme.md)
shows that the API server latency is still well within the existing SLOs,
more thorough testing needs to be performed.

* **Will enabling / using this feature result in non-negligible increase of
resource usage (CPU, RAM, disk, IO, ...) in any components?** The proposed
Expand Down

0 comments on commit 308e3a5

Please sign in to comment.