Update API Priority and Fairness metrics

Add some overlooked ones. Note maturity level. Signed-off-by: Mike Spreitzer <mspreitz@us.ibm.com>
kubernetes · Oct 18, 2023 · b415f25 · b415f25
1 parent 06325f9
commit b415f25
Showing 1 changed file with 42 additions and 25 deletions.
diff --git a/content/en/docs/concepts/cluster-administration/flow-control.md b/content/en/docs/concepts/cluster-administration/flow-control.md
@@ -488,6 +488,8 @@ exports additional metrics. Monitoring these can help you determine whether your
 configuration is inappropriately throttling important traffic, or find
 poorly-behaved workloads that may be harming system health.
 
+#### Maturity level BETA
+
 * `apiserver_flowcontrol_rejected_requests_total` is a counter vector
   (cumulative since server start) of requests that were rejected,
   broken down by the labels `flow_schema` (indicating the one that
@@ -509,6 +511,37 @@ poorly-behaved workloads that may be harming system health.
   vector (cumulative since server start) of requests that began
   executing, broken down by `flow_schema` and `priority_level`.
 
+* `apiserver_flowcontrol_current_inqueue_requests` is a gauge vector
+  holding the instantaneous number of queued (not executing) requests,
+  broken down by `priority_level` and `flow_schema`.
+
+* `apiserver_flowcontrol_current_executing_requests` is a gauge vector
+  holding the instantaneous number of executing (not waiting in a
+  queue) requests, broken down by `priority_level` and `flow_schema`.
+
+* `apiserver_flowcontrol_current_executing_seats` is a gauge vector
+  holding the instantaneous number of occupied seats, broken down by
+  `priority_level` and `flow_schema`.
+
+* `apiserver_flowcontrol_request_wait_duration_seconds` is a histogram
+  vector of how long requests spent queued, broken down by the labels
+  `flow_schema`, `priority_level`, and `execute`. The `execute` label
+  indicates whether the request has started executing.
+
+  {{< note >}}
+  Since each FlowSchema always assigns requests to a single
+  PriorityLevelConfiguration, you can add the histograms for all the
+  FlowSchemas for one priority level to get the effective histogram for
+  requests assigned to that priority level.
+  {{< /note >}}
+
+* `apiserver_flowcontrol_nominal_limit_seats` is a gauge vector
+  holding each priority level's nominal concurrency limit, computed
+  from the API server's total concurrency limit and the priority
+  level's configured nominal concurrency shares.
+
+#### Maturity level ALPHA
+
 * `apiserver_current_inqueue_requests` is a gauge vector of recent
   high water marks of the number of queued requests, grouped by a
   label named `request_kind` whose value is `mutating` or `readOnly`.
@@ -518,6 +551,10 @@ poorly-behaved workloads that may be harming system health.
   last window's high water mark of number of requests actively being
   served.
 
+* `apiserver_current_inqueue_seats` is a gauge vector of the sum over
+  queued requests of the largest number of seats each will occupy,
+  grouped by labels named `flow_schema` and `priority_level`.
+
 * `apiserver_flowcontrol_read_vs_write_current_requests` is a
   histogram vector of observations, made at the end of every
   nanosecond, of the number of requests broken down by the labels
@@ -528,14 +565,6 @@ poorly-behaved workloads that may be harming system health.
   number of requests (queue volume limit for waiting and concurrency
   limit for executing).
 
-* `apiserver_flowcontrol_current_inqueue_requests` is a gauge vector
-  holding the instantaneous number of queued (not executing) requests,
-  broken down by `priority_level` and `flow_schema`.
-
-* `apiserver_flowcontrol_current_executing_requests` is a gauge vector
-  holding the instantaneous number of executing (not waiting in a
-  queue) requests, broken down by `priority_level` and `flow_schema`.
-
 * `apiserver_flowcontrol_request_concurrency_in_use` is a gauge vector
   holding the instantaneous number of occupied seats, broken down by
   `priority_level` and `flow_schema`.
@@ -584,11 +613,6 @@ poorly-behaved workloads that may be harming system health.
   was always equal to `apiserver_flowcontrol_current_limit_seats`
   (which did not exist as a distinct metric).
 
-* `apiserver_flowcontrol_nominal_limit_seats` is a gauge vector
-  holding each priority level's nominal concurrency limit, computed
-  from the API server's total concurrency limit and the priority
-  level's configured nominal concurrency shares.
-
 * `apiserver_flowcontrol_lower_limit_seats` is a gauge vector holding
   the lower bound on each priority level's dynamic concurrency limit.
 
@@ -631,18 +655,6 @@ poorly-behaved workloads that may be harming system health.
   holding, for each priority level, the dynamic concurrency limit
   derived in the last adjustment.
 
-* `apiserver_flowcontrol_request_wait_duration_seconds` is a histogram
-  vector of how long requests spent queued, broken down by the labels
-  `flow_schema`, `priority_level`, and `execute`. The `execute` label
-  indicates whether the request has started executing.
-
-  {{< note >}}
-  Since each FlowSchema always assigns requests to a single
-  PriorityLevelConfiguration, you can add the histograms for all the
-  FlowSchemas for one priority level to get the effective histogram for
-  requests assigned to that priority level.
-  {{< /note >}}
-
 * `apiserver_flowcontrol_request_execution_seconds` is a histogram
   vector of how long requests took to actually execute, broken down by
   `flow_schema` and `priority_level`.
@@ -661,6 +673,11 @@ poorly-behaved workloads that may be harming system health.
   to a request being dispatched but did not, due to lack of available
   concurrency, broken down by `flow_schema` and `priority_level`.
 
+* `apiserver_flowcontrol_epoch_advance_total` is a counter vector of
+  the number of attempts to jump a priority level's progress meter
+  backward to avoid numeric overflow, grouped by `priority_level` and
+  `success`.
+
 ## Good practices for using API Priority and Fairness
 
 When a given priority level exceeds its permitted concurrency, requests can