New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Promoting some flowcontrol metrics to Beta #118882
Comments
/sig api-machinery |
/triage accepted |
I think we should split two topics:
Re (1), I don't have very strong opinion - certainly the first 5 metrics you proposed make sense to me. I'm not entirely sure about the last two which are really "the current state, not valid anymore 1ms later", which makes them hard to reason about given the monitoring pipelines usually have much larger latency, but I'm not strongly opposed either... Re (2) - we should absolutely start thinking about promoting them, but given APF is not yet in GA, we should not propomote metrics to GA. But with the recent change in the metrics framework and introduction a Beta phase, I think we can definitely consider promoting them to beta. |
Yes, there are two topics here. Regarding I agree that Beta is the metric maturity level to aim for. |
BTW, See #118960 |
The metric named |
@MikeSpreitzer I don't disagree with using better names for these metrics, but we are effectively making a breaking change for only a name change and no actual changes to the metric schema. While these are alpha metrics, I'm not sure it's worthwhile to break the metric schema for the sake of better naming. I would instead be in favor of graduating existing metrics that are widely adopted to Beta while also introducing the new metric in alpha with more accurate naming which can be adopted in newer versions. |
Certainly we can fix name problems by introducing new names before removing old ones that carry equivalent information. I am not enthused about promoting to Beta stability something that we plan to delete. Could we introduce the new name at Beta stability, on the grounds that its content has already proven worthwhile under the old name? |
I don't agree - we either agree that the existing names are good enough then just promote those (and not introduce any new ones then). Or we deprecated the existing ones and introduce new ones which will be promoted to beta then. |
Selfishly I am not in favor of breaking any metrics just for name sake due to the ripple effects it will have on existing monitoring that rely on them. But I also understand that that these are alpha metrics so we have the right to break them :) I am not really convinced that users will care that the metric is named |
(either way, we should still update the metric description for apiserver_flowcontrol_request_concurrency_limit to mention it accounts for total seats and includes adjustments from borrowing) |
Elsewhere you can see users who are confused. That comes from the combination of an outdated name and an outdated HELP string. When I complained about there being no formal maturity level for "deprecated", I was overlooking the distinct field for indicating the release at which a metric is deprecated. So I agree that #118959 should be modified to not delete Now that #118960 has merged we have So I propose that the list of Beta metrics be as follows.
|
What would you like to be added?
Some flowcontrol metrics in kube-apiserver have been around for a long time and are very high value signals for how APF is impacting apiserver latency. With APF graduated to Beta in 1.20 and on-going work to get to GA, I wanted to open an issue to discuss graduating some of these metrics to Beta (or Stable because we skip Beta?), specifically ones that are high value signals for apiserver latency with many users depending on them already. Here are metrics that I personally think would benefit from graduation to Stable based on their usage at Google:
For other metrics, I propose we leave as alpha for now as they pertain to internals details of APF that most users will likely not care about and mostly catered to maintainers who are looking to optimize implementation details.
Why is this needed?
Promoting widely adopted metrics to Beta/Stable will provide stronger guarantees on their compatibility going forward.
The text was updated successfully, but these errors were encountered: