-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.
Already on GitHub? Sign in to your account
ws-manager: Replace backup/restore success with total metric #11158
Conversation
5c0a65d
to
e5391ab
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thank you!!
/hold
adding hold, can you please double check that we are not using that metric in any of grafana dashboards? As in that case they will become broken. I don't think we do, but if you could double check that would be appreciated. Feel free to remove the hold.
鉂わ笍
I've looked at Workspace success criteria, and couldn't find anything 馃 |
Signed-off-by: ArthurSens <arthursens2005@gmail.com>
e5391ab
to
6c1b349
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thank you 鉂わ笍
We only use the /unhold |
Description
Hey Workspace-team!
We noticed that you want to level up your game towards success-criteria-related metrics, and there is momentum in the company to start adopting SLOs to get the right metrics.
We from the platform-team are being a little more proactive regarding implementing SLOs and while trying to get the first example up and running, we noticed that your metrics for workspace backups and workspace restores are a bit out of best practices for SLOs with Prometheus 馃槵
When implementing success ratio SLOs, we usually look for metrics that represent "total amount of requests" and "amount of failed requests". So the calculation looks like this:
backup error ratio = amount of backup with errors / total amount of backups
Would it be fine if we change the current metrics to be more SLO-friendly?
PS: I also simplified the nested ifs by using
WithLabelValues
instead ofGetMetricWithLabelValues
as wellRelated Issue(s)
Fixes #
How to test
Release Notes