Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve UX for SLO's with more than 2 9's #131

Closed
winmillwill opened this issue Jul 27, 2021 · 4 comments
Closed

Improve UX for SLO's with more than 2 9's #131

winmillwill opened this issue Jul 27, 2021 · 4 comments
Labels
bug Something isn't working grafana prometheus

Comments

@winmillwill
Copy link

Thanks for the awesome code! I had to do the equivalent of this a couple years ago with nothing but sed!

If I have a service with two objectives of 99.9 and another objective of 99.99, I get numbers like these:

$ sloth generate -i slo.yml 2>/dev/null | grep -Eo '0\.[0-9]{4}[0-9]+'
0.9990000000000001
0.9990000000000001
0.0009999999999999432
0.0009999999999999432
0.0009999999999999432
0.0009999999999999432
0.0009999999999999432
0.0009999999999999432
0.0009999999999999432
0.0009999999999999432
0.9990000000000001
0.9990000000000001
0.0009999999999999432
0.0009999999999999432
0.0009999999999999432
0.0009999999999999432
0.0009999999999999432
0.0009999999999999432
0.0009999999999999432
0.0009999999999999432
0.9998999999999999
0.9998999999999999
0.00010000000000005117
0.00010000000000005117
0.00010000000000005117
0.00010000000000005117
0.00010000000000005117
0.00010000000000005117
0.00010000000000005117
0.00010000000000005117

... where what I probably want is

0.9990000000000001 -> 0.999
0.0009999999999999432 -> 0.001
0.9998999999999999 -> 0.9999
0.00010000000000005117 -> 0.0001

I think the implementation might be something like "don't use floats".

Related to this issue: the grafana board reports a 99.99 objective as 100%, which is dismaying to people who only see the dashboard and not the actual values in the promql queries.

@winmillwill winmillwill changed the title Support for SLO's with more than 2 9's Improve UX for SLO's with more than 2 9's Jul 27, 2021
@slok
Copy link
Owner

slok commented Jul 27, 2021

Hi @winmillwill!

From my side I don't see a problem with those numbers because they happen under the hood, it is an implementation detail, how they are being represented on a dashboard is another thing and I'm with you.

I see from your example that is affecting you on how is being visualized, right? In what other sense affects you those Prometheus rules numbers?

The dashboard you are referring to is the default dashboard? what version of Grafana are you using?

@winmillwill
Copy link
Author

I see from your example that is affecting you on how is being visualized, right? In what other sense affects you those Prometheus rules numbers?

I don't think these numbers (the ones in the PrometheusRule resources) affect the visualization per se ... I think that's down to the dashboard json not specifying a precision ... though it could be that it's an issue with grafana rounding because of all the trailing 9's.

As you say, it probably isn't having any other effect: the computation that prometheus does with these values is most likely the same either way. I guess I see this as an opportunity to remove a source of confusion, because the only computation we ever want to do is 1 - {{objectiveAsDecimal}} ... and I think we can just pass that term into the promql queries and let prometheus do the math. For my use case, I will probably be running the CLI in a CI/CD pipeline or storing the output in a repo. In either case, developers will review the diff of the PrometheusRules yaml and end up going "but that's not what I wanted". Again, I agree that it probably doesn't matter in terms of prometheus and grafana working as expected ... I just don't want to have the same conversation with N different teams as I help them adopt SLO's.

The dashboard you are referring to is the default dashboard? what version of Grafana are you using?

Yes, the one at grafana.com. I'm using grafana version 7.5.7.

To be clear I'm seeing a panel like this:
Screen Shot 2021-07-27 at 21 00 49

... and I'd like it to say 99.99% instead of 100.0%

@slok slok added bug Something isn't working grafana and removed generator labels Aug 2, 2021
@slok
Copy link
Owner

slok commented Aug 2, 2021

Yesterday I checked and as you said, Grafana rounds automatically after 2 decimals, it happens the same with percentages and ratios, so the problem is not the 0.99900000000... (I did the same tests with 0.9999, 99.99...).

I was thinking about putting explicitly 3 decimals, however, I also don't like to have a 99 represented as 99.000, So I need to research a bit more of how to solve this in the best way, any way, this doesn't affect Sloth's Prometheus implementation in this sense.

@slok
Copy link
Owner

slok commented Oct 5, 2021

After merging #172 and changing Sloth's dashboard with the 4th revision this issue is fixed.

Before:

Screenshot_20211005_074818

After:

Screenshot_20211005_074842

@slok slok closed this as completed Oct 5, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working grafana prometheus
Projects
None yet
Development

No branches or pull requests

2 participants