New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Support for Thanos Ruler #527
Conversation
Are there features that thanos-ruler provides that our self-deployed rule evaluator doesn't? If this request is due to a feature gap, maybe we should address that as well. |
Hi @lyanco. Thanks a lot for the quick reply.
Thanos Ruler is already part of our self-managed monitoring infrastructure and the generated series are pushed to Cloud Storage (in raw Prometheus format) instead of sending them to Cloud Monitoring via the CreateTimeSeries API. The workflow using rule-evaluator is like this:
But with GMP+Thanos Ruler it will look like this:
|
Got it, thanks for the context. And what metrics are you recording over? System metrics or custom (user-defined) metrics? If system metrics, why is exporting the raw metrics via stackdriver-exporter not sufficient? |
@lyanco system metrics (http LBs, PubSub, etc). I have had issues before yielding wrong results with Stackdriver-exporter, but when I try the same query on Cloud Monitoring it returns correct results ... deltas are not easy to translate and the histograms can start returning weird results. I even have an open support ticket with GCP for more than a week now. Another reason why I like GMP Frontend is that I don't have to pull high cardinality metrics (histograms) when I am interested in the aggregation only. So doing the aggregation GCP side is more efficient for our needs. |
Got it! Thank you for the context on your use case, that makes sense. As you said, the issue with stackdriver-exporter likely boils down to OSS Prometheus having no idea how to scrape DELTAs natively... But yeah, not our code 🙂 I don't see any issue with this PR from a goals/objective perspective but I'll let the devs weigh in on correctness etc. |
Just an update from my side, I opened another PR for Thanos to allow disabling sending those parameters altogether: thanos-io/thanos#6560. I believe dropping them from the source would be better and will keep prometheus-engine-frontend cleaner. |
Thanks! From the dev site, we are happy to accept this PR, in fact we could go as far as exposing Thanos StoreAPI gRPC at some point. However, to merge this one we need some unit test and perhaps some flag like |
Commented on thanos-io/thanos#6560 (review) nevertheless |
Thanks @bwplotka for the feedback. I personally think the PR I opened on Thanos repo is a better way of supporting my use case. If it gets merged, then I will close this PR. |
sgtm! |
closing this in favor of thanos-io/thanos#6560 |
I would like to use prometheus-engine frontend component as a datasource for Thanos Ruler. Unfortunately, Thanos Ruler always sends 2 extra params
dedup
andpartial_response
which results on a 400 error on GCP side:The change in this PR mutates the request body by removing those extra params before forwarding them to GCP.
@bwplotka your feedback is welcome (maybe you have a different way of doing it).