You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We have been using Thanos for a few months now. It is deployed using sidecars, and the historical data is stored in S3.
Our setup:
Thanos version 0.32.5
Kubernetes v1.25.3
Prometheus stores locally the metrics for 2 weeks.
The compactor is running and the data is compacted and downsampled:
* --retention.resolution-raw=30d
* --retention.resolution-5m=30d
* --retention.resolution-1h=10y
The problem
I want to see some statistics for the last 30 days, but the dashboard takes very long to load. When I ask for data for the period now-30d to now, it takes 25 seconds for the dashboard to load.
When I change the timeframe to now-45d to now-15d (also 30 days, but the data is available only in S3), the same dashboard takes only 10 seconds to load.
I can see in the logs that all the sidecars are contacted. I'm not sure how exactly the response is aggregated, probably for 2 weeks the data is taken from the sidecars, and the rest from the bucket.
Similarly, if I query for 2 weeks (14 days) of data:
now-14d to now - takes 25 seconds to load
now-30d to now-16d - takes 8 seconds to load
It seems that when the timeframe includes the last 2 weeks of data (the period for which the data is still kept locally by Prometheus) the queries take much longer.
How can I speed up the queries? Am I doing something wrong in our setup? It seems like an obvious use case, but it takes very long to load, to the point where the dashboard is unusable.
Thank you for any potential ideas!
The text was updated successfully, but these errors were encountered:
Unless you properly configured your Thanos Store Gateways min and max time to exclude the timeframe covered by the sidecars, you're getting a lot of duplicated data that the Thanos Queriers have to work hard to de-duplicate. Hopefully you are also configuring labels correctly for deduplication 😬
Hi everyone,
We have been using Thanos for a few months now. It is deployed using sidecars, and the historical data is stored in S3.
Our setup:
* --retention.resolution-raw=30d
* --retention.resolution-5m=30d
* --retention.resolution-1h=10y
The problem
I want to see some statistics for the last 30 days, but the dashboard takes very long to load. When I ask for data for the period
now-30d to now
, it takes 25 seconds for the dashboard to load.When I change the timeframe to
now-45d to now-15d
(also 30 days, but the data is available only in S3), the same dashboard takes only 10 seconds to load.I can see in the logs that all the sidecars are contacted. I'm not sure how exactly the response is aggregated, probably for 2 weeks the data is taken from the sidecars, and the rest from the bucket.
Similarly, if I query for 2 weeks (14 days) of data:
now-14d to now
- takes 25 seconds to loadnow-30d to now-16d
- takes 8 seconds to loadIt seems that when the timeframe includes the last 2 weeks of data (the period for which the data is still kept locally by Prometheus) the queries take much longer.
How can I speed up the queries? Am I doing something wrong in our setup? It seems like an obvious use case, but it takes very long to load, to the point where the dashboard is unusable.
Thank you for any potential ideas!
The text was updated successfully, but these errors were encountered: