You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thanos, Prometheus and Golang version used: 0.35.0
Object Storage Provider: Azure Storage Account
What happened:
After enable query.mode=distributed my querier get a lot of panics. Removing --query.mode=distributed stops all pancis
What you expected to happen:
No panics
How to reproduce it (as minimally and precisely as possible):
At the moment, I'm unable to provide a minimal reproducible environment. However, according ruler logs, all queries like absent(up{job=\"kube-proxy\"} == 1) (job can have any label) seems affected.
We are using stateless rulers and thanos receive, no sidecars.
Uncomment if you would like to post collapsible logs:
Ruler Logs
Just a few, they are repeating
{"caller":"rule.go:968","component":"rules","err":"rpc error: code = Internal desc = runtime error: index out of range [0] with length 0","level":"error","query":"absent(up{job=\"kube-proxy\"} == 1)","ts":"2024-05-02T13:03:09.954559928Z"}
{"caller":"rule.go:938","component":"rules","err":"read query instant response: perform POST request against http://opsstack-thanos-query.opsstack.svc.cluster.local:10902/api/v1/query: Post \"http://opsstack-thanos-query.opsstack.svc.cluster.local:10902/api/v1/query\": EOF","level":"error","query":"absent(up{job=\"apiserver\"} == 1)","ts":"2024-05-02T12:59:26.372712478Z"}
The issue here was that the querier is not configured to point at query APIs but at store APIs ( we should guard against that better probably ); this leads to the promql-engine distributing to 0 other engines, which exposes a bug where we dont guard against that!
Thanos, Prometheus and Golang version used: 0.35.0
Object Storage Provider: Azure Storage Account
What happened:
After enable
query.mode=distributed
my querier get a lot of panics. Removing--query.mode=distributed
stops all pancisWhat you expected to happen:
No panics
How to reproduce it (as minimally and precisely as possible):
At the moment, I'm unable to provide a minimal reproducible environment. However, according ruler logs, all queries like
absent(up{job=\"kube-proxy\"} == 1)
(job can have any label) seems affected.We are using stateless rulers and thanos receive, no sidecars.
Thanos querier arguments:
Full logs to relevant components:
Uncomment if you would like to post collapsible logs:
Ruler Logs
Just a few, they are repeating
Querier Logs
https://gist.github.com/jkroepke/9fc58319bf819866138a8dae4f1c8d92
Anything else we need to know:
Environment:
The text was updated successfully, but these errors were encountered: