You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm having a vague problem, but I will try my best to explain what is happening.
From time to time, I'm getting this error:
rpc error: code = Internal desc = stream terminated by RST_STREAM with error code: NO_ERROR
Basically I'm seeing a bunch of http2 connection resets.
When debugging, I'm using the Thanos Querier directly and I have been able to reproduce this behaviour under a very specific set of variables.
This only seems to happen against one of our larger Prometheus instances
We have Prometheus instances over multiple regions and the Thanos querier on a fixed region
When the Querier isn't in the same region as the largest Prometheus instance, it's 100% reproducible. When I moved the Querier to the same region as the Prometheus (but still talk to it via an ingress), it somehow magically isn't reproducible anymore.
The range of the query seems to have an impact on it. Doing a 2 hour query works fine. 6 hours = breaks.
I have been somewhat starring blind on this, while eventually it hit me.
Looking deeper on an error query like, it actually shows more details. It's a 422 Unprocessable Entity response code.
Which make sense, as the MaxTime is 9223372036854775807 which isn't a legit value. However, as you can see, I make the query with a end: 1714564667.884 argument.
This only seems to happen when the query reaches a certain latency response. If I would only query the last 2 hours instead of 6 hours, it's "fine". Like-wise, if I move my querier closer to the Prometheus in question, I can query the last 6 hours just fine.
Somehow, and I don't know why/how, when the latency/response becomes too long, it magically ruins my maxtime to an invalid value?
The text was updated successfully, but these errors were encountered:
It must be something other then the maxtime of the range request being messed with. That logging is the String representation of the endpoint ref and corresponds to the time range of the store, see
I'm having a vague problem, but I will try my best to explain what is happening.
From time to time, I'm getting this error:
Basically I'm seeing a bunch of http2 connection resets.
When debugging, I'm using the Thanos Querier directly and I have been able to reproduce this behaviour under a very specific set of variables.
I have been somewhat starring blind on this, while eventually it hit me.
Looking deeper on an error query like, it actually shows more details. It's a
422 Unprocessable Entity
response code.This I can explain, as due to the error message.
Let me show you the query in question:
Response:
Which make sense, as the
MaxTime
is9223372036854775807
which isn't a legit value. However, as you can see, I make the query with aend: 1714564667.884
argument.This only seems to happen when the query reaches a certain latency response. If I would only query the last 2 hours instead of 6 hours, it's "fine". Like-wise, if I move my querier closer to the Prometheus in question, I can query the last 6 hours just fine.
Somehow, and I don't know why/how, when the latency/response becomes too long, it magically ruins my maxtime to an invalid value?
The text was updated successfully, but these errors were encountered: