-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
query results may incorrectly overlap time series #748
Comments
@belm0 , thanks for the detailed bug report and the proposed solution! The solution looks good. We'll try implementing it and see how it works. |
Thank you. For discrepancies like this, it would be nice for VM to have unit tests against the output of the Prometheus query library. |
+1, I've run into the same issue |
The other (related) issue is that range queries do not appear to respect the |
Could you file a separate issue regarding this bug? |
@valyala would you consider prioritizing this? It's the main regression we have remaining vs. Prometheus. |
This should prevent from double counting for time series at the time when it changes label. The most common case is in K8S, which changes pod uid label with each new deployment. Updates #748
This should prevent from double counting for time series at the time when it changes label. The most common case is in K8S, which changes pod uid label with each new deployment. Updates #748
@belm0 , could you verify whether the issue is fixed in the following commits:
Note that the response cache must be reset before testing the bugfix in order to remove previously cached incorrect results. See https://victoriametrics.github.io/#backfilling for more info on how to reset response cache. |
The bugfix has been included in VictoriaMetrics v1.44.0. @belm0 , @propertone , could you verify whether it works as expected on your workloads? |
It seems to be resolved in v1.44, thank you! I added a review comment to the commit. |
@belm0 , could you share the original query used for building the graph above? It would be great if you could simplify and narrow down the query with more specific label filters, so it is based on a single source time series and still would expose the issue. |
…ailing data points in time series Now trailing data points are additionally dropped for time series with a single raw sample Updates #748
…ailing data points in time series Now trailing data points are additionally dropped for time series with a single raw sample Updates #748
FYI, this introduces regression in queries to |
FYI, VictoriaMetrics v1.45.0 contains additional enhancements for this issue. |
the problem near end of query range seems to be resolved in 1.45.0 |
FYI, VictoriaMetrics gains support for Prometheus staleness markers in the next release as mentioned at #1526 . This should fix the original issue with overlapped time series in more generic way. The fix works only for queries over newly ingested data. It doesn't work for already existing data. |
FYI, VictoriaMetrics and vmagent gained support for Prometheus staleness markers starting from the release v1.64.0. |
Describe the bug
If query step is much greater than data interval (e.g. > 4x), and two series are adjacent in time but not overlapping, the query output or aggregation may incorrectly overlap the time series.
A typical example is
build_info{version="..."}
. A new version is deployed to instances, which stop updatingbuild_info{version="1.0"}
and start updatingbuild_info{version="1.1"}
. Due to this bug, at low zoom resolution there will be a point in time wherecount(build_info{instance="..."))
returns 2, even though there is no overlap in the raw data points.The equivalent query on Prometheus (Thanos) does not exhibit the problem.
To Reproduce
Raw datapoints:
query, step 60 (no overlap)
query, step 240
Expected behavior
If two series are not overlapping in time by raw data, the query should not treat them as overlapping when evaluating one interval in the output.
An example implementation would be to treat "start" and "end" points of a series differently when quantizing raw data points into time buckets: include the series in the bucket if 1) raw points are continuous in the bucket range, or 2) the series starts in the bucket range. Therefore, if a series ends in a bucket range, it is not included. It's similar to the concept of an open-ended range.
Screenshots
Example graph showing artificial spikes in
![Screen Shot 2020-09-05 at 12 45 38 AM](https://user-images.githubusercontent.com/1708631/92366547-46fe0c00-f130-11ea-8473-be38c624a502.png)
count(build_info)
when there is a deployment causing the version label to change:Version
victoria-metrics-20200815-125320-tags-v1.40.0-0-ged00eb3f3
The text was updated successfully, but these errors were encountered: