Grafana Table doesn't merge queries results with Victoria Metrics datasource #720

Keiske · 2020-08-25T18:39:55Z

Describe the bug
Switching datasource from Prometheus to Victoria Metrics in Grafana Tables breaks row merging for the same data. We are trying to use VictoriaMetrics as long-term historical metrics storage for our Kubernetes in-cluster Prometheus. Have configured Prometheus to send data to VictoriaMetrics via remote_write api.

To Reproduce

Install VictoriaMetrics single node helm chart.
Add remote_write to VM in Prometheus config.
Change datasource from prometheus to VM in grafana dashboards with tables with multiple queries. No any other settings in Grafana changed.
Queries results doesn't merged anymore. See screenshots.
Change datasouce back to Prometheus - tables look as it should.

Expected behavior
Multiple queries results with same lebels should be merged to rows in Grafana Tables panel.

Screenshots
Prometheus datasource table result:

VictoriaMetrics datasource table result with same queries and settings:

Version
The line returned when passing --version command line flag to binary. For example:

$ ./victoria-metrics-prod --version
victoria-metrics-20200815-125320-tags-v1.40.0-0-ged00eb3f3

The text was updated successfully, but these errors were encountered:

hagen1778 · 2020-08-26T21:39:48Z

Hi @Keiske! Could you pls check the actual response in both cases and compare? Would be nice to post it here if possible.

Keiske · 2020-08-27T13:39:38Z

Hi @Keiske! Could you pls check the actual response in both cases and compare? Would be nice to post it here if possible.

Sure. Here it is. Looks very similar, but with switched order of metrics in results.

First value column query result:
Prometheus:
{"status":"success","data":{"resultType":"vector","result":[{"metric":{"pod":"ag-postgres-portal-0"},"value":[1598532794,"0.25"]},{"metric":{"pod":"ag-postgres-user-0"},"value":[1598532794,"0.25"]}]}}

VictoriaMetrics:
{"status":"success","data":{"resultType":"vector","result":[{"metric":{"pod":"ag-postgres-user-0"},"value":[1598532794,"0.25"]},{"metric":{"pod":"ag-postgres-portal-0"},"value":[1598532794,"0.25"]}]}}

Second value column query results:
Prometheus:
{"status":"success","data":{"resultType":"vector","result":[{"metric":{"pod":"ag-postgres-portal-0"},"value":[1598532794,"0.0908893273381095"]},{"metric":{"pod":"ag-postgres-user-0"},"value":[1598532794,"0.0875724036524064"]}]}}

VictoriaMetrics:
{"status":"success","data":{"resultType":"vector","result":[{"metric":{"pod":"ag-postgres-user-0"},"value":[1598532794,"0.08627032711112406"]},{"metric":{"pod":"ag-postgres-portal-0"},"value":[1598532794,"0.09153162373333341"]}]}}

hagen1778 · 2020-08-30T14:25:26Z

Thanks! Weird, content looks identical except order. But order is consistent for both VM requests so it shouldn't be a case. Have you tried to build any other tables based on VM datasource? I wonder if it is Grafana bug...

valyala · 2020-09-01T16:22:59Z

@Keiske , try wrapping queries into sort() function. This should guarantee consistent order of the returned metrics.

balabalazhoucj · 2020-09-09T03:05:59Z

Hi!I have same question.My environment uses v1.40.0-cluster.

But I change grafana time range is ok(from now-5m to now-1m),my prometheus scrape_interval is 1m.

/ # ./vminsert-prod -version
vminsert-20200815-132700-tags-v1.40.0-cluster-0-gd9f7ea1c6

When i use v1.37.2-cluster is ok

hagen1778 · 2020-09-09T21:55:40Z

Hi @balabalazhoucj! Have you tried to set instant button to your queries like in the screenshot from @Keiske?

Muxa1L · 2020-09-17T10:43:03Z

@hagen1778 I think its because when "instant" is used it returns time for as - 30s (or current value of search.latencyOffset param set for vmselect) instead of actual time for this metric. And because queries may complete in different time this timestamps may differ for 1-2-10-inf ms, and it breaks the table.
Examples.
Not OK: (3ms offset for metrics scraped in one time)

OK (after 10 refreshes):

Nothing was changed in queries, and dashboards, only few refreshes.

Keiske · 2020-09-17T10:52:44Z

@valyala Sorting results didn't help.

@hagen1778 Yes, as @Muxa1L said, our issue looks just like this. Refreshing dashboard in browser sometimes makes table marge correct. About 1 of 10 page refreshes. But how can we fix it to always merge results in table correctly, just like Prometheus datasource does.

Muxa1L · 2020-09-17T16:21:18Z

By the way, when you disable "instant" thus running ranged query - last results of query are correct.

hagen1778 · 2020-09-17T23:09:53Z

@Muxa1L Good catch! Instant queries are sent to /query handler and response contains timestamps with milliseconds precision. And regular queries are sent to /query_range and response contains timestamps with second precision. Should VM round timestamps up to seconds for instant queries @valyala ?

Muxa1L · 2020-09-18T09:43:04Z

@hagen1778 I think it will be better to return the timestamp of the last metric value
Example single metric and results for instant query (to path /query):
{"result":[{"metric":{"__name__":"go_cpu_count","instance":"self","job":"victoria-metrics"},"value":[1600421855.726,"3"]}]}
1600421855.726 is approx 2020-09-18 12:37:35 - is the - 30s
, and ranged (for last minute)
{"result":[{"metric":{"__name__":"go_cpu_count","instance":"self","job":"victoria-metrics"},"values":[[1600421820,"3"],[1600421835,"3"],[1600421850,"3"],[1600421865,"3"],[1600421880,"3"]]}]}
1600421880 is 2020-09-18 12:38:00, and is a correct time of last scrape

Plus. If VM would round timestamps up to seconds - it still will be possible to get different timestamps that will break the table.

starsliao · 2020-09-21T01:45:28Z

@hagen1778
I also have this problem. I found that there will be such a problem since v1.39.0, but it is normal in v1.38.1. Because the grafana table cannot be used, I can only use the v1.38.1 version now. I hope this problem can be solved.

v1.39.0
v1.38.1

Muxa1L · 2020-09-21T12:55:55Z

@hagen1778 Nvm my previous comment. Returning timestamps with seconds precision will be enough. Prometheus also returns with seconds precision. And victoriaMetrics returned timestamps with seconds precision before v1.39.0, as @starsliao noticed

…o `/api/v1/query` Updates #720

valyala · 2020-09-21T18:28:41Z

The issue must be fixed in the following commits:

Single-node VictoriaMetrics - 2eb72e0
Cluster VictoriaMetrics - 07c6226

The bugfix rounds default time value to seconds when the query to /api/v1/query doesn't contain time query arg. This is a workaround, which reduces the probability of the original issue. The proper fix should be applied on Grafana side - it must pass time query arg with each query to /api/v1/query.

Muxa1L · 2020-09-22T08:40:11Z

@valyala grafana passes time to queries. But it does not seem to be counted anywhere.

I think this part overwrites start value.

VictoriaMetrics/app/vmselect/prometheus/prometheus.go

Lines 654 to 658 in 3b1e3a0

    
           if !searchutils.GetBool(r, "nocache") && ct-start < queryOffset { 
        
           	// Adjust start time only if `nocache` arg isn't set. 
        
           	// See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/241 
        
           	start = ct - queryOffset 
        
           }

So setting -search.latencyOffset to something small, like 1ms helps.

…h match the `time` query arg Updates #720

valyala · 2020-09-23T10:02:41Z

@Muxa1L , thanks for the spot! It must be fixed in the following commits:

Single-node VictoriaMetrics: 3ba5070
Cluster VictoriaMetrics: 1fce795

Unfortunately these commits weren't included in v1.41.1, but they will be included in the next release.

Muxa1L · 2020-09-23T11:02:37Z

@valyala, great! Now it works correctly, thanks for fixes!

boazjohn · 2020-09-28T19:12:29Z

Is it advised to use rollback to v1.38.1? Is this fix going to to be released soon?
Using 1.41.1 and facing the issue.

valyala · 2020-09-29T20:42:04Z

Is it advised to use rollback to v1.38.1?

Unfortunately it is impossible to downgrade from v1.41.* to older releases due to on-disk data format change. See release notes for v1.41.0 for details. So it is better waiting for the next release or building VictoriaMetrics from sources according to the following docs:

Is this fix going to to be released soon?

The fix will be included in the upcoming release, which is going to be published in the next couple of days.

valyala · 2020-09-30T06:26:44Z

The bugfix is available starting from v1.42.0. Closing the bug as fixed.

hagen1778 added the need more info label Aug 30, 2020

starsliao mentioned this issue Sep 21, 2020

Can you add support for victoriametrics? starsliao/Prometheus#59

Closed

valyala added a commit that referenced this issue Sep 21, 2020

app/vmselect: use time value rounded to seconds if it isnt passed t…

2eb72e0

…o `/api/v1/query` Updates #720

valyala added a commit that referenced this issue Sep 21, 2020

app/vmselect: use time value rounded to seconds if it isnt passed t…

07c6226

…o `/api/v1/query` Updates #720

valyala added the bug Something isn't working label Sep 21, 2020

valyala added a commit that referenced this issue Sep 23, 2020

app/vmselect/prometheus: return timestamps from /api/v1/query, whic…

3ba5070

…h match the `time` query arg Updates #720

valyala added a commit that referenced this issue Sep 23, 2020

app/vmselect/prometheus: return timestamps from /api/v1/query, whic…

1fce795

…h match the `time` query arg Updates #720

valyala mentioned this issue Sep 23, 2020

Grafana Instant query with table don't work correctly #786

Closed

valyala closed this as completed Sep 30, 2020

valyala mentioned this issue Jan 15, 2021

query response with different timestamp #1018

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Grafana Table doesn't merge queries results with Victoria Metrics datasource #720

Grafana Table doesn't merge queries results with Victoria Metrics datasource #720

Keiske commented Aug 25, 2020

hagen1778 commented Aug 26, 2020

Keiske commented Aug 27, 2020

hagen1778 commented Aug 30, 2020

valyala commented Sep 1, 2020

balabalazhoucj commented Sep 9, 2020

hagen1778 commented Sep 9, 2020

Muxa1L commented Sep 17, 2020

Keiske commented Sep 17, 2020

Muxa1L commented Sep 17, 2020

hagen1778 commented Sep 17, 2020

Muxa1L commented Sep 18, 2020 •

edited

starsliao commented Sep 21, 2020

Muxa1L commented Sep 21, 2020

valyala commented Sep 21, 2020

Muxa1L commented Sep 22, 2020 •

edited

valyala commented Sep 23, 2020

Muxa1L commented Sep 23, 2020

boazjohn commented Sep 28, 2020

valyala commented Sep 29, 2020

valyala commented Sep 30, 2020

Grafana Table doesn't merge queries results with Victoria Metrics datasource #720

Grafana Table doesn't merge queries results with Victoria Metrics datasource #720

Comments

Keiske commented Aug 25, 2020

hagen1778 commented Aug 26, 2020

Keiske commented Aug 27, 2020

hagen1778 commented Aug 30, 2020

valyala commented Sep 1, 2020

balabalazhoucj commented Sep 9, 2020

hagen1778 commented Sep 9, 2020

Muxa1L commented Sep 17, 2020

Keiske commented Sep 17, 2020

Muxa1L commented Sep 17, 2020

hagen1778 commented Sep 17, 2020

Muxa1L commented Sep 18, 2020 • edited

starsliao commented Sep 21, 2020

Muxa1L commented Sep 21, 2020

valyala commented Sep 21, 2020

Muxa1L commented Sep 22, 2020 • edited

valyala commented Sep 23, 2020

Muxa1L commented Sep 23, 2020

boazjohn commented Sep 28, 2020

valyala commented Sep 29, 2020

valyala commented Sep 30, 2020

Muxa1L commented Sep 18, 2020 •

edited

Muxa1L commented Sep 22, 2020 •

edited