Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Postgres stacked bar graph panel with $__timeGroup has incorrect graph data #10073

Closed
rezolutiontech opened this issue Dec 4, 2017 · 11 comments

Comments

@rezolutiontech
Copy link

Grafana version: 4.6.2
OS: Linux (Amazon Linux AMI)

I'm trying to replicate a graph I use from a influx datasource regularly using a postgres datasource which is a stacked bar graph with a 10m grouping.

When I view this graph however I get data outside the 10m grouping referenced in bars they shouldn't be in which results in floating values. I'll attach a screenshot in a follow up to this report.

Also if you hover over the graph data points will show values for all for metrics in each group.

Example of my postgres query/data is below:

SELECT (extract(epoch from "purchase_datetime")/extract(epoch from '10m'::interval))::int*extract(epoch from '10m'::interval) as time, company as metric, sum(paid_price) as value FROM purchases WHERE extract(epoch from purchase_datetime) BETWEEN 1512379200 AND 1512381417 GROUP BY 1,2 ORDER BY 1,2 ASC;
    time    | metric | value 
------------+--------+-------
 1512379200 | CO1    |   100
 1512379200 | CO2    |    20
 1512379800 | CO3    |    47
 1512380400 | CO3    |  62.5
 1512381000 | CO1    |    93
 1512381600 | CO3    |  30.5
 1512381600 | CO4    |    37
(4 rows)

My panel JSON is

{
  "id": 28,
  "title": "Panel Title",
  "span": 12,
  "type": "graph",
  "datasource": "Warehouse Reporting",
  "targets": [
    {
      "policy": "default",
      "dsType": "influxdb",
      "resultFormat": "time_series",
      "orderByTime": "ASC",
      "tags": [],
      "groupBy": [
        {
          "type": "time",
          "params": [
            "$__interval"
          ]
        },
        {
          "type": "fill",
          "params": [
            "null"
          ]
        }
      ],
      "select": [
        [
          {
            "type": "field",
            "params": [
              "value"
            ]
          },
          {
            "type": "mean",
            "params": []
          }
        ]
      ],
      "refId": "A",
      "format": "time_series",
      "alias": "",
      "rawSql": "SELECT\n  $__timeGroup(purchase_datetime,'10m') as time,\n  company as metric,\n  sum(paid_price) as value\nFROM\n  purchases\nWHERE\n  $__timeFilter(purchase_datetime)\nGROUP BY 1,2\nORDER BY 1,2 ASC\n"
    }
  ],
  "renderer": "flot",
  "yaxes": [
    {
      "label": null,
      "show": true,
      "logBase": 1,
      "min": null,
      "max": null,
      "format": "currencyUSD"
    },
    {
      "label": null,
      "show": true,
      "logBase": 1,
      "min": null,
      "max": null,
      "format": "short"
    }
  ],
  "xaxis": {
    "show": true,
    "mode": "time",
    "name": null,
    "values": [],
    "buckets": null
  },
  "lines": false,
  "fill": 1,
  "linewidth": 1,
  "dashes": false,
  "dashLength": 10,
  "spaceLength": 10,
  "points": false,
  "pointradius": 5,
  "bars": true,
  "stack": true,
  "percentage": false,
  "legend": {
    "show": true,
    "values": true,
    "min": false,
    "max": false,
    "current": false,
    "total": true,
    "avg": false,
    "hideEmpty": true,
    "hideZero": true,
    "alignAsTable": false
  },
  "nullPointMode": "null as zero",
  "steppedLine": false,
  "tooltip": {
    "value_type": "individual",
    "shared": true,
    "sort": 0
  },
  "timeFrom": null,
  "timeShift": null,
  "aliasColors": {},
  "seriesOverrides": [],
  "thresholds": []
}
@rezolutiontech
Copy link
Author

Example panel

dataissue-3

@svenklemm
Copy link
Contributor

svenklemm commented Dec 4, 2017

The problem is you are not resetting the series. For every interval where a series has no data you need to insert a row for that series with NULL or 0.

Your query should produce something like the following query

SELECT * FROM (
SELECT 1512379200 as time, 'CO1' as metric, 100.0 UNION 
SELECT 1512379800 as time, 'CO1' as metric, NULL UNION 
SELECT 1512380400 as time, 'CO1' as metric, NULL UNION
SELECT 1512381000 as time, 'CO1' as metric, 93 UNION

SELECT 1512379200 as time, 'CO2' as metric, 20 UNION
SELECT 1512379800 as time, 'CO2' as metric, NULL UNION 
SELECT 1512380400 as time, 'CO2' as metric, NULL UNION
SELECT 1512381000 as time, 'CO2' as metric, NULL UNION

SELECT 1512379200 as time, 'CO3' as metric, NULL UNION 
SELECT 1512379800 as time, 'CO3' as metric, 47 UNION 
SELECT 1512380400 as time, 'CO3' as metric, 62.5 UNION
SELECT 1512381000 as time, 'CO3' as metric, 30.5 UNION

SELECT 1512379200 as time, 'CO4' as metric, NULL UNION 
SELECT 1512379800 as time, 'CO4' as metric, NULL UNION 
SELECT 1512380400 as time, 'CO4' as metric, NULL UNION
SELECT 1512381000 as time, 'CO4' as metric, 37
) as purchases ORDER BY 1,2

This is certainly not optimal and this will make the query you have to write more complicated cause you have to do multiple joins to get that result.

Its probably much better to generate those NULLs in the grafana backend so I'm considering implementing it there. I'm not yet sure though how to best control this behaviour.

@rezolutiontech
Copy link
Author

rezolutiontech commented Dec 4, 2017

Right. I thought it might be something like that. I was trying to do something using coalesce to make the equivalent happen but haven't stumbled on the right solution yet.

I agree that having grafana fill in the missing values would be ideal to keep the query complexity down.

The equivalent we use for influxdb with a min time interval options value of ">10m" looks like

SELECT sum("usdPrice") as price FROM "purchases" WHERE usdPrice > 0 AND $timeFilter GROUP BY company, time($interval) fill(0)
which is nice and simple but sadly storing a copy of the data from postgres in influx isn't an option for me currently.

Thanks for the quick response

@svenklemm
Copy link
Contributor

svenklemm commented Dec 4, 2017

This should produce the results you want/need:

SELECT 
  base.time,
  base.company as metric,
  paid_price 
FROM 
  (
    SELECT 
      time,
      company 
    FROM generate_series(($__unixEpochFrom()/600)::int*600,($__unixEpochTo()/600)::int*600,600) as times(time), 
    (SELECT distinct company from purchases) as companies
  ) as base 
  LEFT OUTER JOIN (
    SELECT
      $__timeGroup(purchase_datetime,'10m') as time,
      company,
      sum(paid_price) as paid_price
    FROM purchases
    WHERE
      $__timeFilter(purchase_datetime)
      GROUP BY 1,2
  ) as p ON (p.company=base.company and p.time = base.time) ORDER BY 1,2;

@rezolutiontech
Copy link
Author

Wow! This was unexpected and awesome. Thank you! I completely forgot generate_series existed.

@daniellee
Copy link
Contributor

Great work @svenklemm Is this something we want to add to the pg data source? Having a fill macro or option would be a good addition I think. The query above would be too complicated for some people. We could add a new macro if we to preserve backwards compatability.

@svenklemm
Copy link
Contributor

I was thinking about either adding a new macro $__timeGroupFill(column,'5m',NULL) or adding a $__fill(NULL) macro which would be used in conjunction with the $__timeGroup macro. $__timeGroupFill seems like the cleaner way to implement this as you need the interval length for the filling but I'm open for other ideas. Not sure we need the fill value configurable but might aswell.

@svenklemm
Copy link
Contributor

@daniellee Alternatively I could add an optional 3rd parameter to $__timeGroup() which would be the fill value. Any preferences?
I dont like the $__fill macro variant because it would just be a dangling macro with no real relation to any part of the sql query.

@torkelo torkelo closed this as completed Dec 6, 2017
@daniellee
Copy link
Contributor

@torkelo as you closed this - should I open a new issue for the fill macro functionality?

@daniellee
Copy link
Contributor

Should probably add the fill function for MySQL at the same time. Ref #9487

@torkelo
Copy link
Member

torkelo commented Dec 6, 2017

I closed it as it seemed the problem was more a usage issue and not a bug / feat req

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants