Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for multiple time series #16

Closed
theodor-n opened this issue Nov 27, 2020 · 13 comments
Closed

Support for multiple time series #16

theodor-n opened this issue Nov 27, 2020 · 13 comments
Labels
TODO something to be done when time allows

Comments

@theodor-n
Copy link

Hi,

This is a somewhat mixed post, related to the PieChart issue and to visualisation panels requiring multiple time series.
It also accidentally covers the $__inteval_ms issue.

I took the liberty of experimenting with your plugin a bit, and I got it working.
What I impleneted:

  • check the incoming query text:
    • replace $__interval_ms as necessary
    • extract query configuration options (see below), I chose to have the options as "in-comment" directives rather than having a front-end configuration
  • query configuration options - these are present in comments in the query text, currently there are 2 options:
    • format(timeseries|): specifying format(timeseries) activates the extended behaviour and will format the output as multiple time series data frames
    • fill(prev||) - this provides functionality for filling missing values in case of using time groups, similar to $__timeGroup() for other database datasources

example:
-- this is a SQL comment which includes the configuration options: format(timeseries)
SELECT "time"/1000.0 as "time", temperature, city
from temperature_data
where "time" >= $__from and "time" <= $__to
order by city, "time"

explanation:
time - time column, temperature - float column, city - string column
With the existing behaviour this would return a single data frame containing a table with 3 columns
with the extended behaviour:

  • format(timeseries) is detected to activate the new timeseries behaviour
  • the first time, numeric and string columns returned by the query are identified as significant
  • multiple data frames will be returned, each dataframe will have a name depending on the returned values in the string column (city in the example), and contain time series with two columns "Time" and "Value", the value being the value from the identified numeric colum.
    This should somehow mimic the behaviour of the built-in data sources, still returning the output in data frames and not "in the old way"

In attachment there are 2 sources:
query.go - this is the original one with a few added code lines to activate the new behaviour
extension.go - this is the new code which implements the logic described above
extension.zip

Final words:
What I built is not very elegant (I'm an absolute novice in Go), its not integrated with your solution but basically sits on top of it.
I was a kind of hoping that having a working "prototype" would give you ideas on how to properly implement this (or analogous) functionality in the "official" plugin. Please feel free to use the attached code as you see fit.

Please let me know what you think and also share a preliminary opinion on whether you are willing to implement/integrate similar time series behaviour in the plugin.

Kind Regards,
Theodor

@theodor-n theodor-n changed the title Support for time series output format Support for multiple time series Nov 27, 2020
@JohnnyBee2
Copy link

+1

@fr-ser
Copy link
Owner

fr-ser commented Nov 27, 2020

Hi @theodor-n . Thank you very much for your code contribution. I appreciate it very much!
However, the normal Github way for this would be a Pull Request. This way it is easier to spot changes as well. So please use a Pull Request to submit code changes.

On the other hand I would still like to understand the issue a bit better before diving into code changes.
I see weird behavior in relation to the pie-chart plugin as described here: #15
But I cannot determine yet if it is due to use of dataframes, wrong use of dataframes or something else entirely.

If you would provide some more examples for me to grasp the problems with the dataframe format then just switching to a plain old timeseries might be the way to go here, but before jumping into code solutions I would like to clarify the problem a bit more.

@fr-ser fr-ser added the waiting-for-information This issue is not actionable without more inforamtion label Nov 27, 2020
@theodor-n
Copy link
Author

theodor-n commented Nov 27, 2020

Hi,
Thank you for your answer.
I avoided pull requests as I don't believe what I did is a proper solution, but rather a demo / PoC.
I also aimed at minimal changes to your existing solution, this is why the only change (actually addition only) is just a few lines in the "query" func in query.go

The problem with the charts is not the use of data frames, per se, but in the format of the data.

As in the "old" way, there are (at least) two formats : Table and Time Series. (ref: https://grafana.com/docs/grafana/latest/developers/plugins/data-frames/
Data frames are available in Grafana 7.0+, and replaced the Time series and Table structures with a more generic data structure that can support a wider range of data types.
)

The plugin can currently provide output only in the table format, but not in the time series format. Note that the table format is not wrong, it also has its applications, as does the time series format.

From this point on documentation is very non-specific, as you have already noticed, however, I did find this:
https://github.com/grafana/grafana/blob/master/packages/grafana-data/src/dataframe/processDataFrame.ts
it has methods conveniently named: convertTableToDataFrame, convertTimeSeriesToDataFrame, etc. from where I got the information about how a time series data frame should look like.

I will shortly provide some examples, in a follow-up post.

Edit: another useful source: https://github.com/grafana/grafana/blob/master/pkg/tsdb/sqleng/sql_engine.go
This is "the old way" of transforming data into "table" or "time series" formats - it has nothing to do with data frames, but shows the principles of how one or the other is built.

@theodor-n
Copy link
Author

theodor-n commented Nov 27, 2020

Sample table data:

time, temperature, city
-------------------------
time01, 10, London
time02, 11, London,
time03, 20, New York
time04, 21, New York
time01, 25, Washington
time04, 26, Washington

Response in table format, single data frame, as currently provided by the plugin:

{
  "Frames": [
    {
      "Name": "response",
      "Fields": [
        {
          "Name": "time",
          "Labels": null,
          "Config": null,
		  "vector": [time01,time02,time03,time04,time01,time04]
        },
        {
          "Name": "temperature",
          "Labels": null,
          "Config": null,
		  "vector": [10,11,20,21,25,26]
        },
        {
          "Name": "city",
          "Labels": null,
          "Config": null,
		  "vector": ["London","London","New York","New York","Washington","Washington"]
        }
      ],
      "RefID": "",
      "Meta": null
    }
  ],
  "Error": null
}

Response in time series format, multiple data frames. One data frame for eash series, data is grouped by the "city" column

{
  "Frames": [
    {
      "Name": "London",
      "Fields": [
        {
          "Name": "Time",
          "Labels": null,
          "Config": null,
		  "vector": [time01,time02]
        },
        {
          "Name": "Value",
          "Labels": null,
          "Config": null,
		  "vector": [10,11]
        }
      ],
      "RefID": "",
      "Meta": null
    },
    {
      "Name": "New York",
      "Fields": [
        {
          "Name": "Time",
          "Labels": null,
          "Config": null,
		  "vector": [time03,time04]
        },
        {
          "Name": "Value",
          "Labels": null,
          "Config": null,
		  "vector": [20,21]
        }
      ],
      "RefID": "",
      "Meta": null
    },
    {
      "Name": "Washington",
      "Fields": [
        {
          "Name": "Time",
          "Labels": null,
          "Config": null,
		  "vector": [time01,time04]
        },
        {
          "Name": "Value",
          "Labels": null,
          "Config": null,
		  "vector": [25,26]
        }
      ],
      "RefID": "",
      "Meta": null
    }
  ],
  "Error": null
}

Note the fixed "magic" column names "Time" and "Value" (taken from here: https://github.com/grafana/grafana/blob/master/packages/grafana-data/src/dataframe/processDataFrame.ts, TIME_SERIES_TIME_FIELD_NAME and TIME_SERIES_VALUE_FIELD_NAME correspondingly)

Again, the table format is not wrong and also has its applications. So ideally the plugin should be able to provide data in one or the other format, similar to the built-in database datasource plugins.

@theodor-n
Copy link
Author

In addition, here is a sample database and a dashboard:
database&dashboard.zip
Graphs

@fr-ser
Copy link
Owner

fr-ser commented Nov 29, 2020

I read around in the documentation and backend code regarding data frames. https://grafana.com/docs/grafana/latest/developers/plugins/data-frames/#data-frames-as-time-series

What I previously returned was the "long format" of time series'. There is, however, a method to convert the "long format" to the "wide format", which is supposedly better supported in the frontend right now.
Feel free to have a look at the code changes if you are interested (#18).

Please also check out the pre-release I created. It looks fine on the regular time series use cases, but the pie-chart plugin still seems to have trouble https://github.com/fr-ser/grafana-sqlite-datasource/releases/tag/v0.2.4-rc1

@theodor-n
Copy link
Author

theodor-n commented Nov 29, 2020

Hi,
I've checked the 0.2.4-rc1 version and indeed it works with the graph visualization panel, which seems to support wide format of a single time series.
Although the series name is the original column name + the value as a tag/label (e.g. "temperature London" instead of just "London"), but I guess this could be solved with renaming in Grafana
Graph

Even if this solves a private case with the "Graph" panel (single time series wide format with labels), it does not solve the general case where multiple time series are expected (e.g. the Pie Chart).

E.g. consider the data arriving as a result of the query:
Data formatted as table: (note that this is still a time series, as it has a time column)
Table

Data formatted as a single time series, wide format: (0.2.4-rc1):
SingleTimeSeriesWideFormat

Data formatted as multiple time series, "narrow" format:
MultipleTimeSeriesNarrow0
MultipleTimeSeriesNarrow1

In order to return multiple time series, it would be needed to return multiple data frames (i.e. split the data into separate frames, one for each value of the string column(s).

@fr-ser
Copy link
Owner

fr-ser commented Nov 29, 2020

Is there another example except for the pie chart where the wide data frame is not supported?
Since there was a method to convert the long format to a wide data frame, but nothing (as far as I saw) to convert to a narrow data frame I am a bit hesitant to implement it myself.
The problem with an own implementation lies in the details (by which columns to split under which conditions, etc.).
I would like to be certain that the current solution is really insufficient for more than one use case before venturing into this direction

@theodor-n
Copy link
Author

I don't know for other examples - I guess anything which expects multiple time series.

About which columns and which conditions, there is this:
https://github.com/grafana/grafana/blob/master/pkg/tsdb/sqleng/sql_engine.go (see transformToTimeSeries)
which shows how multiple time series are created for the built-in database data sources.

Basically

  1. a column named "time" is required and it must be of the time type
  2. splitting/grouping is done by the values of the first text column, or the column named "metric". Multiple text columns are not allowed, as all remaining columns will be considered "value" columns, see below
  3. All remaining columns, apart from "time" (1) and "metric" (2) will be considered numeric/float value columns, and must therefore be numeric (otherwise an error is raised)
  • if there is only one remaining column, each time series will be named as the value of the metric column (2)
  • if there are multiple numeric columns, then each series is named "<value of text/metric column>" + " " + "name of numeric column"
  • the number of produced series is therefore equal to the number of distinct values in the metric/text column, multiplied by the number of numeric columns.

I hope this helps, let me know if you have further doubts.

@fr-ser
Copy link
Owner

fr-ser commented Dec 6, 2020

I tried to go with the simplest possible approach and leveraging whatever the Grafana library already provides.
The result is https://github.com/fr-ser/grafana-sqlite-datasource/releases/tag/v0.2.4-rc2

Have a look and check if this makes sense. If I find no bugs in the coming weeks this is what I would release as the initial time series support (which still feels weird, but at least it works 🤷 )

@JohnnyBee2
Copy link

Hi,
I'm running tests.
Seems to be working fine.
Thanks for the good job.
ps.
Is the $__interval_ms variable implemented?

@fr-ser
Copy link
Owner

fr-ser commented Dec 11, 2020

ps.
Is the $__interval_ms variable implemented?

No. I don't like mixing too many things together. I might have decided to skip the time series feature or do it in a year only.
For the variable replacement, there is already an issue since quite a while ago.

I'd rather have the related discussion there: #2 (comment)

@fr-ser fr-ser added TODO something to be done when time allows and removed waiting-for-information This issue is not actionable without more inforamtion labels Dec 11, 2020
@fr-ser
Copy link
Owner

fr-ser commented Dec 13, 2020

I created the new release and with this would consider this issue closed.
If there are problems with the new release let's create a new issue 👌
https://github.com/fr-ser/grafana-sqlite-datasource/releases/tag/v0.2.4

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
TODO something to be done when time allows
Projects
None yet
Development

No branches or pull requests

3 participants