New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Time Series Database #173

Open
lightsofapollo opened this Issue Jul 3, 2017 · 8 comments

Comments

Projects
None yet
7 participants
@lightsofapollo
Contributor

lightsofapollo commented Jul 3, 2017

As a user I want to see a graph of my sensor data over time so that I can see trends in the data.

@lightsofapollo lightsofapollo added this to the Next (date TBD) milestone Jul 3, 2017

@benfrancis benfrancis added the epic label Jul 5, 2017

@benfrancis benfrancis removed this from the 0.2 milestone Nov 1, 2017

@benfrancis benfrancis changed the title from Pluggable Storage Engine to Time Series Database Feb 23, 2018

@benfrancis

This comment has been minimized.

Contributor

benfrancis commented Feb 23, 2018

Example use cases:

  • As a user I want to see a graph of the temperature of my home over time
  • As a user I want to find out when my front door was last opened so I know whether my kids got home from school
  • As a user I want to see a graph of the values from my Internet connected weather station
@hobinjk

This comment has been minimized.

Contributor

hobinjk commented Feb 23, 2018

These are good examples of use cases, thanks!

The most important facet of a time-series database is that it is a mechanism for storing and retrieving data over time. This might sound trivial but the main thing I think is important to keep in mind is that in addition to widgets displaying simple queries of the data there's also the opportunity to do learning, prediction, and other complex operations.

@Timmeey

This comment has been minimized.

Timmeey commented Apr 24, 2018

Maybe we could get some inspiration from Munin and the RRDTool that they are using as their timeseries database

A very easy first approach to at least store some data, would be to build a munin plugin, which queries all sensors. this would be fairly easy, but would store and visualize the data not integrated in the gateway UI.

@rzr

This comment has been minimized.

Contributor

rzr commented Aug 4, 2018

Any database backend to suggest ? should it be on device or remote ?

Is Redis adapted for this ?

@webwurst

This comment has been minimized.

webwurst commented Aug 4, 2018

@benfrancis

This comment has been minimized.

Contributor

benfrancis commented Aug 9, 2018

I'd suggest an on-gateway time series database should probably use SQLite like everything else, unless there's a really good reason for adding a second/switching to a different database back end.

@mrstegeman mrstegeman referenced this issue Aug 30, 2018

Open

Logging #1299

@hongquan

This comment has been minimized.

hongquan commented Sep 7, 2018

If you want a time-series database which stores data of all time, InfluxDB or TimescaleDB is the way to go.
RRDTool will delete the data after a duration, it cannot be used for storing all-time data.

For time-series data, you will very often query batch of the most recent data. Relational database like SQLite will be slow when you store much enough. Even if you can build index for it, you still face problem:

  • If you index all points, it will cost a lot of storage.
  • If you only index the N most recent data, it still be slow if you query data outside than range.

So, using a time-series DB is still better.

Now is comparison of InfluxDB and TimescaleDB:

  • InfluxDB allow to summary (downsample) old data.
  • InfluxDB cost much of RAM when running (maybe because it is written in Go).
@Timmeey

This comment has been minimized.

Timmeey commented Oct 7, 2018

Maybe we could break this task up into smaller pieces, so it's easier to comprehend what has to be done.

I could imagine this being build in 3 individual steps.

  • Build a system that enables the user to define what to do when the thing (sensor/button etc.) is updated with a new value. gotUpdate(sensorId,oldValue,newValue,timestamp)
  • The actual TimeSeriesDatabase, that can do nothing but receive an identifier, a value, and a timestamp
    • As a first step it could just be a stupid sqlite database that appends the given id, value and timestamp, without any time series compaction
    • second step provide ALL data as map of lists with the ID as the map Key
  • A UI system that is able to display simple x/y graphs which take exactly one method that provides a list of x(time)/y(value) tuples

From there on we could work in small bites on the problem of connecting those 3 parts together and improve them independently. But I think in such small steps its more likely to get implemented.

Especially the TimeSeriesPart is a blocker here I think, which we could mitigate by providing a simple database first

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment