Skip to content

Commit

Permalink
Documentation: Add descriptions about databases InfluxDB and MongoDB
Browse files Browse the repository at this point in the history
  • Loading branch information
amotl committed Jun 19, 2023
1 parent 9f5ec6b commit 9f76fe4
Show file tree
Hide file tree
Showing 7 changed files with 607 additions and 21 deletions.
9 changes: 9 additions & 0 deletions doc/source/_resources.rst
Original file line number Diff line number Diff line change
Expand Up @@ -171,15 +171,24 @@

.. NEW
.. _curl: https://en.wikipedia.org/wiki/CURL
.. _Flux data scripting language: https://docs.influxdata.com/flux/
.. _Funky v3: https://harizanov.com/product/funky-v3/
.. _InfluxDB OSS documentation: https://docs.influxdata.com/influxdb/
.. _Influx Query Language (InfluxQL): https://docs.influxdata.com/influxdb/v1.8/query_language/spec/
.. _InfluxDB line protocol: https://docs.influxdata.com/influxdb/v1.8/write_protocols/line_protocol_reference/
.. _InfluxDB storage engine: https://docs.influxdata.com/influxdb/v2.7/reference/internals/storage-engine/
.. _LoRaWAN: https://en.wikipedia.org/wiki/LoRa#LoRaWAN
.. _MongoDB manual: https://www.mongodb.com/docs/manual/
.. _MongoDB Wire Protocol: https://www.mongodb.com/docs/manual/reference/mongodb-wire-protocol/
.. _OpenXC: https://openxcplatform.com/
.. _OpenXC for Python: http://python.openxcplatform.com/
.. _multi-tenant: https://en.wikipedia.org/wiki/Multitenancy
.. _NodeUSB: https://web.archive.org/web/20210621192219/http://www.nodeusb.com/
.. _trunking: https://en.wikipedia.org/wiki/Trunking
.. _webhook: https://en.wikipedia.org/wiki/Webhook
.. _webhooks: https://en.wikipedia.org/wiki/Webhook
.. _WiredTiger: https://github.com/wiredtiger



.. Companies
Expand Down
83 changes: 83 additions & 0 deletions doc/source/database/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,83 @@
(databases)=
(kotori-databases)=
# Databases

Database adapter components will know about vendor-specific dialects and optimal
communication strategies to timeseries-databases.

This documentation section enumerates the collection of database adapters shipped
with Kotori. Adding more adapters is possible.


```{toctree}
:caption: Databases
:maxdepth: 1
:hidden:
influxdb
mongodb
```


::::::{grid} 1
:margin: 0
:padding: 0


:::::{grid-item-card}
::::{grid} 2
:margin: 0
:padding: 0

:::{grid-item}
:columns: 8
#### [](#database-influxdb)

InfluxDB is a scalable datastore and time series platform for metrics, events,
and real-time analytics. It covers storing and querying data, background ETL processing
for monitoring and alerting purposes, and visualization and exploration features.

<small>
<strong>Categories:</strong> timeseries-database
</small>
:::
:::{grid-item}
:columns: 4
{bdg-primary-line}`eth` {bdg-primary-line}`wifi` {bdg-primary-line}`http`

{bdg-success-line}`ilp` {bdg-success-line}`influxql` {bdg-success-line}`flux`

{bdg-secondary-line}`amd64` {bdg-secondary-line}`arm64`
:::
::::
:::::


:::::{grid-item-card}
::::{grid} 2
:margin: 0
:padding: 0

:::{grid-item}
:columns: 8
#### [](#database-mongodb)

MongoDB is a document database designed for ease of application development and scaling.

<small>
<strong>Categories:</strong> document-database
</small>
:::
:::{grid-item}
:columns: 4
{bdg-primary-line}`eth` {bdg-primary-line}`wifi` {bdg-primary-line}`http`

{bdg-success-line}`json` {bdg-success-line}`bson`

{bdg-secondary-line}`amd64` {bdg-secondary-line}`arm64`
:::
::::
:::::


::::::
239 changes: 239 additions & 0 deletions doc/source/database/influxdb.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,239 @@
.. include:: ../_resources.rst

.. _database-influxdb:

########
InfluxDB
########


*****
About
*****

`InfluxDB`_ is a scalable datastore and time series platform for metrics, events,
and real-time analytics. It covers storing and querying data, background ETL processing
for monitoring and alerting purposes, and visualization and exploration features.


*******
Details
*******

This section summarizes InfluxDB's data model and query interface.

Data model
==========

InfluxDB stores data points (records) in measurements. Measurements are analogous to
tables in relational databases. Measurements have been grouped into databases within
InfluxDB 1.x, while those have been repurposed to "buckets" with InfluxDB 2.x.

Each data point within a measurement has a timestamp, fields, and tags.

.. figure:: https://github.com/daq-tools/kotori/assets/453543/6f9c00bb-d834-4adf-b752-a069c48f7b56
:target: https://invidious.fdn.fr/watch?v=1Iw_0J5UkYs&t=257
:width: 400
:alt: An InfluxDB data point by example.

On disk, timestamps are stored in epoch nanosecond format. InfluxDB formats timestamps
in RFC3339 UTC.

Tags are indexed, and store low-cardinality metadata, for example location information
about the data point. Fields are not indexed, and store the actual measurement values
of the data point.

Not sure what to store in tags and what to store in fields?

- Store commonly-queried and grouping (``group()`` or ``GROUP BY``) metadata in tags.
- Store data in fields if each data point contains a different value.
- Store numeric values as fields (tag values only support string values).

Query interface
===============

Languages
---------
InfluxDB 1.x supports both the `Influx Query Language (InfluxQL)`_ and the `Flux data
scripting language`_ for querying data, and the `InfluxDB line protocol`_ for inserting
data. Please inspect the :ref:`influxdb-query-examples`, as well as the corresponding
upstream documentation about how to `insert data`_, `query data using InfluxQL`_, and
`query data using Flux`_.

Protocols
---------
InfluxDB clients communicate to servers using HTTP or UDP.


************
Key features
************

This section enumerates the key features of InfluxDB, as advertised on its documentation.

Storage engine
==============

The `InfluxDB storage engine`_ includes the following components.

- **Write Ahead Log (WAL)**

The Write Ahead Log (WAL) retains InfluxDB data when the storage engine restarts.
The WAL ensures data is durable in case of an unexpected failure.

- **Cache**

The cache is an in-memory copy of data points currently stored in the WAL. The WAL
and cache are separate entities and do not interact with each other. The storage
engine coordinates writes to both.

- **Time-Structured Merge Tree (TSM)**

To efficiently compact and store data, the storage engine groups field values by series
key, and then orders those field values by time. The storage engine uses a Time-Structured
Merge Tree (TSM) data format. TSM files store compressed series data in a columnar format.
To improve efficiency, the storage engine only stores differences (or deltas) between
values in a series. Column-oriented storage lets the engine read by series key and omit
extraneous data.

- **Time Series Index (TSI)**

As data cardinality (the number of series) grows, queries read more series keys and become
slower. The Time Series Index ensures queries remain fast as data cardinality grows. The
TSI stores series keys grouped by measurement, tag, and field, and allows the database to
answer metadata queries about what measurements, tags, or fields exist, and, given a
measurement, tags, and fields, what series keys exist.

Query API
=========

- RESTful API and a set of client libraries (InfluxDB API, Arduino, C#, Go, Java,
JavaScript, Kotlin, Node.js, PHP, Python, R, Ruby, Scala, and Swift) to collect,
transform, and visualize your data.

- The Flux query language is a functional language for working with time series data.

Ecosystem
=========

InfluxDB is supported by a massive community and ecosystem, offering a wide
range of connectivity options.

- Telegraf is an open source collector agent with over 300+ plugins.
- Write data with AWS Lambda or InfluxDB CL.
- Run Flux scripts natively and show results in VS Code.
- Use the Flux REPL (Read–Eval–Print Loop) to execute Flux scripts.
- Use the Flux language to interact with InfluxDB and other data sources.
- Connectors to Grafana, Google Data Studio, and PTC ThingWorx.
- Use Postman to interact with the InfluxDB API.

User interface
==============

A best-in-class UI that includes a data explorer, dashboarding tools, and a script editor.

- Quickly browse through stored metric and event data.
- Apply common transformations.
- The dashboarding tool includes a number of visualizations that help you to see insights
from your data faster.
- The script editor offers easily accessible examples, in order to quickly learn the Flux
query language, and features auto-completion and real-time syntax checking.


.. _influxdb-query-examples:

**************
Query examples
**************

This section demonstrates a few basic query examples from InfluxDB's documentation.

Insert
======

Data is inserted into InfluxDB using the `InfluxDB line protocol`_, without using a
query language.

Select
======
.. code-block:: sql
-- InfluxQL: Basic select statement with date range filtering.
SELECT "water_level"
FROM "h2o_feet"
WHERE
"location"='coyote_creek' AND
time >= '2015-08-18T00:00:00Z' AND
time <= '2015-08-18T00:18:00Z'
.. code-block:: cpp
// Flux: Select most recent reading from a measurement, with date range filtering.
from(bucket:"turbines")
|> range(start: -1h)
|> filter(fn: (r) => r._measurement == "turbine3000")
|> last()
Advanced queries
================

.. code-block:: sql
-- InfluxQL: Aggregations with date range filtering and
-- time bucketing using specified intervals.
SELECT mean("humidity")
FROM "readings"
WHERE time > now()-1h
GROUP BY time(5m)
.. code-block:: cpp
// Flux: Group records using regular time intervals.
// Window every 20 seconds covering 40 second periods.
data
|> window(every: 20s, period: 40s)
.. code-block:: cpp
// Flux: Time bucketing.
// Apply downsampling by grouping data into fixed windows of time and applying an
// aggregate or selector function to each window.
data
|> aggregateWindow(every: 1mo, fn: mean)
.. code-block:: cpp
// Flux: Time bucketing with parameterized aggregation function.
data
|> aggregateWindow(
column: "_value",
every: 20s,
fn: (column, tables=<-) => tables |> quantile(q: 0.99, column: column),
)
*****
Usage
*****

Purpose
=======

Kotori uses InfluxDB to store **timeseries-data** of data acquisition channels.

Documentation
=============

See :ref:`influxdb-handbook` and the `InfluxDB OSS documentation`_.

Compatibility
=============

Kotori supports data acquisition and export with InfluxDB 1.x.

.. todo:: It is not compatible with InfluxDB 2.x and 3.x.


.. _insert data: https://docs.influxdata.com/influxdb/v1.8/write_protocols/line_protocol_tutorial/#writing-data-to-influxdb
.. _query data using InfluxQL: https://docs.influxdata.com/influxdb/v1.8/query_language/sample-data/#test-queries
.. _query data using Flux: https://docs.influxdata.com/influxdb/v1.8/flux/guides/execute-queries/

0 comments on commit 9f76fe4

Please sign in to comment.