Skip to content

Commit

Permalink
Merge 0b8bab4 into d28a1e0
Browse files Browse the repository at this point in the history
  • Loading branch information
c0c0n3 committed Oct 17, 2019
2 parents d28a1e0 + 0b8bab4 commit 985ac5e
Show file tree
Hide file tree
Showing 7 changed files with 464 additions and 88 deletions.
75 changes: 70 additions & 5 deletions docs/manuals/admin/dataMigration.md
Original file line number Diff line number Diff line change
@@ -1,9 +1,74 @@
# Data-Migration-Tool
# Data Migration

Data-Migration-Tool is designed to automatically migrate data stored in [STH-Comet](https://github.com/telefonicaid/fiware-sth-comet) to [QuantumLeap](https://github.com/smartsdk/ngsi-timeseries-api) database. After data migration, data can be accessed by using QuantumLeap's [API](https://app.swaggerhub.com/apis/smartsdk/ngsi-tsdb/0.2).
A few tools are available to assist with migrating data to QuantumLeap.

This tool is developed in [Java](https://en.wikipedia.org/wiki/Java_(software_platform)) using [Eclipse](https://www.eclipse.org/). A python script is used to convert data in [MongoDB](https://github.com/mongodb/mongo) to be compatible for [CrateDB](https://github.com/crate/crate).

Tool is available [here](https://github.com/Data-Migration-Tool/STH-to-QuantumLeap).
## Migrating STH Comet data

User guide for the tool is available [here](https://github.com/Data-Migration-Tool/STH-to-QuantumLeap/blob/master/docs/manuals/README.md).
[Data-Migration-Tool][dmt] is a program designed to automatically
migrate data stored in [STH-Comet][comet] to a QuantumLeap [CrateDB][crate]
database. After migration, the data can be accessed through QuantumLeap's
[REST API][ql-api].

[Data-Migration-Tool][dmt] is developed in [Java][java] using the
[Eclipse IDE][eclipse]. A Python script transforms data in [MongoDB][mongo]
into the format expected by the QuantumLeap [CrateDB][crate] back end.

The tool can be downloaded [here][dmt] and the accompanying user guide
is also [available online][dmt-man].


## Migrating from QuantumLeap Crate to Timescale

QuantumLeap provides a self-contained Python script to help with
migrating tables from a QuantumLeap CrateDB database to a QuantumLeap
Timescale database. The script is located in the `timescale-container`
directory and is called `crate-exporter.py`.
It exports rows in a given Crate table and generates, on `stdout`,
all the SQL statements needed to import that data into Timescale.
These include creating a corresponding schema, table and hypertable
in PostgreSQL as needed. Note that the script generates DDL statements
that, when executed, will result in the exact same table structures
the QuantumLeap Timescale back end would have generated on seeing
NGSI entities corresponding to the rows stored in the Crate table.

Here's an example usage

$ python crate-exporter.py --schema mtyoutenant --table etdevice \
> mtyoutenant.etdevice-import.sql

where we export all the rows in the Crate table `mtyoutenant.etdevice`.
The generated file contains all the SQL statements to recreate the
table and insert the data in Timescale. You may want to put this file
in the `quantumleap-db-setup` script's init directory so that data
are migrated automatically for you when you bootstrap the QuantumLeap
DB on Timescale as explained in the [Timescale section][ts-admin].

By default the script exports all the rows in the Crate table, but
you can also use the `--query` argument to specify a query to select
only a subset of interest as shown below:

$ python crate-exporter.py --schema mtyoutenant --table etdevice --query \
"SELECT * FROM mtyoutenant.etdevice where time_index > '2019-04-15';"




[comet]: https://github.com/telefonicaid/fiware-sth-comet
"FiWare STH Comet Home"
[crate]: https://crate.io
"CrateDB Home"
[dmt]: https://github.com/Data-Migration-Tool/STH-to-QuantumLeap
"Data-Migration-Tool Home"
[dmt-man]: https://github.com/Data-Migration-Tool/STH-to-QuantumLeap/blob/master/docs/manuals/README.md
"Data-Migration-Tool Manual"
[eclipse]: https://www.eclipse.org/
"Eclipse Home"
[java]: https://en.wikipedia.org/wiki/Java_(software_platform)
"Wikipedia - Java"
[mongo]: https://github.com/mongodb/mongo
"MongoDB Home"
[ql-api]: https://app.swaggerhub.com/apis/smartsdk/ngsi-tsdb/0.2
"QuantumLeap REST API"
[ts-admin]: ./timescale.md
"QuantumLeap Timescale"
44 changes: 44 additions & 0 deletions docs/manuals/admin/db-selection.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
# Database Selection

QuantumLeap can use different time series databases to persist and
query NGSI data. Currently both [CrateDB][crate] and [Timescale][timescale]
are supported as back ends, even though query functionality is
not yet available for Timescale.

If no configuration is provided QuantumLeap assumes CrateDB is
the back end to use and will store all incoming NGSI data in it.
However, different back ends can be configured for specific tenants
through a YAML configuration file. To use this feature, you have
to set the environment variable below:

* `QL_CONFIG`: absolute pathname of the QuantumLeap YAML configuration
file. If not set, the default configuration will be used where only
the Crate back end is available.

The YAML configuration file specifies what back end to use for which
tenant as well as the default back end to use for any other tenant
not explicitly mentioned in the file. Here's an example YAML
configuration:

tenants:
t1:
backend: Timescale
t2:
backend: Crate
t3:
backend: Timescale

default-backend: Crate

With this configuration, any NGSI entity coming in for tenant `t1`
or `t3` will be stored in Timescale whereas tenant `t2` will use
Crate. Any tenant other than `t1`, `t2`, or `t3` gets the default
Crate back end.




[crate]: ./crate.md
"QuantumLeap Crate"
[timescale]: ./timescale.md
"QuantumLeap Timescale"
27 changes: 19 additions & 8 deletions docs/manuals/admin/grafana.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,19 +3,22 @@
[**Grafana**](https://grafana.com/) is a powerful visualisation tool that we
can use to display graphics of the persisted data.

In order to read data from a [CrateDB](./crate.md) database for your dashboards
in Grafana, you should use the [Postgres Datasource](http://docs.grafana.org/features/datasources/postgres/).
The Postgres Datasource should come preinstalled in the latest Grafana versions.
You can easily connect Grafana to either the QuantumLeap [CrateDB](./crate.md)
or [Timescale](./timescale.md) back end to visualise QuantumLeap data on
your dashboards. In both cases, the Grafana data source to use is the
[Postgres Datasource](http://docs.grafana.org/features/datasources/postgres/)
which normally ships with recent versions of Grafana.

If you followed the [Installation Guide](./index.md), you have already Grafana
running in a Docker container. If deployed locally, it's probably at [http://0.0.0.0:3000](http://0.0.0.0:3000)

You can now follow Crate's recommendations on how to configure the datasource
by checking out [this post](https://crate.io/a/pair-cratedb-with-grafana-an-open-platform-for-time-series-data-visualization/).
If you already put some data in your database, you can jump directly to the
"Add a Data Source" section. The main parts of such post are convered below.
If you're using the CrateDB back end, we suggest you read
[this blog post](https://crate.io/a/pair-cratedb-with-grafana-an-open-platform-for-time-series-data-visualization/)
and follow Crate's recommendations on how to configure the Grafana
datasource which we have summarised in the below section.

## Configuring the DataSource

## Configuring the DataSource for CrateDB

Explore your deployed Grafana instance (e.g [http://0.0.0.0:3000](http://0.0.0.0:3000)).
If you didn't change the defaults credentials, use `admin` as both user and
Expand All @@ -41,6 +44,14 @@ look like

Click *Save & Test* and you should get an OK message.


## Configuring the DataSource for PostgreSQL

The process is pretty much the same as outlined above and is well documented
in the Grafana [PosgreSQL data source manual](https://grafana.com/docs/features/datasources/postgres/).
Note that you should enable the *TimescaleDB* data source option.


## Using the DataSource in your Graph

Having your datasource setup, you can start using it in the different
Expand Down
1 change: 1 addition & 0 deletions docs/manuals/admin/ports.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@ are used within the cluster.
|TCP | 27017 | Mongo database |
|TCP | 4200 | CrateDB Admin UI |
|TCP | 4300 | CrateDB Transport Protocol |
|TCP | 5432 | PostgreSQL Protocol |
|TCP | 6379 | Redis cache (used by geocoding) |

For more info on ports numbers, you can always inspect the ports being exposed
Expand Down
100 changes: 100 additions & 0 deletions docs/manuals/admin/timescale.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,100 @@
# Timescale

[Timescale][timescale] is one of the time series databases that can be
used with QuantumLeap as a back end to store NGSI entity time series.
As documented in the [Database Selection section][admin.db], it is
possible to dynamically select, at runtime, which storage back end to
use (Crate or Timescale) depending on the tenant who owns the entity
being persisted. Moreover, QuantumLeap ships with tools to automate the
Timescale back end setup and generate Crate-to-Timescale migration
scripts---details in the [Data Migration section][admin.dm].


## QuantumLeap Timescale DB setup

In order to start using the Timescale back end, a working PostgreSQL
installation is required. Specifically, QuantumLeap requires
**PostgreSQL server 10 or above with the Timescale and PostGIS
extensions already installed** on it. The Docker file in the
`timescale-container/test` can be used to quickly spin up a Timescale
server back end to which QuantumLeap can connect, but for
production deployments a more sophisticated setup is likely to
be needed---e.g. configuring PostgreSQL for high availability.

Once Timescale is up and running, you will have to bootstrap the
QuantumLeap DB and perhaps you may also want to migrate some data
from Crate. QuantumLeap ships with a self-contained Python script
that can automate most of the steps in the process. The script file
is named `quantumleap-db-setup` and is located in the
`timescale-container` directory. It does these three things, in order:

1. Bootstrap the QuantumLeap database if it doesn't exist. It creates
a database for QuantumLeap with all required extensions as well as
an initial QuantumLeap role. If the specified QuantumLeap DB already
exists, the bootstrap phase is skipped.
2. Run any SQL script found in the specified init directory---defaults
to `./ql-db-init`. It picks up any `.sql` file in this directory
tree and, in turn, executes each one in ascending alphabetical
order, stopping at the first one that errors out, in which case
the script exits.
3. Load any data file found in the above init directory. A data file
is any file with a `.csv` extension found in the init directory
tree. Each data file is expected to contain a list of records in
the CSV format to be loaded in a table in the QuantumLeap
database---field delimiter `,` and quoted fields must be quoted
using a single quote char `'`. The file name without the `.csv`
extension is taken to be the FQN of the table in which data should
be loaded, whereas the column spec is given by the names in the
CSV header, which is expected to be in the file. Data files are
loaded in turn following their alphabetical order, stopping at
the first one that errors out, in which case the script exits.

(2) and (3) are mostly relevant for data migration (more about it
in the section below), but the script can just as well be used to
execute arbitrary SQL statements. Note that the Docker compose
file mentioned earlier spins up a Timescale container (with PostGIS)
and another container that will run the script using
`timescale-container/test/ql-db-init` as init directory,
providing a working Timescale DB, complete with some tables
and test data.


## Using the Timescale back end

Once you have a Postgres+Timescale+PostGIS server with a freshly
minted QuantumLeap DB in it, you are ready to connect QuantumLeap
to the DB server. To do that, some environment variables have to
be set and a YAML file edited. The environment variables to use
are:

* `POSTGRES_HOST`: the hostname or IP address of your Timescale server.
Defaults to `timescale` if not specified.
* `POSTGRES_PORT`: the server port to connect to, defaults to `5432`.
* `POSTGRES_DB_NAME`: the name of the QuantumLeap DB, defaults to
`quantumleap`.
* `POSTGRES_DB_USER`: the DB user QuantumLeap should use to connect,
defaults to `quantumleap`.
* `POSTGRES_DB_PASS`: the above user's password, defaults to `*`.
* `POSTGRES_USE_SSL`: should QuantumLeap connect to PostgreSQL using
SSL? If so, then set this variable to any of: `true`, `yes`, `1`, `t`.
Specify any other value or don't set the variable at all to use a
plain TCP connection.
* `QL_CONFIG`: absolute pathname of the QuantumLeap YAML configuration
file. If not set, the default configuration will be used where only
the Crate back end is available. For details about how to select a
back end and YAML configuration, refer to the [Database Selection
section][admin.db].




[admin.db]: ./db-selection.md
"QuantumLeap Database Selection"
[admin.dm]: ./dataMigration.md
"QuantumLeap Data Migration"
[postgres]: https://www.postgresql.org
"PostgreSQL Home"
[postgis]: https://postgis.net/
"PostGIS Home"
[timescale]: https://www.timescale.com
"Timescale Home"
Loading

0 comments on commit 985ac5e

Please sign in to comment.