-
Notifications
You must be signed in to change notification settings - Fork 49
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
7 changed files
with
464 additions
and
88 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,9 +1,74 @@ | ||
# Data-Migration-Tool | ||
# Data Migration | ||
|
||
Data-Migration-Tool is designed to automatically migrate data stored in [STH-Comet](https://github.com/telefonicaid/fiware-sth-comet) to [QuantumLeap](https://github.com/smartsdk/ngsi-timeseries-api) database. After data migration, data can be accessed by using QuantumLeap's [API](https://app.swaggerhub.com/apis/smartsdk/ngsi-tsdb/0.2). | ||
A few tools are available to assist with migrating data to QuantumLeap. | ||
|
||
This tool is developed in [Java](https://en.wikipedia.org/wiki/Java_(software_platform)) using [Eclipse](https://www.eclipse.org/). A python script is used to convert data in [MongoDB](https://github.com/mongodb/mongo) to be compatible for [CrateDB](https://github.com/crate/crate). | ||
|
||
Tool is available [here](https://github.com/Data-Migration-Tool/STH-to-QuantumLeap). | ||
## Migrating STH Comet data | ||
|
||
User guide for the tool is available [here](https://github.com/Data-Migration-Tool/STH-to-QuantumLeap/blob/master/docs/manuals/README.md). | ||
[Data-Migration-Tool][dmt] is a program designed to automatically | ||
migrate data stored in [STH-Comet][comet] to a QuantumLeap [CrateDB][crate] | ||
database. After migration, the data can be accessed through QuantumLeap's | ||
[REST API][ql-api]. | ||
|
||
[Data-Migration-Tool][dmt] is developed in [Java][java] using the | ||
[Eclipse IDE][eclipse]. A Python script transforms data in [MongoDB][mongo] | ||
into the format expected by the QuantumLeap [CrateDB][crate] back end. | ||
|
||
The tool can be downloaded [here][dmt] and the accompanying user guide | ||
is also [available online][dmt-man]. | ||
|
||
|
||
## Migrating from QuantumLeap Crate to Timescale | ||
|
||
QuantumLeap provides a self-contained Python script to help with | ||
migrating tables from a QuantumLeap CrateDB database to a QuantumLeap | ||
Timescale database. The script is located in the `timescale-container` | ||
directory and is called `crate-exporter.py`. | ||
It exports rows in a given Crate table and generates, on `stdout`, | ||
all the SQL statements needed to import that data into Timescale. | ||
These include creating a corresponding schema, table and hypertable | ||
in PostgreSQL as needed. Note that the script generates DDL statements | ||
that, when executed, will result in the exact same table structures | ||
the QuantumLeap Timescale back end would have generated on seeing | ||
NGSI entities corresponding to the rows stored in the Crate table. | ||
|
||
Here's an example usage | ||
|
||
$ python crate-exporter.py --schema mtyoutenant --table etdevice \ | ||
> mtyoutenant.etdevice-import.sql | ||
|
||
where we export all the rows in the Crate table `mtyoutenant.etdevice`. | ||
The generated file contains all the SQL statements to recreate the | ||
table and insert the data in Timescale. You may want to put this file | ||
in the `quantumleap-db-setup` script's init directory so that data | ||
are migrated automatically for you when you bootstrap the QuantumLeap | ||
DB on Timescale as explained in the [Timescale section][ts-admin]. | ||
|
||
By default the script exports all the rows in the Crate table, but | ||
you can also use the `--query` argument to specify a query to select | ||
only a subset of interest as shown below: | ||
|
||
$ python crate-exporter.py --schema mtyoutenant --table etdevice --query \ | ||
"SELECT * FROM mtyoutenant.etdevice where time_index > '2019-04-15';" | ||
|
||
|
||
|
||
|
||
[comet]: https://github.com/telefonicaid/fiware-sth-comet | ||
"FiWare STH Comet Home" | ||
[crate]: https://crate.io | ||
"CrateDB Home" | ||
[dmt]: https://github.com/Data-Migration-Tool/STH-to-QuantumLeap | ||
"Data-Migration-Tool Home" | ||
[dmt-man]: https://github.com/Data-Migration-Tool/STH-to-QuantumLeap/blob/master/docs/manuals/README.md | ||
"Data-Migration-Tool Manual" | ||
[eclipse]: https://www.eclipse.org/ | ||
"Eclipse Home" | ||
[java]: https://en.wikipedia.org/wiki/Java_(software_platform) | ||
"Wikipedia - Java" | ||
[mongo]: https://github.com/mongodb/mongo | ||
"MongoDB Home" | ||
[ql-api]: https://app.swaggerhub.com/apis/smartsdk/ngsi-tsdb/0.2 | ||
"QuantumLeap REST API" | ||
[ts-admin]: ./timescale.md | ||
"QuantumLeap Timescale" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,44 @@ | ||
# Database Selection | ||
|
||
QuantumLeap can use different time series databases to persist and | ||
query NGSI data. Currently both [CrateDB][crate] and [Timescale][timescale] | ||
are supported as back ends, even though query functionality is | ||
not yet available for Timescale. | ||
|
||
If no configuration is provided QuantumLeap assumes CrateDB is | ||
the back end to use and will store all incoming NGSI data in it. | ||
However, different back ends can be configured for specific tenants | ||
through a YAML configuration file. To use this feature, you have | ||
to set the environment variable below: | ||
|
||
* `QL_CONFIG`: absolute pathname of the QuantumLeap YAML configuration | ||
file. If not set, the default configuration will be used where only | ||
the Crate back end is available. | ||
|
||
The YAML configuration file specifies what back end to use for which | ||
tenant as well as the default back end to use for any other tenant | ||
not explicitly mentioned in the file. Here's an example YAML | ||
configuration: | ||
|
||
tenants: | ||
t1: | ||
backend: Timescale | ||
t2: | ||
backend: Crate | ||
t3: | ||
backend: Timescale | ||
|
||
default-backend: Crate | ||
|
||
With this configuration, any NGSI entity coming in for tenant `t1` | ||
or `t3` will be stored in Timescale whereas tenant `t2` will use | ||
Crate. Any tenant other than `t1`, `t2`, or `t3` gets the default | ||
Crate back end. | ||
|
||
|
||
|
||
|
||
[crate]: ./crate.md | ||
"QuantumLeap Crate" | ||
[timescale]: ./timescale.md | ||
"QuantumLeap Timescale" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,100 @@ | ||
# Timescale | ||
|
||
[Timescale][timescale] is one of the time series databases that can be | ||
used with QuantumLeap as a back end to store NGSI entity time series. | ||
As documented in the [Database Selection section][admin.db], it is | ||
possible to dynamically select, at runtime, which storage back end to | ||
use (Crate or Timescale) depending on the tenant who owns the entity | ||
being persisted. Moreover, QuantumLeap ships with tools to automate the | ||
Timescale back end setup and generate Crate-to-Timescale migration | ||
scripts---details in the [Data Migration section][admin.dm]. | ||
|
||
|
||
## QuantumLeap Timescale DB setup | ||
|
||
In order to start using the Timescale back end, a working PostgreSQL | ||
installation is required. Specifically, QuantumLeap requires | ||
**PostgreSQL server 10 or above with the Timescale and PostGIS | ||
extensions already installed** on it. The Docker file in the | ||
`timescale-container/test` can be used to quickly spin up a Timescale | ||
server back end to which QuantumLeap can connect, but for | ||
production deployments a more sophisticated setup is likely to | ||
be needed---e.g. configuring PostgreSQL for high availability. | ||
|
||
Once Timescale is up and running, you will have to bootstrap the | ||
QuantumLeap DB and perhaps you may also want to migrate some data | ||
from Crate. QuantumLeap ships with a self-contained Python script | ||
that can automate most of the steps in the process. The script file | ||
is named `quantumleap-db-setup` and is located in the | ||
`timescale-container` directory. It does these three things, in order: | ||
|
||
1. Bootstrap the QuantumLeap database if it doesn't exist. It creates | ||
a database for QuantumLeap with all required extensions as well as | ||
an initial QuantumLeap role. If the specified QuantumLeap DB already | ||
exists, the bootstrap phase is skipped. | ||
2. Run any SQL script found in the specified init directory---defaults | ||
to `./ql-db-init`. It picks up any `.sql` file in this directory | ||
tree and, in turn, executes each one in ascending alphabetical | ||
order, stopping at the first one that errors out, in which case | ||
the script exits. | ||
3. Load any data file found in the above init directory. A data file | ||
is any file with a `.csv` extension found in the init directory | ||
tree. Each data file is expected to contain a list of records in | ||
the CSV format to be loaded in a table in the QuantumLeap | ||
database---field delimiter `,` and quoted fields must be quoted | ||
using a single quote char `'`. The file name without the `.csv` | ||
extension is taken to be the FQN of the table in which data should | ||
be loaded, whereas the column spec is given by the names in the | ||
CSV header, which is expected to be in the file. Data files are | ||
loaded in turn following their alphabetical order, stopping at | ||
the first one that errors out, in which case the script exits. | ||
|
||
(2) and (3) are mostly relevant for data migration (more about it | ||
in the section below), but the script can just as well be used to | ||
execute arbitrary SQL statements. Note that the Docker compose | ||
file mentioned earlier spins up a Timescale container (with PostGIS) | ||
and another container that will run the script using | ||
`timescale-container/test/ql-db-init` as init directory, | ||
providing a working Timescale DB, complete with some tables | ||
and test data. | ||
|
||
|
||
## Using the Timescale back end | ||
|
||
Once you have a Postgres+Timescale+PostGIS server with a freshly | ||
minted QuantumLeap DB in it, you are ready to connect QuantumLeap | ||
to the DB server. To do that, some environment variables have to | ||
be set and a YAML file edited. The environment variables to use | ||
are: | ||
|
||
* `POSTGRES_HOST`: the hostname or IP address of your Timescale server. | ||
Defaults to `timescale` if not specified. | ||
* `POSTGRES_PORT`: the server port to connect to, defaults to `5432`. | ||
* `POSTGRES_DB_NAME`: the name of the QuantumLeap DB, defaults to | ||
`quantumleap`. | ||
* `POSTGRES_DB_USER`: the DB user QuantumLeap should use to connect, | ||
defaults to `quantumleap`. | ||
* `POSTGRES_DB_PASS`: the above user's password, defaults to `*`. | ||
* `POSTGRES_USE_SSL`: should QuantumLeap connect to PostgreSQL using | ||
SSL? If so, then set this variable to any of: `true`, `yes`, `1`, `t`. | ||
Specify any other value or don't set the variable at all to use a | ||
plain TCP connection. | ||
* `QL_CONFIG`: absolute pathname of the QuantumLeap YAML configuration | ||
file. If not set, the default configuration will be used where only | ||
the Crate back end is available. For details about how to select a | ||
back end and YAML configuration, refer to the [Database Selection | ||
section][admin.db]. | ||
|
||
|
||
|
||
|
||
[admin.db]: ./db-selection.md | ||
"QuantumLeap Database Selection" | ||
[admin.dm]: ./dataMigration.md | ||
"QuantumLeap Data Migration" | ||
[postgres]: https://www.postgresql.org | ||
"PostgreSQL Home" | ||
[postgis]: https://postgis.net/ | ||
"PostGIS Home" | ||
[timescale]: https://www.timescale.com | ||
"Timescale Home" |
Oops, something went wrong.