TimescaleDB is an open-source database designed to make SQL scalable for time-series data. It is engineered up from PostgreSQL, providing automatic partitioning across time and space (partitioning key), as well as full SQL support.
TimescaleDB is packaged as a PostgreSQL extension and set of scripts.
For a more detailed description of our architecture, please read the technical paper. Additionally, more documentation can be found on our docs website.
There are several ways to install TimescaleDB: (1) Homebrew (for MacOS), (2) Docker, or (3) from source.
NOTE: Currently, upgrading to new versions requires a fresh install.
Prerequisite
- The Postgres client (psql) is required for all of the following installation methods.
This will install PostgreSQL 9.6 via Homebrew as well. If you have another installation (such as Postgres.app), this will cause problems. We recommend removing other installations before using this method.
Prerequisites
Build and install
# Add our tap
brew tap timescale/tap
# To install
brew install timescaledb
Update postgresql.conf
Also, you will need to edit your postgresql.conf
file to include
necessary libraries:
# Modify postgresql.conf to uncomment this line and add required libraries.
# For example:
shared_preload_libraries = 'dblink,timescaledb'
To get started you'll now need to restart PostgreSQL and add a
postgres
superuser (used in the rest of the docs):
# Restart PostgreSQL
brew services restart postgresql
# Add a superuser postgres:
createuser postgres -s
You can pull our Docker images from Docker Hub.
docker pull timescale/timescaledb:latest
We have only tested our build process on MacOS and Linux. We do not support building on Windows yet. Windows may be able to use our Docker image on Docker Hub (see above).
Prerequisites
- A standard PostgreSQL 9.6 installation with development environment (header files) (e.g., Postgres.app for MacOS)
Build and install with local PostgreSQL
# To build the extension
make
# To install
make install
Update postgresql.conf
Also, you will need to edit your postgresql.conf
file to include
necessary libraries, and then restart PostgreSQL:
# Modify postgresql.conf to uncomment this line and add required libraries.
# For example:
shared_preload_libraries = 'dblink,timescaledb'
# Then, restart PostgreSQL
Now, we'll install our extension and create an initial database.
You again have two options for setting up your initial database:
-
Empty Database - To set up a new, empty database, please follow the instructions below.
-
Database with pre-loaded sample data - To help you quickly get started, we have also created some sample datasets. See Using our Sample Datasets for further instructions. (Includes installing our extension.)
When creating a new database, it is necessary to install the extension and then run an initialization function.
# Connect to Postgres, using a superuser named 'postgres'
psql -U postgres -h localhost
-- Install the extension
CREATE database tutorial;
\c tutorial
CREATE EXTENSION IF NOT EXISTS timescaledb CASCADE;
-- Run initialization function
SELECT setup_timescaledb();
For convenience, this can also be done in one step by running a script from the command-line:
DB_NAME=tutorial ./scripts/setup-db.sh
You should now have a brand new time-series database running in Postgres.
# To access your new database
psql -U postgres -h localhost -d tutorial
Next let's load some data.
One of the core ideas of our time-series database are time-series optimized data tables, called hypertables.
To create a hypertable, you start with a regular SQL table, and then convert
it into a hypertable via the function
create_hypertable()
(API definition).
The following example creates a hypertable for tracking temperature and humidity across a collection of devices over time.
-- We start by creating a regular SQL table
CREATE TABLE conditions (
time TIMESTAMPTZ NOT NULL,
location TEXT NOT NULL,
temperature DOUBLE PRECISION NULL,
humidity DOUBLE PRECISION NULL
);
Next, transform it into a hypertable using the provided function
create_hypertable()
:
-- This creates a hypertable that is partitioned by time
-- using the values in the `time` column.
SELECT create_hypertable('conditions', 'time');
-- OR you can additionally partition the data on another dimension
-- (what we call 'space') such as `location`.
-- For example, to partition `location` into 2 partitions:
SELECT create_hypertable('conditions', 'time', 'location', 2);
Inserting data into the hypertable is done via normal SQL INSERT
commands,
e.g. using millisecond timestamps:
INSERT INTO conditions(time,location,temperature,humidity)
VALUES(NOW(), 'office', 70.0, 50.0);
Similarly, querying data is done via normal SQL SELECT
commands.
SQL UPDATE
and DELETE
commands also work as expected.
Data is indexed using normal SQL CREATE INDEX
commands. For instance,
CREATE INDEX ON conditions (location, time DESC);
This can be done before or after converting the table to a hypertable.
Indexing suggestions:
Our experience has shown that different types of indexes are most-useful for time-series data, depending on your data.
For indexing columns with discrete (limited-cardinality) values (e.g., where you are most likely
to use an "equals" or "not equals" comparator) we suggest using an index like this (using our hypertable conditions
for the example):
CREATE INDEX ON conditions (location, time DESC);
For all other types of columns, i.e., columns with continuous values (e.g., where you are most likely to use a "less than" or "greater than" comparator) the index should be in the form:
CREATE INDEX ON conditions (time DESC, temperature);
Having a time DESC
column specification in the index allows for efficient queries by column-value and time. For example, the index defined above would optimize the following query:
SELECT * FROM conditions WHERE location = 'garage' ORDER BY time DESC LIMIT 10
For sparse data where a column is often NULL, we suggest adding a
WHERE column IS NOT NULL
clause to the index (unless you are often
searching for missing data). For example,
CREATE INDEX ON conditions (time DESC, humidity) WHERE humidity IS NOT NULL;
this creates a more compact, and thus efficient, index.
Below are a few current limitations of our database, which we are actively working to resolve:
- Any user has full read/write access to the metadata tables for hypertables.
- Permission changes on hypertables are not correctly propagated.
create_hypertable()
can only be run on an empty tableCOPY
ing a dataset will currently put all data in the same chunk, even if chunk size goes over max size. For now we recommend breaking down large files forCOPY
(e.g., large CSVs) into smaller files that are slightly larger than max_chunk size (currently 1GB by default). We providescripts/migrate_data.sh
to help with this.- Custom user-created triggers on hypertables currently not allowed
drop_chunks()
(see our API Reference) is currently only supported for hypertables that are not partitioned by space.
For more information on TimescaleDB's APIs, check out our API Reference.
If you want to contribute, please make sure to run the test suite before submitting a PR.
If you are running locally:
make installcheck
If you are using Docker:
make -f docker.mk test