Fast data store for Pandas time-series data
Branch: master
Clone or download
Latest commit 04ab6de Jan 3, 2019
Type Name Latest commit message Commit time
Failed to load latest commit information.
examples added missing space Jun 6, 2018
pystore testing support for py2.7 Sep 29, 2018
.gitignore initial release May 26, 2018
.travis.yml testing support for py2.7 Sep 29, 2018
CHANGELOG.rst testing support for py2.7 Sep 29, 2018
LICENSE.txt seperated modules + new license Jun 3, 2018 initial release May 26, 2018
README.rst updated readme Jan 3, 2019
requirements.txt added python-snappy to installer Jun 2, 2018
setup.cgf initial release May 26, 2018 testing support for py2.7 Sep 29, 2018


PyStore - Fast data store for Pandas timeseries data

Python version PyPi version PyPi status Travis-CI build status Patreon Status Star this repo Follow me on twitter

PyStore is a simple (yet powerful) datastore for Pandas dataframes, and while it can store any Pandas object, it was designed with storing timeseries data in mind.

It's built on top of Pandas, Numpy, Dask, and Parquet (via Fastparquet), to provide an easy to use datastore for Python developers that can easily query millions of rows per second per client.

==> Check out this Blog post for the reasoning and philosophy behind PyStore, as well as a detailed tutorial with code examples.

==> Follow this PyStore tutorial in Jupyter notebook format.


Install PyStore

Install using pip:

$ pip install PyStore

Or upgrade using:

$ pip install PyStore --upgrade --no-cache-dir

INSTALLATION NOTE: If you don't have Snappy installed (compression/decompression library), you'll need to you'll need to install it first.

Using PyStore

#!/usr/bin/env python
# -*- coding: utf-8 -*-

import pystore
import quandl

# Set storage path (optional, default is `~/.pystore`)

# List stores

# Connect to datastore (create it if not exist)
store ='mydatastore')

# List existing collections

# Access a collection (create it if not exist)
collection = store.collection('NASDAQ')

# List items in collection

# Load some data from Quandl
aapl = quandl.get("WIKI/AAPL", authtoken="your token here")

# Store the first 100 rows of the data in the collection under "AAPL"
collection.write('AAPL', aapl[:100], metadata={'source': 'Quandl'})

# Reading the item's data
item = collection.item('AAPL')
data =  # <-- Dask dataframe (see
metadata = item.metadata
df = item.to_pandas()

# Append the rest of the rows to the "AAPL" item
collection.append('AAPL', aapl[100:])

# Reading the item's data
item = collection.item('AAPL')
data =
metadata = item.metadata
df = item.to_pandas()

# --- Query functionality ---

# Query avaialable symbols based on metadata
collection.list_items(some_key='some_value', other_key='other_value')

# --- Snapshot functionality ---

# Snapshot a collection
# (Point-in-time named reference for all current symbols in a collection)

# List available snapshots

# Get a version of a symbol given a snapshot name
collection.item('AAPL', snapshot='snapshot_name')

# Delete a collection snapshot

# ...

# Delete the item from the current version

# Delete the collection


PyStore provides namespaced collections of data. These collections allow bucketing data by source, user or some other metric (for example frequency: End-Of-Day; Minute Bars; etc.). Each collection (or namespace) maps to a directory containing partitioned parquet files for each item (e.g. symbol).

A good practice it to create collections that may look something like this:

  • collection.EOD
  • collection.ONEMINUTE


  • Python >= 3.5
  • Pandas
  • Numpy
  • Dask
  • Fastparquet
  • Snappy (Google's compression/decompression library)

PyStore was tested to work on *nix-like systems, including macOS.


PyStore uses Snappy, a fast and efficient compression/decompression library from Google. You'll need to install Snappy on your system before installing PyStore.

* See the python-snappy Github repo for more information.

*nix Systems:

  • APT: sudo apt-get install libsnappy-dev
  • RPM: sudo yum install libsnappy-devel


$ brew install snappy  # Snappy's C library
$ CPPFLAGS="-I/usr/local/include -L/usr/local/lib" pip install python-snappy


Windows users should checkout Snappy for Windows and this Stackoverflow post for help on installing Snappy and python-snappy.

Known Limitation

PyStore currently only offers support for local filesystem. I plan on adding support for Amazon S3 (via s3fs), Google Cloud Storage (via gcsfs) and Hadoop Distributed File System (via hdfs3) in the future.


PyStore is hugely inspired by Man AHL's Arctic which uses MongoDB for storage and allow for versioning and other features. I highly reommend you check it out.


PyStore is licensed under the Apache License, Version 2.0. A copy of which is included in LICENSE.txt.

I'm very interested in your experience with PyStore. Please drop me an note with any feedback you have.

Contributions welcome!

- Ran Aroussi