tss

tss is a simple time series storage on top of Mongodb. It allows user to store pandas DataFrames directly into mongodb under a simple schema. Data stored in Mongodb are in native format. This is to allow other languages to directly interacting the storage to read or modify data.

tss also supports dynamodb as backend in addition to Mongodb. More information can be found at here.

tss uses two collections to store the data in Mongodb:

series - which stores the time series meta data and chunks' meta data. The document is defined as follows:

Attribute	Data Type	Notes
_id	ObjectId	id for the time series
frequency	String	the frequency of the data: 1d, 1m and 1s
name	String	the name of the time series
columns	String[]	the name of the columns
slices	array of slice	slices meta data

Where sub document slice is defined as:

Attribute	Data Type	Notes
id	ObjectId	slice's data's object id in data document
start	DateTime	start time of the slice
num_of_samples	int	the number of data stored in this slice
is_sparse	boolean	indicates whether the slice's data is stored in sparse way or not

data - which stores the actual data. data can be stored in either sparse way or not (currently, only sparse way is supported).

Attribute	Data Type	Notes
_id	ObjectId	id for slice
data	array	the actual data

For sparse slice, the actual data is stored as subdocument as follows:

Attribute	Data Type	Notes
timestamp	DatTime	the timestamp of the data point
data	array	the data array representing a row of data

Examples:

Creating a new time series from pandas DataFrame:

from StringIO import StringIO

import pandas as pd
import numpy as np

from tss.utils import get_mongo_db

db = get_mongo_db()
input_data=StringIO("""col1,col2,col3
1,2,3
4,5,6
7,8,9
""")
    df = pd.read_csv(input_data, sep=",")
    df['time'] = pd.Series([np.datetime64(datetime(2017, 3, 8)),
                            np.datetime64(datetime(2017, 3, 9)),
                            np.datetime64(datetime(2017, 3, 10))])
    df.set_index(['time'], inplace=True)
    result = utils.create_with_sparse_slices_from_df(df, 'test1', '1d', 3, db)

By default, tss connects to mongo at localhost:27017 with db name tss. This can be customized by environment variables: MONGO_SERVER, MONGO_PORT, and MONGO_DB_NAME.

In mongo, the data is stored as:

> db.series.findOne()
{ 
	"_id" : ObjectId("58c45d4af4a6b0054cecdad6"), 
	"frequency" : "1d", "name" : 
	"test1", 
	"columns" : [ "col1", "col2", "col3" ], 
	"slices" : [ 
		{ 
			"start" : ISODate("2017-03-08T00:00:00Z"), 
			"num_of_samples" : 1, 
			"id" : ObjectId("58c45d4af4a6b0054cecdad7"), 
			"is_sparse" : true 
		}, 
		{ 
			"start" : ISODate("2017-03-09T00:00:00Z"), 
			"num_of_samples" : 1, 
			"id" : ObjectId("58c45d4af4a6b0054cecdad8"), 
			"is_sparse" : true 
		}, 
		{ 
			"start" : ISODate("2017-03-10T00:00:00Z"), 
			"num_of_samples" : 1, 
			"id" : ObjectId("58c45d4af4a6b0054cecdad9"), 
			"is_sparse" : true 
		} 
	] 
}
> db.data.find({})
{
	{ "_id" : ObjectId("58c45d4af4a6b0054cecdad7"), "data" : [ { "timestamp" : ISODate("2017-03-08T00:00:00Z"), "data" : [ 1, 2, 3 ] } ] }
	{ "_id" : ObjectId("58c45d4af4a6b0054cecdad8"), "data" : [ { "timestamp" : ISODate("2017-03-09T00:00:00Z"), "data" : [ 4, 5, 6 ] } ] }
	{ "_id" : ObjectId("58c45d4af4a6b0054cecdad9"), "data" : [ { "timestamp" : ISODate("2017-03-10T00:00:00Z"), "data" : [ 7, 8, 9 ] } ] }
}

Name		Name	Last commit message	Last commit date
Latest commit History 26 Commits
doc		doc
tests		tests
tss		tss
.gitignore		.gitignore
README.md		README.md
circle.yml		circle.yml
scratch.py		scratch.py
scratch_aws.py		scratch_aws.py
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

tss

About

Releases

Packages

Languages

mcai4gl2/tss

Folders and files

Latest commit

History

Repository files navigation

tss

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages