THIS REPO HAS MOVED TO https://github.com/apache/incubator-iotdb. TsFile is a columnar file format designed for time-series data, which supports efficient compression and query. It is easy to integrate TsFile with your IOT big data processing frameworks.
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
example
src
.checkstyle fork 李天安's remove unuseful fileds in tsfile and will merge remove_thr… Jun 25, 2018
.gitignore
.travis.yml
README.md
checkstyle.xml
package.sh Branch 0.3.1 (#105) Dec 7, 2017
pom.xml

README.md

TsFile Document

___________    ___________.__.__          
\__    ___/____\_   _____/|__|  |   ____  
  |    | /  ___/|    __)  |  |  | _/ __ \ 
  |    | \___ \ |     \   |  |  |_\  ___/ 
  |____|/____  >\___  /   |__|____/\___  >  version 0.7.1
             \/     \/                 \/  

Status

Build Status codecov Maven Central GitHub release

Abstract

TsFile is a columnar storage file format designed for time series data, which supports efficient compression and query. It is easy to integrate TsFile into your IoT big data processing frameworks.

Motivation

Nowadays, the implementation of IoT is becoming increasingly popular in areas such as Industry 4.0, Smart Home, wearables and Connected Healthcare. Comparing with traditional IT infrastructure usage monitoring scenarios, applications like intelligent control and alarm reporting stimulate more advanced analytics requirements on time series data generated by sensors. Especially when IoT dives into industrial Internet, intelligent equipments produce one to two orders of magnitudes of data more than consumer-oriented IoT, where analytics comes more complicated to get actionable insights. As an illustrative example, a single wind turbine can generate hundreds of data points every 20 ms for fault detection or prediction through a set of sophisticated operations against time series by data scientists, such as signal decomposition and filtration, segmentation for varied working conditions, pattern matching, frequency domain analysis etc..

Recent advances in time series data management system are developed for data center monitoring. Currently there is not a file format optimized specifically for time series data in above scenarios. So TsFile was born. TsFile is a specially designed file format rather than a database. Users can open, write, read, and close a TsFile easily like doing operations on a normal file. Besides, more interfaces are available on a TsFile.

The target of TsFile project is to support: high ingestion rate up to tens of million data points per second and rare updates only for the correction of low quality data; compact data packaging and deep compression for long-live historical data; traditional sequential and conditional query, complex exploratory query, signal processing, data mining and machine learning.

The features of TsFile is as follow:

  • Write
    • Fast data import
    • Efficiently compression
    • diverse data encoding types
  • Read
    • Efficiently query
    • Time-sorted query data set
  • Integration
    • HDFS
    • Spark and Hive
    • etc.

Online Documents