TSXor: A Simple Time Series Compression Algorithm

Developed @ ISTI CNR - HPC Lab (Pisa)

Full paper here

Code

Introduction
How it works
Building the code
Input data format
Run
Benchmarks
Authors
Notes

Introduction

TSXor, a simple yet effective encoder/decoder for time series that achieves high compression ratios and fast decompression speed. TSXor exploits the redundancy/similarity between close-in-time values through a window that acts as a cache.

How it works

Window

TSXor compares each v_n value with its preceding W ≤ 127 values, logically corresponding to the values seen in the time range [t_n−W , t_n−1]. The goal is to compress v_n relative to this “window” containing the previous W values. We distinguish between 3 cases, namely Reference, XOR, and Exception.

Reference

If v_n is equal to a value in the window, just output its position p in the window. Since the window contains at most 127 values, 1 byte suffices to write the position with the most significant bit always equal to 0.

If the window does not contain v_n, then we search for the value u in the window such that x = vn ⨁ u has the largest number of leading (LZ) and trailing (TZ) zeros bytes. Let p be the position of u in the window. We first write p + 128 using 1 byte. In this case the most significant bit will always be 1 because of sum, which allows us to distinguish this case from the Reference case.

XOR

If LZ +TZ ≥ 2, we output a byte where 4 bits are dedicated to TZ and the other 4 bits to the length (in bytes) of the segment of x between the leading and trailing zero bytes. We then write such middle bytes.

Exception

We output an exception code, i.e., the value 255 using 1 byte, followed by the plain double-precision representation of v_n using 8 bytes.

Builiding the code

The code has been tested both on Linux and MacOS.

No dependencies are needed.

Just clone this repo and execute:

make all

Input data format

The algorithm can process any .csv file containing numbers only. You need first to convert the .csv into a .bin file using the csv_to_bin utility as follows:

cd util

./csv_to_bin.o path/to/MY_DATASET.csv

Please note: the first column will be interpreted as the timestamp, the rest will be interpreted as values.

Run

Compression

To run a compression test of a .bin file, execute the following commands:

cd test

./compression.o path/to/MY_DATASET.bin

This will produce a file called compressed_data.tsx

Decompression

To decompress the file compressed_data.tsx, run the command:

./decompression.o

Benchmarks

The following tables show the comparison between TSXor with respect to Gorilla by Facebook and FPC by Burtscher and Ratanaworabhan. The experiments were run on an Ubuntu 18.04 machine with Intel i7-7700 CPU @ 3.60GHz.

Compression Speeds (MB/s)

	FPC	Gorilla	TSXor
AMPds2	339,28	703,72	66,59
Bar Crawl	423,71	466,49	28,74
Max-Planck	313,40	870,58	51,74
Kinect	166,28	696,10	17,14
Oxford-Man	170,27	630,33	15,43
PAMAP	181,59	521,41	45,05
UCI Gas Sensor Array	286,94	654,32	21,93

Decompression Speeds (MB/s)

	FPC	Gorilla	TSXor
AMPds2	411,29	666,52	1173,65
Bar Crawl	436,12	447,42	709,68
Max-Planck	355,30	858,68	1057,00
Kinect	287,18	635,74	665,47
Oxford-Man	221,80	573,67	604,54
PAMAP	223,86	487,41	949,28
UCI Gas Sensor Array	454,91	578,41	642,40

Compression Ratios

	FPC	Gorilla	TSXor
AMPds2	1,10x	2,03x	6,39x
Bar Crawl	1,20x	1,44x	2,36x
Max-Planck	1,06x	2,97x	4,84x
Kinect	1,09x	1,41x	1,37x
Oxford-Man	1,06x	1,28x	1,30x
PAMAP	1,01x	1,38x	4,85x
UCI Gas Sensor Array	1,19x	1,23x	3,50x

Authors

Notes

This is a beta version. Use it at your own risk.

Name		Name	Last commit message	Last commit date
Latest commit History 111 Commits
benchmark		benchmark
core		core
img		img
lib		lib
test		test
util		util
.gitattributes		.gitattributes
.gitignore		.gitignore
Makefile		Makefile
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

TSXor: A Simple Time Series Compression Algorithm

Code

Introduction

How it works

Window

Reference

XOR

Exception

Builiding the code

Input data format

Run

Compression

Decompression

Benchmarks

Compression Speeds (MB/s)

Decompression Speeds (MB/s)

Compression Ratios

Authors

Notes

About

Releases

Packages

Languages

andybbruno/TSXor

Folders and files

Latest commit

History

Repository files navigation

TSXor: A Simple Time Series Compression Algorithm

Code

Introduction

How it works

Window

Reference

XOR

Exception

Builiding the code

Input data format

Run

Compression

Decompression

Benchmarks

Compression Speeds (MB/s)

Decompression Speeds (MB/s)

Compression Ratios

Authors

Notes

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages