GrammarViz 2.0 public release:
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
RCode bumping the SAX lib version Aug 23, 2016
data adding the winding demo data Sep 14, 2016
src add option for parallel SAX discretization Mar 13, 2017
.classpath fixin java 8 Nov 14, 2017
.gitignore prepearing the release Jun 25, 2015
.project bumping dependencies versions Aug 8, 2015
.travis.yml fixin java 8 Nov 14, 2017
LICENSE.txt adding the LICENSE Aug 26, 2014
README.md Update README.md Mar 6, 2018
citation.bib Update citation.bib Mar 6, 2018
pom.xml fixin java 8 Nov 14, 2017

README.md

GrammarViz 3.0 (the new version is out)

Build Status codecov.io License

GrammarViz 3.0 source code public repository. This code is released under GPL v.2.0.

For the detailed software description, please visit our demo site.

0.0 In a nutshell

GrammarViz 3.0 is a software for time series exploratory analysis with GUI and CLI interfaces. The GUI enables interactive time series exploration workflow that allows for variable length recurrent and anomalous patterns discovery from time series [4]: GrammarViz2 screen

It is implemented in Java and is based on continuous signal discretization with SAX, Grammatical Inference with Sequitur and Re-Pair, and algorithmic (Kolmogorov) complexity.

In contrast with 2.0, GrammarViz 3.0 introduces an approach for the grammar rule pruning and the automated discretization parameters selection procedure based on the greedy grammar rule pruning and MDL -- by sampling a possible parameters space, it finds a parameters set which produces the most concise grammar describing the observed time series the best, which often is close to the optimal. (here concise and describing are based on other specific criteria).

It also implements the "Rule Density Curve" and "Rare Rule Anomaly (RRA)" algorithms for time series anomaly discovery [5], that significantly outperform HOT-SAX algorithm for time series discord discovery, which is current state of the art. In the table below, the algorithms performance is measured in the amount of calls to the distance function (less is better). The last column shows the RRA performance improvement over HOT-SAX:

Dataset and SAX parameters Dataset size Brute Force HOT-SAX RRA Reduction
Daily commute (350,15,4) 17,175 271,442,101 879,067 112,405 87.2%
Dutch power demand (750,6,3) 35,040 1.13 * 10^9 6,196,356 327,950 95.7%
ECG 0606 (120,4,4) 2,300 4,241,541 72,390 16,717 76.9%
ECG 308 (300,4,4) 5,400 23,044,801 327,454 14,655 95.5%
ECG 15 (300,4,4) 15,000 207,374,401 1,434,665 111,348 92.2%
ECG 108 (300,4,4) 21,600 441,021,001 6,041,145 150,184 97.5%
ECG 300 (300,4,4) 536,976 288 * 10^9 101,427,254 17,712,845 82.6%
ECG 318 (300,4,4) 586,086 343 * 10^9 45,513,790 10,000,632 78.0%
Respiration, NPRS 43 (128,5,4) 4,000 14,021,281 89,570 45,352 49.3%
Respiration, NPRS 44 (128,5,4) 24,125 569,753,031 1,146,145 257,529 77.5%
Video dataset (150,5,3) 11,251 119,935,353 758,456 69,910 90.8%
Shuttle telemetry, TEK14 (128,4,4) 5,000 22,510,281 691,194 48,226 93.0%
Shuttle telemetry, TEK16 (128,4,4) 5,000 22,491,306 61,682 15,573 74.8%
Shuttle telemetry, TEK17 (128,4,4) 5,000 22,491,306 164,225 78,211 52.4%

References:

[1] Lin, J., Keogh, E., Wei, L. and Lonardi, S., Experiencing SAX: a Novel Symbolic Representation of Time Series. DMKD Journal, 2007.

[2] Nevill-Manning, C.G., Witten, I.H., Identifying Hierarchical Structure in Sequences: A linear-time algorithm. arXiv:cs/9709102, 1997.

[3] Larsson, N. J., Moffat, A., Offline Dictionary-Based Compression, IEEE 88 (11): 1722–1732, doi:10.1109/5.892708, 2000.

Citing this work:

[4] Pavel Senin, Jessica Lin, Xing Wang, Tim Oates, Sunil Gandhi, Arnold P. Boedihardjo, Crystal Chen, and Susan Frankenstein. 2018. GrammarViz 3.0: Interactive Discovery of Variable-Length Time Series Patterns. ACM Trans. Knowl. Discov. Data 12, 1, Article 10 (February 2018), 28 pages. DOI: https://doi.org/10.1145/3051126

[5] Senin, P., Lin, J., Wang, X., Oates, T., Gandhi, S., Boedihardjo, A.P., Chen, C., Frankenstein, S., Lerner, M., Time series anomaly discovery with grammar-based compression, The International Conference on Extending Database Technology, EDBT 15.

1.0 Building

We use Maven and Java 7 to build an executable.

$ java -version
java version "1.7.0_80"
Java(TM) SE Runtime Environment (build 1.7.0_80-b15)
Java HotSpot(TM) 64-Bit Server VM (build 24.80-b11, mixed mode)

$ mvn -version
Apache Maven 2.2.1 (rdebian-8)
Java version: 1.7.0_80
Java home: /usr/lib/jvm/java-7-oracle/jre
Default locale: fr_FR, platform encoding: UTF-8
OS name: "linux" version: "3.2.0-86-generic" arch: "amd64" Family: "unix"

$ mvn package -Psingle
[INFO] Scanning for projects...
....

[INFO] Building jar: /media/Stock/git/grammarviz2_src.git/target/grammarviz2-0.0.1-SNAPSHOT-jar-with-dependencies.jar
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESSFUL
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 5 seconds
[INFO] Finished at: Wed Jun 17 15:43:01 CEST 2015
[INFO] Final Memory: 47M/238M
[INFO] ------------------------------------------------------------------------

2.0 Running

To run the GrammarViz 3.0 GUI use net.seninp.grammarviz.GrammarVizGUI class, or run the jar from the command line: $ java -Xmx2g -jar target/grammarviz2-0.0.1-SNAPSHOT-jar-with-dependencies.jar (here I have allocated max of 2Gb of memory for the software).

3.0 CLI interface

By using CLI as discussed in these tutorials, it is possible to save the inferred grammar, motifs, and discords.

Made with Aloha!

Made with Aloha!