This repository contains the code accompanying the paper, "Linear Time Complexity Time Series Clustering with Symbolic Pattern Forest" (Xiaosheng Li, Jessica Lin and Liang Zhao, IJCAI 2019). This paper proposes a time series clustering algorithm that has linear time complexity.
Assume using a Linux system:
g++ -O3 SPF.cpp libmetis.a -std=c++11 -o SPF
Or can directly use the compiled file SPF included in the folder.
./SPF [datasetname] [ensemble_size]
[datasetname] is the name of the dataset to run, the user needs to place a folder named with the [datasetname] and the folder contains a training file [datasetname]_TRAIN and a testing file [datasetname]_TEST (The UCR-Archive format). [ensemble_size] is the ensemble size. Please see the FaceFour example contained in the directory.
./SPF FaceFour 100
Output:
dataset:FaceFour, ensemble size:100
rand index: 1
The running time is: 1.860000seconds
The code uses a char array buffer of size 1000000 to read each line of the input file, so if the time series to use is very long, the characters that each line the input file contains may surpass the limit. In this case the buffer limit (line 32 of SPF.cpp, MAX_PER_LINE) should be enlarged correspondingly.
@inproceedings{ijcai2019-406,
title = {Linear Time Complexity Time Series Clustering with Symbolic Pattern Forest},
author = {Li, Xiaosheng and Lin, Jessica and Zhao, Liang},
booktitle = {Proceedings of the Twenty-Eighth International Joint Conference on
Artificial Intelligence, {IJCAI-19}},
publisher = {International Joint Conferences on Artificial Intelligence Organization},
pages = {2930--2936},
year = {2019},
month = {7},
doi = {10.24963/ijcai.2019/406},
url = {https://doi.org/10.24963/ijcai.2019/406},
}