Skip to content

An intergrated framework of pattern detection of time series data

License

Notifications You must be signed in to change notification settings

Cyyjenkins/time-series-pattern-detection

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

time-series-pattern-detection

An intergrated framework of pattern detection of time series data

This project is the implementation of finding the maneuver patterns of time-series data. Note that this framework can be used not only for lane change sequences, but also for all kinds of time-series sequences.

A two-step framework is established to realize our extraction task. In the first step, the time series segmentation algorithms were employed separately to segment the lane change series into blocks, when samples in each block remain the common trend. In the second step, the segmented blocks were clustered using various modified LDAs. Note that the original algorithm of LDA is not used here because the model is unable to handle floating input (the expected input of LDA is an enumeration feature).

Since there already exists many time-series segmentation algorithms with elegant mathematical structure, we utilized these models in our framework:

1. BMASS:

Bayesian model-based agglomerative sequence segmentation, created by @Gabriel Agamennoni. In this model, the greedy based agglomerate search method (like hierarchical clustering) was utilized to find the locally optimal marginal likelihood of the conjugate distribution parameters of the given samples.

2. BMOSS:

Bayesian model-based online sequence segmentation, also created by @Gabriel Agamennoni. It has the same assumption of conjugate distribution as BMASS, but it adjusts the searching method to the forward-backward algorithm (like HMM).

3. HDP-HSMM:

hierarchical Dirichlet process hidden semi-Markov model, created by @Matt Johnson. When a HMM was used to fit time series, we can regard adjacent substrings with the same hidden state as the inferred segments, thus we can also use HMM to segment time-series sequences. Given that the original HMM often had frequent state switches, the use of hidden semi-Markov model (HSMM) allows the hidden state to remain in the same state for a longer period of time. In addition, as a Bayesian nonparametric prior, the hierarchical Dirichlet process (HDP) can be added into the HMM-based models to automatically infer the number of hidden states of HMM, thus reducing excessive manual intervention.

Note that BMASS is implemented using MATLAB, thus it requires users to install MATLAB. In addition, the implementation of HDP-HSMM uses the pyhsmm package, which has a number of unsupported problems of underlying C++ code in Windows environment. Thus, we strongly recommend to run this program in a Linux environment.

In the second step, we employed two extended LDA models to achieve segments clustering:

1. GMM-LDA:

Gaussian mixture model-latent Dirichlet allocation, which is our implementation in our framework. The clusters learning from input samples were generated by using GMM, and the cluster label of each sample is treated as the input of LDA. In this implementation, we improved the implementation of GMM created by @Jeremy M. Stober, since the program is easy to produce a series of numerical calculation errors in large sample data.

2. mLDA:

Multimodal latent Dirichlet allocation, created by @Naka Tomo. The topic-word distribution of each feature in the sample is modeled separately to obtain a juxtaposed distribution model. Some of the construction and implementation details of the program have been tweaked to make the model more robust and retain richer inferential information in the model output.

In this framework, we default that the feature dimensions of all data are the same, and use the same instance of a model to learn characteristic of samples from different data sources. Readers can change the algorithms of each step according to their own needs.

For any problem, please contact us on cyychenyaoyu@163.com

About

An intergrated framework of pattern detection of time series data

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published