Permalink
Browse files

Shining up README and adding full reference

  • Loading branch information...
RasmusFonseca committed Apr 30, 2018
1 parent 98c79bb commit 890797a8b7cac805796986cfd3e5acbabe6b05bc
Showing with 23 additions and 47 deletions.
  1. +23 −47 README.md
View
@@ -1,67 +1,43 @@
# TICC
TICC is a python solver for efficiently segmenting and clustering a multivariate time series. For implementation details, refer to the paper [1].
TICC is a python solver for efficiently segmenting and clustering a multivariate time series. It takes as input a T-by-n data matrix, a regularization parameter `lambda` and smoothness parameter `beta`, the window size `w` and the number of clusters `k`. TICC breaks the T timestamps into segments where each segment belongs to one of the `k` clusters. The total number of segments is affected by the smoothness parameter `beta`. It does so by running an EM algorithm where TICC alternately assigns points to clusters using a dynamic programming algorithm and updates the cluster parameters by solving a Toeplitz Inverse Covariance Estimation problem.
----
The TICC method takes as input a T-by-n data matrix, a regularization parameter "lambda" and smoothness parameter "beta", the window size "w" and the number of clusters "k". TICC breaks the T timestamps into segments where each segment belongs to one of the "k" clusters. The total number of segments is defined by the smoothness parameter "beta". It does so by running an EM algorithm where TICC alternately assigns points to clusters using a DP algorithm and updates the cluster parameters by solving a Toeplitz Inverse Covariance Estimation problem. The details can be found in the paper.
For details about the method and implementation see the paper [1].
Download & Setup
======================
## Download & Setup
Download the source code, by running in the terminal:
```
git clone https://github.com/davidhallac/TICC.git
```
Using TICC
======================
Alternatively, install the python library using
```
TICC()
pip install ticc
```
Initializes problem:
**Parameters**
window_size : the size of the sliding window
number_of_clusters: the number of underlying clusters 'k'
lambda_parameter: sparsity of the MRF for each of the clusters. The sparsity of the inverse covariance matrix of each cluster.
beta: The switching penalty used in the TICC algorithm. Same as the beta parameter described in the paper.
maxIters : the maximum iterations of the TICC algorithm before covnergence. Default value is 100.
threshold: convergence threshold
write_out_file : Boolean. Flag indicating if the computed inverse covariances for each of the clusters should be saved.
prefix_string: Location of the folder to which you want to save the outputs.
```
TICC.fit()
```
Runs the TICC algorithm on a specific dataset to learn the model parameters.
**Parameter**
## Using TICC
The `TICC`-constructor takes the following parameters:
input_file: Location of the Data matrix of size T-by-n.
* `window_size`: the size of the sliding window
* `number_of_clusters`: the number of underlying clusters 'k'
* `lambda_parameter`: sparsity of the Markov Random Field (MRF) for each of the clusters. The sparsity of the inverse covariance matrix of each cluster.
* `beta`: The switching penalty used in the TICC algorithm. Same as the beta parameter described in the paper.
* `maxIters`: the maximum iterations of the TICC algorithm before convergence. Default value is 100.
* `threshold`: convergence threshold
* `write_out_file`: Boolean. Flag indicating if the computed inverse covariances for each of the clusters should be saved.
* `prefix_string`: Location of the folder to which you want to save the outputs.
**Returns**
The `TICC.fit(input_file)`-function runs the TICC algorithm on a specific dataset to learn the model parameters.
returns an array of cluster assignments for each time point.
* `input_file`: Location of the data matrix of size T-by-n.
returns a dictionary with keys being the cluster_id (from 0 to k-1) and the values being the cluster MRFs.
An array of cluster assignments for each time point is returned in the form of a dictionary with keys being the `cluster_id` (from `0` to `k-1`) and the values being the cluster MRFs.
----
Example Usage
======================
## Example Usage
See example.py for proper usage of TICC.
See `example.py`.
References
==========
[1] TICC paper : http://stanford.edu/~hallac/TICC.pdf
## References
[1] D. Hallac, S. Vare, S. Boyd, and J. Leskovec [Toeplitz Inverse Covariance-Based Clustering of
Multivariate Time Series Data](http://stanford.edu/~hallac/TICC.pdf) Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 215--223

0 comments on commit 890797a

Please sign in to comment.