Generalizing the Wasserstein Time Series Kernel Using Locality-Sensitive Hashing

This is the source code for my master thesis submitted on August 17, 2020. The thesis proposes a kernel method (LSH-WTK) that generalizes the Wasserstein Time Series Kernel (WTK). It builds upon WTK by utilizing the distribution of time series shapelets (subsequences). 'Histograms' of subsequences are created by clustering subsequences of similar shape applying a novel clustering algorithm based on Locality-Sensitive Hashing. The files are organized in the following way.

clustering

The folder contains the code for the illustration of the LSH based clustering algorithm on a synthetic time series. To generate the figures used in the thesis, simply execute the main.py script. You can also use different subsequence lengths w, hash sizes and number of hash rounds num_tables in the clustering algorithm.

$ python main.py --w 15 --hash_size 15 --num_tables 30

data

The folder contains the synthetic time series and a small dataset called Chinatown from the UEA & UCR archive which can be used to run our kernelized SVM classifier.

lsh-wtk

This folder contains the source code of our classifier algorithm. To run LSH-WTK on the example dataset, simply execute the main.py script and provide the name of the dataset.

$ python main.py Chinatown

As LSH-WTK is a generalization of WTK, it is possible to run the latter method by specifying the percentile threshold to be larger than 100.

$ python main.py Chinatown --pctl 101

results

The folder contains all the accuracy results of our method reported in the thesis. analisys.py produces the summary statistics and the histogram of the accuracy differences between LSH-WTK and WTK. It also performs the Wilcoxon signed-rank test on the differences.

src

Contains the source codes of the LSH based clustering algorithm and LSH-WTK.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

clustering

clustering

data

data

lsh-wtk

lsh-wtk

results

results

src

src

README.md

README.md

Repository files navigation

Generalizing the Wasserstein Time Series Kernel Using Locality-Sensitive Hashing

clustering

data

lsh-wtk

results

src

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
clustering		clustering
data		data
lsh-wtk		lsh-wtk
results		results
src		src
README.md		README.md

BorgwardtLab/LSH-WTK

Folders and files

Latest commit

History

Repository files navigation

Generalizing the Wasserstein Time Series Kernel Using Locality-Sensitive Hashing

clustering

data

lsh-wtk

results

src

About

Resources

Stars

Watchers

Forks

Languages