fast-cluster
C++ C
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
lib
src
.gitignore
Makefile
README

README

* fast_cluster *

Cluster documents using LSH in linear time.

$ make
$ ./fast_cluster <file> <ngrams>

example:
$ find path_to_documents | xargs -I{} ./fast_cluster {} 5 | cut -f1 | sort -n | uniq -c
... list of count id pairs ...
$ find path_to_documents | grep '<interesting_id>'