No description or website provided.
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
helper
presentation
src
.gitignore
CMakeLists.txt
README.md

README.md

similarity_search

make, then run:

similarity_search <#lines(sets) to find common integers(tokens,words)> <jaccard-threshold(0..1)>

demo with threshold 0.9 (almost ident sets/lines)

./similarity_search dblp_first500.txt 50 0.9

demo for original implementation

./set_sim_join --timings --statistics --whitespace '/home/liquid/similarity_search/enron.format' allpairs 0.9