Skip to content
master
Go to file
Code

Latest commit

 

Git stats

Files

Permalink
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
 
 
 
 
src
 
 
 
 
 
 

README.md

similarity_search

make, then run:

similarity_search <#lines(sets) to find common integers(tokens,words)> <jaccard-threshold(0..1)>

demo with threshold 0.9 (almost ident sets/lines)

./similarity_search dblp_first500.txt 50 0.9

demo for original implementation

./set_sim_join --timings --statistics --whitespace '/home/liquid/similarity_search/enron.format' allpairs 0.9

You can’t perform that action at this time.