Skip to content

liquidsunset/similarity_search

master
Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?
Code

Latest commit

 

Git stats

Files

Permalink
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
 
 
 
 
src
 
 
 
 
 
 

similarity_search

make, then run:

similarity_search <#lines(sets) to find common integers(tokens,words)> <jaccard-threshold(0..1)>

demo with threshold 0.9 (almost ident sets/lines)

./similarity_search dblp_first500.txt 50 0.9

demo for original implementation

./set_sim_join --timings --statistics --whitespace '/home/liquid/similarity_search/enron.format' allpairs 0.9