Fast Jaccard similarity search for abstract sets (documents, products, users, etc.) using MinHashing and Locality Sensitve Hashing
-
Updated
May 21, 2020 - Python
Fast Jaccard similarity search for abstract sets (documents, products, users, etc.) using MinHashing and Locality Sensitve Hashing
BagMinHash - Minwise Hashing Algorithm for Weighted Sets
ProbMinHash – A Class of Locality-Sensitive Hash Algorithms for the (Probability) Jaccard Similarity
Set of tasks solved in Big Data Algorithms course
SetSketch: Filling the Gap between MinHash and HyperLogLog
SuperMinHash: A New Minwise Hashing Algorithm for Jaccard Similarity Estimation, Simhash and SimhashIndex
HyperLogLog en C++ y OpenMP para cálculo de similitud de genomas mediante índice de Jaccard
TreeMinHash: Fast Sketching for Weighted Jaccard Similarity Estimation
This contains all projects that I have done during my master degree.
The goal of this project it to provide a tool to build new ranker easily and to compare them with existing ones in terms of results overlapping.
Add a description, image, and links to the jaccard-similarity-estimation topic page so that developers can more easily learn about it.
To associate your repository with the jaccard-similarity-estimation topic, visit your repo's landing page and select "manage topics."