MinHash, LSH, LSH Forest, Weighted MinHash, HyperLogLog, HyperLogLog++, LSH Ensemble and HNSW
-
Updated
Jun 4, 2024 - Python
MinHash, LSH, LSH Forest, Weighted MinHash, HyperLogLog, HyperLogLog++, LSH Ensemble and HNSW
Quickly search, compare, and analyze genomic and metagenomic data sets.
Locality Sensitive Hashing using MinHash in Python/Cython to detect near duplicate text documents
JS implementation of probabilistic data structures: Bloom Filter (and its derived), HyperLogLog, Count-Min Sketch, Top-K and MinHash
Weighted MinHash implementation on CUDA (multi-gpu).
Sketching Algorithms for Clojure (bloom filter, min-hash, hyper-loglog, count-min sketch)
Elasticsearch plugin for b-bit minhash algorism
C++ Implementations of sketch data structures with SIMD Parallelism, including Python bindings
Union, intersection, and set cardinality in loglog space
Locality Sensitive Hashing In R
Quickly estimate the similarity between many sets
Easy-to-use Java similarity algorithms for text and numeric-series
Dynatrace hash library for Java
Detect and visualize text reuse
A method to mine beyond-pairwise relationships using Min-Hashing for large-scale pattern discovery
A resistome profiler for Graphing Resistance Out Of meTagenomes
There are Python 2.7 codes and learning notes for Spark 2.1.1
Add a description, image, and links to the minhash topic page so that developers can more easily learn about it.
To associate your repository with the minhash topic, visit your repo's landing page and select "manage topics."