Big data homework solutions
-
Updated
Jun 21, 2017 - Python
Big data homework solutions
Aurora karton for calculating minhash from input dataset.
similarity of the texts (Jaccard Similarity, Minhash, LSH)
Assignment-2 for CS F469 Information Retrieval Course
👯 Algorithms using Jaccard similarity to identify questions from a list that are similar to one another
This is the repo for the Spring 2019 version of MCBS913
Scalable and Multifaceted Search and Its Application for Malware Binary Files
A simple MinHash implementation based on the explanation in the Mining of Massive Datasets course by Stanford
Aurora karton for similiarity matching.
Attempt to use MinHash to find duplicates in an Elasticsearch index
Analysis of Massive Datasets FER labs
Fast Jaccard similarity search for abstract sets (documents, products, users, etc.) using MinHashing and Locality Sensitve Hashing
Add a description, image, and links to the minhash topic page so that developers can more easily learn about it.
To associate your repository with the minhash topic, visit your repo's landing page and select "manage topics."