Implement MinHash Matrix and Locality Sensitive Hashing (LSH) to estimate Jaccard similarity among documents and to identify near-duplicate documents.
-
Notifications
You must be signed in to change notification settings - Fork 0
ydengGitHub/Implementation-of-Large-Data-Set-Algorithm-in-Text-Analysis
About
Implement MinHash Matrix and Locality Sensitive Hashing (LSH) to estimate Jaccard similarity among documents and to identify near-duplicate documents.
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published