Minhashing is an efficient similarity estimation technique that is often used to identify near-duplicate documents in large text collections. This package offers a JavaScript implementation of the minhash algorithm and an efficient Locality Sensitive Hashing Index for finding similar minhashes in Node.js or web applications.
To get started with Minhash.js, you can install the package with npm:
npm install minhash --saveThe sample application uses minhash.js to compute the similarity between samle files:
There is also a sample Node.js script that can be run with node examples/index.js.
To run the test suite — npm run test.
To compile and minify minhash.min.js — npm run build.
