A collection of algorithms to generate a signature/fingerprint/hash in order to be used for detecting duplicate/near duplicate documents.
rust profile text fingerprint hash cargo deduplication de-duplication text-profile-signature lookup3
-
Updated
Aug 9, 2017 - Rust