quickhash

Fast non-cryptographically secure file hash approximation

This provides an approximate hash for large files by reading N bytes for every X bytes of a file. Typically it hashes 4KB for every 2MB of a file. To cover the end of files it also hashes the last 4KB of the file.

For very large files, this is much faster than creating a hash for the entire file, and under certain assumptions, just as useful.

The main assumption is that your use case has no need for cryptographically secure hashes as it would be easy to craft collisions with intent.

The main use is to create a 'relatively' unique hash for a file that can help in identifying 'duplicate' files where you're not worried about bad actors. For example, you may receive largeish (1GB+) video clips that you need to hash quickly for a responsive app to determine if you've already seen that file or not before. This code will create an approximate MD5 hash about 40x to 60x faster than a full MD5 hash. By varying the amount of N bytes hashed and X bytes skipped, you can adjust to your use case.

I've made PHP and Perl versions, but basic idea could be easily expanded.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
.editorconfig		.editorconfig
LICENSE		LICENSE
README.md		README.md
class.quickhash.php		class.quickhash.php
example_quickhash.php		example_quickhash.php
quickhash_functional.php		quickhash_functional.php

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

quickhash

About

Releases

Packages

Languages

License

filmo/quickhash

Folders and files

Latest commit

History

Repository files navigation

quickhash

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages