Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: Use size check to quickly reject unequal files #3

Merged
merged 2 commits into from Oct 25, 2018

Conversation

chadnetzer
Copy link
Contributor

In multiple-mode, it can be much faster to first verify that the files cannot be equal by checking if their sizes don't match. In that case, it makes no sense to compute a hash for potentially large files.

Also, by checking size first, we ensure that if the hashes match, they must also agree on file size as well (a much stronger assurance for equivalency of the crypto-hashed files).

In multiple-mode, it can be much faster to first verify that the files
cannot be equal by checking if their sizes don't match.  In that case,
it makes no sense to compute a hash for potentially large files.

Also, by checking size first, we ensure that if the hashes match, they
must also agree on file size as well (a much stronger assurance for
equivalency of the crypto-hashed files).
@chadnetzer chadnetzer changed the title Use size check to quickly reject unequal files WIP: Use size check to quickly reject unequal files Oct 25, 2018
@chadnetzer
Copy link
Contributor Author

chadnetzer commented Oct 25, 2018

Worth postponing for a bit until after PR#5 (or similar) fix is applied.

The hash computations use MaxSize as a to limit their work, such as if
given a device file with unbounded I/O (ie. /dev/zero).  The default
value of zero should be set before getHash attempts to compute a hash,
so that the hash input is not truncated to zero.

Rather than special handling for character devices, it may be preferable
for such cases to use CompareReader().  For example, what should the
expected result be for comparing /dev/zero to a 1K file file of all
zeros?  If MaxSize is not set, or set to > 1K, probably they should not
match.  And if MaxSize is set to zero, probably they should match.
However, Stat() will return size 0 for /dev/zero (and other character
device files), so arguably CompareFiles() is not well defined on
character device files (and that's just for Unix).
@udhos udhos merged commit 347b3b5 into udhos:master Oct 25, 2018
@udhos
Copy link
Owner

udhos commented Oct 25, 2018

Good, thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants