-
-
Notifications
You must be signed in to change notification settings - Fork 4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support blake3 / b3sum as hash #7765
Comments
Do you specifically want sftp to support executing I've seen the same as you. A while ago I played around with using this as hash for the local filesystem backend in rclone, but did not get consistently better performance results that lead me to finalize a PR for it. The IO contribution, caching etc seemed to affect the results far more than the actual hash calculation, however there might be niche cases where it could be relevant, I just didn't spend more time on it. When speaking of hash performance, xxHash (XXH3) is also often part of the discussion, and is normally even faster - probably the fastest around currently? In contrast to blake3 it is not a cryptographic hash, and is therefore in another league sort of, however for file checksumming it may not be a requirement. As a curiosity, some do even use a combination of both:
|
I just updated my previous experimental implementation, and pushed a draft #7767, which will create beta builds at https://beta.rclone.org/branch/add-xxh-blake-hash/ in case anyone feels like testing it out. |
Having a tree based hash is a very interesting idea and one which, for example the dropboxhash is emulating in a simplistic way. The rclone internals aren't currently optimized for tree based hashes though, they expect sequential hashes. I'm not sure the go interface supports nonsequential hashes. However getting sftp to support b3sum will work well in conjunction. I have a slight concern about sftp startup times. Lots of people use sftp without a config file which means that it probes for shells/supported hashes each time it is used. Perhaps we should delay hash support probing until it is asked for? |
Good point, I agree we need to look into that if/when additional hash is added to sftp backend. |
On second though... I assumed you meant it probes hashes on each NewFs or similar, but I don't think it does? |
Some parts of rclone, such as the SFTP checksum, currently support only
md5sum
andsha1sum
. These are both very slow, necessarily sequential hashes.BLAKE3 with
b3sum
is a tree hash and thus scales with CPUs, parallel single disk access (SSDs), and multi-disk array access (RAID, striped networked drives), e.g. > 6 GB/s single threaded from the official benchmark:It would be nice if rclone could support
b3sum
as an alternative tomd5sum
andsha1sum
.There are other planned uses of it in rclone, e.g.:
And rclone already indirectly depends on the
blake3
Go package:rclone/go.mod
Line 173 in cc3ae93
The text was updated successfully, but these errors were encountered: