Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add BLAKE3 and xxHash (XXH3/XXH3-64 and XXH128/XXH3-128) #7767

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

albertony
Copy link
Contributor

@albertony albertony commented Apr 15, 2024

What is the purpose of this change?

Add support for BLAKE3 and xxHash (XXH3/XXH3-64 and XXH128/XXH3-128).

  • Added to the rclone's hash library, with internal names: blake3, xxh3 and xxh128.
  • Implicitly supported in rclone hashsum command.
  • Implicitly supported by the local backend.
  • Added local backend option --hashes to configure which hashes it should support.
    • E.g. :local,hashes=blake3:C:\Temp.
    • This can be used to test hashing performance of the different types of hashes, e.g. with local-to-local operations.
      • Because the default, as before, is for it to support all that rclone's hash library supports, and when needing one it picks the first of a the internally ordered list, currently md5, and with operations between two backends it will pick the first in common. For local-local operations this means it will always use md5.
  • BLAKE3 support in SFTP backend.
    • Probes for blake3sum executable, or rclone hashsum blake3, configurable with blake3sum_command.

To experiment, beta builds are here: https://beta.rclone.org/branch/add-xxh-blake-hash/

TODO:

  • Name and type (e.g. repeated parameter vs comma-separated value) of the local backend option, and general consistency of options to specify hash types:
    • Name of the new local backend option introduced here is currently hashes, and it takes a comma-separated list.
    • The hasher backend already has a similar option, with same name hashes, and also takes a comma-separated list.
    • The lsjson command has option hash to enable hash in output and hash-type option, which can be repeated, to specify which ones.
    • The lsf command has option hash to specify which hash to use option format is used with keyword "h".
  • I have a slight concern about sftp startup times. Lots of people use sftp without a config file which means that it probes for shells/supported hashes each time it is used. Perhaps we should delay hash support probing until it is asked for?
    Support blake3 / b3sum as hash #7765 (comment)

Was the change discussed in an issue or in the forum before?

#7765 and https://forum.rclone.org/t/faster-non-cryptographic-hashing-algorithm-for-faster-file-comparison/23601

Checklist

  • I have read the contribution guidelines.
  • I have added tests for all changes in this PR if appropriate.
  • I have added documentation for the changes if appropriate.
  • All commit messages are in house style.
  • I'm done, this Pull Request is ready for review :-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant