Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(mm): use blake3_single as default hashing algo #6020

Merged
merged 2 commits into from Mar 21, 2024

Conversation

psychedelicious
Copy link
Collaborator

Summary

For SSDs, blake3 is about 10x faster than blake3_single - 3 files/second vs 30 files/second.

For spinning HDDs, blake3 is about 100x slower than blake3_single - 300 seconds/file vs 3 seconds/file.

For external drives, blake3 is always worse, but the difference is highly variable. For external spinning drives, it's probably way worse than internal.

The least offensive algorithm is blake3_single, and it's still much faster than any other algorithm.


Also rename blake3 to blake3_multi for clarity.

Related Issues / Discussions

Numerous discussions on discord from users with spinning disks or external drives.

QA Instructions

Enable memory db and start up. You should still get reasonable hashing speeds as it migrates everything.

Merge Plan

N/A

Checklist

  • The PR has a short but descriptive title
  • Tests added / updated
  • Documentation added / updated

@github-actions github-actions bot added python PRs that change python files backend PRs that change backend files services PRs that change app services python-tests PRs that change python tests docs PRs that change docs labels Mar 21, 2024
@psychedelicious psychedelicious changed the title feat(mm): hash default blake3 single feat(mm): use blake3_single as default hashing algo Mar 21, 2024
Copy link
Collaborator

@lstein lstein left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@psychedelicious psychedelicious enabled auto-merge (rebase) March 21, 2024 21:17
For SSDs, `blake3` is about 10x faster than `blake3_single` - 3 files/second vs 30 files/second.

For spinning HDDs, `blake3` is about 100x slower than `blake3_single` - 300 seconds/file vs 3 seconds/file.

For external drives, `blake3` is always worse, but the difference is highly variable. For external spinning drives, it's probably way worse than internal.

The least offensive algorithm is `blake3_single`, and it's still _much_ faster than any other algorithm.
Just make it clearer which is which.
@psychedelicious psychedelicious force-pushed the psyche/feat/mm/hash-default-blake3_single branch from 6946abd to 6a980e8 Compare March 21, 2024 21:17
@psychedelicious psychedelicious merged commit 72b44f7 into main Mar 21, 2024
14 checks passed
@psychedelicious psychedelicious deleted the psyche/feat/mm/hash-default-blake3_single branch March 21, 2024 21:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend PRs that change backend files docs PRs that change docs python PRs that change python files python-tests PRs that change python tests services PRs that change app services
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants