New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Blake3 digest support #403
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let me know and I'll show you how to review just the last commit (the other commits are already up in review).
Reviewable status: 0 of 40 files reviewed, all discussions resolved (waiting on @aaronmondal)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 13 of 13 files at r1, 4 of 4 files at r2, 34 of 34 files at r3, all commit messages.
Reviewable status: all files reviewed, 5 unresolved discussions (waiting on @allada)
cas/cas_main.rs
line 646 at r3 (raw file):
global_cfg .default_digest_hash_function .unwrap_or(ConfigDigestHashFunction::sha256),
I wonder whether we'd want to make blake3 the default here. I'm aware that bazel defaults to SHA2 (and blake3 support was only just added in 6.4), but this would force users to explicitly set sha2
in their config. Considering that blake3 is ~15x faster than SHA2 this seems reasonable to me to make users think twice whether they actually want to keep using the significantly slower SHA2.
cas/scheduler/tests/utils/scheduler_utils.rs
line 42 at r3 (raw file):
}, skip_cache_lookup: false, digest_function: DigestHasherFunc::Sha256,
Is this intentionally hardcoded?
cas/worker/tests/local_worker_test.rs
line 216 at r3 (raw file):
platform_properties: PlatformProperties::default(), priority: 0, load_timestamp: SystemTime::UNIX_EPOCH,
Will this still work on Windows?
cas/worker/tests/local_worker_test.rs
line 305 at r3 (raw file):
}, skip_cache_lookup: true, digest_function: DigestHasherFunc::Sha256,
Note: Not for this commit, but we should look into a way to make tests parametric. This way we could "matrix-test" stuff like this.
config/cas_server.rs
line 550 at r3 (raw file):
/// by client. /// /// Default: ConfigDigestHashFunction::Sha256
Nit: I vote for making Blake3 the default and make users set sha2 if they want 15x slower hashing 😈
util/digest_hasher.rs
line 90 at r3 (raw file):
)), } }
4 variations of From
/TryFrom
seems overkill.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 5 unresolved discussions (waiting on @aaronmondal)
cas/cas_main.rs
line 646 at r3 (raw file):
Previously, aaronmondal (Aaron Siddhartha Mondal) wrote…
I wonder whether we'd want to make blake3 the default here. I'm aware that bazel defaults to SHA2 (and blake3 support was only just added in 6.4), but this would force users to explicitly set
sha2
in their config. Considering that blake3 is ~15x faster than SHA2 this seems reasonable to me to make users think twice whether they actually want to keep using the significantly slower SHA2.
I would rather trade ease of use in exchange for slower builds.
Bazel and buck2 are the only clients that support blake3 to my knowledge and Bazel requires a command line argument to enable it.
Also, most companies cannot just upgrade bazel at any time, it's a big deal to go through an upgrade.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 5 unresolved discussions (waiting on @aaronmondal)
cas/scheduler/tests/utils/scheduler_utils.rs
line 42 at r3 (raw file):
Previously, aaronmondal (Aaron Siddhartha Mondal) wrote…
Is this intentionally hardcoded?
Yes, its a test util.
cas/worker/tests/local_worker_test.rs
line 216 at r3 (raw file):
Previously, aaronmondal (Aaron Siddhartha Mondal) wrote…
Will this still work on Windows?
It should, Unix timestamps are supported on windows.
cas/worker/tests/local_worker_test.rs
line 305 at r3 (raw file):
Previously, aaronmondal (Aaron Siddhartha Mondal) wrote…
Note: Not for this commit, but we should look into a way to make tests parametric. This way we could "matrix-test" stuff like this.
I thought of that but was worried of the complexity it'd introduce and difficulty to debug.
Id use integration tests to validate entire builds to for matrix-like tests.
config/cas_server.rs
line 550 at r3 (raw file):
Previously, aaronmondal (Aaron Siddhartha Mondal) wrote…
Nit: I vote for making Blake3 the default and make users set sha2 if they want 15x slower hashing 😈
but then most clients wont work.
Remember there is a capabilities property that tells the client we support blake3. the client can choose whatever. This is only used when none is specified.
util/digest_hasher.rs
line 90 at r3 (raw file):
Previously, aaronmondal (Aaron Siddhartha Mondal) wrote…
4 variations of
From
/TryFrom
seems overkill.
I was trying to hide complexity here. Otherwise some places will need to define the complexity where the .into()
/.from()
is called.
9a0899e
to
302e547
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 2 of 2 files at r4, all commit messages.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @allada)
cas/worker/tests/local_worker_test.rs
line 34 at r3 (raw file):
use action_messages::{ActionInfo, ActionInfoHashKey, ActionResult, ActionStage, ExecutionMetadata}; use common::{encode_stream_proto, fs, DigestInfo}; use config::cas_server::{LocalWorkerConfig, WrokerProperty};
WorkerProperty
is correct. This was previously changed in #390
302e547
to
87aee70
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 22 of 22 files at r5, all commit messages.
Reviewable status: complete! all files reviewed, all discussions resolved (waiting on @allada)
Fully support Blake3 digest upload format. This can increase performance significantly for cases where hashing files is a bottleneck. closes #395
87aee70
to
f684028
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 3 of 3 files at r6, all commit messages.
Reviewable status: complete! all files reviewed, all discussions resolved (waiting on @allada)
Fully support Blake3 digest upload format. This can increase performance significantly for cases where hashing files is a bottleneck. closes TraceMachina#395
Fully support Blake3 digest upload format. This can increase
performance significantly for cases where hashing files is a
bottleneck.
closes #395
This change is