Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cleanup hash functions to be more idomatic #691

Merged
merged 1 commit into from
Mar 3, 2024

Conversation

allada
Copy link
Member

@allada allada commented Mar 2, 2024

Minor change to cleanup the way we deal with the hashing functions to be flexible and easier to work with.


This change is Reviewable

Copy link

vercel bot commented Mar 2, 2024

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
nativelink-docs ✅ Ready (Inspect) Visit Preview 💬 Add feedback Mar 2, 2024 11:03pm

Copy link
Member Author

@allada allada left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+@zbirenbaum

Reviewable status: 0 of 1 LGTMs obtained, and pending CI: Analyze (javascript-typescript), Analyze (python), Bazel Dev / ubuntu-22.04, Cargo Dev / macos-13, Cargo Dev / ubuntu-22.04, Local / ubuntu-22.04, Remote / large-ubuntu-22.04, Vercel, asan / ubuntu-22.04, docker-compose-compiles-nativelink (20.04), docker-compose-compiles-nativelink (22.04), integration-tests (20.04), integration-tests (22.04), macos-13, pre-commit-checks, publish-image, ubuntu-20.04 / stable, ubuntu-22.04, ubuntu-22.04 / stable, vale, windows-2022 / stable, zig-cc ubuntu-20.04, zig-cc ubuntu-22.04 (waiting on @zbirenbaum)

@zbirenbaum
Copy link
Contributor

There was one change to an import I wasn't sure about since it looks like all usages reference the more specific one there prior, but otherwise looks good

Copy link
Contributor

@zbirenbaum zbirenbaum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 1 LGTMs obtained, and pending CI: Cargo Dev / ubuntu-22.04, asan / ubuntu-22.04


nativelink-store/tests/completeness_checking_store_test.rs line 27 at r1 (raw file):

use nativelink_store::memory_store::MemoryStore;
use nativelink_util::common::DigestInfo;
use nativelink_util::digest_hasher::DigestHasherFunc::Blake3;

Keep this


nativelink-store/tests/completeness_checking_store_test.rs line 77 at r1 (raw file):

        let tree_digest =
            serialize_and_upload_message(&tree, pinned_cas, &mut DigestHasherFunc::Blake3.hasher()).await?;

Code snippet:

&mut Blake3.hasher()

nativelink-store/tests/completeness_checking_store_test.rs line 84 at r1 (raw file):

        };

        serialize_and_upload_message(&output_directory, pinned_cas, &mut DigestHasherFunc::Blake3.hasher()).await?;

&mut Blake3.hasher()


nativelink-store/tests/completeness_checking_store_test.rs line 99 at r1 (raw file):

        // The structure of the action result is not following the spec, but is simplified for testing purposes.
        let action_result_digest =
            serialize_and_upload_message(&action_result, pinned_ac, &mut DigestHasherFunc::Blake3.hasher()).await?;

&mut Blake3.hasher()

Copy link
Contributor

@zbirenbaum zbirenbaum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 4 of 8 files at r1, all commit messages.
Reviewable status: 0 of 1 LGTMs obtained, and pending CI: Cargo Dev / ubuntu-22.04, asan / ubuntu-22.04


nativelink-util/src/digest_hasher.rs line 114 at r1 (raw file):

pub trait DigestHasher {
    /// Update the hasher with some additional data.
    fn update(&mut self, input: &[u8]);

u8 isn't long enough

Copy link
Contributor

@zbirenbaum zbirenbaum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 1 LGTMs obtained, and pending CI: Cargo Dev / ubuntu-22.04, asan / ubuntu-22.04


nativelink-util/src/digest_hasher.rs line 114 at r1 (raw file):

Previously, zbirenbaum (Zach Birenbaum) wrote…

u8 isn't long enough

Sorry misread this. It's fine


nativelink-worker/src/running_actions_manager.rs line 1048 at r1 (raw file):

            let data = execution_result.stderr;
            let digest = compute_buf_digest(&data, &mut hasher.into())
                .await

It doesn't look like this function was changed, so why was the await and error handling removed

Copy link
Contributor

@zbirenbaum zbirenbaum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 1 LGTMs obtained, and pending CI: Cargo Dev / ubuntu-22.04, asan / ubuntu-22.04


nativelink-worker/src/running_actions_manager.rs line 1038 at r1 (raw file):

            let data = execution_result.stdout;
            let digest = compute_buf_digest(&data, &mut hasher.into())
                .await

Here too

Copy link
Member Author

@allada allada left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 1 LGTMs obtained, and pending CI: Cargo Dev / ubuntu-22.04, asan / ubuntu-22.04 (waiting on @zbirenbaum)


nativelink-util/src/digest_hasher.rs line 114 at r1 (raw file):

Previously, zbirenbaum (Zach Birenbaum) wrote…

Sorry misread this. It's fine

This is a slice of u8's. Any data of any size can be represented this way.

This is also the standard "arbitrary data" type in rust.

Example: https://docs.rs/sha2/latest/sha2/trait.Digest.html#tymethod.update

We could use AsRef<[u8]>, which is the exact same as we have here, but the template version.

Copy link
Member Author

@allada allada left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 1 LGTMs obtained, and pending CI: Cargo Dev / ubuntu-22.04, asan / ubuntu-22.04 (waiting on @zbirenbaum)


nativelink-worker/src/running_actions_manager.rs line 1048 at r1 (raw file):

Previously, zbirenbaum (Zach Birenbaum) wrote…

It doesn't look like this function was changed, so why was the await and error handling removed

Because this function is no longer async.

Copy link
Member Author

@allada allada left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 1 LGTMs obtained, and pending CI: Cargo Dev / ubuntu-22.04, asan / ubuntu-22.04 (waiting on @zbirenbaum)


nativelink-worker/src/running_actions_manager.rs line 1048 at r1 (raw file):

Previously, allada (Nathan (Blaise) Bruer) wrote…

Because this function is no longer async.

Also, it can no longer error any more.

Copy link
Contributor

@zbirenbaum zbirenbaum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 1 LGTMs obtained, and pending CI: Cargo Dev / ubuntu-22.04, asan / ubuntu-22.04


nativelink-worker/src/running_actions_manager.rs line 1048 at r1 (raw file):

Previously, allada (Nathan (Blaise) Bruer) wrote…

Also, it can no longer error any more.

Got it, I didn't see that change made in this PR so wasn't sure. LGTM then after the import thing is fixed

Copy link
Member Author

@allada allada left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 1 LGTMs obtained, and pending CI: Cargo Dev / ubuntu-22.04, asan / ubuntu-22.04 (waiting on @zbirenbaum)


nativelink-store/tests/completeness_checking_store_test.rs line 27 at r1 (raw file):

Previously, zbirenbaum (Zach Birenbaum) wrote…

Keep this

I did this because all other files do it this way. When there are different enum structs that have the same keys, i prefer to be verbose. In this case sometimes there's:
ProtoDigestFunction::Blake3 and DigestHasherFunc::Blake3

Depending on if the data being referenced is the protobuf version our local version.

On top of it all Blake3 is a type in the blake3 crate.

Minor change to cleanup the way we deal with the hashing functions
to be flexible and easier to work with.
Copy link
Contributor

@zbirenbaum zbirenbaum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 1 LGTMs obtained, and pending CI: Local / ubuntu-22.04, Vercel, asan / ubuntu-22.04, pre-commit-checks, publish-image, vale, zig-cc ubuntu-20.04


nativelink-store/tests/completeness_checking_store_test.rs line 27 at r1 (raw file):

Previously, allada (Nathan (Blaise) Bruer) wrote…

I did this because all other files do it this way. When there are different enum structs that have the same keys, i prefer to be verbose. In this case sometimes there's:
ProtoDigestFunction::Blake3 and DigestHasherFunc::Blake3

Depending on if the data being referenced is the protobuf version our local version.

On top of it all Blake3 is a type in the blake3 crate.

Got it

Copy link
Contributor

@zbirenbaum zbirenbaum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewable status: 1 of 1 LGTMs obtained, and pending CI: docker-compose-compiles-nativelink (20.04), docker-compose-compiles-nativelink (22.04)

Copy link
Contributor

@zbirenbaum zbirenbaum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 1 of 1 LGTMs obtained, and pending CI: docker-compose-compiles-nativelink (20.04), docker-compose-compiles-nativelink (22.04)


nativelink-store/tests/completeness_checking_store_test.rs line 27 at r1 (raw file):

Previously, zbirenbaum (Zach Birenbaum) wrote…

Got it

I think ProtoDigestFunction is a private enum, and under the hood it uses TryInto to convert Self::Blake3 and Self::SHA256 to the DigestHasherFunc types from a generic Value. Do we need to do this specification if we are only importing from DigestHasherFunc and that enum uses the same underlying type?

Copy link
Contributor

@zbirenbaum zbirenbaum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 4 of 8 files at r1.
Reviewable status: 1 of 1 LGTMs obtained

Copy link
Member Author

@allada allada left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 1 of 1 LGTMs obtained


nativelink-store/tests/completeness_checking_store_test.rs line 27 at r1 (raw file):

Previously, zbirenbaum (Zach Birenbaum) wrote…

I think ProtoDigestFunction is a private enum, and under the hood it uses TryInto to convert Self::Blake3 and Self::SHA256 to the DigestHasherFunc types from a Value. Do we need to do this generalization if we are only importing from DigestHasherFunc?

It's more about consistency. People tend to copy-pasta stuff around and I'd rather be consistent first and foremost.

Copy link
Contributor

@zbirenbaum zbirenbaum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 1 of 1 LGTMs obtained

@allada allada merged commit 8dd786a into TraceMachina:main Mar 3, 2024
26 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants