Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reproducible Hashes is 0.8? #132

Closed
domenukk opened this issue Oct 18, 2022 · 4 comments
Closed

Reproducible Hashes is 0.8? #132

domenukk opened this issue Oct 18, 2022 · 4 comments

Comments

@domenukk
Copy link

We are currently using ahash 0.7 to hash some data we store to disk, as fast as possible, and deduplicate them between different processes (that might be compiled differently, as well).
It doesn't matter if it's attackable, since we are the only users in this case.
So far we have used new_with_keys(key1: u128, key2: u128) -> AHasher with 0, 0 as keys, to create such a reproducible hashes.
However, starting in 0.8 this method is private.

Are we holding it wrong/misunderstanding something?
What would be the fastest way to create such a reproducible ahasher in 0.8 and up?
Even RandomState::with_seed(0).build_hasher(); seems to internally use the keys.

Thank you

@rrichardson
Copy link

I am in a similar boat. I've tried setting
default-features = false
then doing :

static HASH_KEY: usize = 12345;

lazy_static! {
    static ref HASH_STATE: RandomState = { 
        set_random_source(NotRandomState).expect("Setting hash state to not-random");
        RandomState::with_seed(HASH_KEY)
    };
}
struct NotRandomState;

impl RandomSource for NotRandomState {
    fn gen_hasher_seed(&self) -> usize {
        HASH_KEY
    }
}

fn hash_stuff(a: A, b: B) -> u64 {
    let mut hasher = HASH_STATE.build_hasher();
    a.hash(&mut hasher);
    b.hash(&mut hasher);
    hasher.finish().into()
}

This seems to produce self-consistent hashes between builds on the same platform.

However, on different platforms (e.g. aarch64) it produces different values.

Is there an easier or more reliable method of ensuring consistencency?

@rrichardson
Copy link

rrichardson commented Oct 21, 2022

@domenukk - digging into the code a bit farther, it seems that using RandomState::with_seeds should achieve the old new_with_keys functionality.

e.g.

let st = RandomState::with_seeds(42, 1, 0 5); 
let hasher = st.build_hasher(); 

build_hasher just calls (private) AHasher::from_random_state (https://docs.rs/ahash/latest/src/ahash/aes_hash.rs.html#73)

However, I still don't get consistency across multiple platforms (and I suspect 0.7 is the same)

@tkaitchuck
Copy link
Owner

with_seeds is the method you are looking for.
However please note that aHash does not guarantee identical output across architectures, flags, or versions. For example when 0.8.1 is released, it will have performance improvements but the same data will hash to different values than it did in 0.8.0.

@tkaitchuck
Copy link
Owner

This is now better documented in 0.8.1:
https://docs.rs/ahash/0.8.1/ahash/random_state/struct.RandomState.html

rchildre3 added a commit to rchildre3/LibAFL that referenced this issue Feb 3, 2023
Reduces total number of packages from 577 to 571 on building with:
`cargo +nightly build --workspace --all-features`

* ahash 0.7 -> 0.8
  * Move `AHasher::new_with_keys` to `RandomState::with_seeds` given the
    recommendation from: aHash maintainer:
    tkaitchuck/aHash#132 (comment)

* bindgen: 0.61 -> 0.63

* c2rust-bitfields: 0.3 -> 0.17

* criterion: 0.3 -> 0.4

* crossterm: 0.25 -> 0.26

* dynasmrt: 1.2 -> 2

* frida-gum/frida-gum-sys
  * frida-gum:     0.8.1 -> 0.10
  * frida-gum-sys: 0.4.1 => 0.6
  * Update underlying frida version from 15.2 -> 16.0

* goblin: 0.5.3 -> 0.6

* hashbrown: 0.12 -> 0.13

* nix: 0.25 -> 0.26
  * The `addr` arg of `mmap` is now of type `Option<NonZeroUsize>`
  * The `length` arg of `mmap` is now of type `NonZeroUsize`

* prometheus-client: 0.18.0 -> 0.19
  * Do not box metrics
  * Gauges (a majority of the LibAFL metrics) are now i64 types so there
    is a small chance of overflow, with the u64 values that LibAFL
    tracks, but unlikely to be problematic.
 * Keep `exec_rate` as a floating point value

* serial_test: 0.8 -> 1

* typed-builder: 0.10.0 -> 0.12

* windows: 0.42.0 -> 0.44
rchildre3 added a commit to rchildre3/LibAFL that referenced this issue Feb 5, 2023
Reduces total number of packages from 577 to 571 on building with:
`cargo +nightly build --workspace --all-features`

* ahash 0.7 -> 0.8
  * Move `AHasher::new_with_keys` to `RandomState::with_seeds` given the
    recommendation from: aHash maintainer:
    tkaitchuck/aHash#132 (comment)

* bindgen: 0.61 -> 0.63

* c2rust-bitfields: 0.3 -> 0.17

* criterion: 0.3 -> 0.4

* crossterm: 0.25 -> 0.26

* dynasmrt: 1.2 -> 2

* goblin: 0.5.3 -> 0.6

* hashbrown: 0.12 -> 0.13

* nix: 0.25 -> 0.26
  * The `addr` arg of `mmap` is now of type `Option<NonZeroUsize>`
  * The `length` arg of `mmap` is now of type `NonZeroUsize`
  * Requires updating implementers to update `nix` as well

* prometheus-client: 0.18.0 -> 0.19
  * Do not box metrics
  * Gauges (a majority of the LibAFL metrics) are now i64 types so there
    is a small chance of overflow, with the u64 values that LibAFL
    tracks, but unlikely to be problematic.
 * Keep `exec_rate` as a floating point value

* serial_test: 0.8 -> 1

* typed-builder: 0.10.0 -> 0.12

* windows: 0.42.0 -> 0.44
domenukk added a commit to AFLplusplus/LibAFL that referenced this issue Feb 6, 2023
Reduces total number of packages from 577 to 571 on building with:
`cargo +nightly build --workspace --all-features`

* ahash 0.7 -> 0.8
  * Move `AHasher::new_with_keys` to `RandomState::with_seeds` given the
    recommendation from: aHash maintainer:
    tkaitchuck/aHash#132 (comment)

* bindgen: 0.61 -> 0.63

* c2rust-bitfields: 0.3 -> 0.17

* criterion: 0.3 -> 0.4

* crossterm: 0.25 -> 0.26

* dynasmrt: 1.2 -> 2

* goblin: 0.5.3 -> 0.6

* hashbrown: 0.12 -> 0.13

* nix: 0.25 -> 0.26
  * The `addr` arg of `mmap` is now of type `Option<NonZeroUsize>`
  * The `length` arg of `mmap` is now of type `NonZeroUsize`
  * Requires updating implementers to update `nix` as well

* prometheus-client: 0.18.0 -> 0.19
  * Do not box metrics
  * Gauges (a majority of the LibAFL metrics) are now i64 types so there
    is a small chance of overflow, with the u64 values that LibAFL
    tracks, but unlikely to be problematic.
 * Keep `exec_rate` as a floating point value

* serial_test: 0.8 -> 1

* typed-builder: 0.10.0 -> 0.12

* windows: 0.42.0 -> 0.44

Co-authored-by: Dominik Maier <domenukk@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants