Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Url hash values have changed between 0.2.34 and 0.2.35 #117

Closed
aidanhs opened this issue Jun 13, 2015 · 5 comments
Closed

Url hash values have changed between 0.2.34 and 0.2.35 #117

aidanhs opened this issue Jun 13, 2015 · 5 comments

Comments

@aidanhs
Copy link

aidanhs commented Jun 13, 2015

main.rs

#![feature(hash)]
extern crate url;

use url::Url;
use std::hash::{hash, SipHasher};

fn main() {
    let issue_list_url = Url::parse(
        "https://github.com/rust-lang/rust/issues?labels=E-easy&state=open"
    ).unwrap();
    println!("{}", hash::<_, SipHasher>(&issue_list_url));
}

Terminal session:

~/Desktop/rust $ cargo new --bin test
~/Desktop/rust $ cd test
~/Desktop/rust/test $ vim src/main.rs # edit main.rs per above
~/Desktop/rust/test $ echo '[dependencies]' >> Cargo.toml
~/Desktop/rust/test $ echo 'url = "0.2.34"' >> Cargo.toml
~/Desktop/rust/test $ cargo run
    Updating registry `https://github.com/rust-lang/crates.io-index`
   Compiling rustc-serialize v0.3.15
   Compiling matches v0.1.2
   Compiling url v0.2.35
   Compiling test v0.1.0 (file:///home/aidanhs/Desktop/rust/test)
     Running `target/debug/test`
10598872291636267386
~/Desktop/rust/test $ sed -i 's/35/34/' Cargo.lock
~/Desktop/rust/test $ cargo run                                                                                                                                                                 
   Compiling url v0.2.34
   Compiling test v0.1.0 (file:///home/aidanhs/Desktop/rust/test)
     Running `target/debug/test`
11128727563018172050

Note the editing Cargo.lock by hand - url = "0.2.34" in Cargo.toml is not sufficient.

This is a pretty surprising change for a patch release! I'd expect a more significant version bump.
Possibly went unnoticed though? There don't seem to be any hashing tests.

@aidanhs
Copy link
Author

aidanhs commented Jun 13, 2015

Ah, commit 3d4430b / issue #113.

Wouldn't you consider this a backwards incompatible change? I'd be pretty nervous if two rust programs could miscommunicate url hashes (e.g. via network or via serialisation) just because they have a single patch version difference in a dependency.

@SimonSapin
Copy link
Member

I don’t know, are hash values significant? I’ve always thought of them as only meaningful for comparison with each other within one execution of a program and had not considered they could be relied on in anything serialized. Are they guaranteed to be the same on architectures? (Bit depth, endianness, etc.)

@aidanhs
Copy link
Author

aidanhs commented Jun 13, 2015

Interesting question.

Scenario: a bunch of website-hitting applications (let's say they're all written in rust) which contact a central caching server before actually making their requests. You need to choose a suitable cache key.

My personal expectation would be that using a hash would be acceptable, since that's what hashmaps do. I'd (unthinkingly) expect this to work consistently for all minor/patch variations on library versions, and across different architectures.

However, reading the documentation of the Hash trait gives no hints about whether this is expected or not. Perhaps this is something to raise on rust internals. Thoughts?

@aidanhs
Copy link
Author

aidanhs commented Jun 18, 2015

I've made my peace with this change after some discussion on rust internals - https://internals.rust-lang.org/t/stability-of-hash-values/2241.

It seems like my understanding of 'hash' is different to the popular one, so I'll see if I can improve the documentation of the hash trait to better spell it out for people like me.

@aidanhs aidanhs closed this as completed Jun 18, 2015
@SimonSapin
Copy link
Member

The serialization of URLs should be much more stable (since it affect interoperability and has a spec) so I’d recommend basing your hashing on that rather than the built-in Hash trait.

To avoid allocating a serialized string, you could have a specialized std::fmt::Write implementation that does the hashing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants