Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make uniqueify_with generate deterministic macro names, to fix build failures #2797

Merged
merged 1 commit into from
May 29, 2024

Conversation

cormacrelf
Copy link
Contributor

@cormacrelf cormacrelf commented May 28, 2024

Overview

In a nutshell, some slightly exotic ways of compiling rocket with rustc fail due to nondeterminism in rocket_codegen leading to rustc correctly noticing that you've compiled two incompatible versions of rocket. The nondeterminsim is in IdentExt::uniqueify_with.

I believe you can fix this by just doing it deterministically. I don't think it needs to be random. Reviewers please help determine if the things that are hashed for the route macro are enough to avoid collisions in the generated code. Specifically these. My gut feeling is you will also want the HTTP method to be hashed.

let inner_macro_name = macro_name.uniqueify_with(|mut hasher| {
route.handler.sig.ident.hash(&mut hasher);
route.attr.uri.path().hash(&mut hasher);
route.attr.uri.query().hash(&mut hasher)
});

Background -- rocket

See this for why random identifiers are being generated at all. #964

It currently does so using a combination of

and some deterministic ones

The randomness seems like someone tried really hard to make it as random as possible. I don't think anything but the DefaultHasher's initialization was necessary to achieve that, and I put it to you that no randomness is required at all.

Background -- rustc

  • rustc hashes crates as it compiles them. It produces a "Strict Version Hash" that basically hashes the entire source code and every identifier, --cfg flag, --Cmetadata flag, all the dependencies, or something like that.
  • The crucial thing is that it hashes every identifier
  • Making random identifiers will produce a different SVH if you compile rocket twice with the exact same flags

My particular circumstances that led to experiencing this issue are:

  • Not using cargo
  • Using buck2 instead
  • Buck2 does two compile passes with rustc on every crate -- one for metadata only, and one for codegen.
    • Cargo does not, rather cargo check metadata is not reused for cargo build and when you do run cargo build, the metadata + codegen outputs are produced during the same rustc invocation. (Incidentally this is why running cargo check does not make a subsequent cargo build any faster.)
  • A library crate linking to rocket (the metadata build)
  • A binary crate linking to rocket (the full codegen build, with a different SVH)

The build errors you get look like found possibly newer version of crate rocket which thing depends on. Almost all of the mentions of this error online are due to people compiling libstd / libcore from scratch and linking the wrong version, or compiler developers mixing different stages. Rustc does very reliably compute SVH, so generally when it's telling you this, it's not lying.

Side note

Apart from the build failure, not ever being able to build identical copies of rocket / rocket applications is a problem if your build tool does caching and yeets the build products off to be used on other machines. That's a whole other can of beans.

Illustration

Not a repro because too many steps and buck2 is complicated. But I reduced rocket down to a few lines of code to discover the cause of the build failure, so may as well show you.

image

// rocket_src.rs
rocket_codegen::export! {
    macro_rules! some_macro {
        () => {};
    }
}
pub mod request {}
// lib.rs
pub use rocket::request;
// main.rs
// You get a more informative error by writing out the `extern crate`s
extern crate thing;
extern crate rocket;
fn main() {}

Notable compile steps:

# metadata
rustc rocket_src.rs --crate-name=rocket_src --crate-type=rlib -Zno-codegen ...

# codegen
rustc rocket_src.rs --crate-name=rocket_src --crate-type=rlib ...

# library build links to the metadata (rlib-pic-static_pic-metadata-full/...)
rustc lib.rs --crate-name=thing --crate-type=rlib
--extern=rocket_src=buck-out/v2/gen/root/5c688cbd5131fe95/thing/rocket_src/__rocket_src__/rlib-pic-static_pic-metadata-full/librocket_src-6901b3c9.rlib

# binary build links to the codegen (rlib-pic-static_pic-link/...)
rustc main.rs --crate-type=bin
--extern=rocket_src=buck-out/v2/gen/root/5c688cbd5131fe95/thing/rocket_src/__rocket_src__/rlib-pic-static_pic-link/librocket_src-6901b3c9.rlib

And the binary build fails as below:

error[E0460]: found possibly newer version of crate `rocket_src` which `thing` depends on
 --> ./main.rs:1:1
  |
1 | extern crate thing;
  | ^^^^^^^^^^^^^^^^^^^
  |
  = note: perhaps that crate needs to be recompiled?
  = note: the following crate versions were found:
          crate `rocket_src`: .../buck-out/v2/gen/root/5c688cbd5131fe95/thing/rocket_src/__rocket_src__/rlib-pic-static_pic-link/librocket_src-6901b3c9.rlib
          crate `thing`: .../buck-out/v2/gen/root/5c688cbd5131fe95/thing/__thing__/rlib-pic-static_pic-link/libthing-16aa3fe0.rlib

@cormacrelf
Copy link
Contributor Author

cormacrelf commented May 28, 2024

The following code will fail to compile after this change, because we made two identical macro calls and they get hashed to the same value.

Which makes me think we should just hash as many things as we can get our hands on -- until Span::source_file / line / column are stabilised.

#[macro_use]
extern crate rocket;

#[get("/")]
fn index() -> &'static str {
    "GET"
}

mod module {
    #[get("/")]
    pub fn index() -> &'static str {
        "whatever"
    }
}

#[launch]
fn rocket() -> _ {
    rocket::build().mount("/", routes![index])
}
error[E0428]: the name `rocket_uri_macro_index_3643259843385121855` is defined multiple times
  --> ./thing/main.rs:10:5
   |
4  | #[get("/")]
   | ----------- previous definition of the macro `rocket_uri_macro_index_3643259843385121855` here
...
10 |     #[get("/")]
   |     ^^^^^^^^^^^ `rocket_uri_macro_index_3643259843385121855` redefined here
   |
   = note: `rocket_uri_macro_index_3643259843385121855` must be defined only once in the macro namespace of this module
   = note: this error originates in the attribute macro `get` (in Nightly builds, run with -Z macro-backtrace for more info)

@SergioBenitez
Copy link
Member

SergioBenitez commented May 28, 2024

One thing you can try is hashing the debug output of the relevant Span, i.e. format!("{span:?}"). Today, this is roughly equivalent to file/line/column.

That being said, I'm not sure this is sufficient to avoid collisions. In particular, will this work if the route is generated by a declarative macro?

@cormacrelf cormacrelf force-pushed the master branch 2 times, most recently from cb9f324 to 8a1bb81 Compare May 28, 2024 05:17
@cormacrelf
Copy link
Contributor Author

Good idea! It seems to have worked. That test case now compiles. I guess I should add it as an actual test case.

@SergioBenitez
Copy link
Member

Can you add both that test and a test that generates routes from a macro_rules macro?

@cormacrelf
Copy link
Contributor Author

Done. I think that should cover it.

Comment on lines 84 to 85
// Use rustc_hash because it's deterministic, whereas the DefaultHasher
// in std is nondeterministic. This prevents problems where compiling
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't believe this is necessarily true. DefaultHasher::new() always returns a SipHasher with keys of 0, so it should always hash to the same values and should thus be deterministic.

@SergioBenitez SergioBenitez merged commit 120d1e7 into rwf2:master May 29, 2024
16 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants