Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

crate_universe has non hermetic tests #654

Closed
UebelAndre opened this issue Mar 29, 2021 · 2 comments
Closed

crate_universe has non hermetic tests #654

UebelAndre opened this issue Mar 29, 2021 · 2 comments

Comments

@UebelAndre
Copy link
Collaborator

UebelAndre commented Mar 29, 2021

The tests are currently relying on cargo being in the user's PATH and then appear to reach out to a crate registry to pull some metadata. cargo-raze had a similar issue but was solved by committing metadata dumps to the repo and writing wrappers around the objects that would perform metadata fetching to instead read from the file (google/cargo-raze#300). This rule should also have hermetic testing and not require network access.

@UebelAndre
Copy link
Collaborator Author

Continuing a conversation from #655 (comment)

and re control flow of main.rs: I'd prefer it to be more explicit. I currently have the concern that too much unnecessary data is going into the resolver

Yeah, when we stop the lockfile containing the whole rendered output, we can pass a lot less to the resolver.

and I'm worried that just as users can accidentally forget to hook something up to the hashing mechanism, that they can make a change to an underlying component and unknowingly have that affect the hash.

I think those things should all be very easy to notice in code review at the moment. Currently the rule is very simple: Don't open any sources of external data (e.g. file paths, network addresses) unless you were constructed with them. So if someone started reading a hard-coded file path, that would be bad, but easily noticed.

The two places that are allowed to violate this are:

1. Constructing Config - we deserialize it from an external path.

2. Env vars are currently handled in `Resolver` rather than `Config` - we should probably make `Config` be the thing to read and filter env vars, rather than `Resolver`, so that it doesn't violate the rule :)

The two corners which are trickier are:

1. Stuff that `cargo metadata` does. For instance, if it will read files at arbitrary paths (spoilers: it does), or read the contents of paths that env vars point at (it probably does that too, but we are making a plan!), those are things we need to be aware of.

2. Things which contain addresses, where we only digest the address and not the results of reading from it. e.g. if an env var contains an absolute path, we currently factor that absolute path into the hash, but not the contents of the file. By filtering out more env vars and such, I hope we'll get a bit better at this - I think one of the biggest holes we have at the moment is cargo workspaces, where we probably don't properly digest/process the non-primary `Cargo.toml` files. But that's definitely something we can add tests for as we improve support :)

Though perhaps proper testing can be used to solve for the latter? I'd be willing to adapt to things the way they are but would appreciate some thought-out tests with "mocked" objects. Maybe something for #654

I am very supportive of making our tests hermetic! The only real requirements we have that scrape at the edges of hermeticity are:

1. We need _some_ way of showing that the system actually works end-to-end - that could be by running the examples we have, but that should actually interface with crates.io (or another registry) and build real code.

2. We need some way of testing that resolution actually works. For instance, testing that resolving crate `foo` leads to also resolving its dependencies even if they're not listed in our `Cargo.toml`. We could go about this with a high-fidelity mock of the crates.io endpoints, or we could do it by using less restrictive tests, something like "Try to resolve `cargo-raze==0.9.0` and verify that one of the things we get back is `anyhow`, rather than verifying the exact versions of everything that's resolved.

There's always a tension between "how accurate are my mocks" vs "how reliable are my tests", and going all the way in either direction has issues :)

cc @illicitonion

@illicitonion
Copy link
Collaborator

This has since been fixed :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants