Join GitHub today
GitHub is home to over 31 million developers working together to host and review code, manage projects, and build software together.
Sign upFix #3658 .cargo/config prevents freshness (sort after HashMap) #3659
Conversation
lilith
added some commits
Feb 6, 2017
rust-highfive
assigned
brson
Feb 7, 2017
This comment has been minimized.
This comment has been minimized.
rust-highfive
commented
Feb 7, 2017
|
Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @brson (or someone else) soon. If any changes to this PR are deemed necessary, please add them as extra commits. This ensures that the reviewer can see what has changed since they last reviewed the code. Due to the way GitHub handles out-of-date commits, this should also make it reasonably obvious what issues have or haven't been addressed. Large or tricky changes may require several passes of review and changes. Please see the contribution instructions for more information. |
lilith
added some commits
Feb 7, 2017
This comment has been minimized.
This comment has been minimized.
|
Fixes #3658 One change to note (specific to TOML configuration only - this PR does not touch other sources, like build output) While order between rustc-link-lib and rustc-flags values was always undefined, order of link paths/libs within a single rustc-flags value was maintained. Now we sort them all. Thus, the order of link paths and libraries may change when specifying multiple values on a single line. To preserve these, yet not destroy all caching, we would need order access (or duplicate, then sort) the TOML table used in scrape_target_config. Specifically, replace this HashMap with something order-preserving: https://github.com/rust-lang/cargo/blob/master/src/cargo/util/config.rs#L423 |
lilith
referenced this pull request
Feb 7, 2017
Closed
Adds world to the link path; bad neighbor #447
alexcrichton
reviewed
Feb 7, 2017
| /// Suggested if populated from a HashMap instead of an order-preserving data source | ||
| pub fn sort(&mut self){ | ||
| self.library_paths.sort(); | ||
| self.library_links.sort(); |
This comment has been minimized.
This comment has been minimized.
alexcrichton
Feb 7, 2017
Member
This and library_paths above I believe can be quite significant in terms of ordering, would it be possible to preserve the original orderings rather than sorting these?
This comment has been minimized.
This comment has been minimized.
|
Thanks for the PR! The fix looks right to me. Can you clarify though for me where the nondeterminism is coming from? I don't quite see how that hash map would connect to nondeterminism in the output just yet. |
This comment has been minimized.
This comment has been minimized.
|
If you run the failing unit test, you should see randomized ordering
(without the patch). From code review, the loss of order would have to
occur in toml parsing or any HashMap.
This is the data source: `pub fn get_table(&self, key: &str) -> CargoResult<Option<Value<HashMap<String, CV>>>>`
Playground demonstrating different HashMap ordering based on execution order: https://is.gd/0gM9zr
HashMap pairs are dumped into vectors, then hashed.
|
This comment has been minimized.
This comment has been minimized.
|
FWIW, I would LOVE to preserve ordering. What should we replace HashMaps with? |
This comment has been minimized.
This comment has been minimized.
|
Is there a way to use a custom build of cargo with an otherwise standard nightly toolchain from rustup? |
This comment has been minimized.
This comment has been minimized.
|
Yes hash maps introduce nondeterminism, but Cargo's full of hash maps and we shouldn't change them all! The actual problem here is not that all arrays need to be sorted (which is unfortunately incorrect wrt linking) but just one array needs sorting. That metadata array can be created in a nondeterministic order, but all other keys should always be deterministic. Can you change this PR to only sort that one array in that one location? Other than that looks good to me! |
This comment has been minimized.
This comment has been minimized.
|
If I understand correctly, multiple TOML files are merged, which means there can be duplicate keys. TOML tables should be sorted (assuming normal BTreeMap behavior), https://is.gd/f9rI49, but these are later dumped into a HashMap. Additionally, multiple keys feed into the same arrays in that method |
This comment has been minimized.
This comment has been minimized.
|
Expanding an array is done deterministically, iteration of a map is not. The only case iteration shows up in the output is the |
This comment has been minimized.
This comment has been minimized.
|
If you remove the hash of BuildOutput, then you can get informative output. For example:
Note that both library_links and library_paths are non-deterministic, as they are sourced from multiple keys within the same HashMap. This case is less common, and See 0c43678 for the failing test demonstrating a |
lilith
added some commits
Feb 8, 2017
This comment has been minimized.
This comment has been minimized.
|
This approach no longer mutates BuildOutput at all. Instead, it iterates over the HashMap in a deterministic order. A Vec is used to sort by key. As no other changes exist (anymore), this ensures order is always preserved within TOML arrays. |
This comment has been minimized.
This comment has been minimized.
|
Hm that feels like it may be an excessively big hammer for the problem at hand here? We don't need everything to be deterministic, just this one array? |
This comment has been minimized.
This comment has been minimized.
|
Wouldn't it be the smallest possible scope/hammer, as this can't have any ill effects? See the fresh_builds_possible_with_link_libs test for why 3 arrays are affected.
|
This comment has been minimized.
This comment has been minimized.
|
Ah right yeah good point about visiting the flags vs array fix. I feel like that's a bit overly pedantic, but hey it's fixing a bug! @bors: r+ |
This comment has been minimized.
This comment has been minimized.
|
|
This comment has been minimized.
This comment has been minimized.
bors
added a commit
that referenced
this pull request
Feb 8, 2017
This comment has been minimized.
This comment has been minimized.
|
|
lilith commentedFeb 7, 2017
The many vectors of BuildOutput are populated from a HashMap in cargo_compile... and later these vectors are hashed.
HashMaps are the bane of Cargo's existence.
But for now, we sort.