Join GitHub today
GitHub is home to over 31 million developers working together to host and review code, manage projects, and build software together.
Sign upMake the resolver non-recursive #1941
Conversation
jyasskin
added some commits
Aug 15, 2015
rust-highfive
assigned
alexcrichton
Aug 26, 2015
This comment has been minimized.
This comment has been minimized.
rust-highfive
commented
Aug 26, 2015
|
Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @alexcrichton (or someone else) soon. If any changes to this PR are deemed necessary, please add them as extra commits. This ensures that the reviewer can see what has changed since they last reviewed the code. The way Github handles out-of-date commits, this should also make it reasonably obvious what issues have or haven't been addressed. Large or tricky changes may require several passes of review and changes. Please see the contribution instructions for more information. |
This comment has been minimized.
This comment has been minimized.
|
Ok, mach defeated me. How do I tell it to use a particular cargo binary? Or, how do you want me to benchmark this? |
This comment has been minimized.
This comment has been minimized.
|
Holy cow! Awesome! I haven't had a chance to take a close look at this just yet, but yeah I'd definitely benchmark this against Servo to make sure we don't regress at all. You can do that by running You can also get Servo to successfully build by using their |
This comment has been minimized.
This comment has been minimized.
|
Ah I was curious and tried this out myself on Servo, and I saw what was at most a few ms difference, so overall I don't think this had any impact at all in terms of performance. I will try to review soon! |
This comment has been minimized.
This comment has been minimized.
|
Yeah, I wasn't able to get it to print enough to verify that I was using the cargo I expected. Thanks for checking the speed for me. |
jyasskin
reviewed
Aug 28, 2015
| trace!("{}[{}]>{} -- None", parent.name(), cur, dep.name()); | ||
| let last_err = activation_error(&cx, registry, None, &parent, &dep, | ||
| &prev_active, &candidates); | ||
| try!(find_candidate(&mut backtrack_stack, &mut cx, &mut remaining_deps, |
This comment has been minimized.
This comment has been minimized.
jyasskin
Aug 28, 2015
Author
Contributor
I've realized this probably needs to also reset parent, dep, and cur, but that mistake doesn't make any tests fail. Can you suggest a new test that would show the difference between the old and new versions?
This comment has been minimized.
This comment has been minimized.
alexcrichton
Aug 31, 2015
Member
Hm yeah this is interesting. I haven't quite groked this enough to be able to recommend a test, but stylistically it looks like this branch of the match should probably execute a continue at some point to go back up the to the top of the loop. That should end up doing all the resetting of local state as necessary, right?
This comment has been minimized.
This comment has been minimized.
jyasskin
Sep 1, 2015
Author
Contributor
I couldn't find a way to get it to continue here. The successful path through the function pushes a frame onto bracktrack_stack, so if this one continued, it'd need to somehow prevent that frame from getting pushed. In the original code, the loop over my_candidates also starts halfway down the function, so I think the structure here may be natural for the algorithm, even though it requires resetting a lot of variables.
jyasskin
reviewed
Aug 28, 2015
| match my_cx { | ||
| Ok(cx) => return Ok(Ok(cx)), | ||
| Err(e) => { last_err = Some(e); } | ||
| cx.visited.clear(); |
This comment has been minimized.
This comment has been minimized.
jyasskin
Aug 28, 2015
Author
Contributor
This looks like a pre-existing bug, where the visited set stays cleared even after this branch of the deps tree has finished activating. I'd like to avoid fixing that until the main refactoring is done. :)
alexcrichton
reviewed
Aug 31, 2015
| } | ||
| }; | ||
| if let Some(candidate) = maybe_candidate { | ||
| backtrack_stack.push(frame); |
This comment has been minimized.
This comment has been minimized.
alexcrichton
Aug 31, 2015
Member
The Some branch above shouldn't have any active borrows of frame, so perhaps this logic could be directly inlined above?
This comment has been minimized.
This comment has been minimized.
jyasskin
Aug 31, 2015
Author
Contributor
frame.remaining_candidates.next() borrows frame because of the non-Iterator next() signature. It might make sense to have this next() function auto-clone its result, since I think I'm explicitly cloning all the uses now.
This comment has been minimized.
This comment has been minimized.
alexcrichton
Aug 31, 2015
Member
Oh oops, right, thought that was an iterator. You could probably throw in a .cloned() on the Option to kill the borrow though
This comment has been minimized.
This comment has been minimized.
jyasskin
Sep 1, 2015
Author
Contributor
Neat. .cloned() didn't work because it's an Option((usize, &T)), but .map(|(_, candidate)| candidate.clone()) did. But I wound up making RcVecIter auto-clone anyway because it shortens all three uses of .next() and lets me make it a real Iterator.
This comment has been minimized.
This comment has been minimized.
|
While you're at it, could you remove the custom |
alexcrichton
reviewed
Aug 31, 2015
| let mut backtrack_stack: Vec<BacktrackFrame> = Vec::new(); | ||
| let mut remaining_deps: RemainingDeps = Vec::new(); | ||
| remaining_deps.extend( | ||
| try!(activate(&mut cx, registry, top, &top_method))); |
This comment has been minimized.
This comment has been minimized.
alexcrichton
Aug 31, 2015
Member
Stylistically Cargo tends to prefer something along the lines of:
remaining_deps.extend(try!(activate(&mut cx, registry, top,
&top_method)));But if this fits in 80 chars on one line then that's fine.
This comment has been minimized.
This comment has been minimized.
alexcrichton
reviewed
Aug 31, 2015
| #[derive(Clone)] | ||
| struct RcVecIter<Elem> { | ||
| vec: Rc<Vec<Elem>>, | ||
| next_index: usize, |
This comment has been minimized.
This comment has been minimized.
alexcrichton
Aug 31, 2015
Member
This might be able to store something like:
struct RcVecIter<T> {
vec: Rc<Vec<T>>,
next: Range<usize>,
}
impl<T> Iterator for RcVecIter<T> {
fn next(&mut self) -> Option<T> {
self.range.next().and_then(|i| self.vec.get(i))
}
}
This comment has been minimized.
This comment has been minimized.
jyasskin
Sep 1, 2015
Author
Contributor
It winds up as self.rest.next().and_then(|i| self.vec.get(i).map(|val| (i, val.clone()))) because of the need to return the index, which may or may not be more readable than the explicit match. I've kept the one-liner for now.
alexcrichton
reviewed
Aug 31, 2015
| platform: Option<String>, | ||
| id: PackageId, | ||
| } | ||
| type RemainingDeps = Vec<DepsFrame>; |
This comment has been minimized.
This comment has been minimized.
alexcrichton
Aug 31, 2015
Member
I personally tend to prefer to avoid type aliases like this, could Vec<DepsFrame> be used below instead of the alias?
This comment has been minimized.
This comment has been minimized.
alexcrichton
reviewed
Aug 31, 2015
| remaining_deps.push(deps_frame); | ||
| info | ||
| } | ||
| }; |
This comment has been minimized.
This comment has been minimized.
alexcrichton
Aug 31, 2015
Member
Stylistically Cargo also tends to prefer something like:
let foo = match bar {
Some(p) => p,
None => { ... }
};
This comment has been minimized.
This comment has been minimized.
jyasskin
Sep 1, 2015
Author
Contributor
I don't follow. Are you saying you want the None branch to come second or ...? This got a bunch simpler when I made RcVecIter auto-clone, so see if I accidentally put it in the right form.
alexcrichton
reviewed
Aug 31, 2015
| let mut remaining_candidates = RcVecIter::new(my_candidates); | ||
| let maybe_candidate = | ||
| remaining_candidates.next().map(|(_, candidate)| candidate.clone()); | ||
| let candidate: Rc<Summary> = match maybe_candidate { |
This comment has been minimized.
This comment has been minimized.
alexcrichton
Aug 31, 2015
Member
In general I tend to try to avoid as many type ascriptions as possible, were these (and some of the others around here) necessary?
This comment has been minimized.
This comment has been minimized.
jyasskin
Sep 1, 2015
Author
Contributor
When I first tried reading this code, I found it very hard to keep the expected types of variables in my head, so as a courtesy to the next person trying to come up to speed on this code, I left in the annotations that helped me understand it. Obviously you're the maintainer so I'll take them back out if you want, but I think it could use more than the bare minimum.
This comment has been minimized.
This comment has been minimized.
|
|
jyasskin
added some commits
Sep 1, 2015
alexcrichton
reviewed
Sep 1, 2015
| registry: &mut Registry, | ||
| top: Rc<Summary>, | ||
| top_method: &Method) -> CargoResult<Resolve> { | ||
| let mut backtrack_stack: Vec<BacktrackFrame> = Vec::new(); |
This comment has been minimized.
This comment has been minimized.
alexcrichton
Sep 1, 2015
Member
I can see where the type annotations here can be helpful, but in general as it doesn't follow the general idioms of Cargo and Rust as a whole I'd prefer to leave them out for now (unless they're needed to guide type inference)
This comment has been minimized.
This comment has been minimized.
jyasskin
Sep 5, 2015
Author
Contributor
'k, I still disagree and hope the Cargo and Rust communities will reconsider, but done for this change.
alexcrichton
reviewed
Sep 1, 2015
| @@ -244,149 +190,240 @@ struct Context { | |||
| /// Builds the list of all packages required to build the first argument. | |||
| pub fn resolve(summary: &Summary, method: &Method, | |||
| registry: &mut Registry) -> CargoResult<Resolve> { | |||
| trace!("resolve; summary={}", summary.package_id()); | |||
| debug!("resolve; summary={}", summary.package_id()); | |||
This comment has been minimized.
This comment has been minimized.
alexcrichton
Sep 1, 2015
Member
Ah you can actually leave these as trace! because that's defined in liblog as a separate logging level, the special part for this module was the guard with cfg!(debug_assertions)
alexcrichton
reviewed
Sep 1, 2015
| loop { | ||
| // Retrieves the next dependency to try, from `remaining_deps`. | ||
| let (mut parent, mut cur, mut dep, candidates, features) = | ||
| match remaining_deps.pop() { |
This comment has been minimized.
This comment has been minimized.
alexcrichton
Sep 1, 2015
Member
Ah in the stylistic thing beforehand I basically meant that let and match should be on the same line and the closing brace of the match should be at the same indentation level as the let. In this case it's fine to destructure the tuple after the match.
This comment has been minimized.
This comment has been minimized.
alexcrichton
reviewed
Sep 1, 2015
| let mut candidates: Vec<Summary> = match registry.query(&new_dep) { | ||
| Ok(candidates) => candidates, | ||
| Err(e) => return e, | ||
| }; |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
jyasskin
Sep 3, 2015
Author
Contributor
It can't trivially because I changed the return type of activation_error from CargoResult<Box<Context>> to Box<CargoError> (since it never actually returned a Context). I could change that back, but I'd need to explicitly unwrap last_err in find_candidate, and it just seems odd to offer an Ok result that can never actually happen.
This comment has been minimized.
This comment has been minimized.
alexcrichton
reviewed
Sep 1, 2015
| debug!("{}[{}]>{} -- {:?}", frame.parent.name(), frame.cur, frame.dep.name(), | ||
| Some(&last_err)); | ||
| last_err = activation_error(cx, registry, Some(last_err), &frame.parent, &frame.dep, | ||
| &frame.prev_active, &frame.all_candidates); |
This comment has been minimized.
This comment has been minimized.
alexcrichton
Sep 1, 2015
Member
Can you also be sure to clip all lines to 80 chars? (the style of Cargo)
This comment has been minimized.
This comment has been minimized.
jyasskin
Sep 3, 2015
Author
Contributor
Whoops, I assumed from the 100-column check in make test that that was the limit. I'll switch everything back to 80.
alexcrichton
reviewed
Sep 1, 2015
| parent: Rc<Summary>, | ||
| cur: usize, | ||
| dep: Rc<Dependency>, | ||
| prev_active: Vec<Rc<Summary>>, |
This comment has been minimized.
This comment has been minimized.
alexcrichton
Sep 1, 2015
Member
I think this can avoid storing prev_active, right? It looks like it's recalculated on each iteration of the main loop anyway so I think that value should be usable for generating an error.
This comment has been minimized.
This comment has been minimized.
jyasskin
Sep 3, 2015
Author
Contributor
I hadn't realized that activation_error only computed anything when it was passed None. That simplifies some more things.
alexcrichton
reviewed
Sep 1, 2015
| remaining_candidates: RcVecIter<Rc<Summary>>, | ||
| // For building an activation error: | ||
| parent: Rc<Summary>, | ||
| cur: usize, |
This comment has been minimized.
This comment has been minimized.
alexcrichton
Sep 1, 2015
Member
This is only ever used for logging, so it's fine to remove. Perhaps the index can be derived from one of the other iterators here?
This comment has been minimized.
This comment has been minimized.
jyasskin
Sep 5, 2015
Author
Contributor
Yep, deps_backup.last().unwrap().remaining_siblings.rest.start - 1
alexcrichton
reviewed
Sep 1, 2015
|
|
||
| struct BacktrackFrame { | ||
| context_backup: Context, | ||
| deps_backup: Vec<DepsFrame>, |
This comment has been minimized.
This comment has been minimized.
alexcrichton
Sep 1, 2015
Member
I'm a little worried about storing this as a Vec, it seems pretty allocation heavy as it's possible for this to get relatively large and a copy of this is lying around per-frame.
A better strategy here may be to use a linked list with a bunch of Rc and RefCell instances. This list only needs two operations: push and pop, so it should lend itself to a linked list quite well (where you can just go up the list)
This comment has been minimized.
This comment has been minimized.
jyasskin
Sep 5, 2015
Author
Contributor
I believe you're suggesting that DepsFrame hold an Option<Rc<RefCell<DepsFrame>>> to implement the linked list, and BacktrackFrame hold an Rc<RefCell<DepsFrame>> tracking the top of the remaining_deps stack when the BacktrackFrame (call it bf) was pushed. Unfortunately, the algorithm will pop the remaining_deps past the level bf is holding, and mutate its remaining_siblings in order to finish traversing the dependencies. If we then backtrack past bf, it'll reset the remaining_deps stack to hold a bunch of frames whose siblings have been exhausted, instead of remembering to revisit them with the new candidate.
We could reduce the size of DepsFrame by changing RcVecIter back to holding the next index instead of a Range. Other than that, I think the biggest memory savings might come from changing Context so it doesn't need to be copied for each BacktrackFrame. I'm not volunteering to implement the necessary persistent maps and sets though. :)
This comment has been minimized.
This comment has been minimized.
alexcrichton
Sep 8, 2015
Member
Hm I'm not sure I quite understand this. Just taking a look at the operations on remaining_deps it's only push/pop
This comment has been minimized.
This comment has been minimized.
alexcrichton
Sep 8, 2015
Member
Er sorry, hit comment too soon!
Although after thinking again I think I see where you're coming from, the persistence means that it'll retain too much state (or at least moreso than keep a Vec on the side). This doesn't seem to regress the profiles too much so I think it's ok to leave this as an open opportunity to explore in the future.
alexcrichton
reviewed
Sep 1, 2015
| struct BacktrackFrame { | ||
| context_backup: Context, | ||
| deps_backup: Vec<DepsFrame>, | ||
| remaining_candidates: RcVecIter<Rc<Summary>>, |
This comment has been minimized.
This comment has been minimized.
alexcrichton
Sep 1, 2015
Member
In the interest of cutting down on allocations and reducing the amount of redundantly stored information, I think that this may want to actually just not be stored separately and use perhaps a totally custom iterator type. This is just a filtered list of all_candidates below and with prev_active already calculated on each iteration of the loop I think it should be possible to just walk among the list of all candidates without storing a pre-filtered list of actual candidates.
This comment has been minimized.
This comment has been minimized.
jyasskin
Sep 5, 2015
Author
Contributor
It turns out I could drop all_candidates from BacktrackFrame entirely after your revelation that activation_error only needed to be called at the start of backtracking, so we're no longer storing redundant candidates here.
This comment has been minimized.
This comment has been minimized.
alexcrichton
referenced this pull request
Sep 8, 2015
Closed
Cargo crashes with SIGILL sometimes #1971
alexcrichton
reviewed
Sep 8, 2015
| } | ||
| trace!("activating {}", parent.package_id()); | ||
| debug!("activating {}", parent.package_id()); |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
alexcrichton
reviewed
Sep 8, 2015
| /// provided is an iterator over all dependencies where each element yielded | ||
| /// informs us what the candidates are for the dependency in question. | ||
| #[derive(Clone)] | ||
| struct RcVecIter<Elem> { |
This comment has been minimized.
This comment has been minimized.
alexcrichton
Sep 8, 2015
Member
Conventionally type parameters like this are typically called something like T instead of Elem (same below as well)
This comment has been minimized.
This comment has been minimized.
alexcrichton
reviewed
Sep 8, 2015
| !compatible(a.version(), b.version()) | ||
| }) | ||
| }); | ||
| let prev_active = cx.prev_active(&dep).to_vec(); |
This comment has been minimized.
This comment has been minimized.
alexcrichton
Sep 8, 2015
Member
Can this to_vec be avoided? I think this borrow only needs to stay alive until my_candidates is created so it should be ok to not allocate a new vector here I think
This comment has been minimized.
This comment has been minimized.
jyasskin
Sep 9, 2015
Author
Contributor
prev_active is used in the call to activation_error() below, which makes it hard to drop before cx is borrowed mutably.
This comment has been minimized.
This comment has been minimized.
alexcrichton
Sep 9, 2015
Member
I think you can get away with an extra scope here which kills the borrow right after my_candidates is assembled.
This comment has been minimized.
This comment has been minimized.
jyasskin
Sep 11, 2015
Author
Contributor
I'll have to drop the use of prev_active in activation_error or re-compute it. I'll see if recomputing works.
alexcrichton
reviewed
Sep 8, 2015
| @@ -687,7 +721,7 @@ impl Context { | |||
| feature))); | |||
| } | |||
| } | |||
| ret.push((dep, base)); | |||
| ret.push((Rc::new(dep.clone()), base)); | |||
This comment has been minimized.
This comment has been minimized.
alexcrichton
Sep 8, 2015
Member
Looking at some profiles, this is actually pretty expensive. A Dependency is quite large and has a good deal of information inside it, so I think it may be best to actually just take the same strategy as PackageId here and have Dependency internally store an Arc with a DependencyInner that actually has all the data. That way you could just pass around Dependency by value here and it'd just be cheap to cloen.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
jyasskin
Sep 9, 2015
Author
Contributor
Because of the set_foo methods on Dependency, having it store an Arc<DependencyInner> doesn't seem to work that well. Every use of set_foo would have to copy the whole Dependency. I'm most of the way through just switching all the non-creating uses over to Rc<Dependency>, but because of Summary::map_dependencies I'm using the experimental Rc::try_unwrap to avoid unnecessary copies. Let me know if you see a better way to do this.
This comment has been minimized.
This comment has been minimized.
jyasskin
Sep 9, 2015
Author
Contributor
'k, this is done.
Another way to avoid this would be if there were an Rc equivalent to C++11's template<class Y> shared_ptr(const shared_ptr<Y>& r, T *ptr) constructor that holds a reference to an owning object but dereferences to something it owns.
alexcrichton
reviewed
Sep 8, 2015
|
|
||
| // Next, transform all dependencies into a list of possible candidates | ||
| // which can satisfy that dependency. | ||
| let mut deps = try!(deps.into_iter().map(|(dep, features)| { | ||
| let mut candidates = try!(registry.query(dep)); | ||
| let mut deps: Vec<DepInfo> = try!(deps.into_iter().map(|(dep, features)| { |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
alexcrichton
reviewed
Sep 8, 2015
| trace!("{}[{}]>{} -- None", parent.name(), cur, dep.name()); | ||
| let err = activation_error(&cx, registry, &parent, | ||
| &dep, &prev_active, | ||
| &candidates); |
This comment has been minimized.
This comment has been minimized.
alexcrichton
Sep 8, 2015
Member
It looks like this error isn't actually needed unless find_candidate doesn't find anything, so could this creation be deferred until after find_candidate has failed?
This comment has been minimized.
This comment has been minimized.
alexcrichton
reviewed
Sep 8, 2015
| remaining_candidates: remaining_candidates, | ||
| parent: parent.clone(), | ||
| dep: dep.clone(), | ||
| }); |
This comment has been minimized.
This comment has been minimized.
alexcrichton
Sep 8, 2015
Member
Another possible optimization in the future I think is that if remaining_candidates is empty then this push can almost be elided entirely, the information I believe is only later used to generate an error which may be returned. That being said it's easier to just push for now, so that's fine.
alexcrichton
reviewed
Sep 8, 2015
| let top_deps_frame = frame.deps_backup.last().unwrap(); | ||
| trace!("{}[{}]>{} -- {:?}", frame.parent.name(), | ||
| top_deps_frame.remaining_siblings.cur_index(), | ||
| frame.dep.name(), Some(&err)); |
This comment has been minimized.
This comment has been minimized.
alexcrichton
Sep 8, 2015
Member
It's not necessarily super important that the logging here remains exactly the same, for example it may not be worth it to bend over backwards to get the same information. Along those lines I think it's fine to shorten this or just elide it entirely. These sorts of tracing messages can always be added in an on-demand fashion later.
This comment has been minimized.
This comment has been minimized.
alexcrichton
reviewed
Sep 8, 2015
| // find a dependency that does have a candidate to try, and try | ||
| // to activate that one. This resets the `remaining_deps` to | ||
| // their state at the found level of the `backtrack_stack`. | ||
| trace!("{}[{}]>{} -- None", parent.name(), cur, dep.name()); |
This comment has been minimized.
This comment has been minimized.
alexcrichton
Sep 8, 2015
Member
Similarly to down below, this message of "None" doesn't need to stay exactly the same, it can be tweaked to something like "no candidates"
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
|
Ah sorry this isn't quite what I had in mind for The problem with |
alexcrichton
reviewed
Sep 9, 2015
| @@ -1,5 +1,6 @@ | |||
| #![deny(unused)] | |||
| #![cfg_attr(test, deny(warnings))] | |||
| #![feature(rc_unique)] | |||
This comment has been minimized.
This comment has been minimized.
alexcrichton
Sep 9, 2015
Member
Cargo currently builds on stable, and I'd personally very much like to keep it that way, so can this avoid using unstable features for now?
This comment has been minimized.
This comment has been minimized.
|
I'll probably get to the new |
This comment has been minimized.
This comment has been minimized.
|
No worries, thanks for keeping up with this @jyasskin! |
jyasskin
force-pushed the
jyasskin:non-recursive-activate
branch
from
5328257
to
aa3005f
Sep 17, 2015
This comment has been minimized.
This comment has been minimized.
|
Here you go. |
jyasskin commentedAug 26, 2015
I'm very new to Rust, so lots of this probably has the wrong style, misses lifetime optimizations, etc. Let me know what to fix.
This passes the cargo tests, but let me know if there are other validations to run. Cargo has no benchmarks, so I'm currently downloading Servo to time
cargo updateon it before and after this patch.