Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make the resolver non-recursive #1941

Merged
merged 14 commits into from Sep 21, 2015

Conversation

Projects
None yet
4 participants
@jyasskin
Copy link
Contributor

jyasskin commented Aug 26, 2015

I'm very new to Rust, so lots of this probably has the wrong style, misses lifetime optimizations, etc. Let me know what to fix.

This passes the cargo tests, but let me know if there are other validations to run. Cargo has no benchmarks, so I'm currently downloading Servo to time cargo update on it before and after this patch.

@rust-highfive

This comment has been minimized.

Copy link

rust-highfive commented Aug 26, 2015

Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @alexcrichton (or someone else) soon.

If any changes to this PR are deemed necessary, please add them as extra commits. This ensures that the reviewer can see what has changed since they last reviewed the code. The way Github handles out-of-date commits, this should also make it reasonably obvious what issues have or haven't been addressed. Large or tricky changes may require several passes of review and changes.

Please see the contribution instructions for more information.

@jyasskin

This comment has been minimized.

Copy link
Contributor Author

jyasskin commented Aug 26, 2015

Ok, mach defeated me. How do I tell it to use a particular cargo binary? Or, how do you want me to benchmark this?

@alexcrichton

This comment has been minimized.

Copy link
Member

alexcrichton commented Aug 26, 2015

Holy cow! Awesome! I haven't had a chance to take a close look at this just yet, but yeah I'd definitely benchmark this against Servo to make sure we don't regress at all. You can do that by running cargo build --release to build Cargo itself, and then cd into Servo's components/servo directory and running path/to/your/target/release/cargo build. The build will die pretty quickly, but the amount of time it takes to get to the build is indicative of how long this takes.

You can also get Servo to successfully build by using their servobuild.example file and setting system-cargo to true and pointing it at your own. This is somewhat involved, however, and may not always work out.

@alexcrichton

This comment has been minimized.

Copy link
Member

alexcrichton commented Aug 27, 2015

Ah I was curious and tried this out myself on Servo, and I saw what was at most a few ms difference, so overall I don't think this had any impact at all in terms of performance. I will try to review soon!

@jyasskin

This comment has been minimized.

Copy link
Contributor Author

jyasskin commented Aug 28, 2015

Yeah, I wasn't able to get it to print enough to verify that I was using the cargo I expected. Thanks for checking the speed for me.

trace!("{}[{}]>{} -- None", parent.name(), cur, dep.name());
let last_err = activation_error(&cx, registry, None, &parent, &dep,
&prev_active, &candidates);
try!(find_candidate(&mut backtrack_stack, &mut cx, &mut remaining_deps,

This comment has been minimized.

@jyasskin

jyasskin Aug 28, 2015

Author Contributor

I've realized this probably needs to also reset parent, dep, and cur, but that mistake doesn't make any tests fail. Can you suggest a new test that would show the difference between the old and new versions?

This comment has been minimized.

@alexcrichton

alexcrichton Aug 31, 2015

Member

Hm yeah this is interesting. I haven't quite groked this enough to be able to recommend a test, but stylistically it looks like this branch of the match should probably execute a continue at some point to go back up the to the top of the loop. That should end up doing all the resetting of local state as necessary, right?

This comment has been minimized.

@jyasskin

jyasskin Sep 1, 2015

Author Contributor

I couldn't find a way to get it to continue here. The successful path through the function pushes a frame onto bracktrack_stack, so if this one continued, it'd need to somehow prevent that frame from getting pushed. In the original code, the loop over my_candidates also starts halfway down the function, so I think the structure here may be natural for the algorithm, even though it requires resetting a lot of variables.

match my_cx {
Ok(cx) => return Ok(Ok(cx)),
Err(e) => { last_err = Some(e); }
cx.visited.clear();

This comment has been minimized.

@jyasskin

jyasskin Aug 28, 2015

Author Contributor

This looks like a pre-existing bug, where the visited set stays cleared even after this branch of the deps tree has finished activating. I'd like to avoid fixing that until the main refactoring is done. :)

}
};
if let Some(candidate) = maybe_candidate {
backtrack_stack.push(frame);

This comment has been minimized.

@alexcrichton

alexcrichton Aug 31, 2015

Member

The Some branch above shouldn't have any active borrows of frame, so perhaps this logic could be directly inlined above?

This comment has been minimized.

@jyasskin

jyasskin Aug 31, 2015

Author Contributor

frame.remaining_candidates.next() borrows frame because of the non-Iterator next() signature. It might make sense to have this next() function auto-clone its result, since I think I'm explicitly cloning all the uses now.

This comment has been minimized.

@alexcrichton

alexcrichton Aug 31, 2015

Member

Oh oops, right, thought that was an iterator. You could probably throw in a .cloned() on the Option to kill the borrow though

This comment has been minimized.

@jyasskin

jyasskin Sep 1, 2015

Author Contributor

Neat. .cloned() didn't work because it's an Option((usize, &T)), but .map(|(_, candidate)| candidate.clone()) did. But I wound up making RcVecIter auto-clone anyway because it shortens all three uses of .next() and lets me make it a real Iterator.

@alexcrichton

This comment has been minimized.

Copy link
Member

alexcrichton commented Aug 31, 2015

While you're at it, could you remove the custom trace! macro in this module as well?

let mut backtrack_stack: Vec<BacktrackFrame> = Vec::new();
let mut remaining_deps: RemainingDeps = Vec::new();
remaining_deps.extend(
try!(activate(&mut cx, registry, top, &top_method)));

This comment has been minimized.

@alexcrichton

alexcrichton Aug 31, 2015

Member

Stylistically Cargo tends to prefer something along the lines of:

remaining_deps.extend(try!(activate(&mut cx, registry, top,
                                    &top_method)));

But if this fits in 80 chars on one line then that's fine.

This comment has been minimized.

@jyasskin

jyasskin Sep 1, 2015

Author Contributor

Done.

#[derive(Clone)]
struct RcVecIter<Elem> {
vec: Rc<Vec<Elem>>,
next_index: usize,

This comment has been minimized.

@alexcrichton

alexcrichton Aug 31, 2015

Member

This might be able to store something like:

struct RcVecIter<T> {
    vec: Rc<Vec<T>>,
    next: Range<usize>,
}

impl<T> Iterator for RcVecIter<T> {
    fn next(&mut self) -> Option<T> {
        self.range.next().and_then(|i| self.vec.get(i))
    }
}

This comment has been minimized.

@jyasskin

jyasskin Sep 1, 2015

Author Contributor

It winds up as self.rest.next().and_then(|i| self.vec.get(i).map(|val| (i, val.clone()))) because of the need to return the index, which may or may not be more readable than the explicit match. I've kept the one-liner for now.

platform: Option<String>,
id: PackageId,
}
type RemainingDeps = Vec<DepsFrame>;

This comment has been minimized.

@alexcrichton

alexcrichton Aug 31, 2015

Member

I personally tend to prefer to avoid type aliases like this, could Vec<DepsFrame> be used below instead of the alias?

This comment has been minimized.

@jyasskin

jyasskin Sep 1, 2015

Author Contributor

Done.

remaining_deps.push(deps_frame);
info
}
};

This comment has been minimized.

@alexcrichton

alexcrichton Aug 31, 2015

Member

Stylistically Cargo also tends to prefer something like:

let foo = match bar {
    Some(p) => p,
    None => { ... }
};

This comment has been minimized.

@jyasskin

jyasskin Sep 1, 2015

Author Contributor

I don't follow. Are you saying you want the None branch to come second or ...? This got a bunch simpler when I made RcVecIter auto-clone, so see if I accidentally put it in the right form.

let mut remaining_candidates = RcVecIter::new(my_candidates);
let maybe_candidate =
remaining_candidates.next().map(|(_, candidate)| candidate.clone());
let candidate: Rc<Summary> = match maybe_candidate {

This comment has been minimized.

@alexcrichton

alexcrichton Aug 31, 2015

Member

In general I tend to try to avoid as many type ascriptions as possible, were these (and some of the others around here) necessary?

This comment has been minimized.

@jyasskin

jyasskin Sep 1, 2015

Author Contributor

When I first tried reading this code, I found it very hard to keep the expected types of variables in my head, so as a courtesy to the next person trying to come up to speed on this code, I left in the annotations that helped me understand it. Obviously you're the maintainer so I'll take them back out if you want, but I think it could use more than the bare minimum.

@bors

This comment has been minimized.

Copy link
Contributor

bors commented Sep 1, 2015

☔️ The latest upstream changes (presumably #1939) made this pull request unmergeable. Please resolve the merge conflicts.

registry: &mut Registry,
top: Rc<Summary>,
top_method: &Method) -> CargoResult<Resolve> {
let mut backtrack_stack: Vec<BacktrackFrame> = Vec::new();

This comment has been minimized.

@alexcrichton

alexcrichton Sep 1, 2015

Member

I can see where the type annotations here can be helpful, but in general as it doesn't follow the general idioms of Cargo and Rust as a whole I'd prefer to leave them out for now (unless they're needed to guide type inference)

This comment has been minimized.

@jyasskin

jyasskin Sep 5, 2015

Author Contributor

'k, I still disagree and hope the Cargo and Rust communities will reconsider, but done for this change.

@@ -244,149 +190,240 @@ struct Context {
/// Builds the list of all packages required to build the first argument.
pub fn resolve(summary: &Summary, method: &Method,
registry: &mut Registry) -> CargoResult<Resolve> {
trace!("resolve; summary={}", summary.package_id());
debug!("resolve; summary={}", summary.package_id());

This comment has been minimized.

@alexcrichton

alexcrichton Sep 1, 2015

Member

Ah you can actually leave these as trace! because that's defined in liblog as a separate logging level, the special part for this module was the guard with cfg!(debug_assertions)

loop {
// Retrieves the next dependency to try, from `remaining_deps`.
let (mut parent, mut cur, mut dep, candidates, features) =
match remaining_deps.pop() {

This comment has been minimized.

@alexcrichton

alexcrichton Sep 1, 2015

Member

Ah in the stylistic thing beforehand I basically meant that let and match should be on the same line and the closing brace of the match should be at the same indentation level as the let. In this case it's fine to destructure the tuple after the match.

This comment has been minimized.

@jyasskin

jyasskin Sep 5, 2015

Author Contributor

'k, how's this?

let mut candidates: Vec<Summary> = match registry.query(&new_dep) {
Ok(candidates) => candidates,
Err(e) => return e,
};

This comment has been minimized.

@alexcrichton

alexcrichton Sep 1, 2015

Member

Can this be switched back to using try!?

This comment has been minimized.

@jyasskin

jyasskin Sep 3, 2015

Author Contributor

It can't trivially because I changed the return type of activation_error from CargoResult<Box<Context>> to Box<CargoError> (since it never actually returned a Context). I could change that back, but I'd need to explicitly unwrap last_err in find_candidate, and it just seems odd to offer an Ok result that can never actually happen.

This comment has been minimized.

@alexcrichton

alexcrichton Sep 3, 2015

Member

oh, duh!

debug!("{}[{}]>{} -- {:?}", frame.parent.name(), frame.cur, frame.dep.name(),
Some(&last_err));
last_err = activation_error(cx, registry, Some(last_err), &frame.parent, &frame.dep,
&frame.prev_active, &frame.all_candidates);

This comment has been minimized.

@alexcrichton

alexcrichton Sep 1, 2015

Member

Can you also be sure to clip all lines to 80 chars? (the style of Cargo)

This comment has been minimized.

@jyasskin

jyasskin Sep 3, 2015

Author Contributor

Whoops, I assumed from the 100-column check in make test that that was the limit. I'll switch everything back to 80.

parent: Rc<Summary>,
cur: usize,
dep: Rc<Dependency>,
prev_active: Vec<Rc<Summary>>,

This comment has been minimized.

@alexcrichton

alexcrichton Sep 1, 2015

Member

I think this can avoid storing prev_active, right? It looks like it's recalculated on each iteration of the main loop anyway so I think that value should be usable for generating an error.

This comment has been minimized.

@jyasskin

jyasskin Sep 3, 2015

Author Contributor

I hadn't realized that activation_error only computed anything when it was passed None. That simplifies some more things.

remaining_candidates: RcVecIter<Rc<Summary>>,
// For building an activation error:
parent: Rc<Summary>,
cur: usize,

This comment has been minimized.

@alexcrichton

alexcrichton Sep 1, 2015

Member

This is only ever used for logging, so it's fine to remove. Perhaps the index can be derived from one of the other iterators here?

This comment has been minimized.

@jyasskin

jyasskin Sep 5, 2015

Author Contributor

Yep, deps_backup.last().unwrap().remaining_siblings.rest.start - 1


struct BacktrackFrame {
context_backup: Context,
deps_backup: Vec<DepsFrame>,

This comment has been minimized.

@alexcrichton

alexcrichton Sep 1, 2015

Member

I'm a little worried about storing this as a Vec, it seems pretty allocation heavy as it's possible for this to get relatively large and a copy of this is lying around per-frame.

A better strategy here may be to use a linked list with a bunch of Rc and RefCell instances. This list only needs two operations: push and pop, so it should lend itself to a linked list quite well (where you can just go up the list)

This comment has been minimized.

@jyasskin

jyasskin Sep 5, 2015

Author Contributor

I believe you're suggesting that DepsFrame hold an Option<Rc<RefCell<DepsFrame>>> to implement the linked list, and BacktrackFrame hold an Rc<RefCell<DepsFrame>> tracking the top of the remaining_deps stack when the BacktrackFrame (call it bf) was pushed. Unfortunately, the algorithm will pop the remaining_deps past the level bf is holding, and mutate its remaining_siblings in order to finish traversing the dependencies. If we then backtrack past bf, it'll reset the remaining_deps stack to hold a bunch of frames whose siblings have been exhausted, instead of remembering to revisit them with the new candidate.

We could reduce the size of DepsFrame by changing RcVecIter back to holding the next index instead of a Range. Other than that, I think the biggest memory savings might come from changing Context so it doesn't need to be copied for each BacktrackFrame. I'm not volunteering to implement the necessary persistent maps and sets though. :)

This comment has been minimized.

@alexcrichton

alexcrichton Sep 8, 2015

Member

Hm I'm not sure I quite understand this. Just taking a look at the operations on remaining_deps it's only push/pop

This comment has been minimized.

@alexcrichton

alexcrichton Sep 8, 2015

Member

Er sorry, hit comment too soon!

Although after thinking again I think I see where you're coming from, the persistence means that it'll retain too much state (or at least moreso than keep a Vec on the side). This doesn't seem to regress the profiles too much so I think it's ok to leave this as an open opportunity to explore in the future.

struct BacktrackFrame {
context_backup: Context,
deps_backup: Vec<DepsFrame>,
remaining_candidates: RcVecIter<Rc<Summary>>,

This comment has been minimized.

@alexcrichton

alexcrichton Sep 1, 2015

Member

In the interest of cutting down on allocations and reducing the amount of redundantly stored information, I think that this may want to actually just not be stored separately and use perhaps a totally custom iterator type. This is just a filtered list of all_candidates below and with prev_active already calculated on each iteration of the loop I think it should be possible to just walk among the list of all candidates without storing a pre-filtered list of actual candidates.

This comment has been minimized.

@jyasskin

jyasskin Sep 5, 2015

Author Contributor

It turns out I could drop all_candidates from BacktrackFrame entirely after your revelation that activation_error only needed to be called at the start of backtracking, so we're no longer storing redundant candidates here.

This comment has been minimized.

@alexcrichton

alexcrichton Sep 8, 2015

Member

Excellent!

Round 2 of review:
* Converted pre-existing trace!s back from debug!
* Computed `cur` from the top-most frame instead of storing it.
* Wrapped to 80 columns.
* Removed prev_active and all_candidates from `BacktrackStack`
* Removed unnecessary type annotations.
* Called activation_error() only when starting to backtrack, since it bailed
  early when passed an error anyway.
}
trace!("activating {}", parent.package_id());
debug!("activating {}", parent.package_id());

This comment has been minimized.

@alexcrichton

alexcrichton Sep 8, 2015

Member

Could this stay as trace!?

This comment has been minimized.

@jyasskin

jyasskin Sep 9, 2015

Author Contributor

Sure, done.

/// provided is an iterator over all dependencies where each element yielded
/// informs us what the candidates are for the dependency in question.
#[derive(Clone)]
struct RcVecIter<Elem> {

This comment has been minimized.

@alexcrichton

alexcrichton Sep 8, 2015

Member

Conventionally type parameters like this are typically called something like T instead of Elem (same below as well)

This comment has been minimized.

@jyasskin

jyasskin Sep 9, 2015

Author Contributor

Done.

!compatible(a.version(), b.version())
})
});
let prev_active = cx.prev_active(&dep).to_vec();

This comment has been minimized.

@alexcrichton

alexcrichton Sep 8, 2015

Member

Can this to_vec be avoided? I think this borrow only needs to stay alive until my_candidates is created so it should be ok to not allocate a new vector here I think

This comment has been minimized.

@jyasskin

jyasskin Sep 9, 2015

Author Contributor

prev_active is used in the call to activation_error() below, which makes it hard to drop before cx is borrowed mutably.

This comment has been minimized.

@alexcrichton

alexcrichton Sep 9, 2015

Member

I think you can get away with an extra scope here which kills the borrow right after my_candidates is assembled.

This comment has been minimized.

@jyasskin

jyasskin Sep 11, 2015

Author Contributor

I'll have to drop the use of prev_active in activation_error or re-compute it. I'll see if recomputing works.

@@ -687,7 +721,7 @@ impl Context {
feature)));
}
}
ret.push((dep, base));
ret.push((Rc::new(dep.clone()), base));

This comment has been minimized.

@alexcrichton

alexcrichton Sep 8, 2015

Member

Looking at some profiles, this is actually pretty expensive. A Dependency is quite large and has a good deal of information inside it, so I think it may be best to actually just take the same strategy as PackageId here and have Dependency internally store an Arc with a DependencyInner that actually has all the data. That way you could just pass around Dependency by value here and it'd just be cheap to cloen.

This comment has been minimized.

@jyasskin

jyasskin Sep 9, 2015

Author Contributor

Working on this now.

This comment has been minimized.

@jyasskin

jyasskin Sep 9, 2015

Author Contributor

Because of the set_foo methods on Dependency, having it store an Arc<DependencyInner> doesn't seem to work that well. Every use of set_foo would have to copy the whole Dependency. I'm most of the way through just switching all the non-creating uses over to Rc<Dependency>, but because of Summary::map_dependencies I'm using the experimental Rc::try_unwrap to avoid unnecessary copies. Let me know if you see a better way to do this.

This comment has been minimized.

@jyasskin

jyasskin Sep 9, 2015

Author Contributor

'k, this is done.

Another way to avoid this would be if there were an Rc equivalent to C++11's template<class Y> shared_ptr(const shared_ptr<Y>& r, T *ptr) constructor that holds a reference to an owning object but dereferences to something it owns.


// Next, transform all dependencies into a list of possible candidates
// which can satisfy that dependency.
let mut deps = try!(deps.into_iter().map(|(dep, features)| {
let mut candidates = try!(registry.query(dep));
let mut deps: Vec<DepInfo> = try!(deps.into_iter().map(|(dep, features)| {

This comment has been minimized.

@alexcrichton

alexcrichton Sep 8, 2015

Member

Extra type annotations here and above can be elided

This comment has been minimized.

@jyasskin

jyasskin Sep 9, 2015

Author Contributor

Whoops, sorry for missing these.

trace!("{}[{}]>{} -- None", parent.name(), cur, dep.name());
let err = activation_error(&cx, registry, &parent,
&dep, &prev_active,
&candidates);

This comment has been minimized.

@alexcrichton

alexcrichton Sep 8, 2015

Member

It looks like this error isn't actually needed unless find_candidate doesn't find anything, so could this creation be deferred until after find_candidate has failed?

This comment has been minimized.

@jyasskin

jyasskin Sep 9, 2015

Author Contributor

Good point. I've moved it.

remaining_candidates: remaining_candidates,
parent: parent.clone(),
dep: dep.clone(),
});

This comment has been minimized.

@alexcrichton

alexcrichton Sep 8, 2015

Member

Another possible optimization in the future I think is that if remaining_candidates is empty then this push can almost be elided entirely, the information I believe is only later used to generate an error which may be returned. That being said it's easier to just push for now, so that's fine.

let top_deps_frame = frame.deps_backup.last().unwrap();
trace!("{}[{}]>{} -- {:?}", frame.parent.name(),
top_deps_frame.remaining_siblings.cur_index(),
frame.dep.name(), Some(&err));

This comment has been minimized.

@alexcrichton

alexcrichton Sep 8, 2015

Member

It's not necessarily super important that the logging here remains exactly the same, for example it may not be worth it to bend over backwards to get the same information. Along those lines I think it's fine to shorten this or just elide it entirely. These sorts of tracing messages can always be added in an on-demand fashion later.

This comment has been minimized.

@jyasskin

jyasskin Sep 9, 2015

Author Contributor

I've dropped it and simplified the match.

// find a dependency that does have a candidate to try, and try
// to activate that one. This resets the `remaining_deps` to
// their state at the found level of the `backtrack_stack`.
trace!("{}[{}]>{} -- None", parent.name(), cur, dep.name());

This comment has been minimized.

@alexcrichton

alexcrichton Sep 8, 2015

Member

Similarly to down below, this message of "None" doesn't need to stay exactly the same, it can be tweaked to something like "no candidates"

This comment has been minimized.

@jyasskin

jyasskin Sep 9, 2015

Author Contributor

Done.

@alexcrichton

This comment has been minimized.

Copy link
Member

alexcrichton commented Sep 9, 2015

Ah sorry this isn't quite what I had in mind for Rc<Dependency>, the idea is to hide that kind of ownership (e.g. the outer Rc) from users of the type rather than having it leak into the public API (e.g. there's a few map(Rc::new).collect() which would be nice to not have).

The problem with set_foo I think isn't so bad, it's only ever actually called during creation of the original Dependency and it'd probably be fine in this case to expose DependencyInner with builder-like methods and an into_dependency method which puts it in an Arc or Rc. The lock_to method can look like SourceId::with_precise

@@ -1,5 +1,6 @@
#![deny(unused)]
#![cfg_attr(test, deny(warnings))]
#![feature(rc_unique)]

This comment has been minimized.

@alexcrichton

alexcrichton Sep 9, 2015

Member

Cargo currently builds on stable, and I'd personally very much like to keep it that way, so can this avoid using unstable features for now?

@jyasskin

This comment has been minimized.

Copy link
Contributor Author

jyasskin commented Sep 11, 2015

I'll probably get to the new Dependency refactoring Tuesday. Sorry for the delay.

@alexcrichton

This comment has been minimized.

Copy link
Member

alexcrichton commented Sep 14, 2015

No worries, thanks for keeping up with this @jyasskin!

@jyasskin jyasskin force-pushed the jyasskin:non-recursive-activate branch from 5328257 to aa3005f Sep 17, 2015

@jyasskin

This comment has been minimized.

Copy link
Contributor Author

jyasskin commented Sep 17, 2015

Here you go.

@bors bors merged commit aa3005f into rust-lang:master Sep 21, 2015

1 of 2 checks passed

continuous-integration/appveyor AppVeyor build failed
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
@alexcrichton

This comment has been minimized.

Copy link
Member

alexcrichton commented Sep 21, 2015

Awesome, thanks so much again @jyasskin! I added a few small bits in #1995 but they were all just minor things here and there

@jyasskin jyasskin deleted the jyasskin:non-recursive-activate branch Sep 7, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.