Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf: keep directly the intersection of derivations in Memory #33

Closed
mpizenberg opened this issue Oct 11, 2020 · 9 comments
Closed

perf: keep directly the intersection of derivations in Memory #33

mpizenberg opened this issue Oct 11, 2020 · 9 comments

Comments

@mpizenberg
Copy link
Member

mpizenberg commented Oct 11, 2020

Memory stores a Vec of derivations per package. If we are to evaluate potential packages in a smarter way #32 , we need the intersection of derivation terms.

There are two methods in Memory that can return iterators to individual assignment terms, terms_for_package and potential_packages. The latter is the use case we are discussing for performance improvements in #32 . And the former is to compute relations in incompatibilities that eventually ends up computing the intersection of those terms. Considering that we seem to spend much time in relation computation #27 , this could be a non-negligible speedup.

@Eh2406
Copy link
Member

Eh2406 commented Oct 12, 2020

One thing I was wondering about is how this works with backtracking. If we backtrack to before a decision that introduced that derivation then does that mean that some terms need to come out of that intersection? If so how do we do that? Even if this is a problem, it is not insurmountable, we can keep the Vec of derivations and a pre computed intersection. In the case of backtracking, the vec gets smaller and the intersection needs to be recomputed.

@mpizenberg
Copy link
Member Author

mpizenberg commented Oct 12, 2020

That's a very relevant question! Backtracking is currently operated thanks to history which is a chronological vec of all the assignments. Once we backtrack to a given level, we iterate on assignments with a lower level and rebuild the memory.

Conflict resolution is not common I think in a normal setting since newer versions tend to be compatible with older ones, and we choose newer versions by default. Also, once we've found the root cause for a conflict, I don't see a reason we should have another conflict soon. But backtracking happens once per prior cause until the root cause is found so it could typically happen multiple times until we've found the root cause. <- woops, that is wrong, there is 1 backtrack per conflict resolution

It's probably worth having two things for memory derivations, a precomputed intersection, and terms that have not been intersected yet in a complementary vec. Something like this in the Memory.assignments.

struct PackageAssignments<V: Version> {
    decision: Option<(V, Term<V>)>,
    derivations_intersected: Term,
    derivations_not_intersected_yet: Vec<Term<V>>,
}

The Memory method terms_for_package(&self, p: &P) would become terms_intersection_for_package(&mut self, p: &P). It should empty the derivations_not_intersected_yet vec into the derivations_intersected term and return that. There should also be changes in the potential_packages.
This would provide both the benefit of never computing that intersection more than once, and not computing the intersection if it's not required yet (for backtracking for example).

Slightly unrelated, but related performance-wise, I couldn't figure how to mutate in place the history and memory and so had to clone the whole history which is quite bad. Is this another case where interior mutability is needed? is there a better way to do that?

@mpizenberg mpizenberg changed the title Optimization: keep directly the intersection of derivations in Memory perf: keep directly the intersection of derivations in Memory Oct 12, 2020
@Eh2406
Copy link
Member

Eh2406 commented Oct 12, 2020

std::mem::take(&mut self.history) would be the escape hatch that will work there, when that clone becomes significant. (it did not change the runtime of the current benchmarks.)

@mpizenberg
Copy link
Member Author

mpizenberg commented Oct 12, 2020

I've had a try at this today. I chose better-heuristic (smart package choice) as a base and ran benchmark large_case_1, then cherry-picked the improvements from dont-readd-deps and finally implemented the improvement discussed here on top. The timings I got for large_case_1 on my computer are the following:

  • better-heuristic: 160 ms / iter
  • + dont-readd-deps: 115 ms / iter
  • + this: 60 ms / iter

So on the large_case_1 benchmark, this looks like a 2x speedup. It is not cleaned up (few things are not needed anymore) but the commit is here (f803428) in branch precompute_intersections.

@mpizenberg
Copy link
Member Author

mpizenberg commented Oct 12, 2020

And on large_case_2 im seeing:

  • better-heuristic: 312 ms / iter
  • + dont-readd-deps: 220 ms / iter
  • + this: 122 ms / iter

@Eh2406
Copy link
Member

Eh2406 commented Oct 12, 2020

On my computer the improvements are similarly grate! I look forward to that getting polished into a PR!

@Eh2406
Copy link
Member

Eh2406 commented Oct 12, 2020

I was excited, so pushed a commit to your branch with some cleaned ups. If that was rude I am sorry and ignore the commit.

@mpizenberg
Copy link
Member Author

ahah don't worry, I didn't even plan to make a PR of this soon. I used the commits for the two new benchmarks, which definitely shouldn't end in Git (we'll have to re-assess that git LFS situation, but that's another subject). I see that branch more as exploration stuff. Feel free to fill it with all the ideas and cleanup you want ^^

This was referenced Oct 14, 2020
@mpizenberg
Copy link
Member Author

Done in #37 which is on its way to being merged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants