New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve time complexity of equality relations #32062

Merged
merged 1 commit into from Mar 22, 2016

Conversation

Projects
None yet
7 participants
@Marwes
Copy link
Contributor

Marwes commented Mar 5, 2016

This PR adds a UnificationTable to the TypeVariableTable type which is used store information about variable equality instead of just storing them in a vector for later processing. By using a UnificationTable equality relations can be resolved in O(n) (for all realistic values of n) rather than O(n!) which can give massive speedups in certain cases (see combine as an example).

Link to combine: https://github.com/Marwes/combine

@Marwes

This comment has been minimized.

Copy link
Contributor

Marwes commented Mar 5, 2016

This is an alternative solution to the same compilation time problem as in #32042. This on the other tackles the problem in a much more complex way which may add a slight slow down for typechecking in the normal case due to its increased complexity (most of the time n is very small leading to the old and simpler way being slightly faster).

This change essentially makes it so that when two variables are stated to be equal the relation is immediately created and made available to users of the TypeVariableTable which is quite different from the old approach which would not make this connection until one of the variables was assigned a concrete type (and even then it would take time for it to propagate to all the variables). As that is a rather large change I cannot say for certain if I missed some subtlety in how it works so hopefully someone with more experienced can give some insight. I do however think it should only change the order of when errors are detected.

Minimal example which displays this issue, add on more calls to the chain method to make it more noticeable.

fn main() {
    let _ = test(Some(0).into_iter());
}

trait Parser {
    type Input: Iterator;
    type Output;
    fn parse(self, input: Self::Input) -> Result<(Self::Output, Self::Input), ()>;
    fn chain<P>(self, p: P) -> Chain<Self, P> where Self: Sized {
        Chain(self, p)
    }
}

struct Token<T>(T::Item) where T: Iterator;

impl<T> Parser for Token<T> where T: Iterator {
    type Input = T;
    type Output = T::Item;
    fn parse(self, _input: Self::Input) -> Result<(Self::Output, Self::Input), ()> {
        Err(())
    }
}

struct Chain<L, R>(L, R);

impl<L, R> Parser for Chain<L, R> where L: Parser, R: Parser<Input = L::Input> {
    type Input = L::Input;
    type Output = (L::Output, R::Output);
    fn parse(self, _input: Self::Input) -> Result<(Self::Output, Self::Input), ()> {
        Err(())
    }
}

fn test<I>(i: I) -> Result<((), I), ()> where I: Iterator<Item = i32> {
    Chain(Token(0), Token(1))
        .chain(Chain(Token(0), Token(1)))
        .chain(Chain(Token(0), Token(1)))
        .chain(Chain(Token(0), Token(1)))
        .chain(Chain(Token(0), Token(1)))
        .chain(Chain(Token(0), Token(1)))
        .chain(Chain(Token(0), Token(1)))
        .chain(Chain(Token(0), Token(1)))
        .chain(Chain(Token(0), Token(1)))
        .parse(i)
        .map(|(_, i)| ((), i))
}

Fixes #21231

@alexcrichton

This comment has been minimized.

Copy link
Member

alexcrichton commented Mar 5, 2016

@Marwes

This comment has been minimized.

Copy link
Contributor

Marwes commented Mar 5, 2016

This does not currently fix the compile times of the example @asajeffrey posted. I do however have a theory about that specific issue which I will investigate closer. If it turns out to be correct then this PR (or something close to it) is almost guaranteed to be necessary to solved the issue).

In the iterator chaining example (pasted below) the main problem is the large number of instantiations of associated types. #20304 indicates that this should be possible to solve by adding making normalize_projection_type memorized. While it is easy to add a FnvHashMap<ProjectionTy, Ty> which stores the result of each call it won't actually help much as it may be called with <$0 as Test>::Type and <$1 as Test>::Type (where $0 is a type variable) and these two will not be found equal even if it has already been inferred that they are. By doing the equivalence relation as early as possible the chance of this kind of cache-miss can at least be reduced by always inserting types which have only "parent" variables filled.

Example:

$0 <=> $1 // $1 becomes the "parent"
normalize_projection_type($0::Type) // Cache miss, inserts $1::Type
normalize_projection_type($1::Type) // Cache hit on $1::Type
normalize_projection_type($0::Type) // Cache miss, inserts $0::Type
$0 <=> $1 // $1 becomes the "parent"
normalize_projection_type($1::Type) // Cache miss, inserts $1::Type

Ideally the second case would work as well but I am unaware of a key-value data structure where the keys may change.

fn main() {
    let iter = (0..2)
        .chain(0..2)
        .chain(0..2)
        .chain(0..2)
        .chain(0..2)
        .chain(0..2)
        .chain(0..2)
        .chain(0..2)
        .chain(0..2)
        .chain(0..2)
        .chain(0..2);
    for i in iter { println!("{:?}", i); }
}
@arielb1

This comment has been minimized.

Copy link
Contributor

arielb1 commented Mar 6, 2016

That looks +1. I want to check how this affects performance on normal rustc builds.

@Marwes

This comment has been minimized.

Copy link
Contributor

Marwes commented Mar 6, 2016

Pushed a minor change to avoid retrieving the root variable more than is necessary. Also renamed "parent_*" to "root_" as that seems more like the naming scheme in the unification table module.

@bors

This comment has been minimized.

Copy link
Contributor

bors commented Mar 6, 2016

☔️ The latest upstream changes (presumably #32056) made this pull request unmergeable. Please resolve the merge conflicts.

@Marwes Marwes force-pushed the Marwes:unification_table_for_eq_relations branch from d78fefb to 5cfd971 Mar 7, 2016

@nikomatsakis

This comment has been minimized.

Copy link
Contributor

nikomatsakis commented Mar 8, 2016

@Marwes I didn't get a chance to read this patch in detail yet, but I was actually thinking of taking precisely this same approach for other independent reasons, so I'm definitely +1 on the overall idea. I'll try to take a look tomorrow!

@nikomatsakis

This comment has been minimized.

Copy link
Contributor

nikomatsakis commented Mar 8, 2016

Can you add the test case that you mention here into the repo? How about making a directory src/test/run-pass/bench and putting it in there? I can move a few other similar "benchmark" test cases into there. It's great to be able to check these when later refactoring things.

Another option -- perhaps a better option -- is to add the test case to as a PR to https://github.com/nrc/rustc-perf. I plan to do this in any case though =) along with a few other similar microbenchmarks for extreme cases.

@Marwes

This comment has been minimized.

Copy link
Contributor

Marwes commented Mar 8, 2016

Can you add the test case that you mention here into the repo?

I assume that it won't be run during make check? While this significantly reduces type checking times it does not fix them completely as the example is also affected by #20304 so It might actually slow down check times otherwise.

@Marwes

This comment has been minimized.

Copy link
Contributor

Marwes commented Mar 8, 2016

@nikomatsakis I read all your comments and I will go through and do a commit tomorrow.

@Marwes

This comment has been minimized.

Copy link
Contributor

Marwes commented Mar 9, 2016

Those two commits should fix all nits. I can rebase and squash all commits if that would be cleaner.

I tried an alternate way of handling relations as well. I figured it may be more memory efficient to move the non-root relation vector into the RelateRange action so that it can be quickly rolled back while also allowing the vector to be deallocated. Instead of a decrease I got an increase in memory use which seems odd so either I am missing something or the undo log is never cleared. I may investigate this a bit further but this PR should should be ok modulo a rebase.

@Marwes

This comment has been minimized.

Copy link
Contributor

Marwes commented Mar 11, 2016

Seems that make tidy fails because of the missing license. I tried copying the license from a different file but that does not solve it. Any ideas?

@steveklabnik

This comment has been minimized.

Copy link
Member

steveklabnik commented Mar 11, 2016

@Marwes, src/test/run-pass/bench/issue-32062.rs doesn't seem to have a license at the top. Maybe you forgot to push again after you tried to fix it?

@Marwes Marwes force-pushed the Marwes:unification_table_for_eq_relations branch from d027683 to ea4a051 Mar 11, 2016

@Marwes

This comment has been minimized.

Copy link
Contributor

Marwes commented Mar 11, 2016

@steveklabnik I only added it locally and tested running make tidy after which it still failed. Anyway, turns out the file had CRLF line endings locally as I had copy pasted the example from this PR. Fixed that and amended the commit so it should pass now.

@Marwes

This comment has been minimized.

Copy link
Contributor

Marwes commented Mar 11, 2016

I am not able to reproduce the current travis failure locally so I am unfortunately not sure how to fix it.

@nikomatsakis

This comment has been minimized.

Copy link
Contributor

nikomatsakis commented Mar 11, 2016

I'm trying to reproduce locally.

@nikomatsakis

This comment has been minimized.

Copy link
Contributor

nikomatsakis commented Mar 11, 2016

r=me btw -- I think we can (and will) do some more work in this area, but this seems like a great step forward. But let's see if we can get to the bottom of this travis failure.

@Marwes

This comment has been minimized.

Copy link
Contributor

Marwes commented Mar 20, 2016

Were a bit busy last week but I finally got time to go through the changes again. Turns out I missed/forgot that drop is not called until after the function is called when in argument position. Still not sure why I can't reproduce it locally (might be because of different order of iteration over a hashmap somewhere I guess?).

Let me know if I should rebase.

@nikomatsakis

This comment has been minimized.

Copy link
Contributor

nikomatsakis commented Mar 21, 2016

On Sun, Mar 20, 2016 at 06:38:07AM -0700, Markus Westerlind wrote:

Were a bit busy last week but I finally got time to go through the changes again. Turns out I missed/forgot that drop is not called until after the function is called when in argument position. Still not sure why I can't reproduce it locally (might be because of different order of iteration over a hashmap somewhere I guess?).

Let me know if I should rebase.

OK, please do rebase. I am also surprised that I didn't see this
locally. Perhaps a hashmap indeed.

Improve time complexity of equality relations
This PR adds a `UnificationTable` to the `TypeVariableTable` type which
is used store information about variable equality instead of just
storing them in a vector for later processing. By using a
`UnificationTable` equality relations can be resolved in O(n) (for all
realistic values of n) rather than O(n!) which can give massive
speedups in certain cases (see combine as an example).

Link to combine: https://github.com/Marwes/combine

@Marwes Marwes force-pushed the Marwes:unification_table_for_eq_relations branch from c6a771f to e00cdd7 Mar 21, 2016

@Marwes

This comment has been minimized.

Copy link
Contributor

Marwes commented Mar 21, 2016

Rebased

@nikomatsakis

This comment has been minimized.

Copy link
Contributor

nikomatsakis commented Mar 21, 2016

@bors r+

@bors

This comment has been minimized.

Copy link
Contributor

bors commented Mar 21, 2016

📌 Commit e00cdd7 has been approved by nikomatsakis

@nikomatsakis

This comment has been minimized.

Copy link
Contributor

nikomatsakis commented Mar 21, 2016

@Marwes it would be fantastic if you could upload that test case to https://github.com/nrc/benchmarks -- you just have to make a makefile, see this directory for an example.

Marwes added a commit to Marwes/benchmarks that referenced this pull request Mar 21, 2016

Add a test for #32062
Add a test on @nikomatsakis for the improvement for equality relations
implemented in rust-lang/rust#32062

Marwes added a commit to Marwes/benchmarks that referenced this pull request Mar 21, 2016

Add a test for #32062
Add a test on @nikomatsakis suggestion for the improvement for equality relations implemented in rust-lang/rust#32062

@Marwes Marwes referenced this pull request Mar 21, 2016

Merged

Add a test for #32062 #4

@bors

This comment has been minimized.

Copy link
Contributor

bors commented Mar 21, 2016

⌛️ Testing commit e00cdd7 with merge 21922e1...

bors added a commit that referenced this pull request Mar 21, 2016

Auto merge of #32062 - Marwes:unification_table_for_eq_relations, r=n…
…ikomatsakis

Improve time complexity of equality relations

This PR adds a `UnificationTable` to the `TypeVariableTable` type which is used store information about variable equality instead of just storing them in a vector for later processing. By using a `UnificationTable` equality relations can be resolved in O(n) (for all realistic values of n) rather than O(n!) which can give massive speedups in certain cases (see combine as an example).

Link to combine: https://github.com/Marwes/combine

@bluss bluss added the relnotes label Mar 21, 2016

@bors bors merged commit e00cdd7 into rust-lang:master Mar 22, 2016

2 checks passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details
homu Test successful
Details

@Marwes Marwes deleted the Marwes:unification_table_for_eq_relations branch Mar 23, 2016

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment