Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Arithmetic operations with inferred variables take quadratic time to type-check #25916

Closed
huonw opened this Issue May 31, 2015 · 5 comments

Comments

Projects
None yet
4 participants
@huonw
Copy link
Member

huonw commented May 31, 2015

This takes ~2 seconds to type check (according to -Z time-passes):

fn main() {
    macro_rules! f {
        () => { 0 + 0 }
    }
    // 16 per line
    f!();f!();f!();f!();f!();f!();f!();f!();f!();f!();f!();f!();f!();f!();f!();f!();
    f!();f!();f!();f!();f!();f!();f!();f!();f!();f!();f!();f!();f!();f!();f!();f!();
    f!();f!();f!();f!();f!();f!();f!();f!();f!();f!();f!();f!();f!();f!();f!();f!();
    f!();f!();f!();f!();f!();f!();f!();f!();f!();f!();f!();f!();f!();f!();f!();f!();

    f!();f!();f!();f!();f!();f!();f!();f!();f!();f!();f!();f!();f!();f!();f!();f!();
    f!();f!();f!();f!();f!();f!();f!();f!();f!();f!();f!();f!();f!();f!();f!();f!();
    f!();f!();f!();f!();f!();f!();f!();f!();f!();f!();f!();f!();f!();f!();f!();f!();
    f!();f!();f!();f!();f!();f!();f!();f!();f!();f!();f!();f!();f!();f!();f!();f!();

    f!();f!();f!();f!();f!();f!();f!();f!();f!();f!();f!();f!();f!();f!();f!();f!();
    f!();f!();f!();f!();f!();f!();f!();f!();f!();f!();f!();f!();f!();f!();f!();f!();
    f!();f!();f!();f!();f!();f!();f!();f!();f!();f!();f!();f!();f!();f!();f!();f!();
    f!();f!();f!();f!();f!();f!();f!();f!();f!();f!();f!();f!();f!();f!();f!();f!();

    f!();f!();f!();f!();f!();f!();f!();f!();f!();f!();f!();f!();f!();f!();f!();f!();
    f!();f!();f!();f!();f!();f!();f!();f!();f!();f!();f!();f!();f!();f!();f!();f!();
    f!();f!();f!();f!();f!();f!();f!();f!();f!();f!();f!();f!();f!();f!();f!();f!();
    f!();f!();f!();f!();f!();f!();f!();f!();f!();f!();f!();f!();f!();f!();f!();f!();
}

Doubling (respectively, halving) the number of f!() invocations multiplies (resp. divides) the time by 4.

It applies to arithmetic in consts/statics too which is, I imagine, where this will occur most often in practice (it's how I found it).

Observations:

  • for +, -, * and /, annotating either side (e.g. 0 + 0u8) or both sides makes it take only 0.03s and behave linearly (AFAICT)
  • for << and >>:
    • annotating the LHS (0u8 >> 0) reduces the time to ~1.5s,
    • annotating the RHS (0 >> 0u8) reduces it to ~1.1s but both are still quadratic,
    • annotating both takes it to 0.04s and linear
  • it seems to only be dependent on the number of operations in a single function/item, separating the 4 big chunks above across 4 functions reduces the time to ~0.5s.
  • there's no significant difference between using the macro or writing the operations inline (I only use f! to allow easier experimentation)

cc @rust-lang/compiler

@eddyb

This comment has been minimized.

Copy link
Member

eddyb commented May 31, 2015

This must be what's making my old macro-abusing inflate implementation take almost 300 seconds in typeck.
It almost sounds like type-checking a function is quadratic over the number of inference variables.

Having ran callgrind on the sample and also half of it, I'm seeing the following trace:

Function Calls (N=128) Calls (N=256) Calls (N)
check::op::check_binop 128 256 N
check::FnCtxt::resolve_type_vars_if_possible 512 1024 4N
check::FnCtxt::select_obligations_where_possible 512 + 12 1024 + 12 4N + 12
traits::FulfillmentContext::select_where_possible 512 + 27 1024 + 27 4N + 27
traits::FulfillmentContext::select 1024 + 155 2048 + 183 8N + M
infer::InferCtxt::commit_if_ok 83072 329984 5N² + 9N

An unrefined conclusion is that either check_binop shouldn't call fcx.resolve_type_vars_if_possible() or FulfillmentContext::select should be constant-time.

eddyb added a commit to eddyb/rust that referenced this issue May 31, 2015

Inline hash_table::calculate_offsets, used by iterators.
The `HashMap` and `HashSet` iterators use `RawTable::first_bucket_raw` which is generic and will get inlined cross-crate.
However, `first_bucket_raw` calls `calculate_offsets` and the call doesn't get inlined, despite being a simple function.
This missing `#[inline]` results in `hash_table::calculate_offsets` showing up at the top of a callgrind profile with 3 million calls (for the testcase in rust-lang#25916).
@bluss

This comment has been minimized.

Copy link
Contributor

bluss commented May 31, 2015

Is this the same issue as #23977?

@eddyb

This comment has been minimized.

Copy link
Member

eddyb commented May 31, 2015

@nikomatsakis I'm not too familiar with fulfill and infer, but couldn't select_where_possible track the latest InferCtxt snapshot it processed and look at the new information there?

bors added a commit that referenced this issue May 31, 2015

Auto merge of #25921 - eddyb:inline-calculate-offsets, r=Aatch
The `HashMap` and `HashSet` iterators use `RawTable::first_bucket_raw` which is generic and will get inlined cross-crate.
However, `first_bucket_raw` calls `calculate_offsets` and the call doesn't get inlined, despite being a simple function.
This missing `#[inline]` results in `hash_table::calculate_offsets` showing up at the top of a callgrind profile with 3 million calls (for the testcase in #25916).
@arielb1

This comment has been minimized.

Copy link
Contributor

arielb1 commented May 31, 2015

@eddyb

We have select_new_obligations exactly for this reason. It doesn't work for some reason tho.

@arielb1

This comment has been minimized.

Copy link
Contributor

arielb1 commented Jun 2, 2015

This is the same issue as
#25069
#23977

This is a different issue from #22204 (I think I will fix that one separately).

arielb1 pushed a commit that referenced this issue Aug 9, 2015

Ariel Ben-Yehuda
Mark round_up_to_next as inline
This speeds up rustc on #25916 from 1.36Â0.022s to 1.326Â0.025s

tbu- added a commit to tbu-/rust that referenced this issue Aug 9, 2015

Mark round_up_to_next as inline
This speeds up rustc on rust-lang#25916 from 1.36Â0.022s to 1.326Â0.025s

arielb1 pushed a commit to arielb1/rust that referenced this issue Aug 13, 2015

arielb1
Mark round_up_to_next as inline
This speeds up rustc on rust-lang#25916 from 1.36±0.022s to 1.326±0.025s

Tests pass locally (even on 32-bit :-)

bors added a commit that referenced this issue Aug 14, 2015

Auto merge of #27822 - arielb1:inline-round-take-2, r=Gankro
This speeds up rustc on #25916 from 1.36±0.022s to 1.326±0.025s

Tests pass locally (even on 32-bit :-)

r? @Gankro

Gankro added a commit to Gankro/rust that referenced this issue Aug 14, 2015

Mark round_up_to_next as inline
This speeds up rustc on rust-lang#25916 from 1.36±0.022s to 1.326±0.025s

Tests pass locally (even on 32-bit :-)

arielb1 pushed a commit to arielb1/rust that referenced this issue Jan 31, 2016

arielb1 pushed a commit that referenced this issue Feb 15, 2016

Ariel Ben-Yehuda
use stalled_on in all obligation types
this improves typeck performance by 5% (LLVM times are still huge).

Basically fixes #25916 (still O(n^2), but the example takes <1s to
compile).

bors added a commit that referenced this issue Feb 16, 2016

Auto merge of #31680 - arielb1:fast-fulfill, r=nikomatsakis
this improves typeck performance by 5% (LLVM times are still huge).

Basically fixes #25916 (still O(n^2), but the example takes <1s to
compile).

r? @nikomatsakis

bors added a commit that referenced this issue Feb 16, 2016

Auto merge of #31680 - arielb1:fast-fulfill, r=nikomatsakis
this improves typeck performance by 5% (LLVM times are still huge).

Basically fixes #25916 (still O(n^2), but the example takes <1s to
compile).

r? @nikomatsakis

@bors bors closed this in #31680 Feb 17, 2016

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.