Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

100x compile time slowdown in BandedMatrices.jl #27512

Closed
dlfivefifty opened this issue Jun 9, 2018 · 22 comments
Closed

100x compile time slowdown in BandedMatrices.jl #27512

dlfivefifty opened this issue Jun 9, 2018 · 22 comments
Labels
latency Compiler latency regression Regression in behavior compared to a previous version

Comments

@dlfivefifty
Copy link
Contributor

BandedMatrices.jl has had an almost 100x slowdown (see "bug-slowmul" branch) in 0.7-alpha:

julia> A = brand(5,5,1,1); b = randn(5); @time A*b # v0.6.2
  0.375991 seconds (335.46 k allocations: 16.421 MiB)

VS

julia> A = brand(5,5,1,1); b = randn(5); @time A*b  # v0.7, branch "bug-slowmul"
 24.676427 seconds (57.04 M allocations: 5.038 GiB, 3.76% gc time)

I sped this up to "only" 15x slowdown on "master" by making the code type stable:

julia> A = brand(5,5,1,1); b = randn(5); @time A*b # v0.7, branch "master"
  5.644144 seconds (16.42 M allocations: 1.252 GiB, 4.54% gc time)
@ufechner7
Copy link

ufechner7 commented Jun 9, 2018

On my PC the slow down is much smaller:

julia> A = brand(5,5,1,1); b = randn(5); @time A*b # v0.6.2
WARNING: importing deprecated binding Base.uninitialized into BandedMatrices.
WARNING: Base.uninitialized is deprecated, use undef instead.
 in module BandedMatrices
WARNING: Base.uninitialized is deprecated, use undef instead.
 in module BandedMatrices
WARNING: Base.uninitialized is deprecated, use undef instead.
 in module BandedMatrices
┌ Warning: using `A[I...] = x` to implicitly broadcast `x` across many locations is deprecated. Use `A[I...] .= x` instead.
│   caller = _banded_generic_matvecmul!(::Array{Float64,1}, ::Char, ::BandedMatrix{Float64}, ::Array{Float64,1}) at interface.jl:184
└ @ BandedMatrices interface.jl:184
  2.024492 seconds (7.35 M allocations: 432.613 MiB, 5.60% gc time)

With julia 0.63 I also need about 0.38 seconds.
I am using:

julia> versioninfo()
Julia Version 0.7.0-alpha.58
Commit c9b2274 (2018-06-09 05:25 UTC)
Platform Info:
  OS: Linux (x86_64-linux-gnu)
  CPU: Intel(R) Core(TM) i7-7700K CPU @ 4.20GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-6.0.0 (ORCJIT, skylake)
Environment:
  JULIA_PKG3_PRECOMPILE = enable

julia> 

And part of the slow down might be due to the deprecations.

@dlfivefifty
Copy link
Contributor Author

dlfivefifty commented Jun 9, 2018 via email

@ufechner7
Copy link

How can I checkout master with Pkg3?

@KristofferC
Copy link
Sponsor Member

You can pkg> add BandedMatrices#master.

@ufechner7
Copy link

ufechner7 commented Jun 9, 2018

Now I get:

julia> A = brand(5,5,1,1); b = randn(5); @time A*b # v0.6.2
  4.385049 seconds (16.40 M allocations: 1.245 GiB, 4.53% gc time)

So this is a slow-down by a factor of 11.5 .
But perhaps the problem is the new implementation of BandedMatrices in master and not Julia?

@dlfivefifty
Copy link
Contributor Author

The 100x slowdown is in a different branch “bug-slowmul”. The comparison between 0.6.2 and 0.7 was done on the same branch.

In any case, 30s compilation time is a problem with the compiler, not a package.

@juliohm
Copy link
Sponsor Contributor

juliohm commented Jun 10, 2018

Naive question, is the slow down present if you start julia --depwarn=no?

@dlfivefifty
Copy link
Contributor Author

There's no deprecation warnings, so julia --depwarn=no doesn't improve anything. (And it would be strange if it did.)

@dlfivefifty
Copy link
Contributor Author

dlfivefifty commented Jun 13, 2018

Here's another ridiculously slow example on master:

using Compat, BandedMatrices

BandedMatrixWithZero = Union{BandedMatrix{Float64}, UniformScaling}
# need to define the concept of zero
Base.zero(::Type{BandedMatrixWithZero}) = 0*I
A = BandedMatrix{BandedMatrixWithZero}(undef, 1, 2, 0, 1);
A[1,1] = BandedMatrix(Eye(1),(0,1))
A[1,2] = BandedMatrix(Zeros(1,2),(0,1))
A[1,2][1,1] = -1/3
A[1,2][1,2] = 1/3
B = BandedMatrix{BandedMatrixWithZero}(undef, 2, 1, 1, 1);
B[1,1] = 0.2BandedMatrix(Eye(1),(0,1))
B[2,1] = BandedMatrix(Zeros(2,1), (1,0))
B[2,1][1,1] = -2/30
B[2,1][2,1] = 1/3
@time A*B # 6s (v0.6.2) VS 620s (1.56 G allocations: 170.395 GiB, 5.82% gc time) (v0.7-alpha)

@ViralBShah
Copy link
Member

I can't just paste it in the repl and reproduce. Can you provide a complete script?

@ViralBShah
Copy link
Member

Just needed using BandedMatrices.

@ViralBShah
Copy link
Member

Took 1714 seconds on my computer 0.7-alpha. The code doesn't run on 0.6.3 with undef not defined. What is the 0.6 version like?

@dlfivefifty
Copy link
Contributor Author

Add using Compat

@ViralBShah
Copy link
Member

Even with Compat:

julia> B = BandedMatrix{BandedMatrixWithZero}(undef, 2, 1, 1, 1)
2×1 BandedMatrices.BandedMatrix{Union{BandedMatrices.BandedMatrix{Float64}, UniformScaling}}:
Error showing value of type BandedMatrices.BandedMatrix{Union{BandedMatrices.BandedMatrix{Float64}, UniformScaling}}:
ERROR: UndefRefError: access to undefined reference

@dlfivefifty
Copy link
Contributor Author

That’s just an error in show. If you ignore it the rest of the code should run.

@ViralBShah
Copy link
Member

ViralBShah commented Jun 13, 2018

Ok. I am just doing it mechanically, but these comments should help others reproducing. It did run in 15s for me on 0.6.3.

@ViralBShah ViralBShah added regression Regression in behavior compared to a previous version triage This should be discussed on a triage call labels Jun 13, 2018
@dlfivefifty
Copy link
Contributor Author

dlfivefifty commented Jun 13, 2018

I updated it so it should now be copy-and-pastable.

@StefanKarpinski StefanKarpinski removed the triage This should be discussed on a triage call label Jun 14, 2018
@StefanKarpinski
Copy link
Sponsor Member

Important but #triage is not appropriate.

@ViralBShah
Copy link
Member

The reason I had tagged triage was if this needs to go on the 1.0 milestone.

@dlfivefifty
Copy link
Contributor Author

I'm not one to over emphasise the "publicity" angle of 1.0, but I think it's a bad idea to release 1.0 with known serious bugs. Indeterministic compile time is a serious bug to me. (I don't know if it'll impact BandedMatrices.jl itself, as I'm planning to rewrite the whole matrix multiplication code.)

@JeffBezanson
Copy link
Sponsor Member

I tried the example in #27512 (comment) and it took about 5 seconds. Is this considered fixed?

@dlfivefifty
Copy link
Contributor Author

Is this considered fixed?

Yep, it only takes me 3s 🎉 And the original example(on master) is only about 1s:

julia> A = brand(5,5,1,1); b = randn(5); @time A*b;

  1.106104 seconds (5.57 M allocations: 300.408 MiB, 9.57% gc time)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
latency Compiler latency regression Regression in behavior compared to a previous version
Projects
None yet
Development

No branches or pull requests

7 participants