Reduce the first time to solve from 5 seconds to 1 second for Tsit5#1465
Reduce the first time to solve from 5 seconds to 1 second for Tsit5#1465ChrisRackauckas merged 9 commits intomasterfrom
Conversation
Ya'll think you write good compilers? Well, I'm the compiler now! ```julia using OrdinaryDiffEq, SnoopCompile function lorenz(du,u,p,t) du[1] = 10.0(u[2]-u[1]) du[2] = u[1]*(28.0-u[3]) - u[2] du[3] = u[1]*u[2] - (8/3)*u[3] end u0 = [1.0;0.0;0.0] tspan = (0.0,100.0) prob = ODEProblem(lorenz,u0,tspan) alg = Tsit5() tinf = @snoopi_deep solve(prob,alg) itrigs = inference_triggers(tinf) itrig = itrigs[13] ascend(itrig) @time solve(prob,alg) using ProfileView ProfileView.view(flamegraph(tinf)) v5.60.2 InferenceTimingNode: 1.249748/4.881587 on Core.Compiler.Timings.ROOT() with 2 direct children Before InferenceTimingNode: 1.136504/3.852949 on Core.Compiler.Timings.ROOT() with 2 direct children Without `@turbo` InferenceTimingNode: 0.956948/3.460591 on Core.Compiler.Timings.ROOT() with 2 direct children With `@inbounds @simd` InferenceTimingNode: 0.941427/3.439566 on Core.Compiler.Timings.ROOT() with 2 direct children With `@turbo` InferenceTimingNode: 1.174613/11.118534 on Core.Compiler.Timings.ROOT() with 2 direct children With `@inbounds @simd` everywhere InferenceTimingNode: 0.760500/1.151602 on Core.Compiler.Timings.ROOT() with 2 direct children ```
| ForwardDiff = "f6369f11-7733-5829-9624-2563aa707210" | ||
| LinearAlgebra = "37e2e46d-f89d-539d-b4ee-838fcccc9c8e" | ||
| Logging = "56ddb016-857b-54e1-b83d-db4d58db5568" | ||
| LoopVectorization = "bdcacae8-1622-11e9-2a5c-532679323890" |
Laptop
Before:
InferenceTimingNode: 1.585750/5.363441 on Core.Compiler.Timings.ROOT() with 2 direct children
After:
InferenceTimingNode: 0.885957/1.254411 on Core.Compiler.Timings.ROOT() with 2 direct children |
|
LOL Vern7
Before:
InferenceTimingNode: 5.962703/13.461966 on Core.Compiler.Timings.ROOT() with 1 direct children
After:
InferenceTimingNode: 2.979609/3.563301 on Core.Compiler.Timings.ROOT() with 2 direct children |
Vern9
Before:
InferenceTimingNode: 17.960255/23.513515 on Core.Compiler.Timings.ROOT() with 2 direct children
After:
InferenceTimingNode: 6.969864/7.531495 on Core.Compiler.Timings.ROOT() with 2 direct children |
* reduce compile times by specializing broadcasts to loops Companion PR to SciML/OrdinaryDiffEq.jl#1465 * Update src/calculate_residuals.jl Co-authored-by: Yingbo Ma <mayingbo5@gmail.com> * Update src/calculate_residuals.jl Co-authored-by: Yingbo Ma <mayingbo5@gmail.com> * Update src/calculate_residuals.jl Co-authored-by: Yingbo Ma <mayingbo5@gmail.com> * simd ivdep * remove reduction compile Co-authored-by: Yingbo Ma <mayingbo5@gmail.com>
|
Yep, broadcasting is quite expensive for the compiler. Sorry you had to write all this out by hand, but nice outcome! You're getting to be a master of the tools! |
|
Writing it out by hand is fine. The fact that I cannot do the same for RecursiveFactorization.jl's compile times bug me though... |
|
In the next month I'm planning to trying to go through some of https://github.com/JuliaLang/julia/issues?q=is%3Aopen+is%3Aissue+label%3Aprecompile. That might make the inference part go away. Any overhead due to codegen/LLVM won't be helped though (yet). |
|
From these studies and #1467, JuliaLinearAlgebra/RecursiveFactorization.jl#29, and JuliaSIMD/TriangularSolve.jl#8, the biggest thing for us would be to figure out why the RecursiveFactorization/TriangularSolve/LoopVectorization stack won't cache the precompiles. In some sense it should be easy: the lowest level is just functions on |
|
I suspect a big part of the problem is that LoopVectorization owns the An additional problem is that most of the time is not spent during inference. However, this is from compiling a large number of methods. The first EDIT: |
|
AFAICT @ChrisRackauckas, you duplicated #1467 up there, feel free to edit and then I will take a look. |
|
Edited. Ahh yes, that's the one piece I in the chain I didn't setup to precompile! I'll go add something to DiffEqBase and see if that handles it. |
Ya'll think you write good compilers? Well, I'm the compiler now!