-
Notifications
You must be signed in to change notification settings - Fork 146
Improve time-to-first-read via precompile + despecialization #875
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Been wanting/needing to do this for a while. Worked through the excellent tutorials in https://timholy.github.io/SnoopCompile.jl/stable/snoopi_deep_parcel/ to help figure out cases where we were overspecializing certain methods (`CSV.Context`) and generating useful precompile statements. On my machine, this reduces the time-to-first-read (TTFR) from ~8s to ~1.5, while package loading time goes from 3s to 10s, which both seem like good tradeoffs. To avoid overspecialization of `CSV.Context` method, which has a huge method body, we introduce a simple `Arg` wrapper struct to kill inference and make the method signature concrete. The `@refargs` macro conveniently wraps each arg in a _call_ with `Arg(x)`, and for a _definition_, it annotates each arg as `x::Arg` and inserts an unwrap as the first thing in the method body like `x = x[]::Int` where `x` was declared as `x::Int` in the argument list.
Codecov Report
@@ Coverage Diff @@
## main #875 +/- ##
==========================================
- Coverage 90.77% 89.19% -1.58%
==========================================
Files 9 9
Lines 1961 2008 +47
==========================================
+ Hits 1780 1791 +11
- Misses 181 217 +36
Continue to review full report at Codecov.
|
Awesome! ❤️ |
|
Nice work, @quinnj! With reference to https://discourse.julialang.org/t/csv-jl-fails-precompling-typeerror-in-type-expression-expected-unionall-got-type-parsers-options/67195, one thing I've noticed (but not yet updated the SnoopCompile docs for, sorry), is that to more easily support multiple Julia versions & architectures (e.g., 32-bit vs 64-bit), often a better strategy is to have your precompile file look like this: function _precompile_()
ccall(:jl_generating_output, Cint, ()) == 1 || return nothing
x = my_pkg_object(args...)
do_some_work(x, otherargs...)
endfor some tiny workloads typical of the data types you want your package to work on. This has the same effect as explicit In other words, rather than creating a workload for snooping via |
|
It's the same as |
|
Very nice @timholy. Now, I get what you meant when we were talking about it during JuliaCon. I did get it, but couldn't find any example anywhere. Looking forward to docs too 👍 |
|
Yeah, the problem with voluminous docs is that when you realize new stuff it's a bit daunting to update them. I really need to restructure it in more cookbook format ("do this..."). |
The changes in files that are not precompile.jl are inference improvements; mainly from inspecting results of `@code_typed`, Cthulhu.jl, and SnoopCompile.jl. The changes in precompile.jl are from comments from @timholy recommending that in our precompile process, we can just call regular code instead needing to call `precompile` with methods/arg types. I'm aware I don't understand all the details around precompilation, method invalidation, etc. but unfortunately, I feel a bit blocked with CSV.jl's precompilation. With the changes in #875, we now see a fixed overhead of allocations when parsing due, I'm told, to an issue in Base Julia (JuliaLang/julia#34055).
* A little refactoring to improve inference and precompilation The changes in files that are not precompile.jl are inference improvements; mainly from inspecting results of `@code_typed`, Cthulhu.jl, and SnoopCompile.jl. The changes in precompile.jl are from comments from @timholy recommending that in our precompile process, we can just call regular code instead needing to call `precompile` with methods/arg types. I'm aware I don't understand all the details around precompilation, method invalidation, etc. but unfortunately, I feel a bit blocked with CSV.jl's precompilation. With the changes in #875, we now see a fixed overhead of allocations when parsing due, I'm told, to an issue in Base Julia (JuliaLang/julia#34055).
Been wanting/needing to do this for a while. Worked through the
excellent tutorials in
https://timholy.github.io/SnoopCompile.jl/stable/snoopi_deep_parcel/ to
help figure out cases where we were overspecializing certain methods
(
CSV.Context) and generating useful precompile statements. On mymachine, this reduces the time-to-first-read (TTFR) from ~8s to ~1.5,
while package loading time goes from 3s to 10s, which both seem like
good tradeoffs. To avoid overspecialization of
CSV.Contextmethod,which has a huge method body, we introduce a simple
Argwrapper structto kill inference and make the method signature concrete. The
@refargsmacro conveniently wraps each arg in a call with
Arg(x), and for adefinition, it annotates each arg as
x::Argand inserts an unwrap asthe first thing in the method body like
x = x[]::Intwherexwasdeclared as
x::Intin the argument list.One thing I noticed that I would still like to look into if time is that if we do
f = CSV.File(IOBuffer("a,b,c\n1,2,3\n\n"))in a fresh session, it takes ~1.5s, but then if we call likef = CSV.File(IOBuffer("a,b,c\n1,2,3\n\n"); header=false), it takes 3s to......recompile a bunch of stuff I guess? I'd love to figure out why that's causing so much recompiling.