Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Precompile some functions for faster startup #673

Merged
merged 1 commit into from Sep 7, 2015
Merged

Conversation

timholy
Copy link
Collaborator

@timholy timholy commented Sep 1, 2015

See discussion in JuliaLang/julia#12897

@timholy
Copy link
Collaborator Author

timholy commented Sep 2, 2015

OK, now I can give a more comprehensive report about the impact of this PR. First, the executive summary:

For someone who develops Gadfly:

  • this PR is superficially a net negative because of the long increase in build times
  • however, while hacking on Gadfly one could just comment out the top precompile directive, restoring faster build times (even faster than the one shown for master)
  • one should then receive net benefit in testing from the directives in Compose and other packages that contribute to making time-to-plot shorter

For users:

  • The load time is about 20% longer, which is enough to notice
  • Time to first plot is about half that of master, and this time now dominates user experience

So, with some genuine caveats, I think this is worth doing.

The data

Master (with existing precompilation)

julia> tic(); using Gadfly; toc()
elapsed time: 5.778354972 seconds
5.778354972

# Time to plot "iris" dataset from a cold start (but with Gadfly already loaded)
julia> @time display(plot(ds,x="SepalLength", y="SepalWidth", Geom.point))
 36.515327 seconds (26.37 M allocations: 1.182 GB, 2.65% gc time)

# From a cold start,
julia> @time include("runtests.jl")
256.179169 seconds (302.63 M allocations: 10.059 GB, 3.34% gc time)

# If I "touch" Gadfly.jl (and all dependent packages are already compiled),
julia> tic(); using Gadfly; toc()
INFO: Precompiling module Gadfly...
elapsed time: 29.519638773 seconds

$ ls -lh ~/.julia/lib/v0.4/Gadfly.ji 
-rw------- 1 tim holy 926K Sep  2 03:22 /home/tim/.julia/lib/v0.4/Gadfly.ji

This PR

Including the similar one over at Compose, plus precompiles that I haven't yet posted for Iterators, DataArrays, and DataFrames (of 60, 187, and 90 lines respectively).

julia> tic(); using Gadfly; toc()
elapsed time: 6.994830364 seconds

julia> @time display(plot(ds,x="SepalLength", y="SepalWidth", Geom.point))
 15.677082 seconds (8.81 M allocations: 362.872 MB, 1.12% gc time)

julia> @time include("runtests.jl")
 142.193677 seconds (254.59 M allocations: 7.978 GB, 3.58% gc time)

ulia> tic(); using Gadfly; toc()
INFO: Recompiling stale cache file /home/tim/.julia/lib/v0.4/Gadfly.ji for module Gadfly.
elapsed time: 104.22046293 seconds
104.22046293

 $ ls -lh ~/.julia/lib/v0.4/Gadfly.ji 
-rw------- 1 tim holy 2.8M Sep  1 22:31 /home/tim/.julia/lib/v0.4/Gadfly.ji

@timholy
Copy link
Collaborator Author

timholy commented Sep 2, 2015

(Test failure on 0.4 is due to simonster/Reexport.jl#3).

@timholy timholy changed the title RFC (don't merge yet): precompile some functions for faster startup Precompile some functions for faster startup Sep 2, 2015
@dcjones
Copy link
Collaborator

dcjones commented Sep 7, 2015

I compared this to @spencerlyon2's patch in #672, and this one does better so I'm merging it, but surprisingly it's only a little better (runtests.jl takes 130 seconds, versus 145 seconds in #672).

It's fun to see Gadfly creep towards not being the slowest plotting software ever written. 😀

dcjones added a commit that referenced this pull request Sep 7, 2015
Precompile some functions for faster startup
@dcjones dcjones merged commit 42b3ae4 into master Sep 7, 2015
@timholy
Copy link
Collaborator Author

timholy commented Sep 8, 2015

It might get better still if I get around to submitting my precompiles to Iterators, DataStructures, and DataFrames

It's fun to see Gadfly creep towards not being the slowest plotting software ever written. 😀

While performance continues to be a challenge in some areas, it's clearly much better than when I last looked. And some of the issues are due to limitations in julia itself. As I dig further into the code, I have to say I grow more and more fond of the architecture. You've been able to keep a fairly clean design over ~20k loc (including Compose) for something that has to handle a huge number of cases and messy user input. No small feat. I just hope that the performance improvements I submit don't uglify things too much.

@timholy timholy deleted the teh/snoop_precompile branch September 8, 2015 01:15
@sglyon
Copy link

sglyon commented Sep 8, 2015

Please tell me you automated this somehow?

If not we all owe you @timholy (well we owe you either way)!

@timholy
Copy link
Collaborator Author

timholy commented Sep 8, 2015

A package I'll soon try to polish, document, tag, and release: https://github.com/timholy/SnoopCompile.jl

@sglyon
Copy link

sglyon commented Sep 8, 2015

That's sneaky and great!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants