Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal for more first class handing of sysimages in Pkg #2008

Open
KristofferC opened this issue Sep 6, 2020 · 9 comments
Open

Proposal for more first class handing of sysimages in Pkg #2008

KristofferC opened this issue Sep 6, 2020 · 9 comments

Comments

@KristofferC
Copy link
Sponsor Member

KristofferC commented Sep 6, 2020

Issue

Using a custom sysimage can drastically reduce load times of packages. The goto solution for this is PackageCompiler.jl and it works well but it isn't used as much as perhaps warranted considering the benefits it provides. From some discussion, it seems that it is a bit too much of a "mental overhead" to use it. To use PackageCompiler.jl for a sysimage it requires you to:

  • Install the package
  • Manually list the packages you want to precompile into a sysimage
  • Make sure you start Julia with that sysimage
  • Make sure you update the sysimage when your dependencies update or you will load old versions of everything.
    There is currently no way to automatically detect this.
  • For best effect, also provide a "precompilation execution script" that is used to gather data of what functions to precompile.

Since load time of packages and "time to first plot" are a frequent gripe about Julia, it makes sense to see if
we can give a better interface to PackageCompiler.

Proposal

The proposal here is to introduce a new set of Pkg API that handles sysimages. To give a taste of what a session would look like:

pkg> sysimage create
Info: Creating a new sysimage based on the packages in curent project at `~/.julia/environments/v1.5/Project.toml` 
Packages tracked by path and their dependencies not put into sysimage:
    - OhMyREPL
    └ DataStructures, Crayons
Info: Package `OhMyREPL` not put into sysimage because it is tracked by path. This caused its dependencies `OrderedCollect

pkg> sysimage status
(@v1.5) pkg> status
Status `~/.julia/environments/v1.5/sysimage.dylib`
  [6e4b80f9] BenchmarkTools v0.5.0
  [f68482b8] Cthulhu v1.2.0
  [0c46a032] DifferentialEquations v6.15.0
  [7876af07] Example v0.5.3
  [8fb92a4a] Exfiltrator v0.1.0
  [b22a6f82] FFMPEG_jll v4.3.1+2
Package in project not in sysimage
  [6e4b80f9] OhMyREPL v0.7.0 `~/JuliaPkgs/OhMyREPL.jl`
  [ae3bc0f9] DataStructures v0.5.0
 [a8cc5b0e] + Crayons v4.0.4

pkg> up
Updating `~/.julia/environments/v1.5/Project.toml`
  [a8cc5b0e] + Crayons v4.0.4
Updating `~/.julia/environments/v1.5/Manifest.toml`
  [a8cc5b0e] ↑ Crayons v4.0.3 ⇒ v4.0.4

pkg> sysimage status
Warn: Some packages in the sysimage are out of date with project, run `sysimage update` to update it:
  [a8cc5b0e] ↑ Crayons v4.0.3 ⇒ v4.0.4
...

pkg> sysimage update
...

Next time we start julia:

> julia --project

❯ /usr/local/bin/julia -q

Info: Automatically using sysimage at `~/.julia/environments/v1.5/sysimage.dylib`
julia> @time using DifferentialEquations # look how fast
0.0202 seconds (144.73 k allocations: 7.456 MiB)

So the concrete proposal here is to add convenience functionalities to Pkg to make dealing with sysimage easier.

In addition, this proposes adding some functionality to Julia itself that allows it to automatically detect a custom sysimage next to the project and use that for the Julia process. This could be done via some naming convention.

Why in Pkg and not in a separate package.

The main point of this proposal is to reduce the friction in using e.g. PackageCompiler. Bundling it with Pkg allows it to use the super user-friendly Pkg REPL with no need to manually install anything. Also, we likely want to use a lot of the code in Pkg for dealing with projects, for status printing, etc so from that point of view, it makes sense to have it in Pkg. One question is if the code for PackageCompiler itself should move into Pkg. I think it is best to not do this but instead, just install PackageCompiler into the global project from Pkg when the sysimage command is used for the first time.

Possible complications:

  • Sysimages are not usable across different Julia versions. Right now, upgrading the Julia minor version is super easy. With a custom sysimage one needs to refresh all the sysimages.
  • Creating a sysimage requires a compiler. PackageCompiler provides a compiler via the artifact system on Windows and tries to use a local one on mac/linux. We could just ship a compiler via the artifact system on Mac and Linux as well if there are problems with using the system one.
  • If one has a custom sysimage for the default environment, starts julia, changes the environment, and then start loading packages, packages from the sysimage for the default environment will still be "locked-in". Pkg could warn about this when a new project is activated.

cc @tkf since I think you have thought a bit about stuff like this

@fredrikekre
Copy link
Member

In addition, this proposes ading some functionality to Julia itself that allows it to automatically detect a custom sysimage
next to the project and use that for the Julia process. This could be done via some naming convention.

xref JuliaLang/julia#35794

@tkf
Copy link
Member

tkf commented Sep 6, 2020

Yeah, I wrote JuliaLang/julia#35794 so that user-friendly interface like this would be easy to implement.

FWIW, the proposal LGTM. A few minor comments:

just install PackageCompiler into the global project from Pkg

Maybe do what --bug-report does? IIUC it checks for the current environment and then install BugReporting.jl in a temporary environment if it's not installed. This approach was very handy for fixing BugReporting.jl bugs. (Ref @StefanKarpinski's comment JuliaLang/julia#35494 (comment))

  • Right now, upgrading the Julia minor version is super easy. With a custom sysimage one needs to refresh all the sysimages.

My approach in JuliaLang/julia#35794 was to compute system image storage path from (the hash of) the path to julia binary. This way, unmatched system image is not used and julia fallbacks to the default system image.

I think it's likely that minor version would be installed in a different path so this may be enough. To be more careful, I think we can include, e.g., Julia version in the hash.

@KristofferC
Copy link
Sponsor Member Author

Maybe do what --bug-report does? IIUC it checks for the current environment and then install BugReporting.jl in a temporary environment if it's not installed.

Yes, that is better.

I think it's likely that minor version would be installed in a different path so this may be enough.

Not sure it will be enough on mac where I think it is /Applications/Julia-1.5.app/Contents/Resources/julia/bin/julia for all 1.5 versions.

@tkf
Copy link
Member

tkf commented Sep 6, 2020

/Applications/Julia-1.5.app/Contents/Resources/julia/bin/julia for all 1.5 versions.

Ah, that's unfortunate. I guess I'd have to put the version in the hash if we are going to use JuliaLang/julia#35794.

@ericphanson
Copy link
Contributor

ericphanson commented Oct 16, 2020

This is only semi related to the feature proposed here, but I think it would be helpful if the resolver could take into account the sysimage when choosing versions of dependencies. E.g. if I start with a "base" sysimage with say things like Plots, and then I make an environment and start adding packages to it, it would be great if my shared dependencies / transitive dependencies chose versions that did not conflict with those baked into the sysimage already. Already the manifest you get from a Project.toml depends on the version of Julia you use, but maybe it should actually depend on the sysimage you use (of which Julia version is kind of a special case).

My actual use-case isn't plots, but is very similar, something like: some process has produced a docker image with Julia and a sysimage and some code and/or other artifacts. Now I want to start from that image and add some more packages. I don't want to regenerate the sysimage (since that takes awhile and the code I'm adding is comparatively light) but I do want versions resolved correctly so I don't run into weird bugs.

If one has a custom sysimage for the default environment, starts julia, changes the environment, and then start loading packages, packages from the sysimage for the default environment will still be "locked-in". Pkg could warn about this when a new project is activated.

I think the above could help with this; at least, if you don't have a manifest, it could try to resolve a compatible manifest instead of just giving a warning. And if you do have a Manifest, it could warn or maybe even prompt to regenerate the manifest.

@KristofferC
Copy link
Sponsor Member Author

This is only semi related to the feature proposed here, but I think it would be helpful if the resolver could take into account the sysimage when choosing versions of dependencies.

If we store the version of Plots into the sysimage, this could be done. And yes, I think that should be done as a part of this proposal.

@ericphanson
Copy link
Contributor

ericphanson commented Oct 16, 2020

It seems to me there might be some overlap with #1233 in terms of merging projects. I.e. if you have a sysimage loaded, then any project you activate is kind of a "subproject" of the project baked into that sysimage (with regards to version resolution, I'm not talking about folder structure or anything like that of course).

@KristofferC
Copy link
Sponsor Member Author

Yes, I think that is a pretty good way to look at it.

@ericphanson
Copy link
Contributor

It would be great if this proposal had some way of excluding packages from the sysimage, e.g. just compile the registered dependenices of the package whose environment you are in, so you don't need to recompile the sysimage if you update a dev'd / add'd dependency or the package code itself.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants