New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Creating a system image fails in Julia 1.10.0-beta2 while it works with 1.9 #50729
Comments
Cc: @pchintalapudi |
What's the
This 6GB number is going to vary by size of system image and number of threads (2 threads might not exceed that for even larger images). There is always the escape hatch where you cut it down to one thread (set the |
RAM usage Julia 1.9.2: 10GB Strange: Julia 1.10 needs 24 GB on a 4 core CPU, but also on a 16 core CPU. |
Does it work when set with one thread? (JULIA_IMAGE_THREADS=1) |
Yes, after adding a GC.gc() at the end of the So we have two issues:
with 2, 4 and 16 threads it always needs 24GB RAM. |
With or without multithreading? --heap-size-hint may be helpful here if the reduction is that steep. |
Without multithreading the needed size is 12.7 GB, worse than with Julia 1.9.2 which needs 10.1GB, but good enough to be able to build the image with 16GB RAM. And no, --heap-size-hint has no effect, but that is another bug: #50658 |
Same behavior in Windows 10. Works with 1.9, but fails with 1.10.0-beta1.
precompile.jl
The error:
|
That is a different issue, happening during linking rather than out of memory.
This is probably related to us using external hidden symbols to link across compilation units in multithreaded image generation. |
Why are there many exports? There should only be a small handful |
Maybe windows ld counts external hidden symbols as exports? |
@MariusDrulea can you check if your issue still happens on #50791 with |
I noticed the PR got merged into the main branch. |
Here is the test for #50752. Still fails. Note I haven't set Note the number of globals. The package compiler has other globals than what you count with this line of code: https://github.com/JuliaLang/julia/pull/50752/files#diff-dcb7645cc96941c621f6aeaaebbdcdd54e83fe925eed0135e4fd840806fb4d91R937
Full console output:
|
I set up the
|
@pchintalapudi I have also tested #50791 with
|
It needs to be with JULIA_IMAGE_THREADS=1, not JULIA_NUM_THREADS. |
Update: |
#50791 is the change that makes |
#50752 works, this is the PR I have tested. Isn't this expected? |
This #50874 works just fine on my Windows machine. |
To answer my own question, I just have to copy the binaries of some commit into the juliaup folder for example. |
Well, this has not been completed. Please, re-open. There have been two issues mixed up in this issue, one Linux issue and one Windows issue. The original issue is not solved. Out of memory... |
I think the solution, for now, is for you to just set |
I think the most simple way to resolve this regression is either to use one thread as default, or to use the number of threads passed as -p parameter ... What do you think? It is not user friendly to assume that everybody has 32GB of RAM... |
That would make it worse in all cases where one does not run out of memory, which is most cases.
Not sure what
That is not the assumption. It is not very strange that one has to manually restrict certain resource usage if one builds very large things (like the sysimage in the example here) on relatively less powerful hardware. |
Well, my philosophy is better safe than sorry. So by default use one thread, use an existing command line option like -p to specify the number of threads to use or add a new command line option specially for this purpose for the people who want to use more threads... Alternatively check the free memory before starting a second thread. I don't think that tuning Julia for speed and accepting that crashes might happen that cannot even be avoided with a command line parameter is a good idea... As a reminder: In my case my computer stalled completely, I had to press the reset button to continue... Not a nice experience for people who upgrade to Julia 1.10 ... |
How do you come to the conclusion that this is "most cases"? I was running out of memory with 16GB RAM, many people have only 4 GB or 8 GB... |
The number of threads we used is being calculated here: Line 1411 in 1c536dd
From the versioninfo you posted it seems you have 8 threads with 32 GB of RAM. Could you observe how much memory Julia uses for your case? Does it succeed with I am not sure what would be a satisfactory resolution. We use the same code for both small-ish workloads like pkgimgs and there this heuristic has worked well. |
The versioninfo() I posted is from my private laptop which has 16GB physical RAM. And as I wrote before:
|
Ah, sorry I misread the numbers then. We could maybe apply a heuristic like |
That would not have helped in my case. Only |
Operating systems usually don't show free ram, if there is ram available they should and do use it for caching stuff. Macos is an example where free memory is very often in the hundred of megabytes but it's data that the OS will store in the disk if another process needs. |
MWE for reproducing this issue on Linux (Ubuntu) with 16GB RAM, e.g. a virtual machine with 16GB RAM and 8 cores: Project.toml:
create_sys_image.jl: using Pkg
@info "Loading packages ..."
using ModelingToolkit, ControlSystems, DataFrames, PackageCompiler
@info "Creating sysimage ..."
PackageCompiler.create_sysimage(
[:ModelingToolkit, :ControlSystems, :DataFrames];
sysimage_path="sys-image.so",
precompile_execution_file="test_for_precompile.jl"
) test_for_precompile.jl: using ModelingToolkit, ControlSystems, DataFrames
df = DataFrame([[1, 2], [0, 0]], [:a, :b]) Executing the test:
This crashes with the OOM killer on Ubuntu 23.04 and Julia 1.10-beta1. It works fine with Julia 1.9.2. Just three packages and one line of code... |
Well, if we implement heuristics we can as well do that OS specific... |
@ufechner7 why do you turn swap off? When there is no swap the OOM killer is likely to be more aggressive since an actual system OOM is much more severe if none can be freed by swapping. If you only do it for the purposes of an MWE thats fine, but in real life having lots of swap might in fact allow you to complete (slowly).
Also I wonder how much IO cache is used? I would expect generating a sysimage to do quite a bit of IO. But to my mind the question is why does the memory usage increase significantly with number of threads? Its doing the same total work, making the sysimage, so unless the multithreaded version duplicates many structures(?) I would not have expected multiple threads to use a huge amount more memory. Or is it because multiple threads are generating garbage faster than GC can recycle it, so total memory usage goes up, or is it something else? If its GC related thats #50658 about the GC needing to be more aggressive when approaching a memory hint. |
I turn swap off to have exactly 16GB of RAM to have a reproducible test case... And yes, I do not really understand why it needs about 24 GB of RAM with two to 16 threads... Two or more does not make a difference, but one needs significantly less RAM... |
It would be good to see the total allocations and GC % for one and two thread cases that complete. |
@ufechner7 I run your MWE and the largest RAM usage I have seen is 12.4GB, for that specific julia process. See the picture. I do have a 32GB machine. However, I agree with you that sysimage creation in julia 1.10 shall work by default also on 8GB and 16GB computers. |
Which OS are you using? How many cores do you have? I see only three Julia threads, I had 4 threads at 100% CPU usage each when using 1.10.0-beta1 ... I use |
Julia 1.10.0-beta2 has a slightly lower memory consumption, so the MWE posted above does not fail any longer... I am working on a new MWE. |
Well if it works and 1.9 works can you please collect the allocation statistics and % GC time for both with single thread vs two threads. |
@elextr How can I collect allocation statistics? Do I need to start Julia with a specific parameter? |
Julia 1.10. using gcc and g++, not Clang on Ubuntu 22.04, second run
Julia 1.9.2, second run (after restart of Julia)
What you can clearly see is that the reported allocations are pretty misleading. At least with 1.10. I assume that a separate Julia instance is launched and we do not see its allocations... If we assume that the memory usage at the beginning was by the OS and not by Julia, then Julia 1.9.2 needed 6.9GB RAM (1.7 GB allocations reported) and Julia 1.10-beta2 needed 12.6 GB (0.8GB allocations reported). With 10GB RAM, e.g. 8GB physical plus 2GB swap this compilation would succeed with 1.9.2 and fail with 1.10.0-beta2. Perhaps anybody else has an idea where the large difference between reported allocations and memory usage as reported by |
Swap is required, as otherwise it is not possible to use all 16GB of RAM, and a considerable amount of physical memory must be wasted to deal with the overcounting needed for all of the virtual memory that otherwise does not require physical memory due to de-duplication. For compilation machines, we expect the machines to have considerable resources, and we don't expect to reduce that requirement as we know users can access bigger machines if necessary to run the build. |
On Ubuntu Linux version 20.04 with 16GB physical RAM creating a system image using package compiler fails badly when using Julia 1.10. The problem is that the system runs out of memory. With "fails badly" I mean that my laptop gets stuck, the OOM killer is not getting activated. Probably the memory usage rises too fast for the OOM killer.
This should not happen. Before starting the multi-threaded system image creation Julia should check the available free RAM (including the swap file) and if less than 6GB are free it should only start one thread. Alternatively there could be a command line option that determines the number of threads to use.
I use the following script to create a system image:
The text was updated successfully, but these errors were encountered: