-
Notifications
You must be signed in to change notification settings - Fork 160
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Gen on master
fails to precompile when inside Docker on a Mac
#311
Comments
@fplk, @agarret7, @nishadgothoskar, do you have access to a Mac and able to help out with step B here? |
I think I can reproduce this with a minimal example:
Building and running this (
What is really disturbing to me is that I was certain it would have to work on Ubuntu 20.04 inside a type-2 hypervisor, so I installed Ubuntu in a VirtualBox VM on a Macbook Pro, built the image there, reran this and... incredibly, this fails with the same error. This is OS virtualization inside type-2 virtualization - this should be so encapsulated that I'm very surprised it can break. Is this kernel related?! Or did I overlook something? Other packages like Image above working on Ubuntu 20.04 workstation: Image above breaking on Ubuntu 20.04 VM on macOS host: PS: PPS: PPPS:
This is via Julia 1.3.1 on macOS 10.15.6. |
Given that this bug has replicated, does it still make sense for me to install VirtualBox and try the Linux Docker version? From Falk's comment, looks it still somehow fails that way? Happy to do it if we need a second attempt. I could try to get an IBM VM and work from there. Or continue debugging my non-Docker install. |
@fplk - Thanks so much for doing this work. If I were not on the Gen team I would have waited to file an upstream bug until I had a reproducible minimal example, and tracked that as an issue in my own repo. But I figured this way was ok especially because the Step B approach could have been doable with very little work by someone who had a Mac. But the degree of investigation you did here is much more extensive and super helpful - thanks! |
It is really a bummer that we are seeing an error that appears to involve the virtualization software itself... |
@fplk, could this be anything other than a bug in VirtualBox? (I mean sure, it could be multiple bugs, but seems like at least one of them has to be in VirtualBox.) Are you able to try the Docker on a different hypervisor? [Edit: Oh shoot, IIUC you're saying this may be a bug in some virtualization facility of the Darwin kernel that all reasonable hypervisors rely on. Well, seems still worth trying...] |
A user of the docker image for the 6.885 psets has encountered the following error upon running this command on an Intel Mac Pro:
Perhaps this involves an interaction between OpenBLAS and Julia and the their actual CPU architecture (see, e.g., JuliaLang/julia#29652). It's not yet clear if they are encountering the same problem as in this issue #311, in which case it probably isn't just a bug in Julia, but also in virtualization on the host machine -- perhaps in the Darwin kernel or one of its modules (or analogous things). |
Also getting this issue when running a Docker image of ubuntu within a Windows host this time, sigh: julia> using Gen
[ Info: Precompiling Gen [ea4f424c-a589-11e8-07c0-fd5c91b9da4a]
ERROR: Failed to precompile Gen [ea4f424c-a589-11e8-07c0-fd5c91b9da4a] to /root/.julia/compiled/v1.5/Gen/OEZG1_Bbn6e.ji.
Stacktrace:
[1] error(::String) at ./error.jl:33
[2] compilecache(::Base.PkgId, ::String) at ./loading.jl:1305
[3] _require(::Base.PkgId) at ./loading.jl:1030
[4] require(::Base.PkgId) at ./loading.jl:928
[5] require(::Module, ::Symbol) at ./loading.jl:923 It's very strange since all the other packages seem to precompile fine?? |
I was hoping to work around this by julia> using Gen
[ Info: Precompiling Gen [ea4f424c-a589-11e8-07c0-fd5c91b9da4a]
[ Info: Skipping precompilation since __precompile__(false). Importing Gen [ea4f424c-a589-11e8-07c0-fd5c91b9da4a].
Killed |
Was this puzzle solved? I'm noticing the same kind of error precompiling Gen in a Ubuntu Docker image on macOS, on certain machines and not on others. Specifically, I'm using a slighly modified version of the 6.885 psets, and I've had a two users report this error, both on macOS, but I haven't been able to reproduce it on the mac I have, nor does it occur on the windows machine I have access to. I'm puzzled where to look for clues. However, I've found some mention of macOS 10.15.6 introducing some errors with virtualization. The users that have the issue are on 10.15, as is the above error reproduced by @fplk (and I do not have the issue with macOS 11.2.3). Could this be related? Seems very unlikely. |
So to follow up with a little more detail: now I’ve gotten reports from users getting this issue on macOS 10.15.4, 10.15.6, and 10.15.7, and reports of users not getting this issue on macOS 11 and Windows 10. I have tested it personally on macOS 10.15.7, and get the issue, but don’t get it on my machine running 11.2.3, nor on a Windows 10 machine I tried. Here’s the minimal dockerfile, more or less replicating @fplk's test above FROM ubuntu:20.04
ARG DEBIAN_FRONTEND=noninteractive
ARG JULIA_VERSION_SHORT="1.5"
ARG JULIA_VERSION_FULL="${JULIA_VERSION_SHORT}.3"
ENV JULIA_INSTALLATION_PATH=/opt/julia
RUN apt-get update -qq \
&& apt-get install -qq -y --no-install-recommends\
wget \
git \
python3-dev \
python3-pip \
python3-tk \
zlib1g-dev \
&& rm -rf /var/lib/apt/lists/* && \
ln -s /usr/bin/python3 /usr/bin/python
RUN wget https://julialang-s3.julialang.org/bin/linux/x64/${JULIA_VERSION_SHORT}/julia-${JULIA_VERSION_FULL}-linux-x86_64.tar.gz && \
tar zxf julia-${JULIA_VERSION_FULL}-linux-x86_64.tar.gz && \
mkdir -p "${JULIA_INSTALLATION_PATH}" && \
mv julia-${JULIA_VERSION_FULL} "${JULIA_INSTALLATION_PATH}/" && \
ln -fs "${JULIA_INSTALLATION_PATH}/julia-${JULIA_VERSION_FULL}/bin/julia" /usr/local/bin/ && \
rm julia-${JULIA_VERSION_FULL}-linux-x86_64.tar.gz && \
julia -e 'import Pkg; Pkg.add(["IJulia","Gen"])' On macOS 10.15.7, running Other packages I tried precompile fine, only Gen fails. I also tried it and got the same issue with using |
When trying to run GenSceneGraphs from Docker, @jcrosenb encountered the error
This persisted even after deleting the directory
~/.julia/compiled
.Possible next step A:
Build a Docker image containing Gen at a specific recent commit (e.g. current head, 07a5427) and see whether the same precompile error occurs on @jcrosenb's machine. If it does, we should get some specs of her machine and then see if it reproduces on a different Mac.
Possible next step B:
Test out https://github.com/probcomp/GenSceneGraphs.jl/pull/205 on a different Mac. If we can find someone who has a Mac and is able to do this, that would be the easiest path to finding out whether this is really a Mac-wide issue or is specific to some aspect of @jcrosenb's setup. (For the time being, she's attempting to get the Docker image running from within a Linux VM).
The text was updated successfully, but these errors were encountered: