Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

M1 tech preview 7 core dump in qemu with python+pyarrow #5251

Closed
codekitchen opened this issue Jan 14, 2021 · 6 comments
Closed

M1 tech preview 7 core dump in qemu with python+pyarrow #5251

codekitchen opened this issue Jan 14, 2021 · 6 comments
Labels

Comments

@codekitchen
Copy link

Just attempting to load the pyarrow library in python causes a core dump. numpy alone doesn't core dump as far as I can tell.

The error:

qemu: uncaught target signal 4 (Illegal instruction) - core dumped

Repro using an amd64 python image:

% docker run --rm -it python:3.8@sha256:d3bfb7ed79d0eea29b12fe876c4c086797c973529eb293dd778319bf479dd07d bash
WARNING: The requested image's platform (linux/amd64) does not match the detected host platform (linux/arm64/v8) and no specific platform was requested

root@70d3758579e4:/# pip install pyarrow
Collecting pyarrow
  Downloading pyarrow-2.0.0-cp38-cp38-manylinux2014_x86_64.whl (17.8 MB)
     |████████████████████████████████| 17.8 MB 20.0 MB/s 
Collecting numpy>=1.14
  Downloading numpy-1.19.5-cp38-cp38-manylinux2010_x86_64.whl (14.9 MB)
     |████████████████████████████████| 14.9 MB 17.1 MB/s 
Installing collected packages: numpy, pyarrow
Successfully installed numpy-1.19.5 pyarrow-2.0.0

root@70d3758579e4:/# python
Python 3.8.7 (default, Jan 12 2021, 17:06:28) 
[GCC 8.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import pyarrow
qemu: uncaught target signal 4 (Illegal instruction) - core dumped
Illegal instruction

Let me know if this would be better filed in another project, I'm not really sure of the protocol here.

@stephen-turner stephen-turner added the area/m1 M1 preview builds label Jan 14, 2021
@jchomarat
Copy link

Same issue here with a custom datascience image. I ended rebuilding one in arm64/v8, with miniforge (a conda like with arm64 support) and from the conda package manager I managed to install those libraries. Works fine, but not optimal

@fatafranci
Copy link

@jchomarat I'm having the same issue, can you please share more info of your data science image? I'm not having any luck

@jchomarat
Copy link

Hello @fatafranci ,

Basically I started from ubuntu focal for arm 64, then use miniforge. And to ensure that data science packages can be installed (at least those I need), I install some of them from conda package manager (not pip).

Below the docker file I am using:

FROM --platform=linux/arm64/v8 ubuntu:focal

ENV CONDA_DIR=/opt/conda
ENV LANG=C.UTF-8 LC_ALL=C.UTF-8
ENV PATH=${CONDA_DIR}/bin:${PATH}

ARG USERNAME=alex
ARG USER_UID=1000
ARG USER_GID=$USER_UID

RUN groupadd --gid $USER_GID $USERNAME \
    && useradd --uid $USER_UID --gid $USER_GID -m $USERNAME

RUN apt-get update > /dev/null && \
    export DEBIAN_FRONTEND=noninteractive && \
    apt-get install --no-install-recommends --yes \
        wget bzip2 ca-certificates python3-dev \
        gnupg software-properties-common \
        sudo git curl make \
        git > /dev/null && \
    apt-get clean && \
    rm -rf /var/lib/apt/lists/*

RUN echo $USERNAME ALL=\(root\) NOPASSWD:ALL > /etc/sudoers.d/$USERNAME && \
    chmod 0440 /etc/sudoers.d/$USERNAME

RUN wget -qO - https://adoptopenjdk.jfrog.io/adoptopenjdk/api/gpg/key/public | apt-key add - && \
    add-apt-repository --yes https://adoptopenjdk.jfrog.io/adoptopenjdk/deb/ && \
    apt-get update > /dev/null && \
    apt-get install --yes adoptopenjdk-8-hotspot

ENV JAVA_HOME=/usr/lib/jvm/adoptopenjdk-8-hotspot-arm64/jre/

RUN wget --no-hsts --quiet https://github.com/conda-forge/miniforge/releases/download/4.9.2-5/Miniforge3-Linux-aarch64.sh -O /tmp/miniforge.sh && \
    /bin/bash /tmp/miniforge.sh -b -p ${CONDA_DIR} && \
    rm /tmp/miniforge.sh && \
    conda clean -tipsy && \
    find ${CONDA_DIR} -follow -type f -name '*.a' -delete && \
    find ${CONDA_DIR} -follow -type f -name '*.pyc' -delete && \
    conda clean -afy

RUN echo ". ${CONDA_DIR}/etc/profile.d/conda.sh && conda activate base" >> /etc/skel/.bashrc

RUN echo ". ${CONDA_DIR}/etc/profile.d/conda.sh && conda activate base" >> ~/.bashrc

RUN chown -R $USERNAME:$USERNAME /opt/conda

USER $USERNAME

COPY requirements.txt /home/$USERNAME/

RUN conda init bash && \
    conda create -n localspark python=3.8 -y && \
    conda create -n db-connect python=3.8 -y

SHELL ["conda", "run", "-n", "localspark", "/bin/bash", "-c"]
RUN conda install -c conda-forge scikit-learn==0.23.2 && \
    conda install -c conda-forge pyspark==3.0.1 && \
    conda install -c conda-forge numpy==1.18.5 && \
    conda install -c conda-forge pandas==1.1.3 && \
    conda install -c conda-forge koalas==1.3.0 && \
    conda install -c conda-forge numba==0.52.0 && \
    conda install -c conda-forge psutil==5.8.0 && \
    python -m pip install pip --upgrade && \
    pip install -r /home/$USERNAME/requirements.txt

SHELL ["conda", "run", "-n", "db-connect", "/bin/bash", "-c"]
RUN python -m pip install pip --upgrade && \
    pip install databricks-connect==7.3.5 && \
    pip install nutter==0.1.34 && \
    pip install -r /home/$USERNAME/requirements.txt

CMD [ "sleep", "infinity" ]

And the requirements.txt file

cookiecutter==1.7.2
databricks-cli==0.12.2
mlflow==1.11.0
nbstripout==0.3.9

bandit==1.6.2
flake8==3.8.4
gitpython==3.1.9
isort==5.6.4
mypy==0.790
pydocstyle==5.1.1
pytest==6.1.1
pytest-cov==2.10.1

@fatafranci
Copy link

@jchomarat that's really helpful! I will give it a try. Thanks a lot

@stephen-turner
Copy link
Contributor

This is a qemu bug, which is the upstream component we use for running Intel (amd64) containers on M1 (arm64) chips, and is unfortunately not something we control. In general we recommend running arm64 containers on M1 chips because (even ignoring any crashes) they will always be faster and use less memory.

Please encourage the author of this container to supply an arm64 or multi-arch image, not just an Intel one. Now that M1 is a mainstream platform, we think that most container authors will be keen to do this.

@docker-robott
Copy link
Collaborator

Closed issues are locked after 30 days of inactivity.
This helps our team focus on active issues.

If you have found a problem that seems similar to this, please open a new issue.

Send feedback to Docker Community Slack channels #docker-for-mac or #docker-for-windows.
/lifecycle locked

@docker docker locked and limited conversation to collaborators Mar 26, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

5 participants