Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update docker build #16

Merged
merged 6 commits into from
May 28, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
61 changes: 0 additions & 61 deletions docker/Dockerfile-bionic

This file was deleted.

43 changes: 27 additions & 16 deletions docker/Dockerfile-focal
Original file line number Diff line number Diff line change
@@ -1,10 +1,24 @@
# -*- mode: dockerfile -*-

FROM nvidia/cuda:11.4.3-devel-ubuntu20.04
FROM nvidia/cuda:12.4.1-devel-ubuntu20.04

ARG TARGETPLATFORM
RUN echo "Building image for $TARGETPLATFORM"

ARG PYTHON_VERSION=3.8
ENV DEBIAN_FRONTEND=noninteractive
RUN apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/3bf863cc.pub

RUN apt-get update && \
apt-get install -yq wget

RUN case ${TARGETPLATFORM} in \
"linux/amd64") CUDA_ARCH=x86_64 ;; \
"linux/x86_64") CUDA_ARCH=x86_64 ;; \
"linux/arm64") CUDA_ARCH=arm64 ;; \
esac && \
wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/${CUDA_ARCH}/cuda-keyring_1.0-1_all.deb

RUN dpkg -i cuda-keyring_1.0-1_all.deb

RUN apt-get update && apt-get install -yq \
bison \
Expand All @@ -19,11 +33,16 @@ RUN apt-get update && apt-get install -yq \

WORKDIR /opt/conda_setup

RUN curl -o miniconda.sh https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh && \
chmod +x miniconda.sh && \
./miniconda.sh -b -p /opt/conda && \
/opt/conda/bin/conda install -y python=$PYTHON_VERSION && \
/opt/conda/bin/conda clean -ya
RUN case ${TARGETPLATFORM} in \
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These days, I'd probably use uv and nothing else but we don't need to change this in this PR.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would probably help save GitHub minutes too as the non-nethack parts are likely the majority of the build time.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's uv?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://astral.sh/blog/uv

I'm already using it in one of the tests, here.

It's from the ruff developer and one of the many Rust-for-Python tools that massively speed up things these days.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On my M3 Mac, the Docker build takes 15 minutes and the image size is a chunky 23GB.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it faster the second time and we just have to cache the Docker layers?

Alternatively, I think the base image with CUDA is quite heavy.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is faster, but still takes nearly 10 minutes, most of the time is in RUN pip install '.[all]' on amd64 (~400s). Rather strangely, the same step for arm64 only takes about 40s. I wonder is this due to me running the build on a Mac.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's likely. Cross compilation is rather slow on arm MacBooks.

"linux/amd64") MINI_ARCH=x86_64 ;; \
"linux/x86_64") MINI_ARCH=x86_64 ;; \
"linux/arm64") MINI_ARCH=aarch64 ;; \
esac && \
curl -o miniconda.sh https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-${MINI_ARCH}.sh && \
chmod +x miniconda.sh && \
./miniconda.sh -b -p /opt/conda && \
/opt/conda/bin/conda install -y python=$PYTHON_VERSION && \
/opt/conda/bin/conda clean -ya
ENV PATH /opt/conda/bin:$PATH

RUN python -m pip install --upgrade pip ipython ipdb
Expand All @@ -36,12 +55,4 @@ RUN pip install '.[all]'

WORKDIR /workspace

CMD ["/bin/bash"]


# Docker commands:
# docker rm nle -v
# docker build -t nle -f docker/Dockerfile-focal .
# docker run --gpus all --rm --name nle nle
# or
# docker run --gpus all -it --entrypoint /bin/bash nle
CMD ["/bin/bash"]
58 changes: 58 additions & 0 deletions docker/Dockerfile-jammy
Original file line number Diff line number Diff line change
@@ -0,0 +1,58 @@
# -*- mode: dockerfile -*-

FROM nvidia/cuda:12.4.1-devel-ubuntu22.04

ARG TARGETPLATFORM
RUN echo "Building image for $TARGETPLATFORM"

ARG PYTHON_VERSION=3.11
ENV DEBIAN_FRONTEND=noninteractive

RUN apt-get update && \
apt-get install -yq wget

RUN case ${TARGETPLATFORM} in \
"linux/amd64") CUDA_ARCH=x86_64 ;; \
"linux/x86_64") CUDA_ARCH=x86_64 ;; \
"linux/arm64") CUDA_ARCH=arm64 ;; \
esac && \
wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/${CUDA_ARCH}/cuda-keyring_1.0-1_all.deb

RUN dpkg -i cuda-keyring_1.0-1_all.deb

RUN apt-get update && apt-get install -yq \
bison \
build-essential \
cmake \
curl \
flex \
git \
libbz2-dev \
ninja-build \
wget

WORKDIR /opt/conda_setup

RUN case ${TARGETPLATFORM} in \
"linux/amd64") MINI_ARCH=x86_64 ;; \
"linux/x86_64") MINI_ARCH=x86_64 ;; \
"linux/arm64") MINI_ARCH=aarch64 ;; \
esac && \
curl -o miniconda.sh https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-${MINI_ARCH}.sh && \
chmod +x miniconda.sh && \
./miniconda.sh -b -p /opt/conda && \
/opt/conda/bin/conda install -y python=$PYTHON_VERSION && \
/opt/conda/bin/conda clean -ya
ENV PATH /opt/conda/bin:$PATH

RUN python -m pip install --upgrade pip ipython ipdb

COPY . /opt/nle/

WORKDIR /opt/nle

RUN pip install '.[all]'

WORKDIR /workspace

CMD ["/bin/bash"]
61 changes: 0 additions & 61 deletions docker/Dockerfile-xenial

This file was deleted.

62 changes: 26 additions & 36 deletions docker/README.md
Original file line number Diff line number Diff line change
@@ -1,47 +1,37 @@
# Docker images for NLE

This directory -- i.e., `docker/` -- contains some docker images that we use for
testing both locally and in CI. They contain dependencies and a pre-built
version of NLE.
This directory -- i.e., `docker/` -- contains some dockerfiles to
create NLE images. They are based on the NVDIA Ubuntu images and
so include CUDA support. However, you don't need any GPUs to make
and run containers from the images.

You can try out the latest stable version in Ubuntu 18.04 by doing:
Docker now provides multi-architecture support so whether you are
using an x86_64, amd64 or arm64 CPU you can use the same image to
run an NLE container.

```bash
$ docker pull fairnle/nle:stable
$ docker run --gpus all --rm -it fairnle/nle:stable python # or bash
# Then you can simply use the nle package as normal
```
# Building Images Locally

The git repository is installed inside a conda distribution, and can be found in
`/opt/nle` inside the images.

The DockerHub repository also contains pre-built images per each released
version of `nle`, following a specific templates:

``` bash
1. fairnle/nle:stable
2. fairnle/nle:<nle-version> # corresponds to (1), based on Ubuntu 18.04
3. fairnle/nle-xenial:<nle-version> # Based on Ubuntu 16.04
4. fairnle/nle-focal:<nle-version> # Based on Ubuntu 20.04
5. fairnle/nle:<sha> # bionic image built on dockerfile changes
6. fairnle/nle-xenial:<sha> # xenial image built on dockerfile changes
7. fairnle/nle-focal:<sha> # focal image built on dockerfile changes
8. fairnle/nle:dev # points to latest built sha
9. fairnle/nle-xenial:dev # points to latest built sha
10. fairnle/nle-focal:dev # points to latest built sha
```
To build and run an image (e.g. `Dockerfile-jammy`) for your local
architecture do:

`<nle-version>` is the latest pip version released, and follows semantic versioning (so something like `X.Y.Z`).
```bash
$ git clone https://github.com/heiner/nle --recursive
$ cd nle
$ docker build -file docker/Dockerfile-jammy . -tag nle
$ docker run -it --gpus all --rm --name nle nle
# or alternatively if you don't have GPUs
$ docker run -it --name nle nle
```

# Building images locally
# Building Multi-Architecture Images

To build and run any of them (e.g. `Dockerfile-bionic`) do:
To build an image on your machine that can be deployed on multiple
architectures (e.g. x86_64, amd64 or arm64), use the following docker
command. Run it from the nle directory.

```bash
$ git clone https://github.com/heiner/nle --recursive
$ cd nle
$ docker build -f docker/Dockerfile-bionic . -t nle
$ docker run --gpus all --rm --name nle nle
# or alternatively
$ docker run --gpus all -it --entrypoint /bin/bash nle
$ docker buildx build --platform linux/amd64,linux/arm64 -t nle -f docker/Dockerfile-jammy .
```

The run instructions are as before. Docker will load the correct
binaries for the architecture you are running the container on.
Loading