Skip to content
This repository has been archived by the owner on Aug 30, 2023. It is now read-only.

meta: Add deprecation notice #208

Merged
merged 1 commit into from Jan 15, 2020
Merged

meta: Add deprecation notice #208

merged 1 commit into from Jan 15, 2020

Conversation

BYK
Copy link
Contributor

@BYK BYK commented Dec 30, 2019

Add a deprecation notice with links to getsentry/sentry and getsentry/onpremise before archiving the repo.

/cc @tianon - we'd like to make the new Docker image releases from https://github.com/getsentry/sentry/tree/master/docker. How do we proceed?

Add a deprecation notice with links to getsentry/sentry and getsentry/onpremise before archiving the repo.

/cc @tianon - we'd like to make the new Docker image releases from https://github.com/getsentry/sentry/tree/master/docker. How do we proceed?
@tianon
Copy link
Contributor

tianon commented Jan 14, 2020

There's a lot to unpack here, but I think the most obvious place to start is pointing out that the Dockerfile you've linked to has very little chance of passing the official-images review, let alone the build process -- the biggest issue is that it doesn't package a specific release (and instead opts to use ARG to allow specifying version at build-time, which we don't support), but it's also a fairly complex multi-stage build.

More generally, we've had quite a few upstreams try to incorporate their official image Dockerfile into their "main" repository with varying levels of success. Typically, what ends up happening is that the official-image Dockerization requires a change post-release which is then more complicated to embed/consume in that release Dockerfile because the "tag is already cut" so to speak.

What I'd suggest is starting with https://github.com/docker-library/official-images#contributing-to-the-standard-library and https://github.com/docker-library/faq#readme and seeing if they help you get a better sense of the official-images program at large and then perhaps we can have a more focused conversation?

(also FYI @yosifkit)

@BYK BYK merged commit b02d4cc into master Jan 15, 2020
@BYK
Copy link
Contributor Author

BYK commented Jan 15, 2020

There's a lot to unpack here, but I think the most obvious place to start is pointing out that the Dockerfile you've linked to has very little chance of passing the official-images review, let alone the build process -- the biggest issue is that it doesn't package a specific release (and instead opts to use ARG to allow specifying version at build-time, which we don't support), but it's also a fairly complex multi-stage build.

It does package a specific release, the arg is only to bake the version information into the image. It simply builds any SHA on master and that is guaranteed via build checks. The image itself is derived from the base 9.1 image in this repo, which obviously passed the review so I'm not sure why you think it wouldn't pass the review or can be modified to pass it.

More generally, we've had quite a few upstreams try to incorporate their official image Dockerfile into their "main" repository with varying levels of success. Typically, what ends up happening is that the official-image Dockerization requires a change post-release which is then more complicated to embed/consume in that release Dockerfile because the "tag is already cut" so to speak.

That's what patch releases are for, I guess? We don't do this: we tag things and treat tags as immutable unless they are explicitly expected to be mutable (like :latest or :9.1 which actually stands for latest :9.1.x etc.).

What I'd suggest is starting with docker-library/official-images#contributing-to-the-standard-library and docker-library/faq#readme and seeing if they help you get a better sense of the official-images program at large and then perhaps we can have a more focused conversation?

I did and I'm not very sure whether this program fits our releases anymore as we switched to releasing every new commit. We are considering a monthly "official release" cycle which might be more suitable.

The two blockers are the need for Dockerfile to be in the root directory, and the slow approval process.

@tianon @yosifkit

@BYK BYK deleted the byk/meta/deprecation branch January 15, 2020 18:15
@tianon
Copy link
Contributor

tianon commented Jan 15, 2020

The image itself is derived from the base 9.1 image in this repo, which obviously passed the review so I'm not sure why you think it wouldn't pass the review or can be modified to pass it.

The current sentry:latest image is pointing at https://github.com/getsentry/docker-sentry/blob/09a7761e841eee7fab758526b14d46ae56134952/9.1/Dockerfile, which when compared against https://github.com/getsentry/sentry/blob/67273fa2fd9f0ade4218fb19101d2202b0454f96/docker/Dockerfile (master at this time of writing), there's a non-trivial diff to review:

$ diff -u --ignore-all-space <(wget -qO- 'https://raw.githubusercontent.com/getsentry/docker-sentry/09a7761e841eee7fab758526b14d46ae56134952/9.1/Dockerfile') <(wget -qO- 'https://raw.githubusercontent.com/getsentry/sentry/67273fa2fd9f0ade4218fb19101d2202b0454f96/docker/Dockerfile')
--- /dev/fd/63	2020-01-15 11:45:47.579471315 -0800
+++ /dev/fd/62	2020-01-15 11:45:47.579471315 -0800
@@ -1,136 +1,189 @@
-FROM python:2.7.16-slim-stretch
+FROM python:2.7.16-slim-buster as sdist
 
-# add our user and group first to make sure their IDs get assigned consistently
-RUN groupadd -r sentry && useradd -r -m -g sentry sentry
+LABEL maintainer="oss@sentry.io"
+LABEL org.opencontainers.image.title="Sentry PyPI Wheel"
+LABEL org.opencontainers.image.description="PyPI Wheel Builder for Sentry"
+LABEL org.opencontainers.image.url="https://sentry.io/"
+LABEL org.opencontainers.image.source="https://github.com/getsentry/sentry"
+LABEL org.opencontainers.image.vendor="Functional Software, Inc."
+LABEL org.opencontainers.image.authors="oss@sentry.io"
 
 RUN apt-get update && apt-get install -y --no-install-recommends \
-        gcc \
-        git \
-        libffi-dev \
-        libjpeg-dev \
-        libmaxminddb-dev \
-        libpq-dev \
-        libxml2-dev \
-        libxmlsec1-dev \
-        libxslt-dev \
-        libyaml-dev \
-        pkg-config \
+    # Needed for GPG
+    dirmngr \
+    gnupg \
+    # Needed for fetching stuff
+    wget \
     && rm -rf /var/lib/apt/lists/*
 
-# Sane defaults for pip
-ENV PIP_NO_CACHE_DIR off
-ENV PIP_DISABLE_PIP_VERSION_CHECK on
+# Fetch trusted keys
+RUN for key in \
+      # gosu
+      B42F6819007F00F88E364FD4036A9C25BF357DD4 \
+      # tini
+      595E85A6B1B4779EA4DAAEC70B588DFF0527A9B7 \
+      # Node - gpg keys listed at https://github.com/nodejs/node
+      94AE36675C464D64BAFA68DD7434390BDBE9B9C5 \
+      FD3A5288F042B6850C66B31F09FE44734EB7990E \
+      71DCFD284A79C3B38668286BC97EC7A07EDE3FC1 \
+      DD8F2338BAE7501E3DD5AC78C273792F7D83545D \
+      C4F0DFFF4E8C1A8236409D08E73BC641CC11F4C8 \
+      B9AE9905FFD7803F25714661B63B535A4C206CA9 \
+      77984A986EBC2AA786BC0F66B01FBB92821C587A \
+      8FCCA13FEF1D0C2E91008E09770F7A9A5AE15600 \
+      4ED778F539E3634C779C87C6D7062848A1AB005C \
+      A48C2BEE680E841632CD4E44F07496B3EB3C1762 \
+      B9E2F5981AA6E0CD28160D9FF13993A75599653C \
+    ; do \
+      # TODO(byk): Replace the keyserver below w/ something owned by Sentry
+      gpg --batch --keyserver hkps://mattrobenolt-keyserver.global.ssl.fastly.net:443 --recv-keys "$key"; \
+    done
 
 # grab gosu for easy step-down from root
+ENV GOSU_VERSION 1.11
 RUN set -x \
-    && export GOSU_VERSION=1.11 \
-    && fetchDeps=" \
-        dirmngr \
-        gnupg \
-        wget \
-    " \
-    && apt-get update && apt-get install -y --no-install-recommends $fetchDeps && rm -rf /var/lib/apt/lists/* \
     && wget -O /usr/local/bin/gosu "https://github.com/tianon/gosu/releases/download/$GOSU_VERSION/gosu-$(dpkg --print-architecture)" \
     && wget -O /usr/local/bin/gosu.asc "https://github.com/tianon/gosu/releases/download/$GOSU_VERSION/gosu-$(dpkg --print-architecture).asc" \
-    && export GNUPGHOME="$(mktemp -d)" \
-    && for key in \
-      B42F6819007F00F88E364FD4036A9C25BF357DD4 \
-    ; do \
-      gpg --batch --keyserver hkp://ha.pool.sks-keyservers.net --recv-keys "$key" || \
-      gpg --batch --keyserver hkp://ipv4.pool.sks-keyservers.net --recv-keys "$key" || \
-      gpg --batch --keyserver hkp://pgp.mit.edu:80 --recv-keys "$key" ; \
-    done \
     && gpg --batch --verify /usr/local/bin/gosu.asc /usr/local/bin/gosu \
-    && gpgconf --kill all \
-    && rm -r "$GNUPGHOME" /usr/local/bin/gosu.asc \
-    && chmod +x /usr/local/bin/gosu \
-    && gosu nobody true \
-    && apt-get purge -y --auto-remove $fetchDeps
+    && rm -r /usr/local/bin/gosu.asc \
+    && chmod +x /usr/local/bin/gosu
 
 # grab tini for signal processing and zombie killing
+ENV TINI_VERSION 0.18.0
 RUN set -x \
-    && export TINI_VERSION=0.18.0 \
-    && fetchDeps=" \
-        dirmngr \
-        gnupg \
-        wget \
-    " \
-    && apt-get update && apt-get install -y --no-install-recommends $fetchDeps && rm -rf /var/lib/apt/lists/* \
     && wget -O /usr/local/bin/tini "https://github.com/krallin/tini/releases/download/v$TINI_VERSION/tini" \
     && wget -O /usr/local/bin/tini.asc "https://github.com/krallin/tini/releases/download/v$TINI_VERSION/tini.asc" \
-    && export GNUPGHOME="$(mktemp -d)" \
-    && for key in \
-      595E85A6B1B4779EA4DAAEC70B588DFF0527A9B7 \
-    ; do \
-      gpg --batch --keyserver hkp://ha.pool.sks-keyservers.net --recv-keys "$key" || \
-      gpg --batch --keyserver hkp://ipv4.pool.sks-keyservers.net --recv-keys "$key" || \
-      gpg --batch --keyserver hkp://pgp.mit.edu:80 --recv-keys "$key" ; \
-    done \
     && gpg --batch --verify /usr/local/bin/tini.asc /usr/local/bin/tini \
-    && gpgconf --kill all \
-    && rm -r "$GNUPGHOME" /usr/local/bin/tini.asc \
-    && chmod +x /usr/local/bin/tini \
-    && tini -h \
-&& apt-get purge -y --auto-remove $fetchDeps
+    && rm /usr/local/bin/tini.asc \
+    && chmod +x /usr/local/bin/tini
 
-# Support for RabbitMQ and GeoIP
-RUN set -x \
-    && apt-get update && apt-get install -y --no-install-recommends make && rm -rf /var/lib/apt/lists/* \
-    && pip install librabbitmq==1.6.1 maxminddb==1.4.1 \
-    && python -c 'import librabbitmq' \
-    # Fully verify that the C extension is correctly installed, it unfortunately
-    # requires a full check into maxminddb.extension.Reader
-    && python -c 'import maxminddb.extension; maxminddb.extension.Reader' \
-    && apt-get purge -y --auto-remove make
+# Get and set up Node for front-end asset building
+COPY .nvmrc /usr/src/sentry/
+RUN cd /usr/src/sentry \
+    && export NODE_VERSION="$(cat .nvmrc)" \
+    && wget "https://nodejs.org/dist/v$NODE_VERSION/node-v$NODE_VERSION-linux-x64.tar.gz" \
+    && wget "https://nodejs.org/dist/v$NODE_VERSION/SHASUMS256.txt.asc" \
+    && gpg --batch --verify SHASUMS256.txt.asc \
+    && grep " node-v$NODE_VERSION-linux-x64.tar.gz\$" SHASUMS256.txt.asc | sha256sum -c - \
+    && tar -xzf "node-v$NODE_VERSION-linux-x64.tar.gz" -C /usr/local --strip-components=1 \
+    && rm -r "node-v$NODE_VERSION-linux-x64.tar.gz" SHASUMS256.txt.asc
+
+ARG SOURCE_COMMIT
+ENV SENTRY_BUILD=${SOURCE_COMMIT:-unknown}
+LABEL org.opencontainers.image.revision=$SOURCE_COMMIT
+LABEL org.opencontainers.image.licenses="https://github.com/getsentry/sentry/blob/${SOURCE_COMMIT:-master}/LICENSE"
+
+COPY . /usr/src/sentry/
+RUN export YARN_CACHE_FOLDER="$(mktemp -d)" \
+    && cd /usr/src/sentry \
+    && python setup.py bdist_wheel \
+    && rm -r "$YARN_CACHE_FOLDER" \
+    && mv /usr/src/sentry/dist /dist
+
+# This is the image to be run
+FROM python:2.7.16-slim-buster
+
+LABEL maintainer="oss@sentry.io"
+LABEL org.opencontainers.image.title="Sentry"
+LABEL org.opencontainers.image.description="Sentry runtime image"
+LABEL org.opencontainers.image.url="https://sentry.io/"
+LABEL org.opencontainers.image.documentation="https://github.com/getsentry/onpremise/tree/v10"
+LABEL org.opencontainers.image.source="https://github.com/getsentry/sentry"
+LABEL org.opencontainers.image.vendor="Functional Software, Inc."
+LABEL org.opencontainers.image.authors="oss@sentry.io"
+
+
+# add our user and group first to make sure their IDs get assigned consistently
+RUN groupadd -r sentry && useradd -r -m -g sentry sentry
 
-ENV SENTRY_VERSION 9.1.2
+COPY --from=sdist /usr/local/bin/gosu /usr/local/bin/tini /usr/local/bin/
+
+# Sane defaults for pip
+ENV PIP_NO_CACHE_DIR=off \
+    PIP_DISABLE_PIP_VERSION_CHECK=1 \
+    # Sentry config params
+    SENTRY_CONF=/etc/sentry \
+    SENTRY_FILESTORE_DIR=/var/lib/sentry/files \
+    # Disable some unused uWSGI features, saving dependencies
+    # Thank to https://stackoverflow.com/a/25260588/90297
+    UWSGI_PROFILE_OVERRIDE=ssl=false;xml=false;routing=false \
+    # UWSGI dogstatsd plugin
+    UWSGI_NEED_PLUGIN=/var/lib/uwsgi/dogstatsd
 
+COPY --from=sdist /dist/*.whl /tmp/dist/
 RUN set -x \
-    && buildDeps=" \
+    && buildDeps="" \
+    # uwsgi
+    && buildDeps="$buildDeps \
+      gcc \
         g++ \
-        dirmngr \
-        gnupg \
         wget \
     " \
-    && apt-get update && apt-get install -y --no-install-recommends $buildDeps && rm -rf /var/lib/apt/lists/* \
-    && mkdir -p /usr/src/sentry \
-    && wget -O /usr/src/sentry/sentry-${SENTRY_VERSION}-py27-none-any.whl "https://github.com/getsentry/sentry/releases/download/${SENTRY_VERSION}/sentry-${SENTRY_VERSION}-py27-none-any.whl" \
-    && wget -O /usr/src/sentry/sentry-${SENTRY_VERSION}-py27-none-any.whl.asc "https://github.com/getsentry/sentry/releases/download/${SENTRY_VERSION}/sentry-${SENTRY_VERSION}-py27-none-any.whl.asc" \
-    && wget -O /usr/src/sentry/sentry_plugins-${SENTRY_VERSION}-py2.py3-none-any.whl "https://github.com/getsentry/sentry/releases/download/${SENTRY_VERSION}/sentry_plugins-${SENTRY_VERSION}-py2.py3-none-any.whl" \
-    && wget -O /usr/src/sentry/sentry_plugins-${SENTRY_VERSION}-py2.py3-none-any.whl.asc "https://github.com/getsentry/sentry/releases/download/${SENTRY_VERSION}/sentry_plugins-${SENTRY_VERSION}-py2.py3-none-any.whl.asc" \
-    && export GNUPGHOME="$(mktemp -d)" \
-    && for key in \
-      D8749766A66DD714236A932C3B2D400CE5BBCA60 \
-      70DBC4D958026B46032EAB75A17EE621C962DE46 \
-      4EBA9A94CC7DC65988662672C2F03C406631065D \
-    ; do \
-      gpg --batch --keyserver hkp://ha.pool.sks-keyservers.net --recv-keys "$key" || \
-      gpg --batch --keyserver hkp://ipv4.pool.sks-keyservers.net --recv-keys "$key" || \
-      gpg --batch --keyserver hkp://pgp.mit.edu:80 --recv-keys "$key" ; \
-    done \
-    && gpg --batch --verify /usr/src/sentry/sentry-${SENTRY_VERSION}-py27-none-any.whl.asc /usr/src/sentry/sentry-${SENTRY_VERSION}-py27-none-any.whl \
-    && gpg --batch --verify /usr/src/sentry/sentry_plugins-${SENTRY_VERSION}-py2.py3-none-any.whl.asc /usr/src/sentry/sentry_plugins-${SENTRY_VERSION}-py2.py3-none-any.whl \
-    && gpgconf --kill all \
-    && pip install \
-        /usr/src/sentry/sentry-${SENTRY_VERSION}-py27-none-any.whl \
-        /usr/src/sentry/sentry_plugins-${SENTRY_VERSION}-py2.py3-none-any.whl \
-    && sentry --help \
-    && sentry plugins list \
-    && rm -r "$GNUPGHOME" /usr/src/sentry \
-    && apt-get purge -y --auto-remove $buildDeps
-
-ENV SENTRY_CONF=/etc/sentry \
-    SENTRY_FILESTORE_DIR=/var/lib/sentry/files
-
-RUN mkdir -p $SENTRY_CONF && mkdir -p $SENTRY_FILESTORE_DIR
-
-COPY sentry.conf.py /etc/sentry/
-COPY config.yml /etc/sentry/
+    # maxminddb
+    && buildDeps="$buildDeps \
+      libmaxminddb-dev \
+    "\
+    # librabbitmq
+    && buildDeps="$buildDeps \
+      make \
+    " \
+       # xmlsec
+    && buildDeps="$buildDeps \
+      libxmlsec1-dev \
+      pkg-config \
+    " \
+    && apt-get update \
+    && apt-get install -y --no-install-recommends $buildDeps \
+    && pip install /tmp/dist/*.whl \
+    # Separate these due to https://git.io/fjyz6
+    # Otherwise librabbitmq will install the latest amqp version,
+    # violating kombu's amqp<2.0 constraint.
+    && pip install librabbitmq==1.6.1 \
+    && mkdir /tmp/uwsgi-dogstatsd \
+    && wget -O - https://github.com/eventbrite/uwsgi-dogstatsd/archive/filters-and-tags.tar.gz | \
+       tar -xzf - -C /tmp/uwsgi-dogstatsd --strip-components=1 \
+    && UWSGI_NEED_PLUGIN="" uwsgi --build-plugin /tmp/uwsgi-dogstatsd \
+    && mkdir -p /var/lib/uwsgi \
+    && mv dogstatsd_plugin.so /var/lib/uwsgi/ \
+    && rm -rf /tmp/dist /tmp/uwsgi-dogstatsd .uwsgi_plugins_builder \
+    && apt-get purge -y --auto-remove $buildDeps \
+    # We install run-time dependencies strictly after
+    # build dependencies to prevent accidental collusion.
+    # These are also installed last as they are needed
+    # during container run and can have the same deps w/
+    # build deps such as maxminddb.
+    && apt-get install -y --no-install-recommends \
+      # pillow
+      libjpeg-dev \
+      # rust bindings
+      libffi-dev \
+      # maxminddb bindings
+      libmaxminddb-dev \
+      # SAML needs these run-time
+      libxmlsec1-dev \
+      libxslt-dev \
+      # pyyaml needs this run-time
+      libyaml-dev \
+      # other
+      pkg-config \
+    \
+    && apt-get clean \
+    && rm -rf /var/lib/apt/lists/* \
+    && python -c 'import librabbitmq' \
+    # Fully verify that the C extension is correctly installed, it unfortunately
+    # requires a full check into maxminddb.extension.Reader
+    && python -c 'import maxminddb.extension; maxminddb.extension.Reader' \
+    && mkdir -p $SENTRY_CONF
 
-COPY docker-entrypoint.sh /entrypoint.sh
+COPY ./docker/docker-entrypoint.sh ./docker/sentry.conf.py ./docker/config.yml $SENTRY_CONF/
 
 EXPOSE 9000
-VOLUME /var/lib/sentry/files
+VOLUME /data
 
-ENTRYPOINT ["/entrypoint.sh"]
+ENTRYPOINT exec $SENTRY_CONF/docker-entrypoint.sh $0 $@
 CMD ["run", "web"]
+
+ARG SOURCE_COMMIT
+ENV SENTRY_BUILD=${SOURCE_COMMIT:-unknown}
+LABEL org.opencontainers.image.revision=$SOURCE_COMMIT
+LABEL org.opencontainers.image.licenses="https://github.com/getsentry/sentry/blob/${SOURCE_COMMIT:-master}/LICENSE"

The change that I noticed first was the multi-stage build, which doesn't appear to fall in line with the guidelines at https://github.com/docker-library/faq#multi-stage-builds.

COPY . /usr/src/sentry/

This change really stands out because it essentially means the entire Sentry source code is now part of the Docker build context, and thus becomes part of the reviewed artifacts. Is there not some other official "release" artifact which would then be consumed by Docker as previously? (ENV SENTRY_VERSION 9.1.2 which then leads to pip install, given that pypi is the official release platform)

That's what patch releases are for, I guess? We don't do this: we tag things and treat tags as immutable unless they are explicitly expected to be mutable (like :latest or :9.1 which actually stands for latest :9.1.x etc.).
, this is exactly what I was getting at -- most projects treat tagged releases as immutable, which becomes a problem when there's Dockerfile review that happens post-release. That's why most of our maintainers opt to put the Dockerization into a separate repository so that it can be modified and maintained independently (since it's really packaging for the release, similar to a .deb).

I did and I'm not very sure whether this program fits our releases anymore as we switched to releasing every new commit. We are considering a monthly "official release" cycle which might be more suitable.

Indeed -- releasing every commit is definitely not suitable for our program, given that it's reviewed. A monthly official release would definitely be much more suitable.

We don't have any problem with the Dockerization consuming from Git instead of something like pypi, but if that means it's doing COPY . /..., that's going to be unreviewable (given that the build context will be the entire application source code).

The two blockers are the need for Dockerfile to be in the root directory, and the slow approval process.

The first of these is overcome by additional fields in library/sentry -- the latter is unfortunately the main selling point of the program, so all I can offer is my apologies for not getting to reviewing things as quickly as we'd always like to. 😞

@BYK
Copy link
Contributor Author

BYK commented Jan 15, 2020

The change that I noticed first was the multi-stage build, which doesn't appear to fall in line with the guidelines at docker-library/faq#multi-stage-builds.

It does fully align with the second item from the section you refer to:

two-stage build where the necessary artifact does not exist and must be built from source and/or the build process is going to be similarly highly deterministic (thus mitigating the cache concern somewhat); for example:

This change really stands out because it essentially means the entire Sentry source code is now part of the Docker build context, and thus becomes part of the reviewed artifacts. Is there not some other official "release" artifact which would then be consumed by Docker as previously?

Nope as if you look closely, that pip artifact is generated by this build stage and then consumed in the runtime image. I'm okay publishing this as a separate image and then runtime image for the official library doing a COPY --from getsentry/sentry:latest-pip but that would mean we need to maintain another special image for you folks and that's exactly what I'm trying to avoid.

We don't have any problem with the Dockerization consuming from Git instead of something like pypi, but if that means it's doing COPY . /..., that's going to be unreviewable (given that the build context will be the entire application source code).

I don't see how this makes a different as a PIP wheel is essentially a zip file with the whole source code in it. Even worse, the front-end artifacts would be minified JS so this is actually way less reviewable?

The first of these is overcome by additional fields in library/sentry

I checked the additional fields and there doesn't seem to be a way to separate the Dockerfile location and the build context directory. So whenever we tell the system to use the docker folder, which we place the Dockerfile in, then we cannot use the main repo as our build context. The command line allows this but not the automated build process.

the latter is unfortunately the main selling point of the program, so all I can offer is my apologies for not getting to reviewing things as quickly as we'd always like to. 😞

No need to apologize! Our interactions so far has been quite speedy. It's just my worry that it will never be as fast as our own images at the getsentry/sentry hub repo.

@yosifkit
Copy link
Contributor

It does fully align with the second item from the section you refer to:

The list was meant as possible examples of multi-stage builds that are likely acceptable, not really a comprehensive list of broad categories that are acceptable (apologies for that being unclear). One of the major issues we have with multi-stage builds is below.

On the official images build infrastructure, "we don't have a clean way to preserve the cache for the intermediate stages", so the layers will get deleted when we clean up images that are "ripe". The practical implication of this is that since the build cache for these untagged stages could be deleted at any time, we will end up spending extra time rebuilding them and users will pull "new" images that are unchanged.

(Ref docker-library/official-images#7134 (comment) and docker-library/official-images#5929 (comment))

So this python build is not deterministic. I ran a quick test after checking out the getsentry/sentry:

$ docker build -t sentry-test -f docker/Dockerfile .
...
$ # everything completed successfully
$ # clean up dangling images similar to the build servers (which removes the first stage):
$ docker image prune -f
$ # build again and the first stage has to build again as expected
$ docker build -t sentry-test -f docker/Dockerfile .
$ # BUT, docker build cache is also broken in the second stage
$ # causing a rebuild there and later, but the image didn't really change
$ # so users will get a new image on docker pull that has no relevant change
Step 34/45 : ENV PIP_NO_CACHE_DIR=off     PIP_DISABLE_PIP_VERSION_CHECK=1     SENTRY_CONF=/etc/sentry     SENTRY_FILESTORE_DIR=/var/lib/sentry/files     UWSGI_PROFILE_OVERRIDE=ssl=false;xml=false;routing=false     UWSGI_NEED_PLUGIN=/var/lib/uwsgi/dogstatsd
 ---> Using cache
 ---> 28556a6a24bf
Step 35/45 : COPY --from=sdist /dist/*.whl /tmp/dist/
 ---> 65309de20a56

Nope as if you look closely, that pip artifact is generated by this build stage and then consumed in the runtime image.

The docker build context is still ., i.e. the entire getsentry/sentry repo, so any changes there between versions handed to us will show up in our review. (see diff-pr.sh for how we generate our diff; it is basically taking all docker build contexts from the current library/sentry file and diffing them to the new contexts from the PR)

I don't see how this makes a different as a PIP wheel is essentially a zip file with the whole source code in it. Even worse, the front-end artifacts would be minified JS so this is actually way less reviewable?

We review the Dockerfile and the docker build context, not the content that they download from pip. What we are concerned with is how the software runs and that it securely gets resources from the internet (like gpg verification: https://github.com/docker-library/official-images#security).

@BYK
Copy link
Contributor Author

BYK commented Jan 17, 2020

@yosifkit, thanks for providing more context around this.

The list was meant as possible examples of multi-stage builds that are likely acceptable, not really a comprehensive list of broad categories that are acceptable (apologies for that being unclear). One of the major issues we have with multi-stage builds is below.

For this, I proposed the following in that comment:

I'm okay publishing this as a separate image and then runtime image for the official library doing a COPY --from getsentry/sentry:latest-pip but that would mean we need to maintain another special image for you folks and that's exactly what I'm trying to avoid.

It is not ideal but I'm okay doing it if that'd be the major blocker.

So this python build is not deterministic. I ran a quick test after checking out the getsentry/sentry:

Is there a way to debug this as I don't really understand why this would be the case? Is it the wildcard COPY instruction? Is it the wget here: https://github.com/getsentry/sentry/blob/1e99064dc4d6558290d7aead56c414111901c3aa/docker/Dockerfile#L143

Or is it something else? Leveraging the cache is very important to us too and I was wondering what might be some things we could do to improve the situation.

The docker build context is still ., i.e. the entire getsentry/sentry repo, so any changes there between versions handed to us will show up in our review. (see diff-pr.sh for how we generate our diff; it is basically taking all docker build contexts from the current library/sentry file and diffing them to the new contexts from the PR)

If we move on to using that :pip-latest image, then we can get away by only using the docker directory as the build context which should solve your issue. I personally find it weird that you'd be okay with this as it literally doesn't change anything: we still have the whole Sentry codebase to review, if you want a proper review but it should address your build context related concern.

We review the Dockerfile and the docker build context, not the content that they download from pip. What we are concerned with is how the software runs and that it securely gets resources from the internet (like gpg verification: docker-library/official-images#security).

Isn't it equivalent to building from source? I'm guessing this is your way of optimizing for not rebuilding from source every time but again, I'm not really convinced that the approach taken by the official library adds any extra value regarding security etc.

@BYK
Copy link
Contributor Author

BYK commented Jan 27, 2020

@yosifkit are you willing to provide help/feedback regarding making the dockerfile deterministic? Also I'd rather finalize this discussion: re do you folks see a way forward keeping Sentry in the official library or shall we just keep using the getsentry/sentry images and remove the stale one from the official library?

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants