Skip to content
This repository has been archived by the owner on Sep 18, 2024. It is now read-only.

Commit

Permalink
Update integration image
Browse files Browse the repository at this point in the history
Summary:
This diff makes several fixes:

* Rebase integration image on slim Debian variants (and install a few missing deps we still need, like ssh)
* Our integration image Dockerfile seems to have broken due to things upstream changing, fixed by this diff
* Fixes dagster-io#1999
* Our snapshotted requirements.txt dependencies for Python 3 were previously based on a Python 3.5 environment, which was increasingly being left behind. This diff updates to use Python 3.7 for snapshotting a requirements.txt file, and adds [[ https://www.python.org/dev/peps/pep-0496/ | PEP 496 environment markers ]] to constrain installed dependencies s.t. the 3.5 and 3.8 image builds still succeed.

Final integration image build is here: https://buildkite.com/dagster/integration-image-builds/builds/46

Test Plan: buildkite

Reviewers: schrockn

Reviewed By: schrockn

Differential Revision: https://dagster.phacility.com/D1750
  • Loading branch information
Nate Kupp committed Dec 30, 2019
1 parent 800ee22 commit ab99f49
Show file tree
Hide file tree
Showing 13 changed files with 364 additions and 280 deletions.
10 changes: 5 additions & 5 deletions .buildkite/defines.py
Original file line number Diff line number Diff line change
@@ -1,10 +1,10 @@
# This should be an enum once we make our own buildkite AMI with py3
class SupportedPython(object):
V3_8 = "3.8.0"
V3_7 = "3.7.4"
V3_6 = "3.6.9"
V3_5 = "3.5.7"
V2_7 = "2.7.16"
V3_8 = "3.8.1"
V3_7 = "3.7.6"
V3_6 = "3.6.10"
V3_5 = "3.5.8"
V2_7 = "2.7.17"


# See: https://github.com/dagster-io/dagster/issues/1960
Expand Down
10 changes: 5 additions & 5 deletions .buildkite/images/Makefile
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
py27 = 2.7.16
py35 = 3.5.7
py36 = 3.6.9
py37 = 3.7.4
py38 = 3.8.0
py27 = 2.7.17
py35 = 3.5.8
py36 = 3.6.10
py37 = 3.7.6
py38 = 3.8.1

####################################################################################################
# Update snapshots
Expand Down
6 changes: 3 additions & 3 deletions .buildkite/images/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,6 @@
example: https://buildkite.com/dagster/integration-image-builds/builds/19. Hit "New Build", then
make sure that the value for "Branch" is something like `phabricator/diff/7663` and the value for
"Commit" is `HEAD`. ("Message" doesn't matter.)
3. Supply the version string to unblock the buildkite build (in this case, "v7")
4. After the images successfully publish to ECR, update the diff to set `INTEGRATION_IMAGE_VERSION`
to the new version
3. After the images successfully publish to ECR, update the diff to set `INTEGRATION_IMAGE_VERSION`
to the new version (check ECR for the version string, which is the YYYY-mm-ddTHHMMSS when the
image was created.)
195 changes: 103 additions & 92 deletions .buildkite/images/docker/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -8,59 +8,88 @@
ARG DEBIAN_VERSION
ARG PYTHON_VERSION

FROM python:"${PYTHON_VERSION}-${DEBIAN_VERSION}"
FROM python:"${PYTHON_VERSION}-slim-${DEBIAN_VERSION}"

# ARG must be both before and after FROM
# See https://docs.docker.com/engine/reference/builder/#understand-how-arg-and-from-interact
ARG DEBIAN_VERSION
ARG PYTHON_MAJOR_VERSION

LABEL maintainer="Elementl"

# Never prompts the user for choices on installation/configuration of packages
ENV DEBIAN_FRONTEND noninteractive
ENV TERM linux
ENV DEBIAN_FRONTEND=noninteractive \
TERM=linux

# Add files needed for build
ADD trigger_maven.py .
ADD snapshot-reqs-$PYTHON_MAJOR_VERSION.txt /snapshot-reqs.txt
ADD scala_modules scala_modules

# Define en_US.
# Set correct locale first and install deps for installing debian packages
RUN apt-get update -yqq \
&& apt-get install -y locales \
&& apt-get upgrade -yqq \
&& apt-get install -yqq --no-install-recommends \
apt-transport-https \
curl \
ca-certificates \
gnupg2 \
locales \
lsb-release \
&& sed -i -e 's/# en_US.UTF-8 UTF-8/en_US.UTF-8 UTF-8/' /etc/locale.gen \
&& dpkg-reconfigure locales \
&& dpkg-reconfigure --frontend=noninteractive locales \
&& update-locale LANG=en_US.UTF-8
ENV LANG en_US.UTF-8
ENV LANGUAGE en_US:en
ENV LC_ALL en_US.UTF-8

# Set env vars
# This installs Java 8 (required by pyspark) - see: http://bit.ly/2ZIuHRh
ENV JAVA_HOME /docker-java-home
ENV JAVA_VERSION 8u222
ENV JAVA_DEBIAN_VERSION 8u222-b10-1~deb9u1
ENV SBT_VERSION=1.2.8
ENV PYSPARK_VERSION=2.4.4
ENV KUBECTL_VERSION=v1.16.3
ENV KIND_VERSION=v0.5.1
ENV LANG=en_US.UTF-8 \
LANGUAGE=en_US.UTF-8 \
LC_ALL=en_US.UTF-8 \
JAVA_HOME=/docker-java-home \
JAVA_VERSION=8u222 \
JAVA_DEBIAN_VERSION=8u222-b10-1~deb9u1 \
SBT_VERSION=1.2.8 \
PYSPARK_VERSION=2.4.4 \
KUBECTL_VERSION=v1.16.4 \
KIND_VERSION=v0.5.1

# Install kubectl
RUN echo "--- \033[32m:k8s: Install kubectl\033[0m" \
&& curl -LO "https://storage.googleapis.com/kubernetes-release/release/$KUBECTL_VERSION/bin/linux/amd64/kubectl" \
&& chmod +x ./kubectl \
&& mv ./kubectl /usr/local/bin/kubectl

# Install kind & helm
RUN echo "--- \033[32m:k8s: Install kind\033[0m" \
&& curl -L "https://github.com/kubernetes-sigs/kind/releases/download/$KIND_VERSION/kind-linux-amd64" -o ./kind \
&& chmod +x ./kind \
&& mv ./kind /usr/local/bin/kind \
&& echo "--- \033[32m:k8s: Install helm\033[0m" \
&& curl "https://raw.githubusercontent.com/helm/helm/master/scripts/get-helm-3" | bash

# add a simple script that can auto-detect the appropriate JAVA_HOME value
# based on whether the JDK or only the JRE is installed
RUN { \
# This installs Java 8 (required by pyspark) - see: http://bit.ly/2ZIuHRh
RUN set -ex \
&& echo "--- \033[32m:debian: Install Debian packages\033[0m" \
# Add a simple script that can auto-detect the appropriate JAVA_HOME value based on whether the JDK
# or only the JRE is installed
&& { \
echo '#!/bin/sh'; \
echo 'set -e'; \
echo; \
echo 'dirname "$(dirname "$(readlink -f "$(which javac || which java)")")"'; \
} > /usr/local/bin/docker-java-home \
&& chmod +x /usr/local/bin/docker-java-home \
# do some fancy footwork to create a JAVA_HOME that's cross-architecture-safe
# Do some fancy footwork to create a JAVA_HOME that's cross-architecture-safe
&& ln -svT "/usr/lib/jvm/java-8-openjdk-$(dpkg --print-architecture)" /docker-java-home \
# Pull in debian stretch stable repo so we can still get openjdk-8 on buster
&& echo "APT::Default-Release \"${DEBIAN_VERSION}\";" > /etc/apt/apt.conf.d/99defaultrelease \
&& echo "deb http://http.us.debian.org/debian stretch main contrib non-free" > /etc/apt/sources.list.d/91-debian-stretch.list

RUN set -ex \
# Pull in debian stretch stable repo so we can still get openjdk-8 on buster
&& echo "APT::Default-Release \"$(lsb_release -cs)\";" > /etc/apt/apt.conf.d/99defaultrelease \
&& echo "deb http://http.us.debian.org/debian stretch main contrib non-free" > /etc/apt/sources.list.d/91-debian-stretch.list \
# PostgreSQL debian repo so that we can install PG 11 on both stretch and buster
&& echo "deb http://apt.postgresql.org/pub/repos/apt/ $(lsb_release -cs)-pgdg main" > /etc/apt/sources.list.d/pgdg.list \
&& curl -sS https://www.postgresql.org/media/keys/ACCC4CF8.asc | apt-key add - \
# Node JS
&& curl -sL https://deb.nodesource.com/setup_11.x | bash - \
# Add yarn repo
&& curl -sS https://dl.yarnpkg.com/debian/pubkey.gpg | apt-key add - \
&& echo "deb https://dl.yarnpkg.com/debian/ stable main" | tee /etc/apt/sources.list.d/yarn.list \
# Deal with slim variants not having man page directories (which causes "update-alternatives"
# Deal with slim variants not having man page directories (which causes "update-alternatives" to fail)
&& mkdir -p /usr/share/man/man1 \
# Refresh apt
&& apt-get update -yqq \
Expand All @@ -69,23 +98,28 @@ RUN set -ex \
&& apt-get install -yqq --no-install-recommends \
bzip2 \
cron \
openjdk-8-jre-headless="$JAVA_DEBIAN_VERSION" \
openjdk-8-jdk-headless="$JAVA_DEBIAN_VERSION" \
gcc \
git \
# These three packages are needed for Python 3.8 until the associated libraries publish wheels
libc-dev \
libgeos-dev \
libpq-dev \
make \
nodejs \
openjdk-8-jdk-headless="$JAVA_DEBIAN_VERSION" \
openjdk-8-jre-headless="$JAVA_DEBIAN_VERSION" \
pandoc \
rabbitmq-server \
rsync \
ssh \
software-properties-common \
# Manually installing sudo else apt-get autoremove will fail trying to remove sudo later
sudo \
unzip \
xz-utils \
yarn \
# Clean up after install process
&& apt-get autoremove -yqq --purge \
&& apt-get clean \
&& rm -rf \
/var/lib/apt/lists/* \
/tmp/* \
/var/tmp/* \
/usr/share/doc \
/usr/share/doc-base \
# Need to ensure PG installs from PG debian repo
&& apt-get install -yqq postgresql-11 -t "$(lsb_release -cs)-pgdg" \
# update-alternatives so that future installs of other OpenJDK versions don't change /usr/bin/java
&& update-alternatives --get-selections | awk -v home="$(readlink -f "$JAVA_HOME")" 'index($3, home) == 1 { $2 = "manual"; print | "update-alternatives --set-selections" }' \
# Validate installation
Expand All @@ -94,78 +128,55 @@ RUN set -ex \
&& java -version

# This will frequently OOM without --no-cache-dir
RUN pip --no-cache-dir install pyspark==$PYSPARK_VERSION

RUN echo "--- \033[32m:python: Install Python dependencies\033[0m" \
&& pip --no-cache-dir install pyspark==$PYSPARK_VERSION \
# This instigates some package downloads required by the airline-demo. Fails on Python 3.8
ADD trigger_maven.py .
RUN python trigger_maven.py; exit 0

RUN apt-get update -yqq \
&& apt-get install -yqq \
libgdal-dev \
libgeos-dev \
libpq-dev

# pip install all the downstream deps to speed up our CI jobs
ADD snapshot-reqs-$PYTHON_MAJOR_VERSION.txt /snapshot-reqs.txt
&& python trigger_maven.py; exit 0
# pip install all of our deps to speed up our CI jobs
RUN pip install -U pip setuptools wheel \
&& pip install \
awscli \
pipenv \
tox \
&& pip install -r /snapshot-reqs.txt

# Install gcloud CLI
RUN echo "deb [signed-by=/usr/share/keyrings/cloud.google.gpg] http://packages.cloud.google.com/apt cloud-sdk main" | \
tee -a /etc/apt/sources.list.d/google-cloud-sdk.list && curl https://packages.cloud.google.com/apt/doc/apt-key.gpg | \
apt-key --keyring /usr/share/keyrings/cloud.google.gpg add - && apt-get update -y && apt-get install google-cloud-sdk -y

# Install Docker and other dependencies
RUN curl -L "https://github.com/docker/compose/releases/download/1.24.0/docker-compose-$(uname -s)-$(uname -m)" -o /usr/local/bin/docker-compose \
# Add gcloud CLI debian pkg source
RUN echo "--- \033[32m:linux: Misc installs and cleanup\033[0m" \
&& echo "deb [signed-by=/usr/share/keyrings/cloud.google.gpg] http://packages.cloud.google.com/apt cloud-sdk main" | \
tee -a /etc/apt/sources.list.d/google-cloud-sdk.list \
&& curl https://packages.cloud.google.com/apt/doc/apt-key.gpg | \
apt-key --keyring /usr/share/keyrings/cloud.google.gpg add - \
# Install docker-compose
&& curl -L "https://github.com/docker/compose/releases/download/1.24.0/docker-compose-$(uname -s)-$(uname -m)" -o /usr/local/bin/docker-compose \
&& chmod +x /usr/local/bin/docker-compose \
&& curl -fsSL https://download.docker.com/linux/ubuntu/gpg | apt-key add - \
# Get SBT
&& curl -LO https://dl.bintray.com/sbt/debian/sbt-$SBT_VERSION.deb \
&& curl -fsSL "https://download.docker.com/linux/ubuntu/gpg" | apt-key add - \
# Add SBT debian pkg
&& curl -LO "https://dl.bintray.com/sbt/debian/sbt-$SBT_VERSION.deb" \
&& dpkg -i sbt-$SBT_VERSION.deb \
&& rm sbt-$SBT_VERSION.deb \
&& add-apt-repository "deb [arch=amd64] https://download.docker.com/linux/debian $(lsb_release -cs) stable" \
&& apt-get -yqq update \
&& apt-get -yqq install \
apt-transport-https \
ca-certificates \
curl \
docker-ce \
gnupg2 \
pandoc \
google-cloud-sdk \
sbt \
rabbitmq-server \
postgresql \
postgresql-contrib \
# Validate that SBT works
&& sbt sbtVersion \
# Pre-load jars for scala_modules by running a compile
&& cd /scala_modules && make compile \
# Clean up after install process
&& apt-get remove -yqq \
gcc \
libc-dev \
libgeos-dev \
libpq-dev \
&& apt-get autoremove -yqq --purge \
&& apt-get clean \
&& rm -rf \
/sbt-$SBT_VERSION.deb \
/var/lib/apt/lists/* \
/tmp/* \
/var/tmp/* \
/usr/share/doc \
/usr/share/doc-base \
# Validate
&& sbt sbtVersion

# Install kubectl and kind CLI utils
RUN curl -LO https://storage.googleapis.com/kubernetes-release/release/$KUBECTL_VERSION/bin/linux/amd64/kubectl \
&& chmod +x ./kubectl \
&& mv ./kubectl /usr/local/bin/kubectl \
&& curl -Lo ./kind https://github.com/kubernetes-sigs/kind/releases/download/$KIND_VERSION/kind-linux-amd64 \
&& chmod +x ./kind \
&& mv ./kind /usr/local/bin/kind

# Pre-load jars for scala_modules by running a compile
ADD scala_modules scala_modules
RUN cd scala_modules && make compile

# Final cleanup
RUN rm -rf /scala_modules \
/snapshot-reqs.txt \
/trigger_maven.py
/scala_modules \
/snapshot-reqs.txt \
/trigger_maven.py
1 change: 0 additions & 1 deletion .buildkite/images/docker/build.sh
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,6 @@ fi

# Build the integration image
docker build . \
--no-cache \
--build-arg DEBIAN_VERSION=$DEBIAN_VERSION \
--build-arg PYTHON_VERSION=$PYTHON_VERSION \
--build-arg PYTHON_MAJOR_VERSION=$PYTHON_MAJOR_VERSION \
Expand Down
2 changes: 2 additions & 0 deletions .buildkite/images/docker/push.sh
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,8 @@ IMAGE_VERSION=$2

TAG=`date '+%Y-%m-%d'`

echo -e "--- \033[32m:docker: Tag and push Docker images\033[0m"

# Log into ECR
aws ecr get-login --no-include-email --region us-west-1 | sh

Expand Down
Loading

0 comments on commit ab99f49

Please sign in to comment.