Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce docker image size #1648

Merged
merged 10 commits into from
Oct 23, 2019
86 changes: 69 additions & 17 deletions .circleci/config.yml
Original file line number Diff line number Diff line change
@@ -1,11 +1,12 @@
version: 2
version: 2.1

references:
workspace_root: &workspace_root /tmp/workspace
attach_workspace: &attach_workspace
attach_workspace:
at: *workspace_root


jobs:
# ----------------------------------
# Check formatting
Expand Down Expand Up @@ -147,7 +148,7 @@ jobs:
# Run unit tests in Python 3.5-3.7
# ----------------------------------

test-3.5:
test-35:
docker:
- image: python:3.5

Expand Down Expand Up @@ -176,7 +177,7 @@ jobs:
paths:
- coverage

test-3.6:
test-36:
docker:
- image: python:3.6
steps:
Expand Down Expand Up @@ -204,7 +205,7 @@ jobs:
paths:
- coverage

test-3.7:
test-37:
docker:
- image: python:3.7
steps:
Expand Down Expand Up @@ -266,41 +267,74 @@ jobs:
name: Upload Coverage
command: bash <(curl -s https://codecov.io/bash) -cF python -s "/tmp/workspace/coverage/"

build_image:
build-docker-image:
wagoodman marked this conversation as resolved.
Show resolved Hide resolved
docker:
- image: docker
parameters:
python-version:
type: string
tag-latest:
type: boolean
default: false
environment:
wagoodman marked this conversation as resolved.
Show resolved Hide resolved
PYTHON_VERSION: << parameters.python-version >>
PYTHON_TAG: python<< parameters.python-version >>
steps:
- setup_remote_docker
- checkout
- run:
name: Docker Build
command: docker build -t prefecthq/prefect .
- setup_remote_docker:
docker_layer_caching: true
- run:
name: Build image
command: >-
docker build
--build-arg GIT_SHA=$CIRCLE_SHA1
--build-arg BUILD_DATE=$(date -u +'%Y-%m-%dT%H:%M:%SZ')
--build-arg PREFECT_VERSION=$CIRCLE_TAG
-t prefecthq/prefect:${CIRCLE_TAG}-${PYTHON_TAG}
cicdw marked this conversation as resolved.
Show resolved Hide resolved
-t prefecthq/prefect:$PYTHON_TAG
.
- when:
condition: << parameters.tag-latest >>
wagoodman marked this conversation as resolved.
Show resolved Hide resolved
steps:
- run:
name: Tag latest image
command: |
docker tag prefecthq/prefect:${CIRCLE_TAG}-${PYTHON_TAG} prefecthq/prefect:latest
- run:
name: Test image
command: |
docker run -dit prefecthq/prefect /bin/bash -c 'curl -fL0 https://raw.githubusercontent.com/PrefectHQ/prefect/master/examples/retries_with_mapping.py | python'
- run:
name: Authenticate with Docker Hub and push
name: Push versioned tags
command: |
docker login --username $DOCKER_HUB_USER --password $DOCKER_HUB_PW
docker push prefecthq/prefect
docker push prefecthq/prefect:${CIRCLE_TAG}-${PYTHON_TAG}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we might want to finesse the logic here a little, because I suspect this will end up pushing an image with a messy git tag

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm open to a better way to do it. I added circle-ci git tag filters here to help reduce the risk of an odd tag coming through.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm yea but won't this job still run on master?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good call, I thought these were anded instead of ored. I'll update...

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interestingly enough there isn't a good way to do this, @TylerWanner and I came up with this combo:

Up to a better approach, but this is all I've got until circleci supports and'ed filters.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is the only place I'm still hesitant; here's what I imagine happening:

  • a PR is merged to master
  • build_docker_image is run
  • CIRCLE_TAG is empty, because this isn't a tagged commit
  • either this step or the build step errors out in some unique way based on this

Everything else looks great though! And to be honest, we could always clean things up if that happens but I want to make sure I understand this piece

Copy link
Author

@wagoodman wagoodman Oct 23, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't believe that it is possible to reach this step because of the workflow filter, however, I'm all on-board for an extra measure of safety. Just added the set -u mentioned below which will fail the build image step on empty env vars.

docker push prefecthq/prefect:$PYTHON_TAG
- when:
condition: << parameters.tag-latest >>
steps:
- run:
name: Push latest tag
command: |
docker login --username $DOCKER_HUB_USER --password $DOCKER_HUB_PW
docker push prefecthq/prefect:latest

workflows:
version: 2

"Run tests":
jobs:
- test-3.5
- test-3.6
- test-3.7
- test-35
- test-36
- test-37
- test-lower-prefect
- test-vanilla-prefect
- test-py352-import-prefect
- test-airflow
- upload-coverage:
requires:
- test-3.5
- test-3.6
- test-35
- test-36
- test-vanilla-prefect
- test-airflow

Expand All @@ -312,7 +346,25 @@ workflows:

"Build docker images":
jobs:
- build_image:
- build-docker-image:
python-version: "3.5"
filters:
branches:
only: master
tags:
only: /^[0-9]+\.[0-9]+\.[0-9]+$/
- build-docker-image:
python-version: "3.6"
filters:
branches:
only: master
tags:
only: /^[0-9]+\.[0-9]+\.[0-9]+$/
- build-docker-image:
python-version: "3.7"
tag-latest: true
filters:
branches:
only: master
tags:
only: /^[0-9]+\.[0-9]+\.[0-9]+$/
22 changes: 17 additions & 5 deletions Dockerfile
Original file line number Diff line number Diff line change
@@ -1,8 +1,20 @@
ARG PYTHON_VERSION=3.6
ARG PYTHON_VERSION=${PYTHON_VERSION:-3.6}
FROM python:${PYTHON_VERSION}-slim

FROM python:${PYTHON_VERSION}
LABEL maintainer="help@prefect.io"
ARG GIT_POINTER=master
RUN apt update && apt install -y gcc git && rm -rf /var/lib/apt/lists/*

RUN pip install git+https://github.com/PrefectHQ/prefect.git@${GIT_POINTER}#egg=prefect[kubernetes]
ARG PREFECT_VERSION
RUN pip install git+https://github.com/PrefectHQ/prefect.git@${PREFECT_VERSION}#egg=prefect[kubernetes]
RUN mkdir /root/.prefect/

ARG GIT_SHA
ARG BUILD_DATE

LABEL maintainer="help@prefect.io"
LABEL io.prefect.python-version=${PYTHON_VERSION}
LABEL org.label-schema.schema-version = "1.0"
LABEL org.label-schema.name="prefect"
LABEL org.label-schema.url="https://www.prefect.io/"
LABEL org.label-schema.version=${PREFECT_VERSION}
LABEL org.label-schema.vcs-ref=${GIT_SHA}
LABEL org.label-schema.build-date=${BUILD_DATE}
wagoodman marked this conversation as resolved.
Show resolved Hide resolved
61 changes: 43 additions & 18 deletions src/prefect/environments/storage/docker.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@
import json
import logging
import os
import re
import shutil
import sys
import tempfile
Expand Down Expand Up @@ -63,14 +64,6 @@ def __init__(
) -> None:
self.registry_url = registry_url

if base_image is None:
python_version = "{}.{}".format(
sys.version_info.major, sys.version_info.minor
)
self.base_image = "python:{}".format(python_version)
else:
self.base_image = base_image

if sys.platform == "win32":
default_url = "npipe:////./pipe/docker_engine"
else:
Expand All @@ -79,19 +72,50 @@ def __init__(
self.image_name = image_name
self.image_tag = image_tag
self.python_dependencies = python_dependencies or []
self.python_dependencies.append("wheel")

self.env_vars = env_vars or {}
self.env_vars["PREFECT__USER_CONFIG_PATH"] = "/root/.prefect/config.toml"
wagoodman marked this conversation as resolved.
Show resolved Hide resolved

self.files = files or {}
self.flows = dict() # type: Dict[str, str]
self._flows = dict() # type: Dict[str, "prefect.core.flow.Flow"]
self.base_url = base_url or default_url
self.local_image = local_image
self.extra_commands = [] # type: List[str]

version = prefect.__version__.split("+")
if prefect_version is None:
self.prefect_version = "master" if len(version) > 1 else version[0]
else:
self.prefect_version = prefect_version

if base_image is None:
python_version = "{}.{}".format(
sys.version_info.major, sys.version_info.minor
)
if re.match("^[0-9]+\.[0-9]+\.[0-9]+$", self.prefect_version) != None:
# note: this does not necessarily mean that we have built/pushed all previous prefect versions to dockerhub
cicdw marked this conversation as resolved.
Show resolved Hide resolved
self.base_image = "prefecthq/prefect:{}-python{}".format(
self.prefect_version, python_version
)
elif self.prefect_version == "master":
# use the latest image for the given python version
self.base_image = "prefecthq/prefect:python{}".format(python_version)
else:
# create an image from python:*-slim directly
self.base_image = "python:{}-slim".format(python_version)
self.extra_commands.extend(
[
"apt update && apt install -y gcc git && rm -rf /var/lib/apt/lists/*",
"pip install git+https://github.com/PrefectHQ/prefect.git@{}#egg=prefect[kubernetes]".format(
self.prefect_version
),
]
)
else:
self.base_image = base_image

not_absolute = [
file_path for file_path in self.files if not os.path.isabs(file_path)
]
Expand Down Expand Up @@ -311,11 +335,11 @@ def create_dockerfile_object(self, directory: str = None) -> None:

with open(os.path.join(directory, "Dockerfile"), "w+") as dockerfile:

# Generate RUN pip install commands for python dependencies
pip_installs = ""
# Generate single pip install command for python dependencies
pip_installs = "RUN pip install "
if self.python_dependencies:
for dependency in self.python_dependencies:
pip_installs += "RUN pip install {}\n".format(dependency)
pip_installs += "{} ".format(dependency)

# Generate ENV variables to load into the image
env_vars = ""
Expand Down Expand Up @@ -355,6 +379,11 @@ def create_dockerfile_object(self, directory: str = None) -> None:
source="{}.flow".format(clean_name), dest=flow_location
)

# Write all extra commands that should be run in the image
extra_commands = ""
for cmd in self.extra_commands:
extra_commands += "RUN {}\n".format(cmd)

# Write a healthcheck script into the image
with open(
os.path.join(os.path.dirname(__file__), "_healthcheck.py"), "r"
Expand All @@ -369,28 +398,24 @@ def create_dockerfile_object(self, directory: str = None) -> None:
FROM {base_image}

RUN pip install pip --upgrade
RUN pip install wheel
{extra_commands}
{pip_installs}

RUN mkdir /root/.prefect/
RUN mkdir -p /root/.prefect/
{copy_flows}
COPY healthcheck.py /root/.prefect/healthcheck.py
{copy_files}

ENV PREFECT__USER_CONFIG_PATH="/root/.prefect/config.toml"
{env_vars}

# update version if base image already has prefect installed
RUN pip install -U git+https://github.com/PrefectHQ/prefect.git@{version}#egg=prefect[kubernetes]

RUN python /root/.prefect/healthcheck.py '[{flow_file_paths}]' '{python_version}'
""".format(
extra_commands=extra_commands,
base_image=self.base_image,
pip_installs=pip_installs,
copy_flows=copy_flows,
copy_files=copy_files,
env_vars=env_vars,
version=self.prefect_version,
flow_file_paths=", ".join(
['"{}"'.format(k) for k in self.flows.values()]
),
Expand Down
Loading