Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce docker image size #1648

Merged
merged 10 commits into from
Oct 23, 2019
119 changes: 90 additions & 29 deletions .circleci/config.yml
Original file line number Diff line number Diff line change
@@ -1,11 +1,12 @@
version: 2
version: 2.1

references:
workspace_root: &workspace_root /tmp/workspace
attach_workspace: &attach_workspace
attach_workspace:
at: *workspace_root


jobs:
# ----------------------------------
# Check formatting
Expand Down Expand Up @@ -56,7 +57,7 @@ jobs:
# test a standard install of prefect
# is importable in python 3.5.2
# there was a typing bug in 3.5.2 that this attempts to catch
test-py352-import-prefect:
test_py352_import_prefect:
docker:
- image: python:3.5.2

Expand All @@ -74,7 +75,7 @@ jobs:
# test a standard install of prefect
# with all requriements pinned to their lowest allowed versions
# to ensure our requirements.txt file is accurate
test-lower-prefect:
test_lower_prefect:
docker:
- image: python:3.5.2

Expand Down Expand Up @@ -114,7 +115,7 @@ jobs:
# test a standard install of prefect
# this ensures we correctly capture all ImportError sitautions
# caused by many package dependency options
test-vanilla-prefect:
test_vanilla_prefect:
docker:
- image: python:3.6

Expand Down Expand Up @@ -147,7 +148,7 @@ jobs:
# Run unit tests in Python 3.5-3.7
# ----------------------------------

test-3.5:
test_35:
docker:
- image: python:3.5

Expand Down Expand Up @@ -176,7 +177,7 @@ jobs:
paths:
- coverage

test-3.6:
test_36:
docker:
- image: python:3.6
steps:
Expand Down Expand Up @@ -204,7 +205,7 @@ jobs:
paths:
- coverage

test-3.7:
test_37:
docker:
- image: python:3.7
steps:
Expand All @@ -226,7 +227,7 @@ jobs:
name: Run tests
command: pytest -rfEsx .

test-airflow:
test_airflow:
docker:
- image: continuumio/miniconda3:4.6.14
steps:
Expand Down Expand Up @@ -257,7 +258,7 @@ jobs:
paths:
- coverage

upload-coverage:
upload_coverage:
docker:
- image: python:3.6
steps:
Expand All @@ -266,43 +267,85 @@ jobs:
name: Upload Coverage
command: bash <(curl -s https://codecov.io/bash) -cF python -s "/tmp/workspace/coverage/"

build_image:
build_docker_image:
docker:
- image: docker
parameters:
python_version:
type: string
tag_latest:
type: boolean
default: false
environment:
wagoodman marked this conversation as resolved.
Show resolved Hide resolved
PYTHON_VERSION: << parameters.python_version >>
PYTHON_TAG: python<< parameters.python_version >>
steps:
- setup_remote_docker
- checkout
- run:
name: Docker Build
command: docker build -t prefecthq/prefect .
# todo: is there a better way to ensure that this is a commit on master?
name: Master branch check
command: |
apk add git
if [[ $(git branch --contains $CIRCLE_SHA1 --points-at master | wc -l) -ne 1 ]]; then
echo "commit $CIRCLE_SHA1 is not a member of the master branch"
exit 1
fi
- setup_remote_docker:
docker_layer_caching: true
- run:
name: Build image
command: >-
docker build
--build-arg GIT_SHA=$CIRCLE_SHA1
--build-arg BUILD_DATE=$(date -u +'%Y-%m-%dT%H:%M:%SZ')
--build-arg PREFECT_VERSION=$CIRCLE_TAG
-t prefecthq/prefect:${CIRCLE_TAG}-${PYTHON_TAG}
cicdw marked this conversation as resolved.
Show resolved Hide resolved
-t prefecthq/prefect:$PYTHON_TAG
.
- when:
condition: << parameters.tag_latest >>
steps:
- run:
name: Tag latest image
command: |
docker tag prefecthq/prefect:${CIRCLE_TAG}-${PYTHON_TAG} prefecthq/prefect:latest
- run:
name: Test image
command: |
docker run -dit prefecthq/prefect /bin/bash -c 'curl -fL0 https://raw.githubusercontent.com/PrefectHQ/prefect/master/examples/retries_with_mapping.py | python'
- run:
name: Authenticate with Docker Hub and push
name: Push versioned tags
command: |
docker login --username $DOCKER_HUB_USER --password $DOCKER_HUB_PW
docker push prefecthq/prefect
docker push prefecthq/prefect:${CIRCLE_TAG}-${PYTHON_TAG}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we might want to finesse the logic here a little, because I suspect this will end up pushing an image with a messy git tag

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm open to a better way to do it. I added circle-ci git tag filters here to help reduce the risk of an odd tag coming through.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm yea but won't this job still run on master?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good call, I thought these were anded instead of ored. I'll update...

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interestingly enough there isn't a good way to do this, @TylerWanner and I came up with this combo:

Up to a better approach, but this is all I've got until circleci supports and'ed filters.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is the only place I'm still hesitant; here's what I imagine happening:

  • a PR is merged to master
  • build_docker_image is run
  • CIRCLE_TAG is empty, because this isn't a tagged commit
  • either this step or the build step errors out in some unique way based on this

Everything else looks great though! And to be honest, we could always clean things up if that happens but I want to make sure I understand this piece

Copy link
Author

@wagoodman wagoodman Oct 23, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't believe that it is possible to reach this step because of the workflow filter, however, I'm all on-board for an extra measure of safety. Just added the set -u mentioned below which will fail the build image step on empty env vars.

docker push prefecthq/prefect:$PYTHON_TAG
- when:
condition: << parameters.tag_latest >>
steps:
- run:
name: Push latest tag
command: |
docker login --username $DOCKER_HUB_USER --password $DOCKER_HUB_PW
docker push prefecthq/prefect:latest

workflows:
version: 2

"Run tests":
jobs:
- test-3.5
- test-3.6
- test-3.7
- test-lower-prefect
- test-vanilla-prefect
- test-py352-import-prefect
- test-airflow
- upload-coverage:
- test_35
- test_36
- test_37
- test_lower_prefect
- test_vanilla_prefect
- test_py352_import_prefect
- test_airflow
- upload_coverage:
requires:
- test-3.5
- test-3.6
- test-vanilla-prefect
- test-airflow
- test_35
- test_36
- test_vanilla_prefect
- test_airflow

"Check code style and docs":
jobs:
Expand All @@ -312,7 +355,25 @@ workflows:

"Build docker images":
jobs:
- build_image:
- build_docker_image:
python_version: "3.5"
filters:
branches:
ignore: /.*/
tags:
only: /^[0-9]+\.[0-9]+\.[0-9]+$/
- build_docker_image:
python_version: "3.6"
filters:
branches:
ignore: /.*/
tags:
only: /^[0-9]+\.[0-9]+\.[0-9]+$/
- build_docker_image:
python_version: "3.7"
tag_latest: true
filters:
branches:
only: master
ignore: /.*/
tags:
only: /^[0-9]+\.[0-9]+\.[0-9]+$/
3 changes: 2 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,8 @@ These changes are available in the [master branch](https://github.com/PrefectHQ/
### Enhancements

- Add the ability to delete task tag limits using the client - [#1622](https://github.com/PrefectHQ/prefect/pull/1622)
- Adds an "Ask for help" button with a link to the prefect.io support page [#1637](https://github.com/PrefectHQ/prefect/pull/1637)
- Adds an "Ask for help" button with a link to the prefect.io support page - [#1637](https://github.com/PrefectHQ/prefect/pull/1637)
- Reduces the size of the `prefecthq/prefect` Docker image by ~400MB, which is now the base Docker image used in Flows - [#1648](https://github.com/PrefectHQ/prefect/pull/1648)

### Task Library

Expand Down
22 changes: 17 additions & 5 deletions Dockerfile
Original file line number Diff line number Diff line change
@@ -1,8 +1,20 @@
ARG PYTHON_VERSION=3.6
ARG PYTHON_VERSION=${PYTHON_VERSION:-3.6}
FROM python:${PYTHON_VERSION}-slim

FROM python:${PYTHON_VERSION}
LABEL maintainer="help@prefect.io"
ARG GIT_POINTER=master
RUN apt update && apt install -y gcc git && rm -rf /var/lib/apt/lists/*

RUN pip install git+https://github.com/PrefectHQ/prefect.git@${GIT_POINTER}#egg=prefect[kubernetes]
ARG PREFECT_VERSION
RUN pip install git+https://github.com/PrefectHQ/prefect.git@${PREFECT_VERSION}#egg=prefect[kubernetes]
RUN mkdir /root/.prefect/

ARG GIT_SHA
ARG BUILD_DATE

LABEL maintainer="help@prefect.io"
LABEL io.prefect.python-version=${PYTHON_VERSION}
LABEL org.label-schema.schema-version = "1.0"
LABEL org.label-schema.name="prefect"
LABEL org.label-schema.url="https://www.prefect.io/"
LABEL org.label-schema.version=${PREFECT_VERSION}
LABEL org.label-schema.vcs-ref=${GIT_SHA}
LABEL org.label-schema.build-date=${BUILD_DATE}
wagoodman marked this conversation as resolved.
Show resolved Hide resolved
63 changes: 44 additions & 19 deletions src/prefect/environments/storage/docker.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@
import json
import logging
import os
import re
import shutil
import sys
import tempfile
Expand Down Expand Up @@ -35,7 +36,8 @@ class Docker(Storage):

Args:
- registry_url (str, optional): URL of a registry to push the image to; image will not be pushed if not provided
- base_image (str, optional): the base image for this environment (e.g. `python:3.6`), defaults to `python:3.6`
- base_image (str, optional): the base image for this environment (e.g. `python:3.6`), defaults to the `prefecthq/prefect` image
matching your python version and prefect core library version used at runtime.
- python_dependencies (List[str], optional): list of pip installable dependencies for the image
- image_name (str, optional): name of the image to use when building, populated with a UUID after build
- image_tag (str, optional): tag of the image to use when building, populated with a UUID after build
Expand Down Expand Up @@ -63,14 +65,6 @@ def __init__(
) -> None:
self.registry_url = registry_url

if base_image is None:
python_version = "{}.{}".format(
sys.version_info.major, sys.version_info.minor
)
self.base_image = "python:{}".format(python_version)
else:
self.base_image = base_image

if sys.platform == "win32":
default_url = "npipe:////./pipe/docker_engine"
else:
Expand All @@ -79,19 +73,49 @@ def __init__(
self.image_name = image_name
self.image_tag = image_tag
self.python_dependencies = python_dependencies or []
self.python_dependencies.append("wheel")

self.env_vars = env_vars or {}
self.env_vars["PREFECT__USER_CONFIG_PATH"] = "/root/.prefect/config.toml"
wagoodman marked this conversation as resolved.
Show resolved Hide resolved

self.files = files or {}
self.flows = dict() # type: Dict[str, str]
self._flows = dict() # type: Dict[str, "prefect.core.flow.Flow"]
self.base_url = base_url or default_url
self.local_image = local_image
self.extra_commands = [] # type: List[str]

version = prefect.__version__.split("+")
if prefect_version is None:
self.prefect_version = "master" if len(version) > 1 else version[0]
else:
self.prefect_version = prefect_version

if base_image is None:
python_version = "{}.{}".format(
sys.version_info.major, sys.version_info.minor
)
if re.match("^[0-9]+\.[0-9]+\.[0-9]+$", self.prefect_version) != None:
self.base_image = "prefecthq/prefect:{}-python{}".format(
self.prefect_version, python_version
)
elif self.prefect_version == "master":
# use the latest image for the given python version
self.base_image = "prefecthq/prefect:python{}".format(python_version)
else:
# create an image from python:*-slim directly
self.base_image = "python:{}-slim".format(python_version)
self.extra_commands.extend(
[
"apt update && apt install -y gcc git && rm -rf /var/lib/apt/lists/*",
"pip install git+https://github.com/PrefectHQ/prefect.git@{}#egg=prefect[kubernetes]".format(
self.prefect_version
),
]
)
else:
self.base_image = base_image

not_absolute = [
file_path for file_path in self.files if not os.path.isabs(file_path)
]
Expand Down Expand Up @@ -311,11 +335,11 @@ def create_dockerfile_object(self, directory: str = None) -> None:

with open(os.path.join(directory, "Dockerfile"), "w+") as dockerfile:

# Generate RUN pip install commands for python dependencies
pip_installs = ""
# Generate single pip install command for python dependencies
pip_installs = "RUN pip install "
if self.python_dependencies:
for dependency in self.python_dependencies:
pip_installs += "RUN pip install {}\n".format(dependency)
pip_installs += "{} ".format(dependency)

# Generate ENV variables to load into the image
env_vars = ""
Expand Down Expand Up @@ -355,6 +379,11 @@ def create_dockerfile_object(self, directory: str = None) -> None:
source="{}.flow".format(clean_name), dest=flow_location
)

# Write all extra commands that should be run in the image
extra_commands = ""
for cmd in self.extra_commands:
extra_commands += "RUN {}\n".format(cmd)

# Write a healthcheck script into the image
with open(
os.path.join(os.path.dirname(__file__), "_healthcheck.py"), "r"
Expand All @@ -369,28 +398,24 @@ def create_dockerfile_object(self, directory: str = None) -> None:
FROM {base_image}

RUN pip install pip --upgrade
RUN pip install wheel
{extra_commands}
{pip_installs}

RUN mkdir /root/.prefect/
RUN mkdir -p /root/.prefect/
{copy_flows}
COPY healthcheck.py /root/.prefect/healthcheck.py
{copy_files}

ENV PREFECT__USER_CONFIG_PATH="/root/.prefect/config.toml"
{env_vars}

# update version if base image already has prefect installed
RUN pip install -U git+https://github.com/PrefectHQ/prefect.git@{version}#egg=prefect[kubernetes]

RUN python /root/.prefect/healthcheck.py '[{flow_file_paths}]' '{python_version}'
""".format(
extra_commands=extra_commands,
base_image=self.base_image,
pip_installs=pip_installs,
copy_flows=copy_flows,
copy_files=copy_files,
env_vars=env_vars,
version=self.prefect_version,
flow_file_paths=", ".join(
['"{}"'.format(k) for k in self.flows.values()]
),
Expand Down
Loading