Skip to content

Support for optional and alternative docker image builder for secure, smaller and faster image builds #138

@mumoshu

Description

@mumoshu

Hi @rhs, thanks a lot for maintaining this great project 👍

Before I dive into #127, I had wanted to make sure we can secure an other aspect of security throughout my deployment pipeline - the contents of docker images. #137 is a fix for it.

Problem

For example, many docker images for rails/npm/golang/etc apps suffers from one or more reasons below:

  1. Insecure image metadata
  • docker build --build-arg MY_SSH_PRIVATE_KEY=... is insecure. It leaves the build arg inside the resulting docker image metadata. Run docker inspect <image> | grep MY_SSH_PRIVATE_KEY and you'll see the secret data remaining in the metadata.
  1. Insecure image content
  • If you have e.g. ADD your_ssh_private_key ~/.ssh/ in your Dockerfile and you had not squashed it after the build to leverage docker caching, your image contains the ssh key.
  • Just run docker save -o foobar.tar foobar && tar -xf foobar and grep-find to see the key remaining in the layer.
  1. Slow builds
  • If you (1) squashed the layers containing secrets or (2) utilized docker multi-stage builds and chose not to docker-push intermediate images containing secrets for the security reason, you'll lose docker layer caching for that image.
  1. Unreadable, relatively complicated script to form a docker-image-build pipeline
  • To avoid these issues, you'd probably need a pipeline to (1) obtain any sources requiring your secret outside of docker build, either with running the package manager on your machine or your docker container, and (2) Add e.g. ADD node_modules ... or ADD vendor/bundle ... to populate the resulting app docker image w/ dependencies.
  • There is already a previous art named openshift/source-to-image which looks useful for implementing the docker-image-build pipeline outlined above. It basically chains docker-builds and docker-runs to finally produce a secure, efficient docker image for your app.

Why not docker multi-stage builds?

Docker multi-stage builds are great as long as what you need is public library deps only.

With multi-stage builds, you'll likely end up with two images at minimum.

CAUTION: There's multiple bad points in the example below. Do not use it in production images!

FROM ruby:2.3.5-alpine as builder

RUN apk --update add --virtual build-dependencies openssh git build-base ruby-dev openssl-dev libxml2-dev libxslt-dev \  
    mysql-dev postgresql-dev libc-dev linux-headers nodejs tzdata

RUN echo 'gem: --no-document' > /etc/gemrc

RUN gem install bundler

## ADVANCED USE-CASE 1: add vendored gems
ADD myvendoredgem /app/myvendoredgem
ADD Gemfile /app/
ADD Gemfile.lock /app/

WORKDIR /app

# Either a private key or username/token pair would be enough
ARG SSH_PRIVATE_KEY
ARG GITHUB_USERNAME
ARG GITHUB_TOKEN

# If you want to authenticate against private git repos w/ the GH token, you'll need a git credential helper like this
ADD ci/git-credential-github-token /usr/local/bin
ADD ci/git-global-configs /usr/local/bin

RUN git-global-configs

RUN bundle config build.nokogiri --use-system-libraries

RUN mkdir -p /root/.ssh \
  && touch /root/.ssh/known_hosts \
  && ssh-keyscan github.com >> /root/.ssh/known_hosts \
  && \
    if [ ! -z "${SSH_PRIVATE_KEY}" ]; then \
      echo "using ssh private key for git-cloning..." \
      && echo "${SSH_PRIVATE_KEY}" > /root/.ssh/id_rsa \
      && chmod 400 /root/.ssh/id_rsa \
      && (set +e; ssh git@github.com; status=$?; if [ $status != 1 ]; then echo unexpected exit status: $status 1>&2; exit 1; fi; set -e); \
    fi \
  && echo running bundle install... \
  && bundle install -j3 --deployment --path vendor/bundle --without development test --no-cache \
  && du -sh vendor/bundle \
  && echo "Removing object files" \
  && find . -iname '*.o' -exec rm {} \; \
  && find . -iname '*.a' -exec rm {} \; \
  && du -sh vendor/bundle \
  && rm -rf /root/.ssh

FROM ruby:2.3.5-alpine as runner

ENV LANG ja_JP.UTF-8

RUN apk --update add tzdata imagemagick mariadb-dev libxml2 libxslt openssl ruby-bundler \
  && rm /usr/lib/libmysqld* \
  && apk del openssl-dev mariadb-client-libs mariadb-common

COPY --from=builder /app /app

ADD . /app
RUN chown -R nobody:nogroup /app  
USER nobody

WORKDIR /app

EXPOSE 8080

## Don't do this in production, of course! This is just a basic example to illustrate issues in multi-stage builds
CMD ["bundle", "exec", "rails", "s", "-p", "8080"]

What's the problem?

This:

ARG SSH_PRIVATE_KEY
ARG GITHUB_USERNAME
ARG GITHUB_TOKEN

is very very suspicious.

If you aren't very keen for fast docker-builds, it is ok.
However, once you want to make builds fast and started docker-pushing the intermediate image(=builder) to a docker registry, you leak the secrets to the registry.

Where's the secrets? The metadata of the docker image.

You can see the secrets remaining in the metadata by running docker build --arg GITHUB_USER=yourgithubuser --arg GITHUB_TOKEN=yourpersonalaccesstoken and then docker inspect <the intermediate image> | grep yourpersonalaccesstoken.

ADD secrets instead of ARG?

The outcome is almost the same - except you end up leaving your secrets inside one of docker image layers instead of the image metadata now.

Proposed fix

I believe the third point can be addressed with forge + imagebuilder.

With imagebuilder, you can basically run docker build w/ volume mounts which may contain secrets. In other words, you can safely use the imagebuilder to run your package manager from your Dockerfile, which simplifies the pipeline.

imagebuilder also supports squashing for producing smaller images, faster builds because it doesn't upload a build context.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions