Faster image rebuilding when only model artifacts are updated #1199

bojiang · 2020-10-21T07:01:56Z

Motivation and Context

Slack thread:

Hi there! I assume bentoml is intended to create docker images that contain all of the dependencies. Are there any plans on injecting model weights dynamically after the docker is already created? Maybe it's already possible? This would dramatically reduce the size of the docker image when the model files are big.

There are two ways to achieve this:

Inject model weights directly into the container and hot reload the service without a restart.
Inject model weights into the image and restart a new container.

For [1], the hot-updating of weights highly relies on frameworks' API. Since bentoml supports more than just one framework, it's not consistent to just support some of them.

And [2] is hardly the same as:

Build a new image with the new model weights without rebuilding the full image.

And also, model files are not bigger than so-called weights much. In fact, weights account for 99% of the size of a saved model.
Take TensorFlow saved model as an example. There are saved_model and checkpoint for tf official docs:

A SavedModel contains a complete TensorFlow program, including weights and computation. If you just want to save/load weights during training see the guide to training checkpoints.

Try to save a tf model as both of them, you will find them hardly the same on size (the difference is about 100K).

Therefore I think leaving saved models (artifacts) as a separate layer is enough.

Description

Make the copy operation of artifacts (models in most cases) the last step in the docker build, thereby saving artifacts as a separate layer.

Also reduced image size by calling conda clean --all in bentoml-init.sh

(py36) git:bojiang(docker) ➜  bentoml docker history 76ff00c17411
IMAGE               CREATED             CREATED BY                                      SIZE                COMMENT
76ff00c17411        26 seconds ago      /bin/sh -c #(nop)  CMD ["bentoml" "serve-gun…   0B
b6b0ff3a0f00        26 seconds ago      /bin/sh -c #(nop)  ENTRYPOINT ["docker-entry…   0B
d2a190b08ac2        26 seconds ago      /bin/sh -c #(nop)  EXPOSE 5000                  0B
8d15d9caf6a3        26 seconds ago      /bin/sh -c #(nop)  ENV PORT=5000                0B
98e5920dc1c2        26 seconds ago      /bin/sh -c #(nop) COPY dir:a13e2e0bc61e4515d…   1.86MB
c1c2781a9086        27 seconds ago      /bin/sh -c if [ -d /bento/bundled_pip_depend…   0B
77ed248c4a8f        27 seconds ago      /bin/sh -c #(nop) COPY multi:29273dad2100153…   537kB
6bf475223430        28 seconds ago      /bin/sh -c if [ -f /bento/bentoml-init.sh ];…   968MB
a80dbb5db084        12 minutes ago      /bin/sh -c chmod +x /bento/bentoml-init.sh /…   3.04kB
72e7263541db        12 minutes ago      /bin/sh -c #(nop) COPY file:268f29c046e12bc7…   699B
456f438a91e4        12 minutes ago      /bin/sh -c #(nop) WORKDIR /bento                0B
371730e1ccdf        12 minutes ago      /bin/sh -c #(nop) COPY multi:e78a88d8d2bd58a…   2.51kB
658d204a1d06        17 hours ago        /bin/sh -c #(nop)  ENV EXTRA_PIP_INSTALL_ARG…   0B
e43597df26ca        17 hours ago        /bin/sh -c #(nop)  ARG EXTRA_PIP_INSTALL_ARG…   0B
1685aadc028b        3 weeks ago         /bin/sh -c #(nop)  CMD ["bentoml" "serve-gun…   0B
<missing>           3 weeks ago         /bin/sh -c #(nop)  ENTRYPOINT ["entrypoint.s…   0B
<missing>           3 weeks ago         /bin/sh -c #(nop) COPY file:183fc6d06e11b722…   741B
<missing>           3 weeks ago         /bin/sh -c pip install bentoml==$BENTOML_VER…   167MB
<missing>           3 weeks ago         /bin/sh -c conda install pip python=$PYTHON_…   212MB
<missing>           3 weeks ago         /bin/sh -c #(nop)  ENV PYTHON_VERSION=3.6       0B
<missing>           3 weeks ago         /bin/sh -c #(nop)  ARG PYTHON_VERSION           0B
<missing>           3 weeks ago         /bin/sh -c #(nop)  ENV BENTOML_VERSION=0.9.0    0B
<missing>           8 weeks ago         /bin/sh -c #(nop)  ARG BENTOML_VERSION          0B
<missing>           8 weeks ago         /bin/sh -c conda update -n base -c defaults …   114MB
<missing>           4 months ago        /bin/sh -c apt-get update --fix-missing &&  …   184MB
<missing>           7 months ago        /bin/sh -c #(nop)  CMD ["/bin/bash"]            0B
<missing>           7 months ago        /bin/sh -c wget --quiet https://repo.anacond…   149MB
<missing>           7 months ago        /bin/sh -c apt-get update --fix-missing &&  …   210MB
<missing>           7 months ago        /bin/sh -c #(nop)  ENV PATH=/opt/conda/bin:/…   0B
<missing>           7 months ago        /bin/sh -c #(nop)  ENV LANG=C.UTF-8 LC_ALL=C…   0B
<missing>           7 months ago        /bin/sh -c #(nop)  CMD ["bash"]                 0B
<missing>           7 months ago        /bin/sh -c #(nop) ADD file:e5a364615e0f69616…   69.2MB

A docker image of a typical MNIST classifier service by this change. The artifacts coping is in the top layer 98e5920dc1c2 now. This layer is small enough.

How Has This Been Tested?

Tested with an MNIST classifier example.
It only takes a few seconds to build newly trained models into docker images now (if the requirements don't change).

Types of changes

Breaking change (fix or feature that would cause existing functionality to change)
New feature and improvements (non-breaking change which adds/improves functionality)
Bug fix (non-breaking change which fixes an issue)
Code Refactoring (an internal change which is not user-facing)
Documentation
Test, CI, or build

Component(s) if applicable

BentoService (service definition, dependency management, API input/output adapters)
Model Artifact (model serialization, multi-framework support)
Model Server (mico-batching, dockerisation, logging, OpenAPI, instruments)
YataiService gRPC server (model registry, cloud deployment automation)
YataiService web server (nodejs HTTP server and web UI)
Internal (BentoML's own configuration, logging, utility, exception handling)
BentoML CLI

Checklist:

My code follows the bentoml code style, both ./dev/format.sh and
./dev/lint.sh script have passed
(instructions).
My change reduces project test coverage and requires unit tests to be added
I have added unit tests covering my code change
My change requires a change to the documentation
I have updated the documentation accordingly

codecov · 2020-10-21T07:03:53Z

Codecov Report

Merging #1199 into master will not change coverage.
The diff coverage is n/a.

@@           Coverage Diff           @@
##           master    #1199   +/-   ##
=======================================
  Coverage   66.05%   66.05%           
=======================================
  Files         135      135           
  Lines        8544     8544           
=======================================
  Hits         5644     5644           
  Misses       2900     2900

Impacted Files	Coverage Δ
bentoml/saved_bundle/templates.py	`100.00% <ø> (ø)`

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 25c319d...756ed75. Read the comment docs.

parano · 2020-10-21T17:56:35Z

bentoml/saved_bundle/templates.py

-# copy over model files
-COPY . /bento
+# copy over bundled_pip_dependencies
+COPY environment.yml bundled_pip_dependencies* /bento/


there is no need to copy environment.yml right?

bundled_pip_dependencies is only used for BentoML development - for installing local build of BentoML itself. This change will not benefit the end-user, nor the BentoML developers because the build cache will be invalidated if you are editing the BentoML source code.

there is no need to copy environment.yml right?

We need it because bundled_pip_dependencies may not exist. This prevents the COPY command from failing.

bundled_pip_dependencies is only used for BentoML development - for installing local build of BentoML itself. This change will not benefit the end-user, nor the BentoML developers because the build cache will be invalidated if you are editing the BentoML source code.

I'm aware of that. You will find it actually benefits when bundled_pip_dependencies exist. Also, the main purpose of this PR is to make sure that the artifact coping is in the top layer.

…l#1199)

Faster image rebuilding when only model artifacts are updated

756ed75

parano reviewed Oct 21, 2020

View reviewed changes

parano merged commit 76ec5cf into bentoml:master Oct 22, 2020

bojiang deleted the docker branch October 22, 2020 05:24

pncnmnp pushed a commit to MLH-Fellowship/BentoML that referenced this pull request Nov 1, 2020

Faster image rebuilding when only model artifacts are updated (bentom…

854b0d3

…l#1199)

aarnphm pushed a commit to aarnphm/BentoML that referenced this pull request Jul 29, 2022

Faster image rebuilding when only model artifacts are updated (bentom…

724de92

…l#1199)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Faster image rebuilding when only model artifacts are updated #1199

Faster image rebuilding when only model artifacts are updated #1199

bojiang commented Oct 21, 2020 •

edited

codecov bot commented Oct 21, 2020 •

edited

parano Oct 21, 2020

parano Oct 21, 2020

bojiang Oct 22, 2020 •

edited

bojiang Oct 22, 2020

Faster image rebuilding when only model artifacts are updated #1199

Faster image rebuilding when only model artifacts are updated #1199

Conversation

bojiang commented Oct 21, 2020 • edited

Motivation and Context

Description

How Has This Been Tested?

Types of changes

Component(s) if applicable

Checklist:

codecov bot commented Oct 21, 2020 • edited

Codecov Report

parano Oct 21, 2020

Choose a reason for hiding this comment

parano Oct 21, 2020

Choose a reason for hiding this comment

bojiang Oct 22, 2020 • edited

Choose a reason for hiding this comment

bojiang Oct 22, 2020

Choose a reason for hiding this comment

bojiang commented Oct 21, 2020 •

edited

codecov bot commented Oct 21, 2020 •

edited

bojiang Oct 22, 2020 •

edited