Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PoC: Modify envbuilder to support build resumption from any layer #185

Closed
mafredri opened this issue May 14, 2024 · 2 comments
Closed

PoC: Modify envbuilder to support build resumption from any layer #185

mafredri opened this issue May 14, 2024 · 2 comments
Assignees
Labels
spike Investigation to prove feasibility or validate an idea

Comments

@mafredri
Copy link
Member

This issue tracks the implementation of a PoC to validate the path forward for #128.

To better utilize the envbuilder cache, we want to extend envbuilder with support for build resumption from any previous layer so that the container runtime layer caching can be utilized to avoid file extraction overhead.

To put simply, if currently we do:

docker run -it --rm ghcr.io/coder/envbuilder:0.2.9
# build layer 1
# push ghcr.io/myorg/envbuilder-cache:aaa
# build layer 2
# push ghcr.io/myorg/envbuilder-cache:bbb
# build layer 3
# push ghcr.io/myorg/envbuilder-cache:ccc
# start

We want to be able to do this instead:

docker run -it --rm ghcr.io/myorg/envbuilder-cache:bbb
# resuming build from layer 2
# build new layer 3
# push ghcr.io/myorg/envbuilder-cache:ddd
# start

Note: In isolation, this may not improve performance. The container runtime should have previously (extracted) some or all of the cached layer that we're resuming from.

@mafredri mafredri self-assigned this May 14, 2024
@coder-labeler coder-labeler bot added enhancement spike Investigation to prove feasibility or validate an idea labels May 14, 2024
mafredri added a commit that referenced this issue May 17, 2024
This commit enables pushing the final image to the given cache repo.a As
part of #185, the goal is to allow for faster startup when the image has
already been built.
@mafredri
Copy link
Member Author

mafredri commented May 17, 2024

Turns out this was not entirely trivial as cache layers uploaded by Kaniko can't be run as images. This does make sense as it's pretty much the same with Docker.

For now, we'll just enable uploading of the final image as per #197. This should allow us to start from the complete image once #186 is implemented, given that it's been built.

In future, it may be possible to allow bootstrapping from any layer by modifying Kaniko slightly to create/upload intermediate images to the registry. A quick hack that enabled layers to be run with docker run (this broke the final image, mind you) looked like this:

diff --git pkg/executor/build.go pkg/executor/build.go
index 8c1f353f..d2b6063d 100644
--- pkg/executor/build.go
+++ pkg/executor/build.go
@@ -416,6 +417,9 @@ func (s *stageBuilder) build() error {
                        if err := s.saveLayerToImage(layer, command.String()); err != nil {
                                return errors.Wrap(err, "failed to save layer")
                        }
+                       if err := s.opts.DoPush(s.image, s.opts); err != nil {
+                               return errors.Wrap(err, "failed to push layer")
+                       }
                } else {
                        tarPath, err := s.takeSnapshot(files, command.ShouldDetectDeletedFiles())
                        if err != nil {
@@ -441,6 +445,9 @@ func (s *stageBuilder) build() error {
                        if err := s.saveSnapshotToImage(command.String(), tarPath); err != nil {
                                return errors.Wrap(err, "failed to save snapshot to image")
                        }
+                       if err := s.opts.DoPush(s.image, s.opts); err != nil {
+                               return errors.Wrap(err, "failed to push layer")
+                       }
                }
        }

mafredri added a commit that referenced this issue May 30, 2024
This commit enables pushing the final image to the given cache repo.a As
part of #185, the goal is to allow for faster startup when the image has
already been built.
@mafredri
Copy link
Member Author

mafredri commented Jun 1, 2024

The conclusion from this PoC is that:

  1. Resumption from build layer cache is not possible, per default. (Lacks data, like architecture, needed by Docker to run it.)
  2. It is possible to modify Kaniko to turn a cached build layer into a complete image, but more investigation into how to do it properly without affecting the final image is needed.
  3. When we enable pushing (feat: push final image to cache repo #197), the complete image can be started but as-is can't be used to resume envbuilder startup
    • Can be fixed by 1) sanitizing the Docker image (USER root, etc) and bundling the envbuilder binary.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
spike Investigation to prove feasibility or validate an idea
Projects
None yet
Development

No branches or pull requests

2 participants