Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: implement reproducible build and get cached image #213

Merged
merged 15 commits into from
Jun 12, 2024

Conversation

mafredri
Copy link
Member

@mafredri mafredri commented May 30, 2024

This PR is a PoC, it builds upon #197 and coder/kaniko#12 and shows how build cache + pushed image can be utilized.

Example get-cached-image:

❯ registry_ip=$(docker inspect envbuilder-registry | jq -r '.[].NetworkSettings.IPAddress'); docker run -it --rm -v /tmp/envbuilder:/workspaces -e ENVBUILDER_CACHE_REPO=${registry_ip}:5000/local/cache -e ENVBUILDER_GIT_URL=https://github.com/coder/envbuilder-starter-devcontainer -e ENVBUILDER_INIT_SCRIPT=exit -e ENVBUILDER_GET_CACHED_IMAGE=true envbuilder:latest
envbuilder - Build development environments from repositories in a container
#1: πŸ“¦ Cloning https://github.com/coder/envbuilder-starter-devcontainer to /workspaces/envbuilder-starter-devcontainer...
...
2: RUN apt-get update
2: unable to build fake image: error fake building stage: uncached command *commands.RunMarkerCommand is not supported in fake build

Now, do build:

❯ registry_ip=$(docker inspect envbuilder-registry | jq -r '.[].NetworkSettings.IPAddress'); docker run -it --rm -v /tmp/envbuilder:/workspaces -e ENVBUILDER_CACHE_REPO=${registry_ip}:5000/local/cache -e ENVBUILDER_GIT_URL=https://github.com/coder/envbuilder-starter-devcontainer -e ENVBUILDER_INIT_SCRIPT=exit -e ENVBUILDER_PUSH_IMAGE=true envbuilder:latest
envbuilder - Build development environments from repositories in a container
#1: πŸ“¦ Cloning https://github.com/coder/envbuilder-starter-devcontainer to /workspaces/envbuilder-starter-devcontainer...
...
2: Running: [/bin/sh -c echo "PS1='🐳 \[\033[1;36m\]\h \[\033[1;34m\]\W\[\033[0;35m\] \[\033[1;36m\]# \[\033[0m\]'" > ~/.bashrc]
2: Taking snapshot of files...
2: Pushing layer 172.17.0.3:5000/local/cache:445691a7ed1a9ce9faac28095b3f3b77a0043c7604b099c299bd4abed5089f57 to cache now
2: Pushing image to 172.17.0.3:5000/local/cache:445691a7ed1a9ce9faac28095b3f3b77a0043c7604b099c299bd4abed5089f57
2: Pushed 172.17.0.3:5000/local/cache@sha256:440a1a035eb3ba7597aa7b3f1b5e08c67c3c5afed33bea057dd18640420ca115
#2: πŸ—οΈ Built image! [16.509477016s]
#3: πŸ—οΈ Pushing image...
3: Pushing image to 172.17.0.3:5000/local/cache
3: Pushed 172.17.0.3:5000/local/cache@sha256:9c102cd02585011898abd11ea367ba5368415b9cd9ecebbb07dcbb57f6e4cd63
#3: πŸ—οΈ Pushed image! [1.003674439s]
=== Running the init command /bin/sh [-c exit] as the "coder" user...

Try get-cached-image again:

❯ registry_ip=$(docker inspect envbuilder-registry | jq -r '.[].NetworkSettings.IPAddress'); docker run -it --rm -v /tmp/envbuilder:/workspaces -e ENVBUILDER_CACHE_REPO=${registry_ip}:5000/local/cache -e ENVBUILDER_GIT_URL=https://github.com/coder/envbuilder-starter-devcontainer -e ENVBUILDER_INIT_SCRIPT=exit -e ENVBUILDER_GET_CACHED_IMAGE=true envbuilder:latest
envbuilder - Build development environments from repositories in a container
#1: πŸ“¦ Cloning https://github.com/coder/envbuilder-starter-devcontainer to /workspaces/envbuilder-starter-devcontainer...
...
#2: πŸ—οΈ Building fake image...
2: Retrieving image manifest ubuntu
2: Retrieving image ubuntu from registry index.docker.io
2: Built cross stage deps: map[]
2: Retrieving image manifest ubuntu
2: Returning cached image manifest
2: Executing 0 build triggers
2: Checking for cached layer 172.17.0.3:5000/local/cache:bd0def545b590c39144a57152c120a5baf8e3cad7c70298bbacfa5116ef1ca3c...
2: Using caching version of cmd: RUN apt-get update
2: Checking for cached layer 172.17.0.3:5000/local/cache:591f8c35a55cc60f7bc49fa3f2c99360616a9b6b35908b8aaa7392e7820b0c36...
2: Using caching version of cmd: RUN apt-get install vim sudo curl -y
2: Checking for cached layer 172.17.0.3:5000/local/cache:1cb1d8b00537df6f0638913717931b76e7b80b8c150448ce4096f4eef042b355...
2: Using caching version of cmd: RUN useradd -m -s /bin/bash -G sudo coder
2: Cmd: USER
2: Checking for cached layer 172.17.0.3:5000/local/cache:445691a7ed1a9ce9faac28095b3f3b77a0043c7604b099c299bd4abed5089f57...
2: Using caching version of cmd: RUN echo "PS1='🐳 \[\033[1;36m\]\h \[\033[1;34m\]\W\[\033[0;35m\] \[\033[1;36m\]# \[\033[0m\]'" > ~/.bashrc
2: RUN apt-get update
2: Found cached layer, faking extraction to filesystem
2: RUN apt-get install vim sudo curl -y
2: Found cached layer, faking extraction to filesystem
2: RUN useradd -m -s /bin/bash -G sudo coder
2: Found cached layer, faking extraction to filesystem
2: USER coder
2: Cmd: USER
2: No files changed in this command, skipping snapshotting.
2: RUN echo "PS1='🐳 \[\033[1;36m\]\h \[\033[1;34m\]\W\[\033[0;35m\] \[\033[1;36m\]# \[\033[0m\]'" > ~/.bashrc
2: Found cached layer, faking extraction to filesystem
2: Mapping stage idx 0 to digest sha256:3ff463dd27a13ebcdfc74fc790e7a8da0b5373110b4edd1dc1426c4b4ed80e27
2: Mapping digest sha256:3ff463dd27a13ebcdfc74fc790e7a8da0b5373110b4edd1dc1426c4b4ed80e27 to cachekey 445691a7ed1a9ce9faac28095b3f3b77a0043c7604b099c299bd4abed5089f57
#2: πŸ—οΈ Built fake image! [1.298919769s]
172.17.0.3:5000/local/cache@sha256:9c102cd02585011898abd11ea367ba5368415b9cd9ecebbb07dcbb57f6e4cd63

And finally:

❯ docker run -it --rm --entrypoint /usr/bin/bash localhost:5000/local/cache@sha256:9c102cd02585011898abd11ea367ba5368415b9cd9ecebbb07dcbb57f6e4cd63
Unable to find image 'localhost:5000/local/cache@sha256:9c102cd02585011898abd11ea367ba5368415b9cd9ecebbb07dcbb57f6e4cd63' locally
localhost:5000/local/cache@sha256:9c102cd02585011898abd11ea367ba5368415b9cd9ecebbb07dcbb57f6e4cd63: Pulling from local/cache
49b384cc7b4a: Already exists
4e5c7f007bcd: Pull complete
b3cda1764760: Pull complete
62220e490978: Pull complete
07141b433862: Pull complete
Digest: sha256:9c102cd02585011898abd11ea367ba5368415b9cd9ecebbb07dcbb57f6e4cd63
Status: Downloaded newer image for localhost:5000/local/cache@sha256:9c102cd02585011898abd11ea367ba5368415b9cd9ecebbb07dcbb57f6e4cd63
To run a command as administrator (user "root"), use "sudo <command>".
See "man sudo_root" for details.

🐳 e54cf491d1a7 / #

Closes #186.

This commit enables pushing the final image to the given cache repo.a As
part of #185, the goal is to allow for faster startup when the image has
already been built.
options.go Outdated Show resolved Hide resolved
options.go Outdated Show resolved Hide resolved
// Boilerplate!
CustomPlatform: platforms.Format(platforms.Normalize(platforms.DefaultSpec())),
SnapshotMode: "redo",
RunV2: true,
RunStdout: stdoutWriter,
RunStderr: stderrWriter,
Destinations: []string{"local"},
Destinations: destinations,
NoPush: !options.PushImage || len(destinations) == 0,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm assuming the reason for the logical disjuction here is that Kaniko will push all intermediate cache layers to destinations, while specifying options.PushImage signifies to push only the final result?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep πŸ‘πŸ»

envbuilder.go Outdated Show resolved Hide resolved
})

// For cached image utilization, produce reproducible builds.
Reproducible: options.PushImage,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where is this getting used?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kylecarbs
Copy link
Member

I'd love to try this in dogfood asap with our devcontainer!

@kylecarbs
Copy link
Member

And since we added support for sysbox, this might just work!

@johnstcn
Copy link
Member

I'd love to try this in dogfood asap with our devcontainer!

Samesies! Do you think we'd want a registry on each dogfood host or just one shared registry?

@kylecarbs
Copy link
Member

I think we try using the Docker Registry to see what the performance looks like with a remote!

envbuilder.go Outdated
Comment on lines 102 to 104
if options.CacheRepo == "" && options.PushImage {
return fmt.Errorf("--cache-repo must be set when using --push-image!")
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

review: I don't see any good reason to not do this. It prevents potential head-scratching if someone sets ENVBUILDER_PUSH_IMAGE without specifying ENVBUILDER_CACHE_REPO.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dry run maybe?

@johnstcn
Copy link
Member

@dannykopping @BrunoQuaresma @mtojek I have updated this branch with integration tests.

It's worth noting that --get-cached-image currently does not work with multi-stage builds. I have filed a separate issue for that.

@mtojek mtojek self-requested a review June 11, 2024 16:53
Copy link
Member

@mtojek mtojek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We agreed on DoCacheProbe in kaniko's PR, so this round is fairly quick. Approved πŸ‘

README.md Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
envbuilder.go Outdated
Comment on lines 102 to 104
if options.CacheRepo == "" && options.PushImage {
return fmt.Errorf("--cache-repo must be set when using --push-image!")
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dry run maybe?

Copy link
Contributor

@dannykopping dannykopping left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only nits, great work!

})

t.Run("CacheAndPushMultistage", func(t *testing.T) {
// Currently fails with:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a way we can make this fail more gracefully? This output is rather difficult to reason about.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There probably is, but I'd say we should handle it as part of #230

envbuilder.go Outdated Show resolved Hide resolved
envbuilder.go Outdated Show resolved Hide resolved
integration/integration_test.go Outdated Show resolved Hide resolved
options.go Outdated Show resolved Hide resolved
@johnstcn johnstcn merged commit eb01b08 into main Jun 12, 2024
4 checks passed
@johnstcn johnstcn deleted the mafredri/feat-fake-build branch June 12, 2024 10:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

PoC: Build a tool to validate file hashes against layer caches
6 participants