Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

podvm-mkosi: use toolchain from nixpkgs #1523

Merged

Conversation

katexochen
Copy link
Contributor

If we want to have reproducible OS image build, we need to use the toolchain from nixpkgs. The mkosi version in nixpkgs uses a patched version of systemd (with commits that are upstream, but will only be part of the next release). We also fixed many of the tools mkosi uses to get things reproducible.
I maintain the mkosi package in nixpkgs with a colleague. We have nightly tests running on nixpkgs master + mkosi main to ensure things stay reproducible and working.

The use of Nix in general is discussed in #1516. Nevertheless, as I already have this ready, I wanted to open this at least as a draft. I don't think using Nix must be an all-or-nothing decision. I don't think there is currently a way to get around using it for what it is used in this PR if we want reproducible images.

@katexochen katexochen added the podvm Related to podvm images label Oct 16, 2023
@katexochen katexochen marked this pull request as ready for review October 19, 2023 06:31
@katexochen katexochen force-pushed the feat/mkosi-builds-nix branch 2 times, most recently from 23a3b64 to b90d180 Compare November 18, 2023 10:22
Adding a Nix flake to pin toolchain and dependencies for building OS images
with mkosi.

Signed-off-by: Paul Meyer <49727155+katexochen@users.noreply.github.com>
This commit switches the mkosi image build process to use a Nix
environment instead of docker.

Signed-off-by: Paul Meyer <49727155+katexochen@users.noreply.github.com>
Adding a worklflow to build podvm images with mkosi in the CI.

Signed-off-by: Paul Meyer <49727155+katexochen@users.noreply.github.com>
Copy link
Contributor

@mkulke mkulke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. built an image and tested it in a peerpod deployment.

I didn't test the GH action code. I'm not entirely sure about the build-space issue on a worker, locally my podvm-mkosi folder consumes 2.5G, and this includes the vhd copy. I'd suspected docker build leftovers, but we should be on a fresh node 🤔

@bpradipt
Copy link
Member

I tried the gh workflow using nektos/act and it failed during the build image step.
Binaries got built successfully.

warning: Git tree '/home/ubuntu/cloud-api-adaptor' is dirty
error: getting status of '/nix/store/7pqci4rwn43ng88610r6a79xxx1j2irm-source/flake.nix': No such file or directory
[Create a Pod VM image with mkosi/Build image   ]   ❌  Failure - Main Build nix shell to cache dependencies
[Create a Pod VM image with mkosi/Build image   ] exitcode '1': failure
[Create a Pod VM image with mkosi/Build image   ] ⭐ Run Post Install Nix
[Create a Pod VM image with mkosi/Build image   ]   🐳  docker cp src=/home/ubuntu/.cache/act/cachix-install-nix-action@v22/ dst=/var/run/act/actions/cachix-install-nix-action@v22/
[Create a Pod VM image with mkosi/Build image   ]   ✅  Success - Post Install Nix
[Create a Pod VM image with mkosi/Build image   ] 🏁  Job failed
Error: Job 'Build image' failed

@bpradipt
Copy link
Member

Another question on the gh-action, is it possible to reuse a pre-built binaries for the podvm "build image" job ? In my local env, re-running the "build image" job results in rebuilding the binaries. And it took roughly 30 min to build the binaries on a VM with 2vcpus and 16GB mem.

@katexochen
Copy link
Contributor Author

I tried the gh workflow using nektos/act and it failed during the build image step. Binaries got built successfully.

I haven't tried this with act, not sure what caused the error you observed. If you want to test this workflow, you can just trigger it on your fork. Container images are automatically pushed to the registry of the repo the workflow is executed. Here is a successful run on my fork: https://github.com/katexochen/cloud-api-adaptor/actions/runs/6928972339

@katexochen
Copy link
Contributor Author

Another question on the gh-action, is it possible to reuse a pre-built binaries for the podvm "build image" job ? In my local env, re-running the "build image" job results in rebuilding the binaries. And it took roughly 30 min to build the binaries on a VM with 2vcpus and 16GB mem.

Yes, this definitely can be optimized. Should we just add a input for the image name? Then we can skip the first job if the image is passed as input.

As we've already discussed a few times, we should get rid of the binaries container image build step, build binaries in separate and cache them in the mid-term.

@bpradipt
Copy link
Member

Another question on the gh-action, is it possible to reuse a pre-built binaries for the podvm "build image" job ? In my local env, re-running the "build image" job results in rebuilding the binaries. And it took roughly 30 min to build the binaries on a VM with 2vcpus and 16GB mem.

Yes, this definitely can be optimized. Should we just add a input for the image name? Then we can skip the first job if the image is passed as input.

Yeah, this might be helpful.

As we've already discussed a few times, we should get rid of the binaries container image build step, build binaries in separate and cache them in the mid-term.

Would it make sense to split the workflow into two - one for building the binaries and another for the podvm image ?

@katexochen
Copy link
Contributor Author

Would it make sense to split the workflow into two - one for building the binaries and another for the podvm image ?

Not sure about this, that would place the burden on managing cache invalidation on the developer - which hasn't been working out well with the old image pipeline, where we would only update these images once or twice a release cycle.

This reduces the build time if you have to rebuild often and know
that the binaries haven't changed.

Signed-off-by: Paul Meyer <49727155+katexochen@users.noreply.github.com>
@bpradipt
Copy link
Member

Would it make sense to split the workflow into two - one for building the binaries and another for the podvm image ?

Not sure about this, that would place the burden on managing cache invalidation on the developer - which hasn't been working out well with the old image pipeline, where we would only update these images once or twice a release cycle.

I see..

@katexochen
Copy link
Contributor Author

Workflow run with prebuilt binaries image: https://github.com/katexochen/cloud-api-adaptor/actions/runs/7043741095/job/19170078950

jobs:
build-binaries:
name: Build binaries
if : ${{ github.event.inputs.binaries-image == '' }}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice.. thanks for adding this.

Copy link
Member

@bpradipt bpradipt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
Thanks @katexochen

@katexochen katexochen merged commit 8881b3a into confidential-containers:main Nov 30, 2023
21 checks passed
@katexochen katexochen deleted the feat/mkosi-builds-nix branch November 30, 2023 08:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
podvm Related to podvm images
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants