New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
multi-kernel, cross-compiling, bash based Hook & (default+foreign) kernels build (incl GHA matrix) #205
Conversation
Thanks @rpardini. Awesome OP. I've got a busy week so probably won't get to it until next weekend. |
No problem, meanwhile I keep on working -- I just added a 2nd drop. 2nd drop of rpardini's take on multi-hook
|
- `showcase` is a chart that, based on values.yaml dictionaries: - generates Tinkerbell CRs (Hardware/Template/Workflow) for both standard (UEFI) & exotic (supported by Armbian) devices - generates download/process jobs for multiple Hook flavors (see tinkerbell/hook#205) - generates download/process jobs for a few OS images (Ubuntu Cloud Images, Armbian, etc) - should be independent of how one deployed Tinkerbell itself (stack chart, individual components, etc) - A few features: - validates values.yaml for common mistakes; arch must match, etc. - validates & handles rootDisk differences (re-invents "formatPartition()" a bit) - avoids re-downloading Hooks and Images that are already on disk, even if Job re-runs - allows easy way to use - custom Hooks - custom Kernel cmdline parameters at both the Hook & device level - for example `acpi=off` at Hook level and `console=ttyS0` at board level - custom OS images for deployment - reboot or kexec to finish deployment - different partition numbers for OS image's rootfs (some images have ESP, some have a separate `/boot`, etc) - control if growpart and/or ssh/user setup is done during provisioning or not - conversion of OS images (`qemu-to-raw-gzip` and `xz-to-gz`) - has a "merge" mechanism with a common way to set parameters like net gateway, UEFI, etc (also easy to override per-device) - default values have everything `enabled: false` thus showcase should produce nothing by default. - Hooks & Images can be forced `enabled: true` in values.yaml, or - `enabled: true` Devices automatically enable their Hook & Image - Probably missing: - More validations - Currently pointing to my Tinkerbell Actions, which I haven't PR'ed yet - How to use: - Clone it, edit the values.yaml to your liking, and deploy. Signed-off-by: Ricardo Pardini <ricardo@pardini.net>
- `showcase` is a chart that, based on values.yaml dictionaries: - generates Tinkerbell CRs (Hardware/Template/Workflow) for both standard (UEFI) & exotic (supported by Armbian) devices - generates download/process jobs for multiple Hook flavors (see tinkerbell/hook#205) - generates download/process jobs for a few OS images (Ubuntu Cloud Images, Armbian, etc) - should be independent of how one deployed Tinkerbell itself (stack chart, individual components, etc) - A few features: - validates values.yaml for common mistakes; arch must match, etc. - validates & handles rootDisk differences (re-invents "formatPartition()" a bit) - avoids re-downloading Hooks and Images that are already on disk, even if Job re-runs - allows easy way to use - custom Hooks - custom Kernel cmdline parameters at both the Hook & device level - for example `acpi=off` at Hook level and `console=ttyS0` at board level - custom OS images for deployment - reboot or kexec to finish deployment - different partition numbers for OS image's rootfs (some images have ESP, some have a separate `/boot`, etc) - control if growpart and/or ssh/user setup is done during provisioning or not - conversion of OS images (`qemu-to-raw-gzip` and `xz-to-gz`) - has a "merge" mechanism with a common way to set parameters like net gateway, UEFI, etc (also easy to override per-device) - default values have everything `enabled: false` thus showcase should produce nothing by default. - Hooks & Images can be forced `enabled: true` in values.yaml, or - `enabled: true` Devices automatically enable their Hook & Image - Probably missing: - More validations - Currently pointing to my Tinkerbell Actions, which I haven't PR'ed yet - How to use: - Clone it, edit the values.yaml to your liking, and deploy. Signed-off-by: Ricardo Pardini <ricardo@pardini.net>
- `showcase` is a chart that, based on values.yaml dictionaries: - generates Tinkerbell CRs (Hardware/Template/Workflow) for both standard (UEFI) & exotic (supported by Armbian) devices - generates download/process jobs for multiple Hook flavors (see tinkerbell/hook#205) - generates download/process jobs for a few OS images (Ubuntu Cloud Images, Armbian, etc) - should be independent of how one deployed Tinkerbell itself (stack chart, individual components, etc) - A few features: - validates values.yaml for common mistakes; arch must match, etc. - validates & handles rootDisk differences (re-invents "formatPartition()" a bit) - avoids re-downloading Hooks and Images that are already on disk, even if Job re-runs - allows easy way to use - custom Hooks - custom Kernel cmdline parameters at both the Hook & device level - for example `acpi=off` at Hook level and `console=ttyS0` at board level - custom OS images for deployment - reboot or kexec to finish deployment - different partition numbers for OS image's rootfs (some images have ESP, some have a separate `/boot`, etc) - control if growpart and/or ssh/user setup is done during provisioning or not - conversion of OS images (`qemu-to-raw-gzip` and `xz-to-gz`) - has a "merge" mechanism with a common way to set parameters like net gateway, UEFI, etc (also easy to override per-device) - default values have everything `enabled: false` thus showcase should produce nothing by default. - Hooks & Images can be forced `enabled: true` in values.yaml, or - `enabled: true` Devices automatically enable their Hook & Image - Probably missing: - More validations - Currently pointing to my Tinkerbell Actions, which I haven't PR'ed yet - How to use: - Clone it, edit the values.yaml to your liking, and deploy. Signed-off-by: Ricardo Pardini <ricardo@pardini.net>
Pushed: forced initial no-offset-limit NTP sync via busybox (fixes RaspberryPi & others without an RTC), support (WiP) for Also, an initial PR for the |
Hey @rpardini. Thanks for this. I'm playing around with it and the x86_64 build works great. I have some non-technical concerns though. It's not clear how to cross compile HookOS for aarch64. Also, it seems like there are some envs involved for different build.sh commands but it isnt clear what all the env options available are and how to use them properly. As this is a significant change from the status quo and in order for this to land and be maintainable we're going to need to be able to understand this better. Can you provide docs for all functionality? Most users arent going to need to or want to rebuild kernels so docs around all the options for building the final HookOS are the most important to me. Again, thanks for all this work! I think we really needed it and I want to see it land. |
I think i found it. |
Also, i see in the RFC.md that you mention |
FYI, just got Hook 5.15 and 6.6 kernels built and booted into them! I'm liking this! |
Super thanks for the review!
Definitely! Docs are a big challenge. I hope to massage the RFC.md into README.md over time. To customize a kernel:
Perfect. I'd like to keep the ability to cross-build on GH Hosted runners, for people without self-hosted runners. What I propose is adding environment variables like |
9bedc43
to
328f5e8
Compare
- `showcase` is a chart that, based on values.yaml dictionaries: - generates Tinkerbell CRs (Hardware/Template/Workflow) for both standard (UEFI) & exotic (supported by Armbian) devices - generates download/process jobs for multiple Hook flavors (see tinkerbell/hook#205) - generates download/process jobs for a few OS images (Ubuntu Cloud Images, Armbian, etc) - should be independent of how one deployed Tinkerbell itself (stack chart, individual components, etc) - A few features: - validates values.yaml for common mistakes; arch must match, etc. - validates & handles rootDisk differences (re-invents "formatPartition()" a bit) - avoids re-downloading Hooks and Images that are already on disk, even if Job re-runs - allows easy way to use - custom Hooks - custom Kernel cmdline parameters at both the Hook & device level - for example `acpi=off` at Hook level and `console=ttyS0` at board level - custom OS images for deployment - reboot or kexec to finish deployment - different partition numbers for OS image's rootfs (some images have ESP, some have a separate `/boot`, etc) - control if growpart and/or ssh/user setup is done during provisioning or not - conversion of OS images (`qemu-to-raw-gzip` and `xz-to-gz`) - has a "merge" mechanism with a common way to set parameters like net gateway, UEFI, etc (also easy to override per-device) - default values have everything `enabled: false` thus showcase should produce nothing by default. - Hooks & Images can be forced `enabled: true` in values.yaml, or - `enabled: true` Devices automatically enable their Hook & Image - Probably missing: - More validations - Currently pointing to my Tinkerbell Actions, which I haven't PR'ed yet - How to use: - Clone it, edit the values.yaml to your liking, and deploy. Signed-off-by: Ricardo Pardini <ricardo@pardini.net>
77e3230
to
13b9bfd
Compare
Another large drop. I didn't squash this time, so I might have missed some sign-off's. Ref the GitHub actions runners: see the same workflow running against an org with self hosted arm64 runners, and another fork with just plain gh-hosted amd64 runners. Ended up with finer-grained control than proposed above. |
In the last drop I added |
build.sh
Outdated
declare -g HOOK_KERNEL_OCI_BASE="${HOOK_KERNEL_OCI_BASE:-"quay.io/tinkerbellrpardini/kernel-"}" | ||
declare -g HOOK_LK_CONTAINERS_OCI_BASE="${HOOK_LK_CONTAINERS_OCI_BASE:-"quay.io/tinkerbellrpardini/linuxkit-"}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These need to default to quay.io/tinkerbell/
. Also if there are new repos that don't exist in quay.io/tinkerbell
i'll need to create them. Do we maybe want different kernels to be image tags?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, those were mostly examples/leftovers from my two setups.
Ref quay.io
I think you're facing the same dilemma as me: it creates new repos/images with "private" visibility by default?
We could yes twist this so the image is always the same, and only the tag changes across kernel flavors.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed the default -- but didn't change the naming scheme.
Hey @rpardini , I believe i have found the issue with Docker not starting with Linuxkit v1.2.0. We need to enable cgroups v2 in the DinD container. - name: hook-docker
image: "${HOOK_CONTAINER_DOCKER_IMAGE}"
capabilities:
- all
net: host
pid: host
mounts:
- type: cgroup2
options: [ "rw", "nosuid", "noexec", "nodev", "relatime" ]
destination: /sys/fs/cgroup |
# Apart from the quay/ghcr coordinates above (used for both pulling & pushing), we might also want to | ||
# log in to DockerHub (with a read-only token) so we aren't hit by rate limits when pulling the linuxkit pkgs. | ||
# To do so, set the secret DOCKER_USERNAME and DOCKER_PASSWORD in the repo secrets, and set the below to yes. | ||
LOGIN_TO_DOCKERHUB: "${{ github.repository_owner == 'rpardini' && 'yes' || 'no' }}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, the DockerHub login is ok, but the check for rpardini
will need to be removed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Definitely. I just wanted to exemplify the conditional -- not everyone will have it I guess.
Since I added LK caching (to GHA cache), the need for this is lower as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed conditional for this PR.
- name: Docker Login to DockerHub # read-only token, required to be able to pull all the linuxkit pkgs without getting rate limited. | ||
if: ${{ env.LOGIN_TO_DOCKERHUB == 'yes' }} | ||
uses: docker/login-action@v3 | ||
with: { registry: "docker.io", username: "${{ secrets.DOCKER_USERNAME }}", password: "${{ secrets.DOCKER_PASSWORD }}" } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI, I've added dockerhub read only access creds for a Tinkerbell org account. DOCKERHUB_USERNAME
and DOCKERHUB_PASSWORD
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Set the default to use them and removed the conditional.
…der `cache` dir - also: ensure our cwd is always the dir containing build.sh - overridable via `CACHE_DIR` environment variable - also move kernel-releases.json into cache dir - refresh disk-cached kernel-releases.json if existing file is older than 120 minutes Signed-off-by: Ricardo Pardini <ricardo@pardini.net>
…ners; include build num in version - squashes most TODO's in the workflow file Signed-off-by: Ricardo Pardini <ricardo@pardini.net>
- better logging in that area Signed-off-by: Ricardo Pardini <ricardo@pardini.net>
- build: - expose HOOK_KERNEL_VERSION to the LK .yaml template - stop polluting exported HOOK_KERNEL_ID with version information - hook: - include flavor & kernel version in PRETTY_NAME - better motd Signed-off-by: Ricardo Pardini <ricardo@pardini.net>
…oval - this showed up when trying 6.6.y kernel Signed-off-by: Ricardo Pardini <ricardo@pardini.net>
…cal+cant-pull scenario gracefully - fix: was tagging the configurator image as the kernel proper and thus causing chaos - if kernel image can't be found locally nor pulled from remote, exit with a suggestion to build it locally Signed-off-by: Ricardo Pardini <ricardo@pardini.net>
- reduces log pollution during package installs in GHA - also: set DEBIAN_FRONTEND=noninteractive where it was missing Signed-off-by: Ricardo Pardini <ricardo@pardini.net>
…ns about using defaults for command and flavor/kernel id - this makes those equivalent: - `KEY=value ./build.sh [<command>] [<id>]` - `./build.sh [<command>] [<id>] KEY=value` - `./build.sh [<command>] KEY=value [<id>]` - `./build.sh KEY=value [<command>] [<id>]` - handle defaulting explicitely: - `[<command>]` defaults to `build`, but generates a warning - `[<id>]` defaults to `hook-default-<host-arch>`, but generates a warning Signed-off-by: Ricardo Pardini <ricardo@pardini.net>
…" kernel; `TAG=lts` - these seem to be working pretty well, and we might wanna promote them soon Signed-off-by: Ricardo Pardini <ricardo@pardini.net>
Signed-off-by: Ricardo Pardini <ricardo@pardini.net>
- LinuxKit does not use the regular Docker cache - see https://github.com/linuxkit/linuxkit/blob/master/docs/image-cache.md - be explicit about the cache location, otherwise it ends up under the user's home directory - this allows for easier GitHub Actions cache-preserving and thus minimize hitting DockerHub pull-rate limits Signed-off-by: Ricardo Pardini <ricardo@pardini.net>
- this should allow us to avoid hitting DockerHub's pull rate limits - `restore-keys` list allow re-using inexact cache hits, as long as they're same-arch - uses actions/cache@v4's new feature `save-always: true` Signed-off-by: Ricardo Pardini <ricardo@pardini.net>
…tion Signed-off-by: Ricardo Pardini <ricardo@pardini.net>
- might not be needed now that we've LK cache implemented - also: remove unused DO_BUILD_KERNEL leftover Signed-off-by: Ricardo Pardini <ricardo@pardini.net>
… & not found locally or pullable Signed-off-by: Ricardo Pardini <ricardo@pardini.net>
…es unexpectedly Signed-off-by: Ricardo Pardini <ricardo@pardini.net>
…DOCKER_BUILDX_PROGRESS_TYPE` - We've too much debug logging for a normal user; hide it. - DOCKER_BUILDX_PROGRESS_TYPE defaults to `plain` if DEBUG=yes or under GitHub actions - otherwise defaults to `tty` unless `DOCKER_BUILDX_PROGRESS_TYPE` is set Signed-off-by: Ricardo Pardini <ricardo@pardini.net>
…ary's keys - I still hate the stringified-dict syntax Signed-off-by: Ricardo Pardini <ricardo@pardini.net>
- Slightly better syntax IMHO Signed-off-by: Ricardo Pardini <ricardo@pardini.net>
- turns out I randomly called those "flavors", "flavours" or "kernels"; consolidate on "inventory" Signed-off-by: Ricardo Pardini <ricardo@pardini.net>
… working afterwards - rmdir -> rm -rf Signed-off-by: Ricardo Pardini <ricardo@pardini.net>
- otherwise it overwrites the default ones during deployment Signed-off-by: Ricardo Pardini <ricardo@pardini.net>
- Remove references to Nix from CONTRIBUTING.md too - this needs more work, editing and polishing Signed-off-by: Ricardo Pardini <ricardo@pardini.net>
…ettings - reword some comments - quay.io/tinkerbell is the official base - remove 'dev' tag from CI default - use fixed - 'yes' for DockerHub login; - 'ARM64' tagged self-hosted runners for arm64 lk containers Signed-off-by: Ricardo Pardini <ricardo@pardini.net>
Hey @jacobweinstock -- I've fixed the DCOs, cleaned up a few commit messages, changed the default OCI coordinates & CI params to Tinkerbell org, and rebased onto main. I've not touched the existing CI workflows, though; feel free to push to my branch if fixes are needed before merge. Thanks so much for the reviews! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's do this!!! Thanks, @rpardini for all the hard work!
multi-kernel, cross-compiling, bash based Hook & (default+external) kernels build (incl GHA matrix)
Original RFC below; please check commits for the (many) updates done after the first drop. Original RFC kept for reference, below the line.
RFC: multi-kernel, cross-compiling, bash based Hook & (default+external) kernels build (incl GHA matrix)
This is a rewrite of the build system.
The produced default artifacts (aarch64/x86_64) should be equivalent, save for an updated 5.10.213+ kernel and arm64 fixes.
It's missing, at least, documentation and linters, possibly more, that I removed and intend to rewrite.
But since it's a large-ish change, I'd like to collect some feedback before continuing.
Main topics
defconfig
versions, via Kbuild'smake savedefconfig
hook-default-amd64
andhook-default-arm64
kernels are equivalent to the two original.armbian-
prefixed kernels are actually Armbian kernels for more exotic arm64 SBCs, or Armbian's generic UEFI kernels for both arches. Those are very fast to "build" since Armbian publishes their .deb packages in OCI images, and here wejust download and massage them into the format required by Linuxkit.
hook.yaml
is replaced withhook.template.yaml
which is templated via a limited-var invocation ofenvsubst
; only the kernel image and the arch is actually different per-flavor.skopeo
. Can opt-out/use a fixed version.dtbs.tar.gz
artifact together with the initrd and vmlinuz.Flavors (/kernels)
Hook's own kernels
hook-default-arm64
hook-default-amd64
Armbian kernels
edge
: release candidates or stable but rarely LTS, more aggressive patchingcurrent
: LTS kernels, stable-ish patchingarmbian-bcm2711-current
armbian-meson64-edge
armbian-rockchip64-edge
armbian-uefi-arm64-edge
armbian-uefi-x86-edge
Proof of working-ness?
In my fork:
Future possibilities:
it would be fairly simple to add Debian/Ubuntu kernels as well as Armbian firmware.
Many, many more Armbian kernels could be added, but save for Allwinner and the Rockchip
-rkr
vendor kernel, I think they might be too niche.Users should have an easy time adding it themselves if they need, though.
Better support for u-boot's "pxelinux" booting requires changes outside of Hook (namely in Smee/ipxedust) which I'll PR eventually.
Certain arm64 SoCs require changes in iPXE (nap.h) -- same, I'll PR those to ipxedust repo.
All these Hook flavors are used in a "showcase" Helm chart based on stack that I will also PR to the charts repo.
TO-DO
bash build.sh config-kernel <flavor>
& follow instructions to configure kernel; only works for default flavorsbash build.sh build-kernel <flavor>
builds the kernelbash build.sh build <flavor>
builds Hook with that kernelactuated
for native arm64 building? -- https://actuated.dev/blog/arm-ci-cncf-ampereThanks for reading this far. I'm looking forward to your feedback!