Skip to content

Conversation

@chewi
Copy link
Contributor

@chewi chewi commented Oct 30, 2025

Install external mod build files with script

The kernel now includes a script for installing the files needed to build out-of-tree modules, rendering our existing code obsolete. The layout is different, but we were following Ubuntu's non-standard layout when there was no need to. Ubuntu's approach is seemingly designed to save space by symlinking common files across different platforms, but Flatcar doesn't need to do this.

More importantly, our previous approach relied on a kernel patch we have carried for years that no longer applies from v6.13. The patch cannot simply be reworked as the underlying mechanism has changed.

This clears the last major blocker for the arm64 SDK as the previous approach also relied on implicit execution by QEMU.

There has been concern that this may break compatibility with some modules, but I have not seen any issues in practise. I have symlinked source to build even though we don't install the full kernel sources because this is what Fedora does, and it makes the layout resemble Ubuntu a little more. Should any issues arise, I will gladly work with upstreams to resolve them or otherwise make adjustments.

How to use

Try triggering a build of the NVIDIA drivers or manually building custom kernel modules. For the latter, don't install coreos-sources, as that wouldn't be testing this change. Only ebuilds insist on the full kernel sources. I'll update the documentation later.

You can also try using the NVIDIA GPU Driver Container. You'll need my changes in jepio/GPU-Driver-Container#1. Note that this was already previously broken, so the changes there aren't solely to make the changes here work. The steps are basically:

On any system:

cd flatcar
DRIVER_VERSION=580.95.05
docker build --pull --build-arg DRIVER_VERSION=${DRIVER_VERSION} --tag nvidia/nvidia-driver-flatcar:${DRIVER_VERSION} --file Dockerfile .

Copy that container image into Flatcar. Give the VM 8GB RAM or thereabouts.

DRIVER_VERSION=580.95.05
docker run -d --privileged --pid=host -v /run/nvidia:/run/nvidia:shared -v /tmp/nvidia:/var/log -v /usr/lib64/modules:/usr/lib64/modules --name nvidia-driver nvidia/nvidia-driver-flatcar:${DRIVER_VERSION} update
docker logs -f nvidia-driver

If you attach a GPU, you can load the driver, but you'll need to sudo systemctl mask nvidia quickly on the first boot. It seems to rebuild when loading the driver, which it shouldn't, but I believe that is unrelated because I also saw this happen without these changes.

docker commit --change='ENTRYPOINT ["nvidia-driver", "init"]' nvidia-driver nvidia/nvidia-kmods-driver-flatcar:${DRIVER_VERSION}
docker run -d --privileged --pid=host -v /run/nvidia:/run/nvidia:shared -v /tmp/nvidia:/var/log -v /usr/lib64/modules:/usr/lib64/modules nvidia/nvidia-kmods-driver-flatcar:${DRIVER_VERSION}

Testing done

Jenkins is all green. There is a lot of churn in the change reports, but it's mostly files moving from source to build. I previously did a comparison that showed only unnecessary files have been dropped such as the device tree sources. These aren't needed to build modules, and Flatcar doesn't use them anyway.

I've also done all the above. That included attaching a GPU to ensure the freshly built NVIDIA driver actually loaded. I built a large handful of out-of-tree modules using the dev container, some with Portage (with a little force) but most manually using vanilla upstream sources. They all worked, except for a couple that failed for unrelated reasons.

  • Changelog entries added in the respective changelog/ directory (user-facing change, bug fix, security fix, update)
  • Inspected CI output for image differences: /boot and /usr size, packages, list files for any missing binaries, kernel modules, config files, kernel modules, etc.

@chewi chewi self-assigned this Oct 30, 2025
@github-actions
Copy link

Test report for 4503.0.0+nightly-20251029-2100 / amd64 arm64

Platforms tested : qemu_uefi-amd64 qemu_update-amd64 qemu_uefi-arm64 qemu_update-arm64

ok bpf.execsnoop 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok bpf.local-gadget 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok cl.basic 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok cl.cloudinit.basic 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok cl.cloudinit.multipart-mime 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok cl.cloudinit.script 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok cl.disk.raid0.data 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok cl.disk.raid0.root 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok cl.disk.raid1.data 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok cl.disk.raid1.root 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok cl.etcd-member.discovery 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok cl.etcd-member.etcdctlv3 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok cl.etcd-member.v2-backup-restore 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok cl.filesystem 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok cl.flannel.udp 🟢 Succeeded: qemu_uefi-amd64 (1)

ok cl.flannel.vxlan 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok cl.ignition.instantiated.enable-unit 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok cl.ignition.kargs 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok cl.ignition.luks 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok cl.ignition.oem.indirect 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok cl.ignition.oem.indirect.new 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok cl.ignition.oem.regular 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok cl.ignition.oem.regular.new 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok cl.ignition.oem.reuse 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok cl.ignition.oem.wipe 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok cl.ignition.partition_on_boot_disk 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok cl.ignition.symlink 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok cl.ignition.translation 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok cl.ignition.v1.btrfsroot 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok cl.ignition.v1.ext4root 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok cl.ignition.v1.groups 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok cl.ignition.v1.once 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok cl.ignition.v1.sethostname 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok cl.ignition.v1.users 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok cl.ignition.v1.xfsroot 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok cl.ignition.v2.btrfsroot 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok cl.ignition.v2.ext4root 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok cl.ignition.v2.users 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok cl.ignition.v2.xfsroot 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok cl.ignition.v2_1.ext4checkexisting 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok cl.ignition.v2_1.swap 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok cl.ignition.v2_1.vfat 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok cl.install.cloudinit 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok cl.internet 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok cl.locksmith.cluster 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok cl.network.initramfs.second-boot 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok cl.network.iptables 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok cl.network.listeners 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok cl.network.nftables 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok cl.network.wireguard 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok cl.omaha.ping 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok cl.osreset.ignition-rerun 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok cl.overlay.cleanup 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok cl.swap_activation 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok cl.sysext.boot 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok cl.sysext.fallbackdownload # SKIP 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok cl.tang.nonroot 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok cl.tang.root 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok cl.toolbox.dnf-install 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok cl.tpm.eventlog 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok cl.tpm.nonroot 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok cl.tpm.root 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok cl.tpm.root-cryptenroll 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok cl.tpm.root-cryptenroll-pcr-noupdate 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok cl.tpm.root-cryptenroll-pcr-withupdate 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok cl.update.badverity 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok cl.update.payload 🟢 Succeeded: qemu_update-amd64 (1); qemu_update-arm64 (1)

ok cl.update.payload-boot-part-too-small 🟢 Succeeded: qemu_update-amd64 (1); qemu_update-arm64 (1)

ok cl.update.reboot 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok cl.users.shells 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok cl.verity 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok coreos.auth.verify 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok coreos.ignition.groups 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok coreos.ignition.once 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok coreos.ignition.resource.local 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok coreos.ignition.resource.remote 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok coreos.ignition.resource.s3.versioned 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok coreos.ignition.security.tls 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok coreos.ignition.sethostname 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok coreos.ignition.systemd.enable-service 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok coreos.locksmith.reboot 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok coreos.locksmith.tls 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok coreos.selinux.boolean 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok coreos.selinux.enforce 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok coreos.tls.fetch-urls 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok coreos.update.badusr 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok devcontainer.docker 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok devcontainer.systemd-nspawn 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok docker.base 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok docker.btrfs-storage 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok docker.containerd-restart 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok docker.enable-service.sysext 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok docker.lib-coreos-dockerd-compat 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok docker.network-openbsd-nc 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok docker.selinux 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok docker.userns 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok extra-test.[first_dual].cl.update.docker-btrfs-compat 🟢 Succeeded: qemu_update-amd64 (1); qemu_update-arm64 (1)

ok extra-test.[first_dual].cl.update.payload 🟢 Succeeded: qemu_update-amd64 (1); qemu_update-arm64 (1)

ok extra-test.[first_dual].cl.update.payload-boot-part-too-small 🟢 Succeeded: qemu_update-amd64 (1); qemu_update-arm64 (1)

ok kubeadm.v1.32.4.calico.base 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok kubeadm.v1.32.4.cilium.base 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok kubeadm.v1.32.4.flannel.base 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok kubeadm.v1.33.0.calico.base 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok kubeadm.v1.33.0.cilium.base 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok kubeadm.v1.33.0.flannel.base 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok kubeadm.v1.34.1.calico.base 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok kubeadm.v1.34.1.cilium.base 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok kubeadm.v1.34.1.flannel.base 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok linux.nfs.v3 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok linux.nfs.v4 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok linux.ntp 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok misc.fips 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok packages 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok sysext.custom-docker.sysext 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok sysext.custom-oem 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok sysext.disable-containerd 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok sysext.disable-docker 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok sysext.simple 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok systemd.journal.remote 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok systemd.journal.user 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

ok systemd.sysusers.gshadow 🟢 Succeeded: qemu_uefi-amd64 (1); qemu_uefi-arm64 (1)

chewi added 2 commits October 31, 2025 18:36
The kernel now includes a script for installing the files needed to
build out-of-tree modules, rendering our existing code obsolete. The
layout is different, but we were following Ubuntu's non-standard layout
when there was no need to. Ubuntu's approach is seemingly designed to
save space by symlinking common files across different platforms, but
Flatcar doesn't need to do this.

More importantly, our previous approach relied on a kernel patch we have
carried for years that no longer applies from v6.13. The patch cannot
simply be reworked as the underlying mechanism has changed.

This clears the last major blocker for the arm64 SDK as the previous
approach also relied on implicit execution by QEMU.

There has been concern that this may break compatibility with some
modules, but I have not seen any issues in practise. I have symlinked
`source` to `build` even though we don't install the full kernel sources
because this is what Fedora does, and it makes the layout resemble
Ubuntu a little more. Should any issues arise, I will gladly work with
upstreams to resolve them or otherwise make adjustments.

Signed-off-by: James Le Cuirot <jlecuirot@microsoft.com>
Signed-off-by: James Le Cuirot <jlecuirot@microsoft.com>
chewi added a commit to chewi/GPU-Driver-Container that referenced this pull request Nov 5, 2025
This is no longer necessary following flatcar/scripts#3445. In fact, it
breaks because the files are already there.

Signed-off-by: James Le Cuirot <jlecuirot@microsoft.com>
@chewi chewi marked this pull request as ready for review November 5, 2025 12:22
@chewi chewi requested a review from a team as a code owner November 5, 2025 12:22
@chewi chewi merged commit 006eea2 into main Nov 7, 2025
4 of 5 checks passed
@chewi chewi deleted the chewi/extmod branch November 7, 2025 14:26
chewi added a commit to chewi/GPU-Driver-Container that referenced this pull request Nov 13, 2025
This is no longer necessary. In older releases, at least as far back as
Flatcar LTS 4081.3.6, the symlinks are already there. More recently,
flatcar/scripts#3445 restructures things so that the files are in build
to begin with. Trying to create new symlinks just breaks the script.

Signed-off-by: James Le Cuirot <jlecuirot@microsoft.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants