New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support UKI #2753
Comments
How would we know from the initramfs when a rollback was performed? Thinking on the use scenario in which the bootloader decides to rollback as the new deployment/update is not good enough (wasn't confirmed), as currently this can be done quite easily by booting the previous deployment (previous initramfs), which also has the previous ostree argument in place. |
What's wrong with using boot loader entries? Wouldn't we expect that the UEFI boot loader participating in the scheme (e.g. sd-boot) to support the boot loader spec? Mucking with the UEFI boot entries doesn't sound that pleasant to me. I'm pretty sure sd-boot deals with it fine as we've been using it with a UKI on a product for a few years. The only changes we have to make are horrifying ones to deal with the lack of symlinks on VFAT (#1719). I guess using UEFI boot variables would side step that issue, though. |
Also, if you're building a UKI, the initramfs is part of it and there's no need for ostree to find it. Are you suggesting the ostree take the separate kernel and initramfs and generate a UKI? |
Is the "how to find the rootfs" problem, the chicken in the egg problem in that you need to populate the "ostree=" karg in the UKI, but you only know what that karg should be after you commit? So that you can only boot the n-1 commit in that case? I had thoughts on this, you could write an extra value client side, maybe as a new entry "ostree" in the bls. So you could do:
But I'm happy with whatever works :) Encountering the same issue in the UKI-like aboot bootloader. |
I think the main reason to embed the rootfs in the kernel cmdline is basically integration with bootloader menus - e.g. to be able to choose the previous deployment in the GRUB GUI. However, this is not a requirement. We could instead read a value from the target root, one could imagine something as simple as a symlink Perhaps a strawman here is that specifying a bare And then to boot the previous deployment, we support an |
By default, we don't do client side commits. Hence, the digest is actually fully predictable and known in advance on the build server. But certainly there is a circular dependency here for any systems which are doing fully sealed kernel command lines - we'd need to generate the rootfs and compute its digest, then patch the kernel binary (which would in theory invalidate that digest, but OTOH nothing actually reads the kernel from the rootfs; the fallout would just be things like Arguably perhaps, we should have better support on the client for something like "ghosting" the kernel/initramfs from |
I think many users/organizations that want to deploy UKIs will want to do so without involving any bootloader at all. But yes, we should probably also support deployment with a bootloader. |
That does make sense. But what's actually unpacking the UKI in that case? Some other UEFI program? I don't believe the linux kernel itself supports booting directly from a combined kernel+initramfs PE program. Ah, sd-stub. I missed that. I guess if you're all in on UEFI and want the minimal boot environment, then even sd-boot is superfluous. |
It depends on the use-case and the hardware mostly. Adding new bootloader entries to the EFI menu works only so-so on some hardware/firmware and often switching b/w different entries on boot is cumbersome (not to mention vendor-specific). Thus, having different boot environments with different kernels/deployments would be much easier with sd-boot than with "native" UEFI boot loading. This is the reason why I use sd-boot on all my systems in combination with UKIs. |
One thing that's not clear to me is how do we deliver a UKI (is it it's own rpm?), because it would be built on the osbuild-side rather than the end device... |
Sure, the same way you build the kernel and initramfs on an ostree system. They're just bundled together for a UKI. I think the only thing in there that doesn't fit that model is the kernel command line since ostree currently allows you to manage that locally and it often contains |
I guess the way we do it right now at Endless is that the initramfs is generated in our ostree builder. For our systems that use a unified kernel with sd-boot, that's also generated in our ostree builder. There's no reason they couldn't be packaged except that generating the initramfs requires installing all the dracut modules that you want in there. We decided that wasn't worth the effort and it was easier to do that in the ostree builder since it would by definition have all the modules installed. |
Just to be explicity, i didn't propose just shipping the commit in the detached metadata, as that is not trusted. What i proposed it to ship some public key in the initrd, sign the commit id with the private part, store it in detached metadata, and then throw away the private key. Then the initrd can validate the commit it reads from somewhere. |
Posting here the result of several discussions that we've had recently: The major change is the need to move the ostree deployment hash out of the kernel command line as the kernel command won't be modifiable in the UKI case. The suggested design is that ostree would take the UKI from the ostree commit, move it to the EFI partition and rename it with the following convention:
For example: We would then need to add support in the initramfs to read the ostree deployment hash from the name of the UKI that has been booted instead of reading it from the kernel command line. This could be done either by reading the name from EFI variables or from the TPM event log.
While this would be nice, I don't think it's strictly needed if we still have a bootloader (systemd-boot preferably) that is capable of booting BLS config entries. |
The design above could be combined with the suggestion from #2753 (comment) and the use of composefs to verify the content of the deployment. |
Sorry I deleted that as I wanted to rewrite a portion, it belongs before @alexlarsson 's comment So some automotive folk were discussing Android boot images, which are similar to UKIs in that it is a "kernel, initrd, cmdline and signature" that gets generated server-side and delivered to the client via ostree. This leave an issue of how do you deliver and boot the ostree SHA. It is difficult to boot via karg because that has the recursive problem, how do you deliver that SHA without altering the SHA? @alexlarsson suggested ostree detached metadata, that way you can deliver the SHA without altering the SHA, so we think this should solve the problem. But this requires booting via an alternate means to booting via karg, and I will explore the symlink techniques @cgwalters suggested above. Wondering what you guys think of this as a proposal? A similar technique could be used for UKIs and Android Boot Images. |
@travier thanks for sharing the output of the discussions here, in the case where you don't have an EFI partition available which is the case in Android Boot Images, do you think it's reasonable to move forward with @cgwalters symlink approach suggested above? |
The UEFI variable Reading this thread, I can't help but feel like this is getting over engineered (or rather "complex") ... I'm personally not interested in building the UKI on the server and loosing the ability to specify command line arguments, however I think that's a requirement if you want the UKI to be signed e.g. by the distribution itself ? Since that isn't my goal, I'm currently building the UKI on the host, supporting kernel arguments, If I need the image on the build server, e.g. for signing or attestation, I can simply take the kernel arguments and build the (fully reproducible) UKI. |
The problem with this is that it moves the indentifier for the rootfs from a trusted location (in signed uki) to a completely untrusted location (the filename). Anyone can just rename the FAT file and make it boot some other rootfs. This is fine if you don't care about validationg, but it is nowhere enough for a secureboot trusted boot into the rootfs. |
Not any other rootfs; you'd include the key used to sign the composefs in the initramfs, and validate it from there. So the problem then turns to rollback protection, and that's a nuanced topic because it's absolutely valid to want to roll back sometimes. |
I liked the symlink approach over EFI partition. The problem with using EFI features, is you start to depend on fully implemented UEFI, which would be nice, but it's not always the case, especially on non-x86 systems. If we could self-contain the solution as much as we can in the main rootfs partition it would be better (over using EFI partitions). |
To be clear "symlink approach" = #2753 (comment) ? I edited that earlier comment to elaborate a bit about how rollbacks would work; so the previous bootloader entries would gain |
Yes, it removes the hard dependency on EFI.
|
We need a way to choose which deployment to boot as we need to support rollbacks (rollback protection is another topic that we are not covering here and would be implemented separately). As we can not change the command line, we need a way to pass that info to the initramfs. Using the filename of the UKI is one way of doing that. Note that this deployment hash isn't particularly trusted data: it only makes sense if the deployment exists in the rootfs. Whether or not it's a valid deployment is thus a question of whether or not we have integrity for the rootfs and that's a composefs / LUKS discussion. You can not use that to boot an arbitrary deployment that would not be in the rootfs already. |
@travier what problems do you see with #2753 (comment) ? |
As far as I understand this involves modifying the kernel command line which is not compatible with UKIs. |
If we do a mapping
|
Not sure how robust this would be in case of power failures as we would need to update two places at the same time every time we do a new deployment: UKI file name + deployment hash symlink. |
But we don't change the UKI for every deployment. We don't want to have to touch the kernel config when only userspace changes in general, right? |
We would have one UKI per boot entry in all cases, even if they are the same files as vfat has no hardlinks. |
We can share kernel & initramfs right now because we use BLS configs to set the roothash, etc. We can't use that with UKIs. |
@ericcurtin All the work on UKIs assumes you're using UEFI, Secure boot and a TPM. |
Yes, but they have a case where they want to boot via adb, which is similar in architecture but not exactly UKI. Also, as far as I know nothing stops one from doing something UKI-like for e.g. zipl. From the perspective of the bootloader, they can't tell the difference between a kernel with an initramfs embedded and one without! So the argument here is really to just change the ostree default to reading a symlink and not to inject a hash into the kernel cmdline. |
s/adb/Android Boot Image/g , it's basically what UKI's used for influence, it's commonly used on Android devices, Chromebooks, Automotive hardware, etc.
Yes this is true |
I don't understand how this would work. How do you know which boot entry you booted with a UKI if you can't pass any info via the bootloader or the filename? |
The only case I can think of which uses UKIs without UEFI and without Secure Boot is to rely on the TPM to measure the UKI to unlock a LUKS encrypted rootfs. In this case you also need something else to pass the information about which boot entry has been booted. I don't know if GRUB or other bootloaders have that kind of interface. uboot has a EFI mode that can emulate EFI behavior. |
Note that the whole design with UKI relies on getting rid of boot loader config entries and relying on the Type 2 BLS specification: https://uapi-group.org/specifications/specs/boot_loader_specification/#type-2-efi-unified-kernel-images |
FWIW, there's a proposal (and a proof-of-concept in works) to add an allowlist of options which are allowed to change to systemd-stub: systemd/systemd#24539 (comment) If implemented, ostree can probably be specified there. It is still an open question how to pass these additional options to UKIs, especially in the absence of a 'real' bootloader when e.g. UKI is booted directly from shim. This can probably be done through a UEFI variable or a file on ESP (e.g. systemd-stub can read BLS config), or something else. |
Inherently, the job of the initramfs is to mount the root filesystem. A common default is to mount the root filesystem via e.g. So the initramfs does:
Another way to say it is: with a UKI-based Again, the bridge between UKI and verified userspace is a key embedded in that UKI. |
For systemd/systemd#24539 to work we need BLS Type 1 entries (config files) and bootloader support to extend the UKI kernel command line with the options passed into that config file. |
How do you set that in the kernel command line and how do you update that when you change the order of deployments? |
We could generate a random hash and include it both in the UKI kernel command line and setup the symlinks in the rootfs but that would be another indirection like I mentioned in #2753 (comment). |
You're right, I wasn't covering a detail here. At this point though the thread is unwieldy, so I've amended the initial comment here. I think systemd-stub credentials are already a way to pass this data and it's what it's designed for. That said, I also do think we can't design solely for systemd-stub. A very interesting case that's entwined with all of this is whether systems using ostree want to explicitly support locally-initiated rollback. If you don't (and I think that's valid!) then there's no need for a "fallback" UKI that would appear as a separate bootable entry at all. Instead, it'd be up to userspace (whether initramfs or real root) to verify health and locally initiate a change in the default UKI/rootfs pair. |
Using credentials is indeed also an option. Note that this requires |
Looks like this won't let us share UKI if I understand correctly. |
I'll post my current test setup here, simply because it might be useful to someone, obviously it won't be usable for the use case discussed here (Firmware SecureBoot, i.e. with Microsofts keys).
Instead of doing the Boot-entry dance using Here is the #!/bin/sh
set -eu
if [ "$1" != "-o" ]; then
echo "Usage: $0 -o <cfg>"
exit 1
fi
if [ -z "$2" ]; then
echo "Usage: $0 -o <cfg>"
exit 1
fi
# FIXME: assert, that _OSTREE_GRUB2_IS_EFI is not set, if it has been set, then
# ostree will use different logic, which is probably incompatible.
# FIXME: replace by using _OSTREE_GRUB2_BOOTVERSION, which also checks that we have been called by ostree
# We get called like `grub-mkconfig -o /boot/loader.0/grub.cfg`, use $2 to obtain the /boot/loader.$bootnum directory
if [ "$2" = "/boot/loader.0/grub.cfg" ]; then
OLD_BOOTNUM="1"
NEW_BOOTNUM="0"
elif [ "$2" = "/boot/loader.1/grub.cfg" ]; then
OLD_BOOTNUM="0"
NEW_BOOTNUM="1"
else
echo "Usage: $0 -o /boot/loader.[01]/grub.cfg"
exit 3
fi
LOADER_DIR="$(dirname "$2")"
if [ -d "$LOADER_DIR/uki" ]; then
# Might be a left over from e.g. a failed previous run.
echo "Removing (old) $LOADER_DIR/uki"
rm -r "$LOADER_DIR/uki"
fi
mkdir "$LOADER_DIR/uki"
for entry_file in "$LOADER_DIR"/entries/*.conf; do
echo "Parsing BLS entry file '$entry_file':"
# 1. Parse the BLS configfile:
ENTRY_TITLE="$(grep "^title " "$entry_file" | sed 's/^title //')"
ENTRY_VERSION="$(grep "^version " "$entry_file" | sed 's/^version //')"
ENTRY_OPTIONS="$(grep "^options " "$entry_file" | sed 's/^options //')"
ENTRY_LINUX="$(grep "^linux " "$entry_file" | sed 's/^linux //')"
ENTRY_INITRD="$(grep "^initrd " "$entry_file" | sed 's/^initrd //')"
# Technically the 'version' is supposed to be sorted using debian version sort style, but we assume
# that the filenames generated by ostree are enough for ordering, which will probably break once you have 9+ deployments
ENTRY_FILENAME="${entry_file##*/}"
UKI_PATH="$LOADER_DIR/uki/${ENTRY_FILENAME%.conf}.efi"
echo "Resulting UKI will be stored in '$UKI_PATH'"
echo "$ENTRY_OPTIONS" > "$UKI_PATH.cmdline"
# Build the actual UKI, note that it is always rebuild / shouldn't exist yet
# --preserve-dates: For a reproducible timestamp in the PEI header
objcopy \
--preserve-dates \
--add-section .cmdline="$UKI_PATH.cmdline" --change-section-vma .cmdline=0x30000 \
--add-section .linux="/boot/$ENTRY_LINUX" --change-section-vma .linux=0x2000000 \
--add-section .initrd="/boot/$ENTRY_INITRD" --change-section-vma .initrd=0x3000000 \
/usr/lib/systemd/boot/efi/linuxx64.efi.stub \
"$UKI_PATH"
done
# Sync build images to /boot/efi
# See also <https://bugzilla.gnome.org/show_bug.cgi?id=724246>
ESP_DIR="/boot/efi/EFI/bauen1-uki"
mkdir -p "$ESP_DIR.0" "$ESP_DIR.1"
sync --file-system "/boot/efi/EFI"
echo "OLD_BOOTNUM: $OLD_BOOTNUM"
echo "NEW_BOOTNUM: $NEW_BOOTNUM"
# We assume, that the currently used Boot variables point to "$ESP_DIR.$OLD_BOOTNUM", so we can safely
# remove "$ESP_DIR.$NEW_BOOTNUM"
# Figure out some values for modifiny UEFI Boot variables:
ESP_DEVICE="$(df /boot/efi | tail -1 | awk '{ print $1 }')"
ESP_PARTNUM="$(cat /sys/class/block/"$(basename "$ESP_DEVICE")"/partition)"
ESP_PARTUUID="$(blkid "$ESP_DEVICE" -o export | awk -F'=' '/PARTUUID=/ { print $2 }' )"
echo "device=$ESP_DEVICE partnum=$ESP_PARTNUM partuuid=$ESP_PARTUUID"
cleanup_bootvars() {
# Removes any boot variables referencing a certain $ESP_DIR.$BOOTNUM
# $1: bootnum
# Now we know that we are looking for something similar to:
# HD($ESP_PARTNUM,GPT,$ESP_PARTUUID,somehex,somehex)/File(\EFI\bauen1-uki.$BOOTNUM\.*)
# efibootmgr outputs like:
# BootXXXX* title with possible spaces\tActualEntry
ENTRIES="$(efibootmgr -v | grep -E '^Boot[[:xdigit:]]{4}' | awk -F'\t' '/^[^\t]+\tHD\('"$ESP_PARTNUM,GPT,$ESP_PARTUUID"',.*\)\/File\(\\EFI\\bauen1-uki.'"$1"'\\.*\)$/ { print $0 }')"
printf "Boot entries that will be removed:\n%s\n" "$ENTRIES"
for entry in $(echo "$ENTRIES" | grep -E '^Boot[[:xdigit:]]{4}' --only-matching | sed 's/^Boot//'); do
echo "Removing $entry"
efibootmgr --delete-bootnum --bootnum "$entry"
done
}
# 1. Cleanup any left over Boot variables still pointing to $ESP_DIR.$NEW_BOOTNUM
cleanup_bootvars "$NEW_BOOTNUM"
# 2. Cleanup $ESP_DIR.$NEW_BOOTNUM
if [ -e "$ESP_DIR.$NEW_BOOTNUM" ]; then
echo "Removing $ESP_DIR.$NEW_BOOTNUM"
rm -r "$ESP_DIR.$NEW_BOOTNUM"
sync --file-system "/boot/efi/EFI"
else
echo "Skipping removal of $ESP_DIR.$NEW_BOOTNUM, does not exist"
fi
# 3. Create new $ESP_DIR.$NEW_BOOTNUM
echo "Creating $ESP_DIR.$NEW_BOOTNUM"
mkdir "$ESP_DIR.$NEW_BOOTNUM"
cp -v "$LOADER_DIR/uki"/*.efi "$ESP_DIR.$NEW_BOOTNUM"/
sync --file-system "/boot/efi/EFI"
# 4. Create new Boot variables
for f in "$ESP_DIR.$NEW_BOOTNUM"/*; do
echo "Creating Boot entry for file '$f':"
efibootmgr \
--create \
--disk="$ESP_DEVICE" \
--part="$ESP_PARTNUM" \
--label="${f##*/}" \
--loader="${f##/boot/efi}"
done
# 5. Set BootOrder (and maybe BootNext ?)
# FIXME: efibootmgr --create adds the entries to the currently defined BootOrder, however I need to verify
# what order is used, and if that is already what is necessaery
# It appears to already do everything correctly.
# 6. Remove now unused old Boot variables
cleanup_bootvars "$OLD_BOOTNUM"
# Finally actually touch the output file to make ostree happy
echo "Touching empty (fake) output file '$2'"
touch "$2" |
UKI are not supported on rpm-ostree based Fedora variants so let's use recommend for binutils for now to let those not include the package until needed. See: coreos/fedora-coreos-tracker#1496 See: ostreedev/ostree#2753 See: https://src.fedoraproject.org/rpms/kexec-tools/c/ea7be0608ed719cc1cb134ecf6ef51a4b7e9f104?branch=rawhide
Btw for the Android Boot Image implementation this is what we did (it's high level design is very similar to UKIs). UKIs aren't designed to have as malleable a cmdline as a BLS file locally client-side, so we set ostree karg to simply: ostree=true Then we created symlinks like: /ostree/root.a which pointed to two different sysroots (the ostree systemd generator parsed the osname/stateroot from this symlink also). |
See https://github.com/uapi-group/specifications/blob/main/specs/unified_kernel_image.md
and
https://fedoraproject.org/wiki/Changes/Unified_Kernel_Support_Phase_1
There are two major points here:
UEFI only
We'll need to add a UEFI backend to ostree, which explicitly controls the UEFI boot ordering via e.g.
efibootmgr
instead of using the/boot/loader/entries
stuff.Kernel cmdline ➡️ rootfs
One goal of the UKI work is to have generic Linux distributions sign both the kernel and initramfs and stock kernel cmdline. However, ostree today embeds the target rootfs in the kernel cmdline - this creates a recursion issue.
Option: ostree=N and symlinks and using systemd-stub credentials
We can change
ostree-prepare-root
in the initramfs to automatically find the latest symlink in/sysroot/ostree
- we effectively do almost this with/ostree/boot.[01]
today.(Something to debate here is whether we require an
ostree=
karg at all; our initramfs code is conservative today in making ostree opt-in, but for people who are requiring it, we could also just add a flag to default it to on, finding the latest deployment)The interesting thing here is what it looks like to fetch a userspace only update.
That flow would look like this:
ostree admin upgrade
orbootc update
or whatever, fetch new rootfs but not a new kernel UKIOption: Parsing the UKI filename
See #2753 (comment)
The text was updated successfully, but these errors were encountered: