-
Notifications
You must be signed in to change notification settings - Fork 77
AB: immutable root with per-slot /var, shared identity, logs, and home #120
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Make each system slot rootfs read-only and move mutable state to a
persistent partition, with per-slot isolation for /var and shared
identity/data/logs across slots.
Key changes:
- Root immutability:
- Mount / as ro; no remounting rw during boot.
- Per-slot /var:
- Add early systemd generator to mount persistent storage at
/persistent and bind /var to /persistent/slots/<slot>/var based on
the active slot. Requires GPT PARTLABEL support.
- Shared home:
- Bind /home to /persistent/home so user data is retained across slot
rotations.
- Stable machine-id across slots:
- /etc/machine-id shipped as an empty regular file (no change).
- Link /var/lib/dbus/machine-id to /etc/machine-id symlink for legacy
clients.
- Add new machine-id-sync.service to:
- first boot: copy /run/machine-id to /persistent/common/etc/machine-id
- subsequent boots: write in-place persistent to /run/machine-id
- Persistent journald across slots:
- Bind /var/log/journal to /persistent/log/journal.
- Add journalling preferences.
- Image build tweaks:
- Stage persistent (slots/*/var, common/etc).
- Enable units under *basic/sysinit*.target.wants
- Hardening:
- Add ConditionPathExists=/run/machine-id and /persistent/common/etc
to units.
- Use RequiresMountsFor=/persistent/common/etc/machine-id where relevant.
Results:
- Consistent machine-id in /run, /etc, and /persistent across reboots and
slot rotations.
- Journald writes to /persistent/log/journal/ preserving machine-id
across slots.
- /home persists across A/B.
- /var remains per-slot enabling predictable state.
Note:
Due to differences in systemd startup, there are some per‑service mount
namespace failures (error 226/NAMESPACE) when using sandboxing with 252
(Bookworm) that don't exist in 257 (Trixie) because of differences in
how systemd boots. Add a workaround to drop PrivateDevices for affected
services (timesyncd and resolved). This reduces the sandboxing for these
units.
Other:
- Relocate persistent partition inside LUKS for crypt PMAP
|
Tested on both Bookworm and Trixie with Connect authkey, this PR enables support of AB delta/incremental updates via immutable root partition and provides unified logging across slots, unified machine-id across slots, persistent per-slot writable support (ie slot specific rw /var), and slot agnostic HOME. Some eyes on the journald settings would be appreciated. See I've extensively tested the machine-id sync service and it seems solid across many reboot and tryboot cycles.
Seems a reasonable initial implementation with which we can develop delta updates on. Had to add a small workaround for systemd 252 (Bookworm) in order to support immutable root for services that tried to have a private Immutable root support now mandates GPT PARTLABEL support (GPT label for the root device is used to bind mount the slot specific /var). |
|
The Connect client is not currently part of either AB config (ie It's probably worth me adding this since it will add Connect support to these builds by default, therefore only requiring the |
pelwell
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's working for me. Using a custom config/otatest.yaml, I don't see any difference except for the persistent partition and the read-only root.
|
This is what I was just looking for. I'm still not up to speed with Anyways, I don't know how familiar you are with Mender, it's an open-source OTA solution with A/B partitioning and I'm trying to get it working with Cheers |
|
Hi @adeldigital To get familiar with how
|
|
Thanks for the pointer, @learmj! I found that the user is defined in device-base.yaml. Will dig into it more when I get the chance. My use case is pretty close to the kiosk example — essentially Chromium running a locally hosted app. For now, I’ve had good results using the Lite version of Pi OS. With Trixie, it’s now possible to install the new “base” packages for Regarding your AB partitioning base layer, do you already have an update strategy in mind — something that handles rootfs switching and rollback on failed updates? I’d love to hear more about what you’re planning for this. |
Yes it's defined there, but no local account will be created in the chroot unless the creds layer is pulled in. Re: update strategy etc - yes we do. Stay tuned... |
|
Also, the three machine-id values (/etc/machine-id, /run/machine-id and /persistent/common/etc/machine-id) have the same value, and they don't change after an A/B update. |
|
@learmj To help us track progress for our new product development (targeting launch in early 2026), is there an existing or planned public repo where we might follow the development of the AB update agent? We're very eager to integrate this into our fleet management. |
|
rpi-image-gen is just the build tool that lays down the foundation for an AB system. Once we release the update functionality, there will be more information available. |
Make each system slot rootfs read-only and move mutable state to a persistent partition, with per-slot isolation for /var and shared identity/data/logs across slots.
Key changes:
Results:
Note:
Due to differences in systemd startup, there are some per‑service mount namespace failures (error 226/NAMESPACE) when using sandboxing with 252 (Bookworm) that don't exist in 257 (Trixie) because of differences in how systemd boots. Add a workaround to drop PrivateDevices for affected services (timesyncd and resolved). This reduces the sandboxing for these units.
Other: