Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

40ignition-ostree: add coreos-inject-rootmap.service #503

Merged
merged 10 commits into from
Aug 27, 2020

Conversation

jlebon
Copy link
Member

@jlebon jlebon commented Jun 29, 2020

This implements the rootmap functionality that figures out all the
dependencies required to find /sysroot, and injects them into the BLS
config. For more information, see:

coreos/fedora-coreos-tracker#94 (comment)

The rdcore code supports RAID and LUKS devices, though the latter
needs a new Clevis release with the following patches to be fully
supported:

latchset/clevis#211
latchset/clevis#217

This also implements the root=UUID=$uuid inject patch proposed in
coreos/fedora-coreos-tracker#465.

On its own, this unlocks reprovisioning FCOS with root on a RAID device,
or e.g. in-place reprovisioning of root on btrfs.

Closes: coreos/fedora-coreos-tracker#465
Closes: coreos/fedora-coreos-tracker#94

@jlebon
Copy link
Member Author

jlebon commented Jun 29, 2020

Requires #184, #354, and the SELinux patch.

With this, it's now possible to configure FCOS with root on a RAID device!

Example Ignition config:

{
  "ignition": {
    "version": "3.2.0-experimental"
  },
  "storage": {
    "disks": [
      {
        "device": "/dev/disk/by-id/virtio-disk1",
        "partitions": [
          {
            "label": "foo"
          }
        ],
        "wipeTable": true
      },
      {
        "device": "/dev/disk/by-id/virtio-disk2",
        "partitions": [
          {
            "label": "bar"
          }
        ],
        "wipeTable": true
      }
    ],
    "raid": [
      {
        "devices": [
          "/dev/disk/by-partlabel/foo",
          "/dev/disk/by-partlabel/bar"
        ],
        "level": "raid1",
        "name": "myroot"
      }
    ],
    "filesystems": [
      {
        "device": "/dev/md/myroot",
        "format": "xfs",
        "wipeFilesystem": true,
        "label": "root"
      }
    ]
  }
}

Running with cosa:

(cosa)$ kola qemuexec -i config.ign --ignition-direct --memory 4096 --add-disk 5G --add-disk 5G
(vm)$ findmnt /sysroot
TARGET   SOURCE     FSTYPE OPTIONS
/sysroot /dev/md127 xfs    rw,relatime,seclabel,attr2,inode64,logbufs=8,logbsize
(vm)$ lsblk --inverse /dev/md127
NAME    MAJ:MIN RM SIZE RO TYPE  MOUNTPOINT
md127     9:127  0   5G  0 raid1 /sysroot
├─vda1  252:1    0   5G  0 part
│ └─vda 252:0    0   5G  0 disk
└─vdb1  252:17   0   5G  0 part
  └─vdb 252:16   0   5G  0 disk
(vm)$ reboot
...
(vm)$ lsblk --inverse /dev/md127
NAME    MAJ:MIN RM SIZE RO TYPE  MOUNTPOINT
md127     9:127  0   5G  0 raid1 /sysroot
├─vda1  252:1    0   5G  0 part
│ └─vda 252:0    0   5G  0 disk
└─vdb1  252:17   0   5G  0 part
  └─vdb 252:16   0   5G  0 disk
(vm)$ cat /proc/cmdline
BOOT_IMAGE=(hd0,gpt1)/ostree/fedora-coreos-cb9be02cb0885ea378b9763a62c5ca5a596b280d182553fb4f2522725f4257be/vmlinuz-5.6.19-300.jl.fc32.x86_64 mitigations=auto,nosmt systemd.unified_cgroup_hierarchy=0 console=tty0 console=ttyS0,115200n8 ignition.platform.id=qemu rd.md.uuid=139abc1a:9fde35c5:1021a2b6:748ce991 root=UUID=ff2ac1ab-f449c-4de0-8396-b19305747456 rw

@jlebon
Copy link
Member Author

jlebon commented Jun 29, 2020

For fun, with coreos/ignition#1010, it's possible to do root on RAID10:

{
  "ignition": {
    "version": "3.2.0-experimental"
  },
  "storage": {
    "disks": [
      {
        "device": "/dev/disk/by-id/virtio-disk1",
        "partitions": [
          {
            "label": "foo"
          }
        ],
        "wipeTable": true
      },
      {
        "device": "/dev/disk/by-id/virtio-disk2",
        "partitions": [
          {
            "label": "bar"
          }
        ],
        "wipeTable": true
      },
      {
        "device": "/dev/disk/by-id/virtio-disk3",
        "partitions": [
          {
            "label": "baz"
          }
        ],
        "wipeTable": true
      },
      {
        "device": "/dev/disk/by-id/virtio-disk4",
        "partitions": [
          {
            "label": "boo"
          }
        ],
        "wipeTable": true
      }
    ],
    "raid": [
      {
        "devices": [
          "/dev/disk/by-partlabel/foo",
          "/dev/disk/by-partlabel/bar"
        ],
        "level": "raid1",
        "name": "foobar"
      },
      {
        "devices": [
          "/dev/disk/by-partlabel/baz",
          "/dev/disk/by-partlabel/boo"
        ],
        "level": "raid1",
        "name": "bazboo"
      },
      {
        "devices": [
          "/dev/md/foobar",
          "/dev/md/bazboo"
        ],
        "level": "raid0",
        "name": "myroot"
      }
    ],
    "filesystems": [
      {
        "device": "/dev/md/myroot",
        "format": "xfs",
        "wipeFilesystem": true,
        "label": "root"
      }
    ]
  }
}

Running with cosa:

(cosa)$ kola qemuexec -i config.ign --ignition-direct --memory 4096 --add-disk 5G --add-disk 5G --add-disk 5G --add-disk 5G
(vm)$ findmnt /sysroot
TARGET   SOURCE     FSTYPE OPTIONS
/sysroot /dev/md125 xfs    rw,relatime,seclabel,attr2,inode64,logbufs=8,logbsize
(vm)$ reboot
...
(vm)$ lsblk --inverse /dev/md125
NAME      MAJ:MIN RM SIZE RO TYPE  MOUNTPOINT
md125       9:125  0  10G  0 raid0 /sysroot
├─md126     9:126  0   5G  0 raid1
│ ├─vdc1  252:33   0   5G  0 part
│ │ └─vdc 252:32   0   5G  0 disk
│ └─vdd1  252:49   0   5G  0 part
│   └─vdd 252:48   0   5G  0 disk
└─md127     9:127  0   5G  0 raid1
  ├─vda1  252:1    0   5G  0 part
  │ └─vda 252:0    0   5G  0 disk
  └─vdb1  252:17   0   5G  0 part
    └─vdb 252:16   0   5G  0 disk
(vm)$ cat /proc/cmdline
BOOT_IMAGE=(hd0,gpt1)/ostree/fedora-coreos-cb9be02cb0885ea378b9763a62c5ca5a596b280d182553fb4f2522725f4257be/vmlinuz-5.6.19-300.jl.fc32.x86_64 mitigations=auto,nosmt systemd.unified_cgroup_hierarchy=0 console=tty0 console=ttyS0,115200n8 ignition.platform.id=qemu ostree=/ostree/boot.1/fedora-coreos/cb9be02cb0885ea378b9763a62c5ca5a596b280d182553fb4f2522725f4257be/0  rd.md.uuid=2bbba6df:428aea5c:763ec8b4:9a2a357b rd.md.uuid=7e1033f7:117cde9d8:8cf8492d:619c679b rd.md.uuid=ac2ed991:60d98db2:7cef9638:773d4426 root=UUUID=1b184d4a-f273-46c7-a359-b84f257f07b8 rw

@jlebon
Copy link
Member Author

jlebon commented Jun 29, 2020

So... I initially wanted to write this in rpm-ostree so it can be in not-bash, and then call it from the initramfs, but it's hard to beat bash in prototyping speed. And then it ended up being much simpler than expected thanks to lsblk's --inverse switch, which gives us exactly what we need. One thing I was tempted to do was providing the ability to have Rust code in this repo which cosa compiles at build-time, but didn't want to tackle that discussion yet.

@jlebon
Copy link
Member Author

jlebon commented Jul 2, 2020

Re. root-on-LUKS, see coreos/ignition#960 (comment).

@jlebon
Copy link
Member Author

jlebon commented Jul 2, 2020

OK, rebased this and now with #466, #184, and #354 folded in!

Will see how quickly we can get a respin of Clevis with latchset/clevis#211 included. Otherwise, we should be able to hack around it for now by carrying our own version of the path and service units until that happens.

I haven't addressed the comments yet. I'd also like to write a kola test for this before everything goes in. Also need to adapt growpart.

@jlebon jlebon changed the title WIP: Add coreos-inject-rootmap.service, unlock root-on-RAID WIP: Add coreos-inject-rootmap.service, unlock root-on-{RAID,LUKS} Jul 2, 2020
@jlebon jlebon changed the title WIP: Add coreos-inject-rootmap.service, unlock root-on-{RAID,LUKS} WIP: Add support for root on complex devices Jul 2, 2020
@jlebon
Copy link
Member Author

jlebon commented Jul 7, 2020

This is blocked on https://bugzilla.redhat.com/show_bug.cgi?id=1845210.

manifest-lock.x86_64.json Outdated Show resolved Hide resolved
@cgwalters
Copy link
Member

I checked out this branch and also dropped in a locally built kernel from this commit, then provided:

# Example of reprovisioning / with btrfs
variant: fcos
version: 1.0.0
storage:
  filesystems:
    - device: /dev/disk/by-partlabel/root
      format: btrfs
      wipe_filesystem: true
      label: root

And it just worked, rootfs on btrfs was that easy! (And same for ext4 etc.)

(I did get a failure from ignition-ostree-uuid-boot.service but that's #503 (comment) I believe)

@cgwalters
Copy link
Member

Another bug I just hit with this is when doing an install without providing an Ignition config:

[root@localhost ~]# systemctl status ignition-ostree-rootfs-detect
● ignition-ostree-rootfs-detect.service
     Loaded: not-found (Reason: Unit ignition-ostree-rootfs-detect.service not found.)
     Active: failed (Result: exit-code) since Mon 2020-07-13 15:08:54 UTC; 3min 16s ago
   Main PID: 537 (code=exited, status=1/FAILURE)
        CPU: 2ms

Jul 13 15:08:54 localhost systemd[1]: Starting Ignition OSTree: detect rootfs replacement...
Jul 13 15:08:54 localhost systemd[1]: ignition-ostree-rootfs-detect.service: Main process exited, code=exited, status=1/FAILURE
Jul 13 15:08:54 localhost ignition-ostree-dracut-rootfs[539]: /usr/libexec/ignition-ostree-dracut-rootfs: line 12: /run/ignition.json: No such file or directory
Jul 13 15:08:54 localhost systemd[1]: ignition-ostree-rootfs-detect.service: Failed with result 'exit-code'.
Jul 13 15:08:54 localhost systemd[1]: Failed to start Ignition OSTree: detect rootfs replacement.

jlebon added a commit to jlebon/coreos-assembler that referenced this pull request Jul 13, 2020
The `platform.Conf` type allows abstracting over the different Ignition
versions so that different tests can use different versions. Using it
instead of the Ignition type directly means that we can now use the
3.2-experimental spec in external tests.

This is needed for testing the new LUKS support in Ignition[1] and the
related rootfs-on-complex-devices work[2].

[1] coreos/ignition#960
[2] coreos/fedora-coreos-config#503
jlebon added a commit to jlebon/coreos-assembler that referenced this pull request Jul 13, 2020
The `platform.Conf` type allows abstracting over the different Ignition
versions so that different tests can use different versions. Using it
instead of the Ignition type directly means that we can now use the
3.2-experimental spec in external tests.

This is needed for testing the new LUKS support in Ignition[1] and the
related rootfs-on-complex-devices work[2].

[1] coreos/ignition#960
[2] coreos/fedora-coreos-config#503

Closes: coreos#1589
jlebon added a commit to jlebon/coreos-assembler that referenced this pull request Jul 13, 2020
The `platform.Conf` type allows abstracting over the different Ignition
versions so that different tests can use different versions. Using it
instead of the Ignition type directly means that we can now use the
3.2-experimental spec in external tests.

This is needed for testing the new LUKS support in Ignition[1] and the
related rootfs-on-complex-devices work[2].

[1] coreos/ignition#960
[2] coreos/fedora-coreos-config#503

Closes: coreos#1589
@arithx
Copy link
Contributor

arithx commented Jul 14, 2020

Another bug I just hit with this is when doing an install without providing an Ignition config:

At least for the no such file part with the new Ignition release the cache config (/run/ignition.json by default) should always exist (coreos/ignition#1002)

Copy link
Member

@cgwalters cgwalters left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good at a high level!

@jlebon
Copy link
Member Author

jlebon commented Aug 19, 2020

OK, so one thing I just realized is that with root=UUID=... always being appended, we lose the prjquota option on subsequent boots because now it's systemd that mounts and not us. And more generally, reprovisioning to a different filesystem with custom mount options will need those persisted as well. So I think rdcore rootmap needs to also inject rootflags=. Working on a patch for that. (Another approach is making it part of /etc/fstab since systemd is capable of re-applying mount options after the initial mount, but we're moving away from /etc/fstab and also some mount options can only be set on the first mount, including quota-related options on XFS).

(Edit: sorry I was kinda confused on this. Mount options for root can't live in the Ignition config. So in the reprovisioning case, it's up to users to add an appropriate rootflags karg.)

cgwalters and others added 10 commits August 20, 2020 17:19
This adds basic infrastructure units for "re-provisioning"
the root filesystem.  See:
coreos/fedora-coreos-tracker#94

A unit first detects if the Ignition configuration has a filesystem
with the label `root` - if so, we save the rootfs into RAM, let
`ignition-disks.service` run, then restore it from RAM.

Earlier attempts used the `brd` kernel module which is a RAM-backed
block device so we can just `dd`.  However, this has some limitations,
such as the need to save the full disk in RAM, and the inability for any
other initrd code to use `brd` devices. As well, `brd` doesn't support
discards, so we require at minimum $rootfs_size RAM (e.g. 3G) until
reprovisioning is complete.

Future work here will likely move the `restore` phase into `rpm-ostree`.

Co-authored-by: Jonathan Lebon <jonathan@jlebon.com>
This is a general best practice; the intention of filesystem
UUIDs is that they're unique.  It helps backup systems and the like
if we change this.

This builds on coreos/coreos-assembler@e3905fd

In the future, we may also switch to using these UUIDs for subsequent
boots; see: coreos/fedora-coreos-tracker#465
This implements the rootmap functionality that figures out all the
dependencies required to find `/sysroot`, and injects them into the BLS
config. For more information, see:

coreos/fedora-coreos-tracker#94 (comment)

The `rdcore` code supports RAID and LUKS devices, though the latter
needs a new Clevis release with the following patches to be fully
supported:

latchset/clevis#211
latchset/clevis#217

This also implements the `root=UUID=$uuid` inject patch proposed in
coreos/fedora-coreos-tracker#465.

On its own, this unlocks reprovisioning FCOS with root on a RAID device,
or e.g. in-place reprovisioning of root on btrfs.

Closes: coreos/fedora-coreos-tracker#465
Closes: coreos/fedora-coreos-tracker#94
This stamp file was used to make sure coreos-growpart only ran on the
first boot when it ran in the real root. Nowadays, it runs in the
initramfs as part of `ignition-complete.target` before we even run
`ostree-prepare-root` (which meant we were actually creating the stamp
file in the initrd filesystem).

Having a stamp file is useful though for writing tests. So let's
repurpose the idea and put it in `/run` instead.
Add two basic tests: one where we reprovision in place to ext4, and one
where we reprovision onto a separate RAID1.
That generator no longer exists.
Split out a small script where the canonical rootflags live in the
non-reprovisioning case. This will be used by both
`ignition-ostree-mount-sysroot` and `rdcore rootmap`.
@jlebon
Copy link
Member Author

jlebon commented Aug 20, 2020

OK, so one thing I just realized is that with root=UUID=... always being appended, we lose the prjquota option on subsequent boots because now it's systemd that mounts and not us.

This is fixed now in coreos/coreos-installer#358 and the latest commit here which adds coreos-rootflags. (Definitely a downside of the rdcore approach is that it's more work to get fixes in vs just some script in the overlay.)

@jlebon
Copy link
Member Author

jlebon commented Aug 27, 2020

--- PASS: ext.fedora-coreos-config_PR-503-4MMBZDZR33QM3J24KCIOGR3JPDUXCUHF5DYIZB4C3WVNVMA5OEJA.root-reprovision.raid1 (95.30s)
...
--- PASS: ext.fedora-coreos-config_PR-503-4MMBZDZR33QM3J24KCIOGR3JPDUXCUHF5DYIZB4C3WVNVMA5OEJA.root-reprovision.filesystem-only (80.73s)

🎉

Who wants to give an approving review on this so we can merge it in?

fi
mount -o "${mountflags}" "${rootpath}" /sysroot

mount -o "$(coreos-rootflags)" "${rootpath}" /sysroot
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This took a second to realize it was sensing options based on a script but 👍!

Copy link
Member

@cgwalters cgwalters left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Member

@ashcrow ashcrow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work here! I seriously appreciate the additional commenting through the code 🙏

@jlebon jlebon merged commit f974aff into coreos:testing-devel Aug 27, 2020
@jlebon jlebon deleted the pr/rootmap branch August 27, 2020 15:55
@Conan-Kudo
Copy link
Contributor

@jlebon Thanks for making this happen! 🎉 🙏

@fulminemizzega
Copy link

(Edit: sorry I was kinda confused on this. Mount options for root can't live in the Ignition config. So in the reprovisioning case, it's up to users to add an appropriate rootflags karg.)

Hello,
I think that this should be mentioned in the related documentation section "Reconfiguring the root filesystem". After reading this PR and issue #94 I think I understand why filesystems.mount_options in ignition does not do anything for root fs, but it was not that intuitive (at least to me). In my case, I changed the root fs to btrfs and enabled compression. I also understand this is at least frowned upon, so I guess I got what I deserved.

@jlebon
Copy link
Member Author

jlebon commented Dec 7, 2020

Hmm, I think if we have magic for reprovisioning root, then we should also just handle mount_options too and translate that into rootflags. This is similar to how with_mount_unit also injects the mount options into the unit.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

switch to root=<uuid> post-firstboot? Support for reconfiguring the root storage
7 participants