ex-integrity.composefs: Tracking issue #2867

cgwalters · 2023-05-30T18:23:37Z

composefs/ostree (and beyond)

Background

A key design goal of ostree at its creation was to not require any new functionality in the Linux kernel. The baseline mechanisms of hard links and read-only bind mounts suffice to manage views of read-only filesystem trees.

However, for Docker and then podman, overlayfs was created to more efficiently support copy-on-write semantics - also crucially, overlayfs is a layered filesystem; it can work with any underlying (modern) Linux filesystem as a backend.

More recently, composefs was created which builds on overlayfs with more integrity features. This tracking issue is for the integration of composefs and ostree.

System integrity

ostree does not provide significant support for truly immutable system state; a simple mount -o remount,rw /usr will allow direct persistent modification of the underlying files.

There is ostree fsck, but this is inefficient and manual, and further still today does not cover the checked-out deployment roots (so e.g. newly added binaries in the deployment root aren't found).

Accidental damage protection

It is important to ostree to support "user owns machine" scenarios, where the user is root on their own computer and must have the ability to make persistent changes.

But it's still useful to have stronger protection against accidental damage. Due to the way composefs works using fs-verity, a simple mount -o remount,rw can no longer silently modify files. First, the mounted composefs is always read-only; there is no write support in composefs. Access to the distinct underlying persistent root filesystem can be more strongly separated and isolated.

Support for "sealed" systems

It's however also desirable to support a scenario where an organization wants to produce computing devices that are "sealed" to run only code produced (or signed) by that organization. These devices should not support persistent unsigned code.

ostree does not have strong support for this model today, and composefs should fix it.

Phase 0: Basic integration (experimental)

In this phase, we will land an outstanding pull request which adds basic integration that enables booting a system using composefs as a root filesystem. In this phase, a composefs image is dynamically created on the client using the ostree metadata.

This has already led us to multiple systems integration issues. So far, all tractable.

Ignition tweaks to work with composefs coreos/fedora-coreos-config#2404

A good milestone to mark completion of this phase is landing a CI configuration to ostree which builds and deploys a system using composefs, and verifies it can be upgraded.

In this phase, there is no direct claimed support for "sealed" systems (i.e. files are not necessarily signed).

Phase 1: Basic rootfs sealing (experimental)

In this phase, support for signatures covering the composefs is added. A key question to determine is when the composefs file format is stable. Because the PR up until this point defaults to "re-synthesizing" the composefs on the client, the client must reproduce exactly what was generated server side and signed.

Phase 2: Secure Boot chaining (experimental)

This phase will document how to create a complete system using Secure Boot which chains to a root filesystem signature using composefs.

This may also depend on #2753 and #1951

Here is a sketch for how we can support trusted boot using composefs and fs-verity signatures.

During build:

Generate a public/private key pair
Copy the public key into the new rootfs for the commit (e.g. /etc/pki/fsverity/cfs.pub)
During initrd generation in the rootfs, pass --install /etc/pki/fsverity/cfs.pub to dracut, which will copy the public key into the initrd.
Add a module to dracut that loads the public key into the fs-verity keyring (see https://gitlab.com/CentOS/automotive/rpms/dracut-fsverity for an example)
Generate a UKI or aboot image containing the above initrd, the kernel and kernel command line. The kernel command line uses a generic ostree=latest argument, because at this point we don't know the final deployment id. See also discussion in Add ostree=aboot for signed Android Boot Images #2844
Sign the UKI with a private key that is trusted by your secureboot keyring.
Stuff the UKI into the rootfs next to the normal kernel
Save the rootfs as objects in the ostree repo, giving the digest of the rootdir
ostree commit normally just stores the above digest in the metadata. But now we also take the private key from step 1 (passed as argument to ostree commit) and generate a composefs image file based on the rootdir digest. We sign this file with the private key and store the signature as extra metadata in the commit object.
The entire commit object is GPG signed and pushed to a repo.

During install:

Pull a new commit
Verify the GPG signature
When deploying we look at the metadata from the commit, in particular the rootdir digest and the signature. The rootdir digest (and the repo objects) is used to construct a new composefs file, and the signature is used to sign the composefs image file when enabling fs-verity on it.
The BLS files are created in /boot/loader.[01] that points to the deploy dir with the composefs file, and the /boot/loader symlink is atomically switched to new loader.[01] dir. This BLS file contains the deploy id we deployed into in the kernel ostree=... arg.
The UKI is put somewhere where the boot loader can find it. (EFI partition, aboot partition, etc)

During boot:

The firmware loads the UKI, verifies it according to the secureboot keyring and boots kernel+initrd.
The initrd mounts the disk partitions.
The initrd notices the kernel arg "ostree=latest" and looks for the BLS file in /boot/loader with index=1 (i.e. most recent deploy, or index=2 if we're in fallback mode).
The initrd parses the BLS file, which contains the full ostree=... argument. This lets us find the deploy directory (like /ostree/deploy/fedora-coreos/deploy/443ae0cd86a7dd4c6f5486a2283471b3c8f76fc5dcc4766cf935faa24a9e3d34.0). (Note at this point that we can't trust either the BLS file or the deploy dir.)
The initrd loads /etc/pki/fsverity/cfs.pub into the kernel keyring for fs-verity. (Trusted, as its in signed initrd.)
The initrd mounts the composefs with the LCFS_MOUNT_FLAGS_REQUIRE_SIGNATURE flag. This ensures that the file to be mounted has a signature, and thus can only be read if the matching public key is loaded in the keyring.
On top of the composefs we bind mount writable things like /var and /etc.
Pivot-root into the new composefs mount, which now will verify all further reads from the readonly parts of rootfs are valid.

Beyond

At this point, we should have gained significant experience with the system. We will determine when to mark this as officially stabilized after this.

Phase 3: "Native composefs"

Instead of "ostree using composefs", this proposes to flip things around, such that more code lives underneath the "composefs" project. A simple strawman proposal here is that we have the equivalent of ostree-prepare-root.service actually be composefs-prepare-root.service and live in github.com/containers/composefs.

Related issues:

Add shared library/tool for managing backing store files containers/composefs#125

Phase 4: Unified container and host systems

This phase builds on the native composefs for hosts and ensures that containers (e.g. podman) share backing storage with the host system and as much code as possible.

The text was updated successfully, but these errors were encountered:

alexlarsson · 2023-05-31T09:38:59Z

About the composefs file format stability. The plan is to guarantee stability in general, and there is way to change it by specifying a version when you generate the file. However, I don't want to give any stability guarantees until the overlay xattr changes has landed in the upstream kernel, because only then do we know they will not change.

alexlarsson · 2023-05-31T10:06:39Z

Ok, I ran into a snag with this approach:

When doing an update, the new deploy is written, and when we enable fs-verity on it, with the signature, fs-verity fails. The reason is that the new signature is signed with the new certificate and the public key is not in the kernel keyring at the time of deploy.

We have a similar issue at image building time, where we would need to load the public key into the keyring of the host (i.e. build) machine.

It doesn't feel right to load any keys like this into the keyring at any time other then boot (and the keyrings are bound to be sealed anyway). So, I think we need to delay the application of the signature to the first boot, as we can then guarantee that the right keys are loaded.

cgwalters · 2023-05-31T15:10:11Z

This issue will be discussed this Friday at 9:00am EST in https://meet.jit.si/moderated/2e9be89e0e9ee06647b4719784578a6251f72eec9a07829bc9212e57c4883816

alexlarsson · 2023-06-02T12:49:24Z

I wrote down some random ramblings about the Phase 3 approach to kickstart
the meeting/discussion:

Basic assumptions:

All the tools for building images and handling images at build time
(rpm-ostree, ostree repos, etc) remain as is. All we change is how
the OS is deployed on the target machine.
We assume that an initrd is used
We assume that an systemd is used

Points that needs consideration:

What mechanism/protocol is used to download a new update to
the target machine.
What is the format of the data that is downloaded.
How does this format allow incremental downloads (i.e. avoiding
downloading parts already available locally). And to what degree is
it incremental? (i.e. just skip already available files, or full
binary delta).
composefs will add support for installing images to a shared
/.composefs "repo", as well as some code similar to
ostree-prepare-root steps. But, who will be responsible for the
boot part of the deployment, like generating bls files, putting
initrds from the images in the right place, merging /etc, rollback,
etc. Its not clear where the border between composefs and ostree
lies.
What other ostree system features do we support?
- overlay /usr?
- rpm layering?
- initrd regeneration (and/or injection) on host?
- local kernel cmdline changes
- anything else?
How do we handle signatures. My current work is based on fs-verity
signatures. So, we load a public key into the kernel fs-verity
keyring and then just validate at mount time that any fs-verity
signature exists, which will trigger validataion by the kernel.
This has some problems:
- You can't sign the file at deploy time unless you load the public
  key into the kernel keyring (which may be sealed)
- You can't even look at the contents for any composefs image other
  than the one you booted (at least when using per-build keys).
- Fedora (for example) currently doesn't enable the fs-verity
  signature support in the kernel.
Another approach is to store a regular x509 signature of the
fs-verity digest in a file, then we verify this signature in the
initrd, passing the signed digest when mounting the composefs
image.

This means adding some crypto code to the initrd prepare-root code,
adding a dependency on openssl or similar.

travier · 2023-06-02T13:02:33Z

More recently, composefs was created which builds on overlayfs with more integrity features. This design document describes the high

Looks like this sentence is cut before the end (from the first comment)

travier · 2023-06-02T13:09:31Z

The initrd notices the kernel arg "ostree=latest" and looks for the BLS file in /boot/loader with index=1 (i.e. most recent deploy, or index=2 if we're in fallback mode).

How do we know that we are in fallback mode?

travier · 2023-06-02T13:13:50Z

But, who will be responsible for the boot part of the deployment, like generating bls files, putting initrds from the images in the right place, merging /etc, rollback, etc. Its not clear where the border between composefs and ostree lies.

This is done by ostree & rpm-ostree in Fedora CoreOS for example.

travier · 2023-06-02T13:16:41Z

You can't even look at the contents for any composefs image other than the one you booted (at least when using per-build keys).

Isn't it possible to validate fs-verity signatures from userspace with requiring the key to be loaded in the kernel?

travier · 2023-06-02T13:17:09Z

+1 from me for this approach in general. Thanks for writing it up!

travier · 2023-06-02T13:21:03Z

The BLS files are created in /boot/loader.[01] that points to the deploy dir with the composefs file, and the /boot/loader symlink is atomically switched to new loader.[01] dir. This BLS file contains the deploy id we deployed into in the kernel ostree=... arg.

Note that in some cases (direct UEFI boot, with or without systemd-boot, with UKIs), there won't be BLS configs or they won't be used. #2753 (comment) as an alternative proposal to let the initrd figure out which entry was booted and which ostree deployment should be used by storing the ostree deployment hash in the filename of the UKI and then reading it from the EFI variables in the initrd.

alexlarsson · 2023-06-02T14:56:32Z

You can't even look at the contents for any composefs image other than the one you booted (at least when using per-build keys).

Isn't it possible to validate fs-verity signatures from userspace with requiring the key to be loaded in the kernel?

The way fs-verity signatures work right now is that they are verified by the kernel automatically when you open the file.

If we had a standalone signature file paired with the non-signed composefs file we could do the validation in userspace like this. But if the composefs file was signed we can't even look at it until we've loaded the right key into the kernel.

osalbahr · 2023-06-02T15:10:43Z

Was the meeting recorded? I wanted to join but accidentally overslept.

cgwalters · 2023-06-02T15:29:09Z

Was the meeting recorded? I wanted to join but accidentally overslept.

Sorry, it wasn't. Probably should have. We decided to make this a recurring meeting, so there will be another one on Friday June 16 at the same time (9:30am EST).

I may also argue at some point that this should be a composefs meeting and not an ostree meeting and we'd do it alongside or in the github.com/containers context.

alexlarsson · 2023-06-02T15:31:31Z

So, there is a keyctl_pkey_verify() syscall:
https://man7.org/linux/man-pages/man3/keyctl_pkey_sign.3.html
I think using this during mount to verify a signature file is much better and more flexible than using the built-in fs-verity signatures, because you can then both access the composefs image file without the key, and enable fs-verity on it without knowing the public key. I will have a look at adding support for this to libcomposefs.

This is prep for supporting composefs, where the physical root is distinct from the deployment root. Specifically for the LUKS case, we can find `/etc/crypttab` only in the deployment root. Otherwise, we suffix the passed path (usually `/sysroot`) that was mounted in the initramfs with `/sysroot` to find the physical root. xref ostreedev/ostree#2867

This is to enable ostree+composefs: ostreedev/ostree#2867 When we care about the *physical* backing filesystem, we need to look at /sysroot/sysroot (which in the real root is `/sysroot`) because now `/sysroot` (aka `/` in the real root) is a composefs (really an `overlayfs` with a transient loop-mounted erofs), which is distinct from the physical root. Co-authored-by: Colin Walters <walters@verbum.org>

This pairs with ostreedev/ostree#2640 It's all off by default (to state the obvious). But one can do e.g.: ``` $ cat >> src/config/image.yaml << EOF rootfs: ext4verity composefs: unsigned EOF ``` You can also try out `composefs: signed` and also do: ``` $ mkdir -p secrets $ openssl req -newkey rsa:4096 -nodes -keyout secrets/root-composefs-key.pem -x509 -out secrets/root-composefs-cert.pem ``` (But this is not *yet* a focus) More in ostreedev/ostree#2867

ericcurtin · 2023-06-06T15:26:13Z

5. Add

This issue will be discussed this Friday at 9:00am EST in https://meet.jit.si/moderated/2e9be89e0e9ee06647b4719784578a6251f72eec9a07829bc9212e57c4883816

Is there an .ics file, etc. for this meeting so I can add to my calendar?

ericcurtin · 2024-05-07T12:24:06Z

Yeah, even rolling back won't necessarily work on cases where you also want to protect against rollback attacks, so the only way would be to do similar as done by android, and reboot into a recovery mode of some sort (application / product specific).

Even recovery mode can get corrupted, it's kinda a never ending chain.

One feature OSTree has is that if you want more than AB rollbacks, you can in theory have as many rollbacks as you want ABCD, but just AB is common.

ldts · 2024-05-07T12:48:12Z

WRT to fs-verity triggering on detected issues, anyone knows why the kernel doesnt implement the config/option to just trigger a reboot on detection? just something around these lines - might be a bit more complex but this would be the idea - in a configurable way

diff --git a/fs/overlayfs/namei.c b/fs/overlayfs/namei.c
index 80391c687c2ad..dbbec0a9c862c 100644
--- a/fs/overlayfs/namei.c
+++ b/fs/overlayfs/namei.c
@@ -991,8 +991,11 @@ int ovl_verify_lowerdata(struct dentry *dentry)
        int err;
 
        err = ovl_maybe_lookup_lowerdata(dentry);
-   if (err)
+ if (err) {
+         if (err == -ENOENT)
+                 BUG_ON(1);
                return err;
+ }
 
        return ovl_maybe_validate_verity(dentry);
 }
diff --git a/fs/overlayfs/util.c b/fs/overlayfs/util.c
index 89e0d60d35b6c..fd039df0851d9 100644
--- a/fs/overlayfs/util.c
+++ b/fs/overlayfs/util.c
@@ -1309,6 +1309,7 @@ int ovl_validate_verity(struct ovl_fs *ofs,
                                          &verity_algo, NULL);
        if (digest_size == 0) {
                pr_warn_ratelimited("lower file '%pd' has no fs-verity digest\n", datapath->dentry);
+         BUG_ON(1);
                return -EIO;
        }
 
@@ -1317,6 +1318,7 @@ int ovl_validate_verity(struct ovl_fs *ofs,
            memcmp(metacopy_data.digest, actual_digest, xattr_digest_size) != 0) {
                pr_warn_ratelimited("lower file '%pd' has the wrong fs-verity digest\n",
                                    datapath->dentry);
+         BUG_ON(1);
                return -EIO;
        }

ericcurtin · 2024-05-07T12:55:12Z

@ldts could be a useful feature. Somthing you are interested in hacking on @ldts ?

I think this would have to be dynamically switchable on/off, it's also dangerous to randomly power off sometimes.

But maybe if in the boot path it's the right thing to do for example (even then it may not be, depends on your use-case), but maybe not after that?

How do you alert the user of this problem is another thing.

ldts · 2024-05-07T13:01:35Z

sure I wouldnt mind. perhaps getting @alexlarsson input first since he extended overlayfs with this support. will follow up but it seems to me that for the use case - so must be configurable- where the full file system demands integrity would be the right thing to do

ericcurtin · 2024-05-07T13:29:17Z

Might be worth considering if this would integrate with:

systemd-bsod

alexlarsson · 2024-05-07T13:37:14Z

I don't really think that is a good approach. For example, in a safety situtation, you might have some really important process running, and then some unimportant process hits a fs-verity issue, rebooting the systemd and stopping the important process. You might also be able to misuse this as a form of attack. I.e. loopback mount a file with a known incorrect fs-verity data to reboot the system.

What might be more useful is to have the option of having the process issuing the failing operation get a signal that kills the process, say SIGBUS or something like that.

ldts · 2024-05-07T14:04:04Z

100%, that is a better design choice overall but wont that be some time away for most products? IMO is possible that some systems might want to chose a more conservative approach when it comes to security and completely shutdown no matter what any other process might be running: I mean, the system is compromised already....and then before kernel reboot/shutdown, something could be logged (maybe attach some form of kernel notifier) so that persistent storage (RPMB?) can be updated to flag the situation during reboot... I am thinking that perhaps a router would fit in that sort of product but I dont know for sure

alexlarsson · 2024-05-07T14:22:16Z

I think you will have a hard time selling it upstream.

ldts · 2024-05-07T14:39:23Z

I think you will have a hard time selling it upstream.

yes I fully agree as well: but I think it is the sort of patch worth carrying off-tree

alexlarsson · 2024-05-08T07:07:09Z

Well, that is up to whoever wants to carry it. I'm not very interested in that kind of thing though.

vnd · 2024-05-09T15:00:03Z

Hey, I'm wondering what's the current state of file verification? It's a bit hard to process all relevant threads as a project outsider.

In particular I'm trying to figure out whether IMA is working or not (#3240), or is it supposed to be replaced with composefs?

ericcurtin · 2024-05-09T15:39:20Z

Most engineering effort is on composefs/erofs/fs-verity right now

ldts · 2024-05-13T15:56:41Z

one note, if using an old systemd (ie, 250 (250.5+) with systemd-boot and ostree+composefs, you might need this systemd patch to find out the boot/ EFI partition:

Subject: [PATCH] gpt: composefs: block device on sysroot

rootfs corresponds to the composefs overlay: use sysroot instead.

Signed-off-by: Jorge Ramirez-Ortiz <jorge@foundries.io>
---
 src/gpt-auto-generator/gpt-auto-generator.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/gpt-auto-generator/gpt-auto-generator.c b/src/gpt-auto-generator/gpt-auto-generator.c
index 64ca9bb2f9..a7c564ca5c 100644
--- a/src/gpt-auto-generator/gpt-auto-generator.c
+++ b/src/gpt-auto-generator/gpt-auto-generator.c
@@ -774,7 +774,7 @@ static int add_mounts(void) {
          * here. */
         r = readlink_malloc("/run/systemd/volatile-root", &p);
         if (r == -ENOENT) { /* volatile-root not found */
-                r = get_block_device_harder("/", &devno);
+                r = get_block_device_harder("/sysroot", &devno);
                 if (r == -EUCLEAN)
                         return btrfs_log_dev_root(LOG_ERR, r, "root file system");
                 if (r < 0)
-- 
2.34.1

otherwise boot will not be mounted

cgwalters · 2024-05-13T17:06:56Z

@ldts Hmm, yes this relates to containers/composefs#280 as well as #3193

ldts · 2024-05-13T17:26:51Z

@ldts Hmm, yes this relates to containers/composefs#280 as well as #3193

the gpt generator on systemd-boot 250 looks for the block device (expecting it contains the boot partition to mount besides the rootfs partition) using "/" (which with composefs is actually the overlay). So switching it to sysroot seems a better choice when using ostree. I hit this the other day so this is why I thought it would be worth sharing it here.

without it, the system would still boot, but ostree admin status would fail to find /boot info

ldts · 2024-05-14T06:44:47Z

Well, that is up to whoever wants to carry it. I'm not very interested in that kind of thing though.

what about adding debugfs/sysfs counters on those errors?

ldts · 2024-05-14T09:49:01Z

Well, that is up to whoever wants to carry it. I'm not very interested in that kind of thing though.

what about adding debugfs/sysfs counters on those errors?

I'll propose something upstream (unless someone beats me to it)

ldts · 2024-05-17T12:26:05Z

@cgwalters are there any performance tests being run comparing ostree vs ostree+composefs+fsverity?

I am simply running:
FIO: https://fio.readthedocs.io/en/latest/fio_doc.html

fio --name=test --readonly --filename=/usr/bin/openssl --size=1M
on:

root@intel-corei7-64:/var/rootdirs/home/fio# findmnt /
TARGET SOURCE                                                                                                                   FSTYPE OPTIONS
/      /dev/disk/by-label/otaroot[/ostree/deploy/lmp/deploy/a79146c3c72ad3cf8a652a711c0d705f68a49e81eb2b0db52c079d5ce3577751.0] ext4   rw,relatime

resulting in:

Run status group 0 (all jobs):
   READ: bw=229MiB/s (240MB/s), 229MiB/s-229MiB/s (240MB/s-240MB/s), io=936KiB (958kB), run=4-4msec

and on

root@intel-corei7-64:/var/rootdirs/home/fio# findmnt /
TARGET SOURCE FSTYPE  OPTIONS
/      none   overlay ro,relatime,lowerdir+=/run/ostree/.private/cfsroot-lower,datadir+=/sysroot/ostree/repo/objects,redirect_dir=on,metacopy=on,verity=require

resulting in

Run status group 0 (all jobs):
   READ: bw=41.5MiB/s (43.6MB/s), 41.5MiB/s-41.5MiB/s (43.6MB/s-43.6MB/s), io=936KiB (958kB), run=22-22msec

I still need to check what this test is doing but I was wondering if there are any performance tests that I could use before we start deploying to embedded devices.

ldts · 2024-05-19T09:24:33Z

[apologies for the removal of previous threads but I am just learning about the tool ...so was commenting in case anyone could steer/chip in]

With something like this (10 minutes random read buffered workload) I persistently measure ~4% performance degradation on CFS reads. Does this seem correct?

#!/bin/bash
ext4="/sysroot/ostree/deploy/lmp/deploy/34da1439c2e5a9f6d57b5b48b827763f9d3d48b9f6cecd093669f61aa99b803c.0/usr/bin/"
cfsf="/usr/bin"

echo "CFS: "
cfsf_val=`sudo fio --opendir=$cfsf --direct=0 --rw=randread --bs=4k --ioengine=sync --iodepth=256 --runtime=600 --numjobs=4 --time_based --group_reporting --name=iops-test-job --eta-newline=1 --readonly | grep READ`
echo " ==> $cfsf_val"
echo""
echo "EXT: "
ext4_val=`sudo fio --opendir=$ext4 --direct=0 --rw=randread --bs=4k --ioengine=sync --iodepth=256 --runtime=600 --numjobs=4 --time_based --group_reporting --name=iops-test-job --eta-newline=1 --readonly | grep READ`
echo " ==> $ext4_val"

CFS: 
 ==>    READ: bw=359MiB/s (376MB/s), 359MiB/s-359MiB/s (376MB/s-376MB/s), io=210GiB (226GB), run=600001-600001msec

EXT: 
 ==>    READ: bw=371MiB/s (389MB/s), 371MiB/s-371MiB/s (389MB/s-389MB/s), io=218GiB (234GB), run=600001-600001msec

ericcurtin · 2024-05-19T11:53:49Z

@ldts I haven't measured or anything but a 4% degradation seems reasonable with fs-verity on as it has to verify bytes as it reads.

What would be even more interesting would be, ext4 vs composefs with fs-verity on vs composefs with fs-verity off.

I expect some Desktop users would prefer fs-verity off for example so they can make some local atomic changes and maybe care less about local signatures (I dunno up for debate) and maybe don't care about fs-verity.

But in IoT or Automotive or somewhere like that it would make more sense to have fs-verity on.

With fs-verity off, I would expect composefs to be faster than ext4 as it is erofs backed, so it would be interesting to see that.

Maybe these things belong here also:

https://github.com/containers/composefs

ldts · 2024-05-19T18:17:31Z

@ericcurtin ok I'll measure fs-verity off as well. makes sense. thanks for the info.!

hsiangkao · 2024-05-20T02:08:22Z

With fs-verity off, I would expect composefs to be faster than ext4 as it is erofs backed, so it would be interesting to see that.

Note that EROFS can have some impacts on metadata access only since ostree keeps data in the underlay filesystem. If your fio workload mainly measures full-data rand/seq read access it will have minor impacts tho.

ldts · 2024-05-20T08:15:46Z

@ericcurtin ok I'll measure fs-verity off as well. makes sense. thanks for the info.!

sorry I mispoke earlier about the 4% loss (my bad, wasnt measuring the right filesystem since fsverity was enabled everywhere);

I am still doing some benchmarking (just qemu x86_64 based) but what I see with randomized reads using a buffered syncrhonous API for the tests on a full system install:

kernel:
Linux intel-corei7-64 6.6.25-lmp-standard #1 SMP PREEMPT_DYNAMIC Thu Apr 4 18:23:07 UTC 2024 x86_64 GNU/Linux

command uses fio 3.30:

fio 
--opendir=/usr/bin      # directory to use for reads
--direct=0/1            # buffered/direct file access (0/1): test both
--rw=randread           
--bs=4k 
--ioengine=sync/libaio  # sync/asynch api : test both
--iodepth=256           # number of i/o units in flight against the file 
--runtime=3600          # run for an hour
--numjobs=4             # four threads
--time_based 
--group_reporting       # report the group instead of individual threads
--name=iops-test-job 
--readonly              # do not write 
--aux-path=~/           # use the home directory for temp files

Test
**iops-test-job: (g=0): **rw=randread**, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, **ioengine=sync**, iodepth=256**

composefs + signed image + full integrity on rootfs: rw bandwidth ~300MB/sec

TARGET SOURCE  FSTYPE  OPTIONS
/      overlay overlay ro,relatime,lowerdir=/run/ostree/.private/cfsroot-lower::/sysroot/ostree/repo/objects,redirect_dir=on,metacopy=on,verity=require

composefs : ~720MB/s

TARGET SOURCE  FSTYPE  OPTIONS
/      overlay overlay ro,relatime,lowerdir=/run/ostree/.private/cfsroot-lower::/sysroot/ostree/repo/objects,redirect_dir=on,metacopy=on

no composefs**: ~670MB/sec

TARGET SOURCE                                                                                                                   FSTYPE OPTIONS
/      /dev/disk/by-label/otaroot[/ostree/deploy/lmp/deploy/d8d86dea56476c576f743097ce3a9dbee1927116893596725b39ac4bf5b17bdb.0] ext4   rw,relatime

I will stop here and comment further once all the benchmarking is done (I felt it was best if I corrected my earlier comment)

ldts · 2024-05-20T15:16:23Z

um, while testing I noticed that I cant enable image signatures without also enabling fsverity on all the files in the rootfs. Is this expected?

ericcurtin · 2024-05-21T06:35:06Z

@ldts makes sense to me, fs-verity is what checks the signatures. Doesn't seem very useful to have signatures without fs-verity.

ldts · 2024-05-21T08:08:32Z

@ldts makes sense to me, fs-verity is what checks the signatures. Doesn't seem very useful to have signatures without fs-verity.

Arent we talking about two different things? the fs-verity kernel layer just checks any file measurements if enabled -and supported- in the filesystem.
However verifying an async ed25519 signature is a different unrelated thing. I am wondering why are they linked together,

In my use case - userspace signature validation- the public key is not loaded in the kernel key ring but just used by ostree-prepare-root to validate the composefs image signature. Other than that, I dont see why fs-verity needs to depend on it?

to me it seems like a bug against usespace signature validation- an assumption that the kernel keyring must contain the key.

ldts · 2024-05-22T09:06:28Z

said differently, why composefs image signature with fs-verity rootfs integrity is not supported - is it intentional or an implementation issue?

The current release only supports composefs image signature with fs-verity rootfs integrity&authentication

ericcurtin · 2024-05-22T09:48:15Z

I see what you mean now, yeah I guess both could be supported individually.

ldts · 2024-05-22T12:39:58Z

also, full filesystem integrity/authentication is amazing, no doubts about it. But perhaps many embedded devices wont be able to afford it - the performance drop in read-bandwidth testing can be too noticeable. So (an extension/feature?) maybe fs-verity could be allowed to be enabled on some of the deploy folders instead of requiring it on all of them? which is what I was trying to test when I noticed it wouldn't work.

[I havent tested this patch yet, maybe it makes a good enough difference]
https://patches.linaro.org/project/linux-crypto/patch/20240507002343.239552-7-ebiggers@kernel.org/

Incidentanlly on imx8mp we are seeing a 13% improvement in read bandwidth performance tests by using ostree with CFS (without fs-verity) over EXT4. So really neat.

I feel I am polluting this thread - maybe I should open a performance evaluation issue?

cgwalters added difficulty/hard hard complexity/difficutly issue triaged This issue has been evaluated and is valid reward/high Fixing this will result in significant benefit labels May 30, 2023

alexlarsson mentioned this issue May 31, 2023

Want an option for transient /etc #2868

Closed

travier mentioned this issue Jun 2, 2023

Strategy for file verification (IMA, fs-verity, composefs) coreos/fedora-coreos-tracker#1252

Open

cgwalters mentioned this issue Jun 2, 2023

rdcore: Juggle physical root versus deployment root coreos/coreos-installer#1203

Merged

cgwalters mentioned this issue May 13, 2024

Canonical method to find backing filesystem (and block device) containers/composefs#280

Open

ex-integrity.composefs: Tracking issue #2867

ex-integrity.composefs: Tracking issue #2867

Comments

cgwalters commented May 30, 2023 • edited Loading

composefs/ostree (and beyond)

Background

System integrity

Accidental damage protection

Support for "sealed" systems

Phase 0: Basic integration (experimental)

Phase 1: Basic rootfs sealing (experimental)

Phase 2: Secure Boot chaining (experimental)

Beyond

Phase 3: "Native composefs"

Phase 4: Unified container and host systems

alexlarsson commented May 31, 2023

alexlarsson commented May 31, 2023

cgwalters commented May 31, 2023

alexlarsson commented Jun 2, 2023

travier commented Jun 2, 2023 • edited Loading

travier commented Jun 2, 2023

travier commented Jun 2, 2023

travier commented Jun 2, 2023

travier commented Jun 2, 2023 • edited Loading

travier commented Jun 2, 2023

alexlarsson commented Jun 2, 2023

osalbahr commented Jun 2, 2023

cgwalters commented Jun 2, 2023

alexlarsson commented Jun 2, 2023

ericcurtin commented Jun 6, 2023

ericcurtin commented May 7, 2024

ldts commented May 7, 2024

ericcurtin commented May 7, 2024 • edited Loading

ldts commented May 7, 2024 • edited Loading

ericcurtin commented May 7, 2024

alexlarsson commented May 7, 2024

ldts commented May 7, 2024

alexlarsson commented May 7, 2024

ldts commented May 7, 2024 • edited Loading

alexlarsson commented May 8, 2024

vnd commented May 9, 2024

ericcurtin commented May 9, 2024

ldts commented May 13, 2024 • edited Loading

cgwalters commented May 13, 2024

ldts commented May 13, 2024 • edited Loading

ldts commented May 14, 2024 • edited Loading

ldts commented May 14, 2024

ldts commented May 17, 2024 • edited Loading

ldts commented May 19, 2024 • edited Loading

ericcurtin commented May 19, 2024

ldts commented May 19, 2024

hsiangkao commented May 20, 2024

ldts commented May 20, 2024 • edited Loading

ldts commented May 20, 2024

ericcurtin commented May 21, 2024 • edited Loading

ldts commented May 21, 2024 • edited Loading

ldts commented May 22, 2024

ericcurtin commented May 22, 2024

ldts commented May 22, 2024 • edited Loading

cgwalters commented May 30, 2023 •

edited

Loading

travier commented Jun 2, 2023 •

edited

Loading

travier commented Jun 2, 2023 •

edited

Loading

ericcurtin commented May 7, 2024 •

edited

Loading

ldts commented May 7, 2024 •

edited

Loading

ldts commented May 7, 2024 •

edited

Loading

ldts commented May 13, 2024 •

edited

Loading

ldts commented May 13, 2024 •

edited

Loading

ldts commented May 14, 2024 •

edited

Loading

ldts commented May 17, 2024 •

edited

Loading

ldts commented May 19, 2024 •

edited

Loading

ldts commented May 20, 2024 •

edited

Loading

ericcurtin commented May 21, 2024 •

edited

Loading

ldts commented May 21, 2024 •

edited

Loading

ldts commented May 22, 2024 •

edited

Loading