Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Build for measured rootfs improvements #7231

Merged
merged 14 commits into from Dec 5, 2023

Conversation

wainersm
Copy link
Contributor

@wainersm wainersm commented Jul 4, 2023

While helping to solve kata-containers/tests#5701 I looked more carefully at the kernel, image and shim-v2 build scripts, and I found some opportunities of improvements. Apart from that, we've got some plans for measured rootfs on main branch as #7415.

This PR has the goals:

  • move the logic to build measured rootfs out of the kata-deploy. IMO the component's builders should handle the build paths for enabling measured rootfs
  • ensure that when MEASURED_BOOTFS=yes then the expected files are found or the build process die. This is to ensure we don't silently produce bad components
  • implements one test case for current implementation

@katacontainersbot katacontainersbot added the size/tiny Smallest and simplest task label Jul 4, 2023
@wainersm
Copy link
Contributor Author

wainersm commented Jul 4, 2023

I set needs-backport in case @stevenhorsman prefers this out of the next merge PR. Of course, if those changes work and are approved :)

Copy link
Member

@stevenhorsman stevenhorsman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code looks good to me, but we don't currently have a way to test it in main which makes me a little nervous, but then again we don't have a code path that activates it either, so it's probably ok to have code review only until it's merged into CCv0, though I will need to update it to handle TDX root hashes then too.

@wainersm
Copy link
Contributor Author

wainersm commented Jul 5, 2023

The code looks good to me, but we don't currently have a way to test it in main which makes me a little nervous, but then again we don't have a code path that activates it either, so it's probably ok to have code review only until it's merged into CCv0, though I will need to update it to handle TDX root hashes then too.

Yeah, I was expecting it being partially tested by the new TDX job on GHB but it seems disabled?

This is how I tested it:

  • Build the vanilla kernel with support to measured rootfs:
make MEASURED_ROOTFS=yes kernel-tarball
make kernel-tdx-experimental-tarball
  • Build shim-v2 with support to measured rootfs:
cd tools/osbuilder
wget http://jenkins.katacontainers.io/job/kata-containers-2.0-rootfs-image-cc-x86_64/lastSuccessfulBuild/artifact/artifacts/root_hash_vanilla.txt
mv root_hash_vanilla.txt root_hash.txt
cd -
make MEASURED_ROOTFS=yes shim-v2-tarball
cat build/shim-v2/destdir/opt/kata/share/defaults/kata-containers/configuration.toml | grep hash
  • Build shim-v2 with support to measured rootfs but force a fail:
rm -f tools/osbuilder/root_hash.txt
make MEASURED_ROOTFS=yes shim-v2-tarball
  • Build the TDX kernel with support to measured rootfs but force a fail:
rm -rf tools/packaging/kernel/initramfs.cpio.gz
# Comment the line https://github.com/kata-containers/kata-containers/blob/b2b70e6d8a5880db4655ace6b981d5ce48acc07d/tools/packaging/static-build/kernel/build.sh#L25 so that initramfs is not built
make kernel-tdx-experimental-tarball

@wainersm
Copy link
Contributor Author

hi @arronwy , could I have your review on this one?

@stevenhorsman
Copy link
Member

Hey Wainer, based on a conversation we had earlier, we have the following extra suggestions for measured rootfs that could go into this PR (or spun out separately). We might want to run it past the AC before we merge though to ensure that people have awareness:

  • At the moment measured rootfs being built into the image is triggered by setting MEASURED_ROOTFS=yes during the build, however on main there is no jobs that set it, so it's never enabled
  • Instead of this we could always build the initramfs into the image so it's capable of enforcing the integrity checking, but only enable it selectively
  • As it is very difficult to calculate the root hash after the event, we can follow the current pattern of calculating it and pulling it into the runtime, but inject it to the config tomls but commented out:
    • We could have a new parameter that results in the kernel_params getting set like debug_console_enabled=true adds agent.debug_console agent.debug_console_vport=1026 to the kernel_params and then a use would just have to uncomment that option to use it
    • Alternatively (and a bit cheaper, but not as nicer user experience), we could just add something like:
    # Add the following line to the kernel_params to enable integrity checking of the guest image 
    # rootfs_verity.scheme=dm-verity cc_rootfs_verity.hash=${root_hash}
    
    in the line above the kernel_params in the config

\cc @fidencio

@wainersm wainersm added the do-not-merge PR has problems or depends on another label Jul 21, 2023
@katacontainersbot katacontainersbot added size/medium Average sized task and removed size/tiny Smallest and simplest task labels Oct 19, 2023
@wainersm
Copy link
Contributor Author

Rebased and fixed the conflicts. Apparently the logic is still working even with the new cache mechanism.

Leaving it WIP as I plan to address the last @stevenhorsman 's suggestions.

@katacontainersbot katacontainersbot added size/large Task of significant size and removed size/medium Average sized task labels Oct 19, 2023
@wainersm wainersm force-pushed the measured_rootfs-improvements branch 3 times, most recently from 0aa5291 to e8f817a Compare October 24, 2023 17:31
@wainersm
Copy link
Contributor Author

A pod with following annotation should be failing the measurement as I am passing an incorrect hash, but it is not:

apiVersion: v1
kind: Pod
metadata:
   name: busybox-cc
   annotations:
     io.katacontainers.config.hypervisor.kernel_params: "rootfs_verity.scheme=dm-verity rootfs_verity.hash=5180b1568c2ba972e4e06ee0a55976acae8329f2a5d1d2004395635e1ec4a76e"
 spec:
   runtimeClassName:
   containers:
     - name: nginx
       image: nginx
       imagePullPolicy: Always
   imagePullSecrets:
     - name: cococred

Am I passing the right annotation @stevenhorsman ?

@wainersm
Copy link
Contributor Author

A pod with following annotation should be failing the measurement as I am passing an incorrect hash, but it is not:

apiVersion: v1
kind: Pod
metadata:
   name: busybox-cc
   annotations:
     io.katacontainers.config.hypervisor.kernel_params: "rootfs_verity.scheme=dm-verity rootfs_verity.hash=5180b1568c2ba972e4e06ee0a55976acae8329f2a5d1d2004395635e1ec4a76e"
 spec:
   runtimeClassName:
   containers:
     - name: nginx
       image: nginx
       imagePullPolicy: Always
   imagePullSecrets:
     - name: cococred

Am I passing the right annotation @stevenhorsman ?

Just noticed that runtimeClassName is blank....

@stevenhorsman
Copy link
Member

Am I passing the right annotation @stevenhorsman ?

Yes - that looks like it matches the kata-deploy method that adds it when measured_rootfs is enabled at the moment:

root_measure_config="rootfs_verity.scheme=dm-verity rootfs_verity.hash=${root_hash}"

Just noticed that runtimeClassName is blank....

Yeah, that's worth looking at more 😄

@wainersm wainersm force-pushed the measured_rootfs-improvements branch 2 times, most recently from 7ecc725 to 371c8a7 Compare October 26, 2023 13:30
@wainersm wainersm force-pushed the measured_rootfs-improvements branch 2 times, most recently from aad934c to 3ea7dd8 Compare October 27, 2023 16:53
@wainersm wainersm removed the do-not-merge PR has problems or depends on another label Oct 27, 2023
@wainersm
Copy link
Contributor Author

The test that I am adding on this PR passed (https://github.com/kata-containers/kata-containers/actions/runs/6670016832/job/18130811556?pr=7231):

ok 1 Test cannnot launch pod with measured boot enabled and incorrect hash
1..2

Removing the wip as it is ready to review.

Copy link
Member

@stevenhorsman stevenhorsman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apart from a couple of minor issue this looks great. Thanks @wainersm!

@@ -538,7 +539,7 @@ install_kata() {
}

main() {
while getopts "a:b:c:deEfg:hH:k:p:t:u:v:x:" opt; do
while getopts "a:b:c:deEfg:hH:mk:p:t:u:v:x:" opt; do
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Should this be k:mp: to preserve the alphabetically ordering?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed!

local datetime="$3"
local message="$4"

# Note: with image-rs we get more that the default 1000 lines of logs
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: more that the -> more than the and apologies as I think that's my typo you've copied!

@stevenhorsman stevenhorsman added the merge-to-main PRs relating to merging CCv0 content to main label Nov 20, 2023
wainersm and others added 14 commits November 28, 2023 11:21
By convention the caller of tools/packaging/kernel/build-kernel.sh changes
the script behavior by passing arguments, whereas, for measured rootfs
it has used an environment variable (MEASURED_ROOTFS). This refactor
the script so that the caller now must pass the "-m" argument to enable
the build of the kernel with measured rootfs support.

Fixes kata-containers#6674
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
Moved the measure rootfs logic from kata-deploy-binaries.sh to the
kernel's builder script so that the former get less bloated with
components's specific code.

Fixes kata-containers#6674
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
Moved the measure rootfs logic from kata-deploy-binaries.sh to the
shim-v2's builder script so that the former get less bloated with
components's specific code.

Fixes kata-containers#6674
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
The KATA_BUILD_CC variable plus the existence (or not) of the initramfs
were used to determine whether to build the kernel for measured rootfs
or not. Currently the variable MEASURED_ROOTFS has been used
to trigger the feature build and when it is activated it should expect
the initramfs exist. In other words, this changed the kernel build
so that if `MEASURED_ROOTFS=yes` then the initramf file must exist and
be found.

Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
When measured toofs is enabled then the shim-v2 build should find the
guest rootfs hash file, otherwise might (silently) generate configuration
files with empty hash.

Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
Re-wrote the logic of init.sh to follow the rules:

 * the root device MUST exist always because it will be either mounted
   or verified (then mounted)
 * if rootfs verifier is enabled then the hash device MUST exist. Avoid
   the case where dm-verity is set but the hash device does not exist and
   so the verification is silently skipped

Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
The following functions were copied from CCv0's branch test's
integration/kubernetes/confidential/lib.sh. I did just smalls
refactorings (shortened their names and delinted shellcheck warnings):

- k8s_delete_all_pods_if_any_exists()
- k8s_wait_pod_be_ready()
- k8s_create_pod()
- assert_pod_fail()

Co-authored-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Co-authored-by: Georgina Kinge <georgina.kinge@ibm.com>
Co-authored-by: Jordan Jackson <jordan.jackson@ibm.com>
Co-authored-by: Megan Wright <Megan.Wright@ibm.com>
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
Co-authored-by: Wang, Arron <arron.wang@intel.com>
Copied the new_pod_config() and pod-config.yaml.in from CCv0 branch
tests' integration/kubernetes/confidential/tests_common.sh and fixtures.
Unlike the original version, new_pod_config() now gets the runtimeclass
by parameter as the RUNTIMECLASS environment variable seems not broadly
used on main branch's CI.

The pod-config.yaml.in was changed as the diff shows below. In
particular the imagePullSecrets was removed to avoid it throwing a
warning on the pod's log.

```
--- a/tests/integration/kubernetes/runtimeclass_workloads/pod-config.yaml.in
+++ b/tests/integration/kubernetes/runtimeclass_workloads/pod-config.yaml.in
@@ -5,12 +5,10 @@
 apiVersion: v1
 kind: Pod
 metadata:
-  name: busybox-cc
+  name: test-e2e
 spec:
   runtimeClassName: $RUNTIMECLASS
   containers:
-  - name: nginx
+  - name: test_container
     image: $IMAGE
-    imagePullPolicy: Always
-  imagePullSecrets:
-  - name: cococred
\ No newline at end of file
+    imagePullPolicy: Always
\ No newline at end of file
```

Co-authored-by: Georgina Kinge <georgina.kinge@ibm.com>
Co-authored-by: Megan Wright <Megan.Wright@ibm.com>
Co-authored-by: stevenhorsman <steven@uk.ibm.com>
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
The new clean-generated-files make target allows for removing the
generated files (including the configuration.toml files).

The tools/packaging/static-build/shim-v2/build.sh script now uses that
target to always force the re-generation of those files.

Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
This new function allow to the annotations to metadata section in a yaml
configuration file.

Co-authored-by: Ryan Savino <ryan.savino@amd.com>
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
Bring the assert_logs_contain() from CCv0 branch tests'
integration/kubernetes/confidential/lib.sh.

Introduced the print_node_journal() which uses `kubectl debug` to print
the systemd's journal of a k8s's node.

Fixes kata-containers#7590
Co-authored-by: stevenhorsman <steven@uk.ibm.com>
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
Bring the setup_common() from CCv0 branch test's
integration/kubernetes/confidential/tests_common.sh. It should be used
to reduce boilerplates on the setup() of the tests.

Unlike the original code, this won't export the `test_start_time` variable
as it wouldn't be accurate to grab logs from the worker nodes due
date/time mismatch between the running tests machine and the worker
node. The function export the `node` variable which holds the name of
a random node which has kata installed. Apart from that, it exports the
`node_start_time` which capture the date/time when the test started,
relative to the `node`.

Tests that should inspect the logs can schedule pods/resources to the `node`
and use `node_start_time` as the value reference to grep the logs.

Fixes kata-containers#7590
Co-authored-by: stevenhorsman <steven@uk.ibm.com>
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
Use this new function to set the node where the pod should be scheduled
to.

Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
Implements the following test case:

  Scenario: Check incorrect hash fails
  **Given** I have a version of kata installed that has a kernel with the
  initramfs built and config with rootfs_verity.scheme=dm-verity
  rootfs_verity.hash=<incorrect hash of rootfs> set in the kernel_params
  **When** I try and create a container a basic pod
  **Then** The pod is doesn't run
  **And**  Ideally we'd get a helpful message to indicate why

Currently on CI only qemu-tdx is built with measured
rootfs support in the kernel, so the test is restriced to that
runtimeclass.

Fixes kata-containers#7415
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
@wainersm
Copy link
Contributor Author

Rebased this PR to solve a small conflict and addressed the last comments of @stevenhorsman

@wainersm
Copy link
Contributor Author

Hi @arronwy @fidencio ! Can I ask you a special attention to this PR on the next days? I explain why.

@ChengyuZhu6 's #8484 is pre-req to a lot of upcoming "merge to main" code. @stevenhorsman plans to pull some tests from CCv0 to test that basic pull mechanism and he will need to migrate some (if not all!) of the common test methods that I have on this PR. Steve's work will be based on this PR ,therefore, it will be important to merge it asap to avoid blocking him (Free Steve!)

@fidencio
Copy link
Member

/test

Copy link
Member

@fidencio fidencio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, thanks @wainersm!

@fidencio
Copy link
Member

/test-s390x

@fidencio
Copy link
Member

fidencio commented Dec 5, 2023

/retest-s390x

@fidencio fidencio merged commit d149b9f into kata-containers:main Dec 5, 2023
165 of 172 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
merge-to-main PRs relating to merging CCv0 content to main ok-to-test size/large Task of significant size
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants