Produce confidential workload images #4960

nalind · 2023-08-08T14:32:17Z

What type of PR is this?

/kind feature

What this PR does / why we need it:

This introduces the ability to replace a container image's layered rootfs with a LUKS-encrypted image containing an ext4 filesystem which includes the contents which would normally have been in the rootfs, plus configuration files and data that describe it as a confidential workload. If the build is provided with the URL of an attestation server, the workload's ID (autogenerated, if not specified) will be registered with it, along with the random (if not specified) passphrase which was used to encrypt the image.

How to verify it

If you have access to the hardware, use buildah mkcw or the --cw buildah build or buildah commit options to produce an image, then run it using podman run --runtime krun. The "sev" type requires an AMD EPYC 1000-series or later processor, the "snp" type requires an AMD EPYC 3000-series or later processor.
For the rest of us, unit tests and integration tests that can only look at the output image!

Which issue(s) this PR fixes:

Special notes for your reviewer:

Does this PR introduce a user-facing change?

Introduces the `buildah mkcw` command, and adds a `--cw` flag for `buildah build` and `buildah commit`.

rhatdan · 2023-08-22T13:17:10Z

cmd/buildah/mkcw.go

+	flags := mkcwCommand.Flags()
+	flags.SetInterspersed(false)
+
+	flags.StringVarP(&teeType, "type", "t", "", "TEE type")


Can you spell out what TEE stands for?

Sure, adding it.

rhatdan · 2023-08-22T13:17:55Z

cmd/buildah/mkcw.go

+
+	flags.StringVarP(&teeType, "type", "t", "", "TEE type")
+	flags.StringVarP(&options.AttestationURL, "attestation-url", "u", "", "attestation server URL")
+	flags.StringVarP(&options.AttestationURL, "attestation_url", "", "", "attestation server URL (alternate flag spelling)")


I found myself having to go back and forth between passing options as flags to mkcw, where the convention is "-", and as options to the cw flag, where the convention is "_". I can remove these.

rhatdan · 2023-08-22T13:18:25Z

cmd/buildah/mkcw.go

+		panic("error marking attestation_url as hidden")
+	}
+	flags.StringVarP(&options.BaseImage, "base-image", "b", "", "alternate base image (default: scratch)")
+	flags.StringVarP(&options.BaseImage, "base_image", "", "", "alternate base image (default: scratch) (alternate flag spelling)")


It could come in handy for troubleshooting, and it's basically free. Can be removed if it offends.

rhatdan · 2023-08-22T13:18:54Z

cmd/buildah/mkcw.go

+		panic("error marking base_image as hidden")
+	}
+	flags.StringVarP(&options.DiskEncryptionPassphrase, "encryption-passphrase", "p", "", "disk encryption passphrase")
+	flags.StringVarP(&options.DiskEncryptionPassphrase, "encryption_passphrase", "", "", "disk encryption passphrase (alternate flag spelling)")


Why not just "passphrase"?

Changing it, though it's still not an option I want to encourage people to use.

rhatdan · 2023-08-22T13:19:12Z

cmd/buildah/mkcw.go

+	flags.IntVarP(&options.CPUs, "cpus", "c", 0, "number of CPUs to expect")
+	flags.IntVarP(&options.Memory, "memory", "m", 0, "amount of memory to expect (MB)")
+	flags.StringVarP(&options.WorkloadID, "workload-id", "w", "", "workload ID")
+	flags.StringVarP(&options.WorkloadID, "workload_id", "", "", "workload ID (alternate flag spelling)")


rhatdan · 2023-08-22T13:19:49Z

cmd/buildah/mkcw.go

+	}
+	flags.StringVarP(&options.Slop, "slop", "s", "25%", "extra space needed for converting a container rootfs to a disk image")
+	flags.StringVarP(&options.FirmwareLibrary, "firmware-library", "f", "", "location of libkrunfw-sev.so")
+	flags.StringVarP(&options.FirmwareLibrary, "firmware_library", "", "", "location of libkrunfw-sev.so (alternate flag spelling)")


It's gone now.

rhatdan · 2023-08-22T13:20:18Z

cmd/buildah/mkcw.go

+		panic("error marking firmware_library as hidden")
+	}
+	flags.BoolVarP(&options.IgnoreAttestationErrors, "ignore-attestation-errors", "", false, "ignore attestation errors")
+	flags.BoolVarP(&options.IgnoreAttestationErrors, "ignore_attestation_errors", "", false, "ignore attestation errors (alternate flag spelling)")


Can we just do something shorter like --ignore?

Feels like it would invite the question "ignore what?" Keep in mind that if we don't manage to register the workload with an attestation server, the init process running in the VM can't ask the attestation server for the disk encryption passphrase. So while an image that wasn't registered is still good enough to let our tests check that we formatted things correctly in a number of ways, the resulting image isn't usable beyond that.

rhatdan · 2023-08-22T13:20:58Z

cmd/buildah/mkcw.go

+		panic("error marking ignore_attestation_errors as hidden")
+	}
+	flags.BoolVarP(&options.IgnoreChainRetrievalErrors, "ignore-chain-retrieval-errors", "", false, "ignore errors retrieving the certificate chain")
+	flags.BoolVarP(&options.IgnoreChainRetrievalErrors, "ignore_chain_retrieval_errors", "", false, "ignore errors retrieving the certificate chain (alternate flag spelling)")


When would I want one versus the other? ignore_attestation_errors versus ignore_chain_retrieval_errors?

It's mainly about separating local errors (missing tools, missing firmware, permissions) from remote errors (server down, authentication errors). I guess we can just call them all attestation errors.

Who would use these options? If the mkcw command is not going to succeed, why would I want to ignore errors? And why would I need to flags to ignore specific errors?

Maybe it'd be undocumented, but we depend on it in order to be able to verify that the image looks like we expect it to look in our tests, to the extent that we can when we don't have the hardware.

Should we hide the option then?

rhatdan · 2023-08-22T13:22:27Z

convertcw.go

+			logrus.Warnf("unmounting target container: %v", err)
+		}
+	}()
+	if err := os.Mkdir(filepath.Join(targetDir, "tmp"), os.ModeSticky|0o777); err != nil && !errors.Is(err, os.ErrExist) {


Does this have to be 777?

That's what's usually set for /tmp. Why would this image's /tmp be different in some way?

Ok this is in side of the image, I wanted to make sure it was not on disk somewhere.

You're right about one thing - we don't need to be doing this here. We already insert the directory with the right permissions into the tarball stream that we're generating and extracting into the working container, so the Mkdir() here is redundant.

rhatdan · 2023-08-22T13:33:03Z

convertcw.go

+	}
+	defer func() {
+		if err := source.Delete(); err != nil {
+			logrus.Warnf("deleting source container: %v", err)


Should we have a standard on Warnf, should they be capitalized or not? This varies throughout the code.

That's going to require PRs for multiple repositories.

rhatdan · 2023-08-22T13:33:55Z

convertcw.go

+		return "", nil, "", fmt.Errorf("generating encrypted image content: %w", err)
+	}
+	if err = archive.Untar(rc, targetDir, &archive.TarOptions{}); err != nil {
+		if err = rc.Close(); err != nil {


Why isn't this a defer?

We want to be able to check its result before we create an image with everything we've read, in case it returns an error that it couldn't earlier.

flouthoc

PR LGTM other than small comments above

giuseppe

LGTM

openshift-ci · 2023-09-05T08:58:26Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: giuseppe, nalind, rhatdan

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [giuseppe,nalind,rhatdan]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

TomSweeneyRedHat · 2023-09-05T22:31:28Z

docs/buildah-build.1.md

+**--cw** *options*
+
+Produce an image suitable for use as a confidential workload running in a
+trusted execution environment (TEE) using krun.  Instead of the conventional


can we add a man page reference to krun at the bottom of this page?

The package that provides /usr/bin/krun doesn't provide a man page for it for us to reference. Is there an alternate one you'd suggest?

TomSweeneyRedHat · 2023-09-05T22:33:40Z

docs/buildah-commit.1.md

+**--cw** *options*
+
+Produce an image suitable for use as a confidential workload running in a
+trusted execution environment (TEE) using krun.  Instead of the conventional


ditto prior krun comment

TomSweeneyRedHat · 2023-09-05T22:34:55Z

docs/buildah-mkcw.1.md

+If a value is specified, the new image's workload ID, along with the passphrase
+used to encrypt the disk image, will be registered with the server, and the
+server's location will be stored in the container image.
+At run-time, krun is expected to contact the server to retrieve the passphrase


ditto krun comment

TomSweeneyRedHat · 2023-09-05T22:35:26Z

docs/buildah-mkcw.1.md

+buildah\-mkcw - Convert a conventional container image into a confidential workload image.
+
+## SYNOPSIS
+**buildah mkcw** [*options*] *source* *destination*


maybe "mkcwi" instead?

Appending an "i" seems to have been the original approach for when the verb was also applicable to containers. I'd group this with pull/push, which have no corresponding commands which operate on containers.

rhatdan · 2023-09-07T11:15:21Z

/lgtm
Lets work on the krun man page. Thoughts @slp
Great work Nalin

TomSweeneyRedHat · 2023-09-07T14:56:00Z

Well, that explains why I couldn't find a krun page anywhere, but I was hoping I was just doing a bad search. If we could get one up somewhere to reference, it would be good. I looked for "krun" and only found travel guides to "Krün" in the state of Bavaria in Germany, and a Country AM Radio station in Texas. Not much technical help.

nalind · 2023-09-07T15:11:10Z

Rebased.

nalind · 2023-09-07T15:13:10Z

krun is crun being told to use libkrun.

TomSweeneyRedHat · 2023-09-07T17:35:21Z

@nalind It might be nice to have a parenthesis explanation like that in the man pages if there isn't a man page to point to.

Add a --cw option to `buildah build` and `buildah commit`, which takes a comma-separated list of arguments and produces an image laid out for use as a confidential workload: type: sev or snp attestation_url: location of a key broker server cpus: expected number of virtual CPUs to run with memory: expected megabytes of memory to run with workload_id: a distinguishing identifier for the key broker server ignore_attestation_errors: ignore errors registering the workload passphrase: for encrypting the disk image slop: extra space to allocate for the disk image At least one of attestation_url and passphrase must be specified in order for the encrypted disk image to be decryptable at run-time. Other arguments can be omitted. ignore_attestation_errors is intentionally undocumented, as it's mainly used to permit some amount of testing on systems which don't have the required hardware. Add an `mkcw` top-level command, for converting directly from an image to a confidential workload. Signed-off-by: Nalin Dahyabhai <nalin@redhat.com>

Add docs for the new --cw option recognized by both `commit` and `build`, and the new `mkcw` command. Signed-off-by: Nalin Dahyabhai <nalin@redhat.com>

nalind · 2023-09-07T18:06:28Z

Rebased and expanded the text in the man pages to mention that krun is crun built with the libkrun feature enabled and invoked as krun, and to mention that the image size is increased to 10 megabytes if the specified or estimated size is less than that.

docs/buildah-build.1.md

TomSweeneyRedHat · 2023-09-07T18:50:10Z

LGTM

rhatdan · 2023-09-07T20:44:33Z

/lgtm

openshift-ci bot added do-not-merge/work-in-progress kind/feature Categorizes issue or PR as related to a new feature. approved labels Aug 8, 2023

nalind force-pushed the mkcw branch 14 times, most recently from 9de06a3 to 5bd630a Compare August 14, 2023 21:26

nalind force-pushed the mkcw branch 2 times, most recently from 6550c17 to 78e55b4 Compare August 21, 2023 12:00

rhatdan reviewed Aug 22, 2023

View reviewed changes

flouthoc reviewed Sep 4, 2023

View reviewed changes

giuseppe approved these changes Sep 5, 2023

View reviewed changes

TomSweeneyRedHat reviewed Sep 5, 2023

View reviewed changes

nalind force-pushed the mkcw branch from f6582e9 to 921e4fb Compare September 6, 2023 19:39

openshift-ci bot assigned rhatdan Sep 7, 2023

openshift-ci bot added lgtm and removed lgtm labels Sep 7, 2023

nalind force-pushed the mkcw branch from b8606b3 to 9f8e63b Compare September 7, 2023 15:11

nalind force-pushed the mkcw branch from 9f8e63b to 5d9e6e1 Compare September 7, 2023 18:04

nalind added 2 commits September 7, 2023 14:05

Add some docs for build --cw, commit --cw, and mkcw

4f3abf9

Add docs for the new --cw option recognized by both `commit` and `build`, and the new `mkcw` command. Signed-off-by: Nalin Dahyabhai <nalin@redhat.com>

nalind force-pushed the mkcw branch from 5d9e6e1 to 4f3abf9 Compare September 7, 2023 18:05

TomSweeneyRedHat reviewed Sep 7, 2023

View reviewed changes

docs/buildah-build.1.md Show resolved Hide resolved

openshift-ci bot added the lgtm label Sep 7, 2023

openshift-merge-robot merged commit 0cbe852 into containers:main Sep 7, 2023
36 checks passed

nalind deleted the mkcw branch September 7, 2023 20:48

github-actions bot added the locked - please file new issue/PR label Dec 7, 2023

github-actions bot locked as resolved and limited conversation to collaborators Dec 7, 2023

Produce confidential workload images #4960

Produce confidential workload images #4960

Conversation

nalind commented Aug 8, 2023 • edited

What type of PR is this?

What this PR does / why we need it:

How to verify it

Which issue(s) this PR fixes:

Special notes for your reviewer:

Does this PR introduce a user-facing change?

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

flouthoc left a comment

Choose a reason for hiding this comment

giuseppe left a comment

Choose a reason for hiding this comment

openshift-ci bot commented Sep 5, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rhatdan commented Sep 7, 2023

TomSweeneyRedHat commented Sep 7, 2023

nalind commented Sep 7, 2023

nalind commented Sep 7, 2023

TomSweeneyRedHat commented Sep 7, 2023

nalind commented Sep 7, 2023

TomSweeneyRedHat commented Sep 7, 2023

rhatdan commented Sep 7, 2023

nalind commented Aug 8, 2023 •

edited