New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Produce confidential workload images #4960
Conversation
9de06a3
to
5bd630a
Compare
6550c17
to
78e55b4
Compare
cmd/buildah/mkcw.go
Outdated
flags := mkcwCommand.Flags() | ||
flags.SetInterspersed(false) | ||
|
||
flags.StringVarP(&teeType, "type", "t", "", "TEE type") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you spell out what TEE stands for?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, adding it.
cmd/buildah/mkcw.go
Outdated
|
||
flags.StringVarP(&teeType, "type", "t", "", "TEE type") | ||
flags.StringVarP(&options.AttestationURL, "attestation-url", "u", "", "attestation server URL") | ||
flags.StringVarP(&options.AttestationURL, "attestation_url", "", "", "attestation server URL (alternate flag spelling)") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I found myself having to go back and forth between passing options as flags to mkcw
, where the convention is "-", and as options to the cw
flag, where the convention is "_". I can remove these.
cmd/buildah/mkcw.go
Outdated
panic("error marking attestation_url as hidden") | ||
} | ||
flags.StringVarP(&options.BaseImage, "base-image", "b", "", "alternate base image (default: scratch)") | ||
flags.StringVarP(&options.BaseImage, "base_image", "", "", "alternate base image (default: scratch) (alternate flag spelling)") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It could come in handy for troubleshooting, and it's basically free. Can be removed if it offends.
cmd/buildah/mkcw.go
Outdated
panic("error marking base_image as hidden") | ||
} | ||
flags.StringVarP(&options.DiskEncryptionPassphrase, "encryption-passphrase", "p", "", "disk encryption passphrase") | ||
flags.StringVarP(&options.DiskEncryptionPassphrase, "encryption_passphrase", "", "", "disk encryption passphrase (alternate flag spelling)") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not just "passphrase"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changing it, though it's still not an option I want to encourage people to use.
cmd/buildah/mkcw.go
Outdated
flags.IntVarP(&options.CPUs, "cpus", "c", 0, "number of CPUs to expect") | ||
flags.IntVarP(&options.Memory, "memory", "m", 0, "amount of memory to expect (MB)") | ||
flags.StringVarP(&options.WorkloadID, "workload-id", "w", "", "workload ID") | ||
flags.StringVarP(&options.WorkloadID, "workload_id", "", "", "workload ID (alternate flag spelling)") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why?
cmd/buildah/mkcw.go
Outdated
} | ||
flags.StringVarP(&options.Slop, "slop", "s", "25%", "extra space needed for converting a container rootfs to a disk image") | ||
flags.StringVarP(&options.FirmwareLibrary, "firmware-library", "f", "", "location of libkrunfw-sev.so") | ||
flags.StringVarP(&options.FirmwareLibrary, "firmware_library", "", "", "location of libkrunfw-sev.so (alternate flag spelling)") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's gone now.
cmd/buildah/mkcw.go
Outdated
panic("error marking firmware_library as hidden") | ||
} | ||
flags.BoolVarP(&options.IgnoreAttestationErrors, "ignore-attestation-errors", "", false, "ignore attestation errors") | ||
flags.BoolVarP(&options.IgnoreAttestationErrors, "ignore_attestation_errors", "", false, "ignore attestation errors (alternate flag spelling)") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we just do something shorter like --ignore?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Feels like it would invite the question "ignore what?" Keep in mind that if we don't manage to register the workload with an attestation server, the init process running in the VM can't ask the attestation server for the disk encryption passphrase. So while an image that wasn't registered is still good enough to let our tests check that we formatted things correctly in a number of ways, the resulting image isn't usable beyond that.
cmd/buildah/mkcw.go
Outdated
panic("error marking ignore_attestation_errors as hidden") | ||
} | ||
flags.BoolVarP(&options.IgnoreChainRetrievalErrors, "ignore-chain-retrieval-errors", "", false, "ignore errors retrieving the certificate chain") | ||
flags.BoolVarP(&options.IgnoreChainRetrievalErrors, "ignore_chain_retrieval_errors", "", false, "ignore errors retrieving the certificate chain (alternate flag spelling)") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When would I want one versus the other? ignore_attestation_errors versus ignore_chain_retrieval_errors?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's mainly about separating local errors (missing tools, missing firmware, permissions) from remote errors (server down, authentication errors). I guess we can just call them all attestation errors.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Who would use these options? If the mkcw command is not going to succeed, why would I want to ignore errors? And why would I need to flags to ignore specific errors?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe it'd be undocumented, but we depend on it in order to be able to verify that the image looks like we expect it to look in our tests, to the extent that we can when we don't have the hardware.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we hide the option then?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure.
convertcw.go
Outdated
logrus.Warnf("unmounting target container: %v", err) | ||
} | ||
}() | ||
if err := os.Mkdir(filepath.Join(targetDir, "tmp"), os.ModeSticky|0o777); err != nil && !errors.Is(err, os.ErrExist) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this have to be 777?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's what's usually set for /tmp
. Why would this image's /tmp
be different in some way?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok this is in side of the image, I wanted to make sure it was not on disk somewhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right about one thing - we don't need to be doing this here. We already insert the directory with the right permissions into the tarball stream that we're generating and extracting into the working container, so the Mkdir()
here is redundant.
} | ||
defer func() { | ||
if err := source.Delete(); err != nil { | ||
logrus.Warnf("deleting source container: %v", err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we have a standard on Warnf, should they be capitalized or not? This varies throughout the code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's going to require PRs for multiple repositories.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yup
return "", nil, "", fmt.Errorf("generating encrypted image content: %w", err) | ||
} | ||
if err = archive.Untar(rc, targetDir, &archive.TarOptions{}); err != nil { | ||
if err = rc.Close(); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why isn't this a defer?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We want to be able to check its result before we create an image with everything we've read, in case it returns an error that it couldn't earlier.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PR LGTM other than small comments above
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: giuseppe, nalind, rhatdan The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
docs/buildah-build.1.md
Outdated
**--cw** *options* | ||
|
||
Produce an image suitable for use as a confidential workload running in a | ||
trusted execution environment (TEE) using krun. Instead of the conventional |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we add a man page reference to krun at the bottom of this page?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The package that provides /usr/bin/krun
doesn't provide a man page for it for us to reference. Is there an alternate one you'd suggest?
docs/buildah-commit.1.md
Outdated
**--cw** *options* | ||
|
||
Produce an image suitable for use as a confidential workload running in a | ||
trusted execution environment (TEE) using krun. Instead of the conventional |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto prior krun comment
If a value is specified, the new image's workload ID, along with the passphrase | ||
used to encrypt the disk image, will be registered with the server, and the | ||
server's location will be stored in the container image. | ||
At run-time, krun is expected to contact the server to retrieve the passphrase |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto krun comment
buildah\-mkcw - Convert a conventional container image into a confidential workload image. | ||
|
||
## SYNOPSIS | ||
**buildah mkcw** [*options*] *source* *destination* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe "mkcwi" instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Appending an "i" seems to have been the original approach for when the verb was also applicable to containers. I'd group this with pull/push, which have no corresponding commands which operate on containers.
/lgtm |
Well, that explains why I couldn't find a krun page anywhere, but I was hoping I was just doing a bad search. If we could get one up somewhere to reference, it would be good. I looked for "krun" and only found travel guides to "Krün" in the state of Bavaria in Germany, and a Country AM Radio station in Texas. Not much technical help. |
Rebased. |
@nalind It might be nice to have a parenthesis explanation like that in the man pages if there isn't a man page to point to. |
Add a --cw option to `buildah build` and `buildah commit`, which takes a comma-separated list of arguments and produces an image laid out for use as a confidential workload: type: sev or snp attestation_url: location of a key broker server cpus: expected number of virtual CPUs to run with memory: expected megabytes of memory to run with workload_id: a distinguishing identifier for the key broker server ignore_attestation_errors: ignore errors registering the workload passphrase: for encrypting the disk image slop: extra space to allocate for the disk image At least one of attestation_url and passphrase must be specified in order for the encrypted disk image to be decryptable at run-time. Other arguments can be omitted. ignore_attestation_errors is intentionally undocumented, as it's mainly used to permit some amount of testing on systems which don't have the required hardware. Add an `mkcw` top-level command, for converting directly from an image to a confidential workload. Signed-off-by: Nalin Dahyabhai <nalin@redhat.com>
Add docs for the new --cw option recognized by both `commit` and `build`, and the new `mkcw` command. Signed-off-by: Nalin Dahyabhai <nalin@redhat.com>
Rebased and expanded the text in the man pages to mention that krun is crun built with the libkrun feature enabled and invoked as krun, and to mention that the image size is increased to 10 megabytes if the specified or estimated size is less than that. |
LGTM |
/lgtm |
What type of PR is this?
/kind feature
What this PR does / why we need it:
This introduces the ability to replace a container image's layered rootfs with a LUKS-encrypted image containing an ext4 filesystem which includes the contents which would normally have been in the rootfs, plus configuration files and data that describe it as a confidential workload. If the build is provided with the URL of an attestation server, the workload's ID (autogenerated, if not specified) will be registered with it, along with the random (if not specified) passphrase which was used to encrypt the image.
How to verify it
If you have access to the hardware, use
buildah mkcw
or the--cw
buildah build
orbuildah commit
options to produce an image, then run it usingpodman run --runtime krun
. The "sev" type requires an AMD EPYC 1000-series or later processor, the "snp" type requires an AMD EPYC 3000-series or later processor.For the rest of us, unit tests and integration tests that can only look at the output image!
Which issue(s) this PR fixes:
Special notes for your reviewer:
Does this PR introduce a user-facing change?