-
Notifications
You must be signed in to change notification settings - Fork 75
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RFC] Attestation Agent Proposal #254
Comments
@jimcadden @rudyjantz Generally, I think we are on the same page here, but let me make a few notes. First, a bit more detail on the SEV and SEV-ES cases, which we plan to support for v0. @dubek has been working on extending the SEV launch measurement to include the kernel, kernel params, and initrd. By default the SEV launch measurement only includes the firmware. By extending the measurement to the initrd et al, we can verify the kata agent. Patches for this are on the list here and here. SEV and SEV-ES use pre-launch attestation meaning that during the boot process the launch measurement can be queried and a secret can be securely injected to a guest physical address. @dubek is also working on a kernel module that allows us to read this secret from userspace. This can be found here. Now, for a slightly higher-level view, the way I like to think about the attestation agent is that it provides userspace key wrapping and unwrapping inside an attested guest by communicating with a trusted third party over a secure channel. For SEV-SNP (I can elaborate on this in a separate issue) the AA sets up the secure channel from inside the guest and drives the attestation. With SEV(-ES) the secure channel is provided by the firmware (and the PSP) and the AA doesn't need to do anything to trigger the attestation. By the time the guest has booted, the attestation will have already been performed and a secret will be available via the securityfs if the measurement was correct. For both SEV(-ES) and SEV-SNP the boot measurement is verified by a trusted third party that conditionally provides keys. We have been referring to this third party as the Guest Owner Proxy (GOP). You call it the KBS. The difference between SEV(-ES) and SEV-SNP attestation isn't so much that one of them is local and the other remote, but that the secure channel is setup differently and the attestation is triggered differently. In the SEV(-ES) case we will need some extra tooling on the host (an extension of the Kata Runtime) that will trigger the measurement and send it to the GOP/KBS. Here is a rough diagram we have been using internally. This shows the SEV(-ES) case where the measurement comes from the PSP, through the HV and the Kata Runtime, to the GOP/KBS. After verifying the measurement the GOP/KBS injects a secret that makes its way to the AA and ocicrypt-rs. For SEV-SNP the AA will talk directly with the GOP/KBS. |
As for the two options, we have been assuming that the Attestation Agent would support all applicable platforms, but that the GOP/KBS would be platform specific. There will likely be a GOP for Intel and one for AMD that, while similar, probably won't be exactly the same. It could make sense to take a modular approach to the Attestation Agent where the AA has a generic interface to modules that know how to communicate with corresponding GOP. On the other hand, if there are only two protocols to support, it might be sufficient to implement each one in the AA itself. If someone wants to add another they can patch the AA. |
Thanks, @jiazhang0! Your RFC aligns nicely with what we have been discussing here at IBM. I support "Option 1: KBS modularization" as the better choice for the architecture-specific KBS. A few initial questions/comments about your design:
|
@fitzthum Thanks for the info from your notes. I'm familiar with SEV/SEV-ES attestation, but not aware of SEV-SNP attestation. So I have a initial question about it: It sounds like not only does SEV-SNP support runtime/dynamical attestation as TDX, it also supports pre-launch attestation (aka pre-attestation) to verify the boot measurements. Are you planning to use the approach of loading kernel, kernel parameters and initrd during pre-launch attestation for SEV-SNP as SEV/SEV-ES in kata CC v0? |
Theoretically speaking, each encrypted layer can contain multiple PLBCOs in the form of annotations corresponding to 1) different KeyIDs provided by different tenants, or 2) multiple KeyIDs provided by a single tenant. So using a single PLBCO or multiple PLBCOs for every encrypted container image in a pod is decided by image creator. Therefore, an encrypted PLBCO with all necessary info (e.g, KeyID and any other platform specific artifacts) needs to be wrapped to a KBC specific annotation packet. I believe the association info you mentioned should be recorded to KBC specific annotation packet. However, I'm not sure the necessity of association between the container being deployed and KeyID. Could you clarify its details?
Not exactly sure what your question is asking for. About integration protection, the newly added annotation ID "org.opencontainers.image.enc.pubopts" contains a field called "hmac" used to verify the integration of encrypted layer blob data. And the conventional field "digest" defined in layer descriptor is used to verify the integration of plain/decrypted layer blob data. Both technologies will be used during image decryption automatically, so it looks like AA doesn't need to do anything about integrity protection if I don't miss something important.
Make sense.
Gerry will provide the details in this thread about the work on the implementation of pullImage() (and etc) in kata-agent. |
Agreed, and I believe there would be more modules with different protocols to communicate with corresponding GOP/KBS. |
SEV-SNP attestation is completely different from SEV(-ES) except for one key detail; both provide the launch measurement as part of the attestation. This means that our OVMF/QEMU patches that extend the launch measurement will be useful for both SEV(-ES) and SEV-SNP. In both cases the launch measurement is calculated only once based on the initial state of the VM. With SEV(-ES) the launch measurement must be queried by the guest owner during launch, but with SEV-SNP, it can be retrieved anytime from within the guest as part of an attestation report. This attestation report includes a few other things (versioning info for firmware, 512 bits of arbitrary guest-provided data, etc.). The attestation agent will get this attestation report from the PSP and send it to the GOP/KBS. I think this is fairly similar to TDX. SEV-SNP does not support pre-attestation in the same way that SEV(-ES) does. One thing I was trying to get across in the previous post is that pre-attestation and SNP attestation both provide the launch measurement to the GOP/KBS in exchange for a key. The mechanism for doing so is totally different. |
@fitzthum Fairly clear. Thanks! |
Yes, I agree. Thank you for clarifying.
The scheme you describe provides a mostly complete solution for integrity. One extension to this model would be to verify the decrypted PLBCO (e.g., with a digest provided by the KSM) prior to decrypting the layer. Although, this seems like it can be contained to the implementation of an KBC.
Great! We should discuss how we can collaborate on these components, and what else needs attention for V0. |
One question that came up is how the AA/KBS design could prevent unpermitted containers from being deployed? For example, a situation where an untrusted cloud provider attempts to deploy malicious containers alongside the trusted ones. Since the |
This design of AA and KBS is a bit different from what I thought/proposed earlier. In this design, each layer LEK has to be sent to the KBS for it to decrypt. The advantage is the KEK will not leave KMS/KBS. However, it will cause multiple trips to the KBS/KMS for LEK decryption if the container has many layers with different LEK. In most cases, the KEK is shared among layers to protect the LEK and that is the purpose of having LEK for each layer. In the architecture I proposed earlier, the AA will send the keyID of the KEK to KBS and KBS releases the wrapped KEK to AA (AA needs to generate a pub/priv so that the KBS can wrap the KEK using AA's pub key for protection). Once the AA gets the KEK, it can locally unwrap the LEK and feed to ocicrypt. The AA is running inside the TEE so the KEK is protected after being transferred to AA. the KEK transmission from KBS to AA is protected with the AA pub/priv key pair. So the KEK can be safely protected on the way. The advantage of passing the KEK to AA is that the KEK is usually shared among multiple layers and once AA retrieves the KEK, it can cache it to avoid multiple round trips to KBS to decrypt each LEK. |
The ideal way is for the kata agent to measure what containers are to be loaded. When kata agent calls into AA to retrieve KEK to decrypt the encrypted container, the KBS receives the request and sends attestation challenge. so the TEE attestation service can verify the entire stack before releasing the KEK. This requires the measurement to be extended all the way to kata agent and the containers. Intel TDX supports it. AMD-SEV/ES may not, but AMD-SNP may. |
Thanks for pointing this out @hdxia. For SEV(-ES) there is no persistent secure connection between the GOP/KBS and the AA. Our plan is to use SEV secret injection to provision a set of keys at boot (given a valid launch measurement). The AA can then use the keys to unwrap the LEKs. This may be what @jiazhang0 is referring to as local attestation, but I am a little unclear on the terminology still. It would be possible to use secret injection to provision a key that the AA could use to setup a persistent secure channel to the GOP/KBS. Then the AA could request individual keys or relay unwrap commands. Injecting the secrets directly at boot seems easier. I think the plan is to do something similar for SEV-SNP. Rather than relaying every unwrap request to the GOP/KBS, the GOP/KBS will probably send over a set of keys once the AA has provided a valid attestation report. I think that usually one of the main goals of remote attestation is to keep the key from traveling, but I don't know if that is our priority here. As you point out, the enclave is trusted. |
For all versions of SEV, the launch measurement will include the firmware, initrd, kernel, and kernel params (once our patches get upstream). The launch measurement is calculated at boot, however, and reflects only the initial state of these elements. Containers that are pulled in by the Kata Agent after the guest has booted will not be reflected in the measurement. That said, since the Kata Agent is measured at boot, we can trust it to measure any containers it pulls in and send the measurement to the GOP/KBS. For SEV(-ES), we don't have a persistent connection to the GOP/KBS, but we can potentially deliver a list of allowed measurements at boot (along with the KEKs). It might be a good idea to explicitly measure every container that is pulled into the enclave, but we do not get that for free with any version of SEV. We would need to add extra functionality to the Kata Agent or whatever service pulls the image in the guest and this would need to be coordinated with the AA. Could we potentially use a signature-based approach instead? |
Thanks @fitzthum. performance is also a big concern when container is launched in addition to security. with the same level of security, we should have performance as priority. Or at least from design perspective, we should make it flexible in case some are paranoid with security. |
signature may be possible, but the CA and the entity to verify the signature has to be trusted. in this case, both the kata agent and the root CA to verify the signature has to be measured and attested by the TEE attestation service. |
Actually you are talking about unauthorized use of image. I think image encryption plus remote attestation can partly mitigate this issue, because it can prevent others from running an images at will and enforce a remote attestation procedure to cause an auditable behavior observed by KBS. With a strong policy driven mechanism, KBS can partly limit the unauthorized use. |
The term "local attestation" is mentioned in P19 https://docs.google.com/presentation/d/1469nSRFtlHMSDDDWVLj0i21dR9M_3SO76ehQsyVSTUk/edit#slide=id.gdccf80c723_0_1261 , so I use this term. I understand why you are confused :) Because obviously the AA for SEV(-ES) just retrieves the injected secret as result of pre-attestation rather than initializing a so-called local attestation procedure at that moment. In addition, I firstly heard about the term "local attestation" when I investigate Intel SGX attestation, which ensures two SGX enclave instances on a local node are attested each other. What is your preference about its naming? |
Also. you are sending the LEK from KBS to AA, there is no difference between sending KEK vs LEK from KBS to AA. if you are worried about the security of KEK, you also need to worry about LEK since if someone gets hold of LEK, he/she can decrypt your layers. that is why in my original diagram, we plan to send KEK, rather LEK for performance reason. |
@hdxia Thanks for your comments. It is a reasonable implementation option, and what you propose can be implemented as ISECL KBS module for AA. Does it make sense to you? |
Secret injection follows the verification for a valid launch measurement during guest launch stage, so the guest has been trusted before performing the first instruction. Is it necessary to inject dedicated key used to setup a secured channel for AA and GOP/KBS? Is it possible to generate it at will for wrapping/unwrapping payload when AA/KBS needs a communication? |
I think @fitzthum is not talking about LEK traveling. His points include:
|
Yup, this seems like the easiest way to do SEV(-ES) support.
We could do this with SEV(-ES), but it's more complex than the above and I'm not sure it has significant benefits.
Yup, this should be fairly similar to the TDX approach except that the SEV-SNP measurement has slightly different properties (meaning that we might need additional support to measure containers). Are we all on the same page here? It seems like we have two categories: pre-attestation, and runtime attestation. At least for v0 seems like we are leaning towards just sending the KEK to the AA once the attestation passes. |
I have no objection two categories. And I think the "GetKey" style (sending the KEK to the AA once and caching) and "UnwrapKey" style (KEK not leave KMS/KBS and relaying unwrapKey request to KBS) would be platform specific implementations. |
@fitzthum @jiazhang0 I will submit the initial code implementation with a sample KBC module using harcoded KEK. |
@sameo @jimcadden @fitzthum @hdxia The initial PR is at confidential-containers/attestation-agent#2. In addition, the development plan of AA referring to https://github.com/containers/ocicrypt-rs/issues/1 is at https://github.com/containers/attestation-agent/issues/3. |
First, thank you for your patience, as I am still coming up to speed with some of the concepts being relayed here. In reference to @fitzthum's graphic, would it be possible / make sense to remove the |
I suspect it's best to have the attestation agent in the same enclave as the workloads. For one thing the interface between ocicrypt-rs and the attestation agent isn't really designed to cross the trust boundary. To decrypt a confidential workload ocicrypt-rs will use rpc to ask the attestation agent to unwrap a key for each layer of the image. The attestation agent gives back an unwrapped key which allows access to a layer. If the attestation agent were in a different vm from ocicrypt-rs, we would have to rework this communication to make sure the unwrapped key is not exposed. We could do something like that, but I'm not sure what the benefit is. In SEV(-ES) there is no direct connection between the attestation agent and the CPU. The AA uses keys that are injected via the launch secret mechanism. QEMU facilitates this via QMP on the host. A kernel module in the guest pulls the injected secret from a GPA and puts it into the filesystem. The AA just needs to read a file to access the secret. The host will have to connect to the PSP to provide the encrypted launch secret (on behalf of the guest owner), but QEMU will have to connect to the PSP anyway as part of the boot sequence. For SEV-SNP there is a direct connection between the AA and the PSP, but I think this may be required. For SNP the AA is responsible for setting up a secure connection to the KBS/GOP. To do so the AA requests an attestation report from the PSP. AFAIK attestation reports must be requested from inside the VM that they pertain to. Something else would have to take care of this if we moved the AA. I could be missing something here with minimizing direct connections. Is there a limit to the number of connections to the PSP? Is there a significant cost? Finally, it might be worth noting that the attestation agent would probably have to be tweaked to support a one-to-many relationship with nodes, particularly if one AA were servicing multiple tenants. In the current plan there is a one-to-many relationship between the GOP/KBS and the AA and a one-to-one relationship between the AA and the VM. Having the AA inside the VM seems like the simplest approach. |
Correct, adjusting for my suggested changes would mean a substantial deviation from the original design, so I am very appreciative for the opportunity for a discussion! I can see the adjustment significantly impacting both the scalability and performance. Instead of creating an instance of the AA in each kata container, you could have far fewer instances for large-scale deployments, but still accomplish the same task.
You're right. My initial thought was based on SEV-SNP, so SEV(-ES) is not directly related this discussion.
Correct. The attestation report must be requested from within the guest. As we are planning to use (g|tt)RPC, we should still be able to accomplish this. We would need to secure the RPC connection between the AA and the guest, and from my understanding the protocol should have this security built into it.
My usage of "connections" was probably the wrong word to describe my concern. The real issue would be surrounding the requests (for lack of a better word) being made to the PSP. The first question is, is the PSP in ring buffer mode, or is it in mailbox mode?
If the AA is separated, it would provide a somewhat centralized queue for the report requests to be made. |
After some additional research, the concerns mentioned above have been resolved. The Linux driver written for interacting with the PSP handles the necessary queuing of the requests. The Windows driver utilizes the ring buffer mode. |
@jiazhang0 is this issue still relevant or can be closed? |
Edit
Summary
This proposal provides the implementation of attestation agent, targeting to facilitating a E2E attestation reference implementation for kata CC v0. This RFC reveals the details of encryption and decryption procedure, and introduces the design options to implement the attestation agent for kata CC v0.
Background
For kata cc v0 architecture, encryption and decryption operations are performed by different entities.
During image encryption, image creation tools, such as skopeo, buildah and ctr-enc, calls ocicrypt which essentially uses Layer Encryption Key (LEK) to encrypt the image layer. In the underlying implementation, ocicrypt generates LEK randomly, and serializes it and its encryption parameters into PrivateLayerBlockCipherOptions object, then encrypt the PrivateLayerBlockCipherOptions (PLBCO for short) object through calling the WrapKey API defined by key provider protocol. The details of the encryption process and the returned annotation packet containing the encrypted PrivateLayerBlockCipherOptions object are all determined by the implementation of Key Broker Service (KBS for short). Eventually, the annotation packet is stored in the image layer's annotations field (for example, the annotation ID can be
org.opencontainers.image.enc.keys.provider.kata_cc_key_broker_foo
).In the process of image decryption, kata-agent calls ocicrypt-rs to retrieve the plain/decrypted PrivateLayerBlockCipherOptions object. In the underlying implementation, ocicrypt-rs calls the UnWrap API implemented by Attestation Agent (AA for short). AA needs to access KBS according to the parameters in annotation packet to decrypt the PrivateLayerBlockCipherOptions Object, and then return the decrypted PrivateLayerBlockCipherOptions Object to ocicrypt-rs as the return value of UnWrap API. Eventually, ocicrypt-rs decrypts the encrypted image layer using LEK from PrivateLayerBlockCipherOptions object.
The above workflow is especially suitable for remote attestation procedure that supports dynamical measurement such as Intel TDX.
For the pre-attestation procedure of SEV/SEV-ES which only supports static measurement, AA only needs to access guest FW (or a kernel module driver) to get the plain PrivateLayerBlockCipherOptions object in the guest which is provisioned in the pre-attestation stage. This is so-called pre-attestation (or local attestation mentioned in P19 https://docs.google.com/presentation/d/1469nSRFtlHMSDDDWVLj0i21dR9M_3SO76ehQsyVSTUk/edit#slide=id.gdccf80c723_0_1261).
From long term, AA is far more functional than just doing the decryption of PrivateLayerBlockCipherOptions object. For example, it can periodically report the TCB status to relying party. Therefore, it is necessary to implement AA as a long-live service. This proposal suggests to implement AA as gRPC endpoint, instead of a standalone binary program.
Goal
The AA is specially designed for kata cc architecture, so it's initial goal is to decrypt the PrivateLayerBlockCipherOptions object according to the input parameters defined by the key provider protocol. This is the only high level function that AA must implement in v0 architecture. This also means that AA does not need to implement the WrapKey API.
Further approaches includes:
Implement UnWrap API defined by key provider protocol
Deserialize and parse the input parameters of UnWrap API, and serialize and return PrivateLayerBlockCipherOptions to ocicrypt-rs.
Support remote attestation and pre-attestation
Specifically, certain HW-TEE requires to obtain PrivateLayerBlockCipherOptions through pre-attestation, and others requires to do it through remote attestation.
Abstract the precedure of PrivateLayerBlockCipherOptions decryption
The precedure of PrivateLayerBlockCipherOptions decryption is implemented by KBS, and is also related to the access to attestation service.
Internal
Parse input parameter of UnWrapKey API
The format of input parameter of UnWrapKey API is KeyProviderKeyWrapProtocolInput.
where:
$scheme
and optional$parameters
are specific to the implementation of the callee of ocicrypt-rs. In ocicrypt,$scheme
is specified in key provider configuration file, and$extra_parameters
is specified in command line of image creation tools such as ctr-enc, skopeo and buildah. See examples for the details. In kata CC, the configuration file may use the pattern "kata_cc_attestation:$mode" as preferred, where$mode
is either local or remote, corresponding to pre-attestation and remote attestation.$extra_parameters
contains the KBS specific annotation packet. A good example from keyprovider test program shows the format of$KBS_specific_annotation_packet
is generated during image encryption and stored in layer annotation. Its format is specific to the implementation of KBS.Example:
Handle the return value of UnWrapKey API
No matter what method AA uses to obtain the plain/decrypted PrivateLayerBlockCipherOptions object, AA needs to serialize the PrivateLayerBlockCipherOptions object into the following JSON object and use it as the return value of UnWrappKey API.
Abstract the procedure of PrivateLayerBlockCipherOptions decryption
Deem KBS as a service providing the capability of PrivateLayerBlockCipherOptions decryption. KBS can be abstracted as one of the following types (but not limited to):
There are two options to implement this abstraction.
Option 1:KBC (Key Broker Client) modularization
KBS is platform specific implementation, so AA needs to define and implement a modularization framework to allow platform providers to communicate with their own KBS infrastructure through a corresponding KBC integrated to AA.
In this scheme, each KBC module needs to realize the following functions:
Function 1: implement a platform specific client for KBS.
AA doesn't need to care about the detail of communication protocol between KBS and KBC. The KBC selection can be done in this way:
Function 2: define and implement the communication protocol between KBS and KBC.
Include application protocol, transport type, API scheme, input and output parameters, etc.
Function 3: implement the corresponding attester logic for all potentially supported HW-TEE types
AA, as the role defined by RATS architecture, is responsible for collecting evidence about the TCB status from the attesting environment and reporting it to the verifier or relying party for verification. The purpose is to convince tenant that the workload is indeed running in a genuine HW-TEE. In order to establish the binding between evidence (called quote in TDX) and user-defined data structure (aka Enclave Held Data, EHD for short), the hash of EHD is embedded into evidence and then the evidence plus EHD is sent to remote peer. Usually, EHD is a public key used for wrapping a secret.
Option 2:AA-KBS E2E
Explicitly provide an implementation of AA and KBS for kata CC.
In this scheme, AA will eventually implement all KBS types mentioned above according to the requirements. The function 2 and 3 belong to the internal details between AA and KBS.
Comparison
Compared with option 2, option 1 asks each KBC module of remote attestation to implement all potential HW-TEE attester logic. In fact this is a waste, and the attester logic for a specific HW-TEE just needs to implement once.
KBS is platform specific implementation, so option 1 offers the greatest flexibility compared with option 2, and AA doesn't need to care about any implementation details of KBS (for example, AA doesn't need to parse the annotation data in the input parameter of UnWrap API).
Cloud HSM KBC is also implement specifically, so the Cloud HSM KBC in option 1 actually needs to implement a modular subsystem to support different Cloud HSM.
At present, most of existing implementation of attester is written in non-Rust, so option 2 asks AA to integrate potential unsafe codes. This problem is raised especially for the software running in HW-TEE. Although option 1 has a similar problem, at least KBC module and AA are separated, and the platform providers focusing on security will try their best to use rust to implement the KBC module from a long term.
Reference
Collected Suggestions
A Status() method to the KBS API would be useful to handle slow remote attestations and attestation failures. - by @jimcadden
One extension to the integrity model for layer encryption would be to verify the decrypted PLBCO (e.g., with a digest provided by the KSM) prior to decrypting the layer. Although, this seems like it can be contained to the implementation of an KBC. - by @jimcadden
The AA will send the keyID of the KEK to KBS and KBS releases the wrapped KEK to AA (AA needs to generate a pub/priv so that the KBS can wrap the KEK using AA's pub key for protection). Once the AA gets the KEK, it can locally unwrap the LEK and feed to ocicrypt. The advantage of passing the KEK to AA is that the KEK is usually shared among multiple layers and once AA retrieves the KEK, it can cache it to avoid multiple round trips to KBS to decrypt each LEK. - by @hdxia
SEV-SNP does not support pre-attestation in the same way that SEV(-ES) does. One thing I was trying to get across in the previous post is that pre-attestation and SNP attestation both provide the launch measurement to the GOP/KBS in exchange for a key. This should be fairly similar to the TDX approach except that the SEV-SNP measurement has slightly different properties (meaning that we might need additional support to measure containers). - by @fitzthum
The text was updated successfully, but these errors were encountered: