Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generic Key Broker System. #68

Open
13 of 14 tasks
Tracked by #6365
jialez0 opened this issue Nov 21, 2022 · 28 comments
Open
13 of 14 tasks
Tracked by #6365

Generic Key Broker System. #68

jialez0 opened this issue Nov 21, 2022 · 28 comments

Comments

@jialez0
Copy link
Member

jialez0 commented Nov 21, 2022

Up to now, we have two experimental Key Broker Systems for CoCo: offline_sev_kbc - simple-kbs for SEV and eaa_kbc - verdictd for TDX and enclave-cc, which have been included in the quickstart guide for user deployment, can support providing the required confidential data for image-rs (such as the key required to decrypt the container image). However, both KBS above are now specific on HW-TEE architecture, and they are incomplete solutions using their own KBC-KBS interaction protocol. In addition, verdictd relies on some external components and is complex in configuration and deployment.

So just like we discuss before in (#119) (Most of the contents are outdated). We need a standard and generic KBS, which is compatible with multi-architecture HW-TEE, full stack implementation of Rust, and neutral and completely owned by CoCo. It will be an out-of-the-box productive KBS component in the CoCo solution.

Current Status

Attestation Agent

KBS

Attestation Service

Architecture

Here is the architecture of generic Key Broker System (cc-kbc, KBS, AS):

kbs

According to the architecture, we can divide the work into four parts: cc-kbc, KBS, AS, Reference value publish.

Reference measurement value provide

When building Github Action, we need to provide the interface for RVPS (Reference Value Provider Service) in AS to subscribe reference value of component measurement, including kernel, kernel parameters and root file system. (This is an enhanced part. Before it is ready, we can temporarily manually send the reference value of measurement to RVPS. ) For more details on this part of the work, @Xynnn007 may can give some proposals.

@Xynnn007
Copy link
Member

Thanks for the great plan for remote attestation overview of CC! I'll open another issue to describe the measurement-related roadmaps later.

@fitzthum
Copy link
Member

This is great. Excited to help out.

@sameo
Copy link
Member

sameo commented Nov 23, 2022

Very nice. Thanks for putting this together!

@fitzthum
Copy link
Member

fitzthum commented Nov 30, 2022

btw @jialez0 Will the KBS be multi-tenant?

Also, are we planning to use HTTPS like the spec mentions?

@jxyang
Copy link

jxyang commented Nov 30, 2022

I like the idea of abstracting out attestation service. Have you thought about using a cloud attestation service like Microsoft Attestation Service? It is designed to abstract out specific HW TEEs and provide a unified attestation flow. It supports SGX and SNP already.

Given a SNP/SGX attestation report, MAA returns a token after validating the report, which is signed by a private key. KBS can then validate the token with a well-known public key.

By offloading the AS to cloud, we could reduce the footprint of guest VM and maintenance cost. Of course, the down side is that we have to include the cloud attestation service in the TCB. Our experience indicates some customers don't mind that.

@surajssd
Copy link
Member

@jxyang a similar suggestion has been raised here: confidential-containers/cloud-api-adaptor#379

@jxyang
Copy link

jxyang commented Dec 1, 2022

@surajssd Thanks. I made a comment there too.

@jialez0
Copy link
Member Author

jialez0 commented Dec 1, 2022

btw @jialez0 Will the KBS be multi-tenant?

@fitzthum What do you mean of multi tenant supporting? Does it meas: KBS creates different User instances. The instances include the Keys and Secrets resource storage of the User, and the AS using the User's Attest Policy. This requires AA to carry the User ID when requesting KBS to indicate which User instance KBS should uses to process the request. Is my understanding accurate?

We need to think about two problems:

  1. What does AA use to represent ID?

  2. How to register an ID in KBS and ensure that AA can know which ID its environment belongs to?

@jialez0
Copy link
Member Author

jialez0 commented Dec 1, 2022

Given a SNP/SGX attestation report, MAA returns a token after validating the report, which is signed by a private key. KBS can then validate the token with a well-known public key.

@jxyang @surajssd You can see that in our current KBS protocol design, KBS can also support issuing and verify its self-signed Token. If the /auth request carries this token and the verification passes, KBS can skip the subsequent Challenge and Attestation phases. Therefore, KBS can easily add verification support for Tokens signed by other trusted third parties (such as MAA Tokens), which can be implemented as an enhancement.

@fitzthum
Copy link
Member

fitzthum commented Dec 1, 2022

@jialez0

What do you mean of multi tenant supporting? Does it meas: KBS creates different User instances. The instances include the Keys and Secrets resource storage of the User, and the AS using the User's Attest Policy. This requires AA to carry the User ID when requesting KBS to indicate which User instance KBS should uses to process the request. Is my understanding accurate?

Yes. I agree with the issues you point out. It seems hard for the KBS to be sure that they are deploying the correct secrets to a given user, especially given that the attestation reports for each user would probably be the same. I think we would need multiple totally separate secure connections, with different KBS keypairs for each client.

@jxyang
Copy link

jxyang commented Dec 1, 2022

@jialez0 Great to know that the design allows 3rd-party tokens. Do you mind making it clear in the above diagram?

@jialez0
Copy link
Member Author

jialez0 commented Dec 2, 2022

@jialez0 Great to know that the design allows 3rd-party tokens. Do you mind making it clear in the above diagram?

The diagram here only guides us how to make our Key Broker System have the most basic functions. It is not a complete design spec, so it should be as simple as possible. Token support and third-party token validation are more detailed enhancements. They will be mentioned in the formal architecture spec document in the future, and there will be a clear diagram to illustrate this point. Don't worry.

In addition, I opened a new issue in the kbs repository: issue 20 to remind us.

@sameo
Copy link
Member

sameo commented Dec 2, 2022

Also, are we planning to use HTTPS like the spec mentions?

Yes, we need HTTPS. This allows AA to authenticate the identity of KBS (although how to pass the certificate is still under discussion...).

cc @sameo

See confidential-containers/trustee#19 for the WIP PR of the api server implementation. It is based on actix for the HTTP side and we will make HTTPS mandatory.

@Xynnn007
Copy link
Member

Hi, for measurement-reference issues, we made a proposal to provide reference values. confidential-containers/trustee#187 Anyone interested in this please take some time to help review. :)

@zvonkok
Copy link
Member

zvonkok commented Jan 16, 2023

Will this architecture also support 3rd party artifacts? Meaning customers bring their own guest-os, kernel, and reference values that could be attested in a later step, layered, composite attestation?

@Xynnn007
Copy link
Member

Xynnn007 commented Jan 16, 2023

Will this architecture also support 3rd party artifacts? Meaning customers bring their own guest-os, kernel, and reference values that could be attested in a later step, layered, composite attestation?

Yes, certainly. The key point is the reference values used in comparation with evidences could be set by the users themselves in RVPS or OPA policy engine, both in AS.

@zvonkok
Copy link
Member

zvonkok commented Jan 16, 2023

I assume a 3rd party KBS will also be supported?

@zvonkok
Copy link
Member

zvonkok commented Jan 16, 2023

What about the use-case of extending the attestation report? We may have the situation where another entity creates an attestation report inside the TEE; it would be nice if we could also attest this in a second step or in an accumulated attestation report with accumulated reference values or different reference values.

@zvonkok
Copy link
Member

zvonkok commented Jan 16, 2023

verdictd does have support for remote KBS, AFAIK.

@zvonkok
Copy link
Member

zvonkok commented Jan 16, 2023

Once TDISP/IDE is available if will be natural that TEEs have trusted accelerators that provide their own attestation reports, independently of the Kata stack, well even now with bounce buffers there is demand for accelerator attestation in our use-case.

@jialez0
Copy link
Member Author

jialez0 commented Jan 17, 2023

I assume a 3rd party KBS will also be supported?

The third-party KBS can indeed be supported. Now the Verdictd and simple KBS in coco release are actually third-party KBS. They connect with CoCo through the exclusive KBC module in the Attestation Agent.

@fitzthum
Copy link
Member

@zvonkok

What about the use-case of extending the attestation report? We may have the situation where another entity creates an attestation report inside the TEE; it would be nice if we could also attest this in a second step or in an accumulated attestation report with accumulated reference values or different reference values.

Getting a second report inside the guest, probably as part of the workload, is interesting. We have to be careful about how we do this to avoid creating security problems, but it should be possible. I am generally thinking that a workload would use this report to connect to some other service that wants to consume an attestation report (rather than connecting to our KBS again). Did you have anything specific in mind?

Once TDISP/IDE is available if will be natural that TEEs have trusted accelerators that provide their own attestation reports, independently of the Kata stack, well even now with bounce buffers there is demand for accelerator attestation in our use-case.

We should think about the best way to support this. The easiest way would be to modify our verifiers to call out to some vendor-specific plugins that know how to verify devices. Maybe we should try to come up with something more flexible, though. Some form of composable attestation that allows verifiers to consume each other.

@zvonkok
Copy link
Member

zvonkok commented Jan 18, 2023

@fitzthum You're on the right track where I am heading. Yes, this architecture needs to be more flexible. Let's make it more concrete and introduce our use cases with confidential containers.

For the seamless running of GPU workloads on Kata we're packaging the driver and the toolkit (for container enablement, prestart hook) in the guest, which means customers do not see any difference when running GPU workloads with or without Kata.

ctr run <whatever> is the very same.

For this to work we're shipping an custom build of the kernel, guest-os and several GPU-kata-runtimes. So it's a mix of standard components (e. g. OVMF) and additional components that are provided by a 3rd party. In the case of Kubernetes the GPU-operator will pull down the needed parts from repository (signed and encrypted) only if the base install of a confidential container can run and attested (without GPU).

Once we fire up a GPU confidential-container we would like to attest the old components and new components with our supplied reference values. Meaning the KBS needs to provide a way to supply our own policy files.

To make the attestation workload agnostic, the idea was to have a component in the guest-os that retrieve the attestation report and provide it to an external entity like the KBS that then does the attestation with additional reference values, maybe something like a two step approach, First attest the base stack and second attest additional attestation reports that are supplied.

It does not stop there, what if we have multiple accelerators? Do we concat the attestation reports and do one by one attestation and just "disable" the not successfully attested devices? Or do we do a verification over all devices all at once?

It would be nice if we could do several rounds of attestation by providing a set of attestation-report, policies, and reference values on demand even during runtime at "any" time.

In the case of Kubernetes, we're relying on the GPU operator to provide the configuration (enable SR-IOV, create VFs, etc) and the needed artifacts to enable GPUs in Kata, aka CoCo.

External entities should be able to retrieve attestation results or request attestation of a specific configuration. The GPU operator cannot attest itself; some higher level entity needs to do that, like in the case of SNP where the SP is providing the attestation report and CoCo attests the environment with the provided policies and ref. values.

There needs to be a terminal entity which the workload trusts a general interface to get the chained results of attestations for the complete stack. This terminal entity cannot be CoCo because you would have higher-level entities that are introducing HW, SW that needs to be attested in a specific "way", in other environments not specifically tied to CoCo.

You we're right when saying the "workload would use this report to connect to some other service" but to which service it connects, depends on how your stack looks like.

@zvonkok
Copy link
Member

zvonkok commented Jan 18, 2023

@fitzthum Are we looking into https://www.ietf.org/archive/id/draft-ietf-rats-architecture-22.html as well as a reference?

@zvonkok
Copy link
Member

zvonkok commented Jan 18, 2023

Did anyone take a look at https://github.com/veraison? This is what the Arm ecosystem is building.

It can be challenging to build just one Verification Service solution which can address all deployments for a technology that needs to produce Attestation reports to prove its trustworthiness. If that then implies that each deployment needs a custom service, there is a significant software barrier and hence cost of entry to establishing a system that can be used in a secure manner. Veraison aims to provide consistency and convenience to solving this problem by building the software components that can be used to build Attestation Verification Services.

@fitzthum @fidencio FYI

@jialez0 jialez0 changed the title Roadmap for Generic Key Broker System in v0.5.0 release. Generic Key Broker System. Feb 24, 2023
@yaoxin1995
Copy link

@fitzthum @jialez0

I am integrating the KBS to quark runtime following the kbs attestation protocol.

I have a few questions.

  1. Will 1 KBS instance only be owned by 1 data owner and only contain secrets belonging to 1 party? Or multiple data owners can share 1 KBS instance?

  2. Where does the KBS run? Will it run on the data owner's side or will it run also in a secure VM hosted in the public cloud like the enclave? In the latter case, the data owner needs to ensure the KBS running in an expected environment before uploading the secret, how do we achieve this?

  3. KBS attestation protocol uses tls to avoid malicious attackers from hijacking KBS address to impersonate KBS and deceive KBC. In turn, how does the KBS prevent malicious attackers from impersonating KBC to get the secret from it?

For example, assume KBS only contains secrets from 1 party and We have KBS instant A that stores the secret from a data owner. Now the data owner and attacker deploy an Nginx service each, using the same image. In this case, both services may generate the same attestation report, which means both services are able to get the secret from the KBS. However, KBS should only provide the secret to the Nginx service deployed by the data owner. So how can the KBS be sure that they are deploying the secrets to the correct user?

@fitzthum
Copy link
Member

@yaoxin1995

I am integrating the KBS to quark runtime following the kbs attestation protocol.

Cool

I have a few questions.

1. Will 1 KBS instance only be owned by 1 data owner and only contain secrets belonging to 1 party? Or multiple data owners can share 1 KBS instance?

The KBS is trusted, but there are potentially multiple ways for it to operate. The simplest is that each client will run their own KBS that serves only their secrets. You could also run a KBS as a service and either provide separate virtualized KBS instances to clients or serve multiple clients with one KBS (not really supported with this codebase yet). In any case the KBS must be trusted.

2. Where does the KBS run? Will it run on the data owner's side or will it run also in a secure VM hosted in the public cloud like the enclave?  In the latter case, the data owner needs to ensure the KBS running in an expected environment before uploading the secret, how do we achieve this?

Again, the simplest case is for the client to deploy the KBS themselves somewhere that they trust, but more complex arrangements are possible. A KBS could run inside of a TEE either operated by the client or by a CSP. Unless the client has a reason to trust the CSP, this TEE would need to be attested, which might require another KBS. Having two levels of KBS might seem redundant, but it could be useful. The client would only have to attest the KBS running in a TEE once. Then that KBS could handle many confidential guests.

3. KBS attestation protocol uses tls to avoid malicious attackers from hijacking KBS address to impersonate KBS and deceive KBC. In turn, how does the KBS prevent malicious attackers from impersonating KBC to get the secret from it?

The KBS identifies the KBC based on the attestation evidence. The content of the attestation evidence and the guarantees that it provides are implementation dependent. Confidential Containers uses generic guest images, meaning that the attestation evidence does not uniquely identify a guest. Rather, the evidence certifies that the guest is some valid Confidential Containers guest that the KBS can inject an identity into. I'm not sure what your goal with Quark will be.

For example, assume KBS only contains secrets from 1 party and We have KBS instant A that stores the secret from a data owner. Now the data owner and attacker deploy an Nginx service each, using the same image. In this case, both services may generate the same attestation report, which means both services are able to get the secret from the KBS. However, KBS should only provide the secret to the Nginx service deployed by the data owner. So how can the KBS be sure that they are deploying the secrets to the correct user?

Generic workloads can be confusing for people. First of all, note that for CoCo the attestation report is the same for most guests, regardless of the pod that is deployed. For CoCo restrictions on the workload are enforced via signatures. You might be doing something else with Quark, but for now let's just update your example to use signatures. The following is still true if you don't use signatures and just put the measurement of the container in the attestation report.

Using signatures the KBS would be provisioned with a signature policy that allows a container to run. The KBS will only send secrets to workloads that meet this signature check. The only workloads that a KBS should allow to run via signatures are workloads that the KBS will trust with its secrets. A KBS will only trust a workload with a secret if the workload won't reveal the secret to any other parties. If this is the case, it really doesn't matter if the workload is generic or if it was started on behalf of the client or some malicious party.

Nginx is a misleading example because secrets provided to an nginx container probably could be exfiltrated via HTTP. So a KBS should never really provide secrets to a stock nginx container. Let's imagine instead that we have a hardened container that receives a secret and does some computations. The secret can never leave the container. In this case, it doesn't matter who actually starts the container.

Hopefully that helps. You might also check out this gist.

@yaoxin1995
Copy link

yaoxin1995 commented Mar 30, 2023

@fitzthum Thanks, I have a follow-up question.

Using signatures the KBS would be provisioned with a signature policy that allows a container to run. The KBS will only send secrets to workloads that meet this signature check. The only workloads that a KBS should allow to run via signatures are workloads that the KBS will trust with its secrets. A KBS will only trust a workload with a secret if the workload won't reveal the secret to any other parties. If this is the case, it really doesn't matter if the workload is generic or if it was started on behalf of the client or some malicious party.

After viewing your talk at FOSDEM and the discussion in the gist, I feel that only verifying the signature can't not preventing the evidence factory attack to KBS. Besides image signature verification, KBS needs to check the KBS public in the host data field of the report.

Assume the client deploys the KBS themselves somewhere that they trust, in this case, how kbs or data owner distributes KBS public key to the low level runtime (kata/quark runtime) so that it can insert the key to the host data field of the report before launching the guest?

I am not sure how a confidential container is launched. But in typical vm-based pod deployment, low-level container runtime such as quark/kata runtime would first launch a VM and create a pause container as the pod's root container based on the pause container's runtime spec. After the successful deployment of the root container, the low-level container runtime then notifies the high-level container runtime (containerd) to send over the runtime specification of the application container. The specification includes the environment variable, and application argument that the data owner defined for the application container in yaml. In this case, we can send the IP of KBS to the Secure VM through yaml, but it is not possible for the data owner to send the public key of kbs to low-level runtime through the yaml because the secure VM only gets the specification of application container after the VM is created and I believe the pause container's runtime specification is generated by contained automatically and a user can't config it in any way.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
Development

No branches or pull requests

8 participants