RFC: Separate trust realms for tenant and host #1834

c3d · 2021-05-12T10:12:24Z

Summary

In the context of Confidential Computing Enablement (#1332), we break the assumption that the containers trust the host. It makes clear that the trust realms for the container workload and for the pod running it do not belong to the same owner. This RFC explores the consequences in terms of API and usage model.

Motivation

Hardware-level enablement of features such as memory encryption, CPU state encryption and memory integrity protection are not sufficient to ensure confidentiality. Exploring this question has shown that achieving real confidentiality raises serious issues regarding system administration, and may not be achievable without deep changes in the existing APIs, notably with respect to the data flow between components of Kata Containers (in particular between runtime and agent).

Objective

The objective of this document issue is to explore the implications of confidentiality with respect to the architecture of Kata. The following aspects seem particularly important:

Confidentiality: We don't want a design where the door is locked but the window is open. So we need to ensure that the benefits of technologies such as memory encryption are not defeated by Kata communication channels.
Flexibility: Very early in the platform support for confidentiality, we already have several models that behave somewhat differently, notably with respect to attestation or integrity protection. We want to accommodate various platforms, and to not settle for a lowest common denominator.
Compatibility: To the maximum extent, we want to preserve the ability to run existing workloads, or alternatively, to provide a clear upgrade path.
Enforcement: We want to be able to enforce confidentiality when it is requested, not to rely on the user not making mistakes.

Non-objective

All features with confidentiality: Some features will be incompatible with confidentiality, for example VGPU (at least until GPU vendors themselves provide support for confidentiality). The "enforcement" objective means it is not an objective to support confidentiality with these features, but to detect such incompatibilities as much as possible.
Fork: As much as possible, we want to avoid a confidential-only fork of Kata, knowing that some other projects are already exploring other ways to deliver confidential containers.

Proposal

This RFC makes the following recommendations to address the problem:

Introduce a new terminology, hosts vs. tenants, and the associated security realms.
Introduce the notion of immutable pod and the implications regarding notably the OCI specification, one idea being to use annotations to pass a complete, non-incremental description of the pod and its containers.
Split kubectl commands / APIs based on owner: host (create a pod), tenant (e.g. exec), both (e.g. statistics and metrics) - possibly with changes in semantics (e.g. Create() currently sets up stdin/stdout/stderr, but in a confidential case, those are tenant-owned).
Change the RPC data path with the agent when running confidentially, for example using the same secure channel used for attestation, and cut communication between host and agent (i.e. no vsock with confidentiality).
Restrict the semantics for the existing agent APIs, and the idea that some APIs could be locked down, e.g. through a configuration file in the image.
Restrict the storage to virtio-blk when confidentiality is requested, and take over the block device using tenant-supplied secrets (e.g. format the volume with LUKS at first use), and explore the possibility to share volumes across pods owned by a same tenant (e.g. to limit storage explosion)
Add APIs to the agent to securely download and mount images on such a tenant-owned block device
Enhance image building tools to create images that can be attested and encrypted.
Add support for attestation services in the pod boot process, and only start the pod (and therefore agent) if attestation succeeded.

In addition, it assumes that some other proposals are completed, notably:

The overall Confidential Computing enablement ([RFC] [WIP] Confidential Computing Enablement #1332), which is required to enable hardware support
Moving image download inside the guest (image pulling inside sandbox #149), which is required for point (7) above.
Some form of node feature discovery ([RFC] Preflight-check / node feature discovery #1716) notably to detect hypervisor / hardware support
Making block devices directly visible in the guest (docs: add design proposal for direct-assigned volume #1568) and not through virtiofs, for point (6) above.

Details

Host vs. Tenant trust realms

This RFC proposes the introduction of the notion of tenant and host in the Kata Containers terminology:

The host owns the hardware running the virtual machines / pods. In the new model, the host has no access to the tenant's data, other than the ability to kill pods and measure them from the outside.
The tenant owns the containers and what runs inside the pods. In the new model, the tenant does not trust the host, except in so far as the host provides raw resources (CPU, memory, disk).

In the rest of this proposal, the host and tenant will be considered separate trust realms, and it is assumed that some users may have access only to one side. Therefore, the proposed processes and APIs "must work" given this constraint.

Immutable pod

During the creation of the pod, the pod content is attested. This includes the boot image, kernel, firmware and possibly some restrictions on which images the pod can download.

This implies that we cannot mutate the pod once it has started running containers. Some mutations, e.g. hot-plugging memory, may make sense from a security point of view, but may be hard to implement in practice or not necessarily supported by the hardware / firmware implementations.

An important implication is that Kata in "confidential" mode will probably not need hot-plugging, which simplifies the architecture quite a bit. One interesting question is if we can leverage the work required for confidential computing in order to gain the same benefits in the "normal" case. In other words, could a future Kata 3 supporting confidential computing also only support immutable pods?

Note that hot-plugging memory may seem quite useful, but in practice today, when you resize a pod, I believe k8s will always restart the pod anyways, not mutate it in place.

Host vs. Guest kubectl commands

A result of the approach is that we need to split APIs based on whether they belong to the host or to the tenant.

The following list is based on the usage for kubectl, prefixed with:

H(host): Host privileged
T(enant): Tenant privileged
B(oth): the command makes sense for both host and tenant users
S(plit): The command may need to be split based on the user, e.g. accesses objects that are host or tenant
C(omplicated): The command may need cooperation between host and tenant
H create - Create a resource from a file or from stdin.
C expose - Take a replication controller, service, deployment or pod and expose it as a new Kubernetes Service
T run - Run a particular image on the cluster
S set - Set specific features on objects
B explain- Documentation of resources
S get - Display one or many resources
S edit - Edit a resource on the server
H delete - Delete resources by filenames, stdin, resources and names, or by resources and label selector
H rollout - Manage the rollout of a resource
H scale - Set a new size for a Deployment, ReplicaSet or Replication Controller
H autoscale - Auto-scale a Deployment, ReplicaSet, or ReplicationController
S certificate - Modify certificate resources.
H cluster - -info Display cluster info
S top - Display Resource (CPU/Memory/Storage) usage.
H cordon - Mark node as unschedulable
H uncordon - Mark node as schedulable
H drain - Drain node in preparation for maintenance
H taint - Update the taints on one or more nodes
S describe - Show details of a specific resource or group of resources
T logs - Print the logs for a container in a pod
T attach - Attach to a running container
T exec - Execute a command in a container
T port - -forward Forward one or more local ports to a pod
T proxy - Run a proxy to the Kubernetes API server
T cp - Copy files and directories to and from containers.
S auth - Inspect authorization
T debug - Create debugging sessions for troubleshooting workloads and nodes
C diff - Diff live version against would-be applied version
C apply - Apply a configuration to a resource by filename or stdin
C patch - Update field(s) of a resource
C replace - Replace a resource by filename or stdin
T wait - Experimental: Wait for a specific condition on one or many resources.
C kustomize - Build a kustomization target from a directory or a remote url.
S label - Update the labels on a resource
S annotate - Update the annotations on a resource
T completion - Output shell completion code for the specified shell (bash or zsh)
? api - -resources Print the supported API resources on the server
? api - -versions Print the supported API versions on the server, in the form of "group/version"
S config - Modify kubeconfig files
? plugin - Provides utilities for interacting with plugins.
B version - Print the client and server version information

RPC data path

If we want to be able to support commands like kubectl exec without exposing too much data to the host, we need a secure channel of communication between kubectl and the agent.

One possible idea is to leverage the secure channel that needs to be setup for attestation. The startup sequence would become something like the following:

Host: Start the guest
Agent: Read credentials and configuration from image, switch to "confidential" mode
Agent: Open secure channel for attestation using root credentials from step 2
Agent: Receive initial secrets from attestation server
Agent: Send kernel / firmware measurements over secure channel
Agent: Verify attestation result, and if OK, initiate image download (see discussion in image pulling inside sandbox #149 and [RFC] [WIP] Confidential Computing Enablement #1332)
Agent: Start container, capturing stdin, stderr, stdout and logs
Agent: Respond to usual RPC over secure channel instead of vsock (e.g. console I/Os, exit codes, stats, logs)

Note that technically, the secure channel for the RPC needs not be the same as the one for attestation, and we may discover later good reasons to use a different one.

This implies that on the other side, the APIs are no longer routed through the runtime, but by talking either to the agent directly or through some tenant-owned proxy (the attestation server being a possibility that may reduce the complexity).

In this model, there is no longer any vsock between the runtime on the host and the agent. As a matter of fact, there are probably no good reasons for the runtime to be able to communicate with the agent once the pod has been created. The runtime, on the other hand, still has responsibility for killing the pod or for "external" metrics, e.g. resource utilisation as can be measured by cgroupv2.

Restrict the semantics of agent API

Step 2 in the previous section reads a configuration from the image, and switches to "confidential" mode.

During discussions in the Confidential Computing use cases meeting, we came to an agreement that it is probably better to have a single agent that exposes different APIs depending on the usage, rather than to configure out the irrelevant APIs at compile-time. The benefit is to simplify the delivery mechanism. For example, the image building process would have to deal with a single agent binary and not have to select the right agent based on whether the image is for confidential use or non-confidential use.

The approach proposed here is slightly more radical, since the idea is that the RPC channel with the host would entirely be closed when in confidential mode. In that model, the APIs may remain somewhat identical, though we receive them through a different transport channel.

There are still some restrictions that seem necessary looking at the agent protocol

CreateContainer and StartContainer may be on different sides of a payload attestation protocol. We may want to have CreateContainer before image download, but reject StartContainer if the image does not pass attestation. We probably need to clarify the difference / order with CreateSandbox and DestroySandbox (do we really need both?)
RemoveContainer is now primarily a tenant operation, but it might make sense to allow "friendly shutdown", i.e. relay a host-side pod tear-down request and give in-guest processes a chance to terminate cleanly.
Same kind of question with SignalContainer.
PauseContainer and ResumeContainer are related to host-controlled resources initially, or should we treat that as tenant operations?
All console (stdio) operations are tenant-owned, so simply routing them through a secure channel may be sufficient.
All the networking APIs may need a bit of thought. Do we need cooperation from the host here?
The tracing / observability APIs are tenant-owned and seems to be fine if we go through a secure channel
OnlineCPUMem and MemHotPlugByProbe would go away with an immutable pod approach.
ReseedRandomDev is probably one we would flatly reject in a confidential computing scenario. Can someone suggest any good use for this API in other contexts, assuming we have a complete pod description initially? What about SetGuestDateTime? Any reason that shouldn't be disabled in the confidential case?
GetGuestDetails: This probably becomes a tenant-only API in the confidential case, and we may need to add more details, e.g. the details of the hardware + software + firmware support (e.g. do we have integrity protection).
CopyFile becomes tenant-only.
GetOOMEvent is probably tenant-only in the confidential case, although I could see a case for reporting OOM events to the host.

Question: could an image that is confidentiality-capable be used for the non-confidential case? That would simplify image management further, but require some additional logic at startup to decide which mode to use. Probably too risky?

Question: How would agent-ctl need to be modified to address this? It already has a flexible addressing scheme for the agent, so maybe simply accepting an https:// URL for the agent instead of unix:/// might be sufficient as far as UI is concerned?

Restrict the storage to virtio-blk

This aspect depends specifically on #1568 and related work being completed. There are reasons to believe that using virtiofs to access a host-provided filesystem will never fly with confidential containers. In particular, it communicates too much information to the host if only through meta-data. This seems unfixable while retaining the file-level semantics.

Therefore, it is likely that the model in confidential mode will be:

The guest mounts host-provided block devices, including for image storage and overlays.
All data on such devices will be encrypted using tenant-owned keys.
At mount time, the agent will check that it can decrypts the disk. If not, it will format the disk using something like LUKS and the tenant-owned keys. The idea is that the tenant controls if the disks can even be "replayed" from one run to the next, or shared between pods. By rotating keys between boots, you ensure that the disk is erased each time, and that the host cannot attempt replay attacks. By reusing keys, including between pods, you may be able to share disk that contain container images and overlays between guests owned by a same tenant.

In the case of qemu, we may be able to use qemu-nbd to share encrypted disk images between pods without the host having access to the underlying data. This would make it possible to have disk usage comparable to that of regular kata or non-kata containers. It is not entirely clear if this works today (i.e. can we have a cluster file system on top of LUKS for example).

Add APIs for secure image download

The agent APIs today do not offer features to download images. This is note really discussed in #149

Question: We may want to be able to restrict the images that can be downloaded? How?

Enhance building tools to create images that can be attested and encrypted

Can we entirely rely on existing encrypted container images toolsets? Probably not: we also need osbuilder / image buider changes to make sure we can generate images that can be attested. This belongs to the tenant, but the images are then stored on the host, so we need host credentials as well.

Add support for attestations

This would be based on remote attestation protocols. Need some way to specify the URL of the broker?

The text was updated successfully, but these errors were encountered:

c3d · 2021-05-12T15:00:59Z

@c3d

jimcadden · 2021-05-14T17:29:59Z

Thanks, @c3d! I am largely in agreement with everything you've written. I do think Immutable Pod is a bit of a misnomer... presumably, after the separation, the tenant will still be able to update containers, or deploy ephemeral containers, within a running pod as long as the operation and image comes from a trusted source. I do agree that the host-controlled VM/sandbox should be made effectively immutable.

bpradipt · 2021-05-15T12:31:19Z

@c3d thanks for the proposal. Nicely lays down the host and tenant aspects.
While going through it , I had a thought on the possibility of having the host-guest API restrictions compiled in the agent. IOW, two different builds for the agent (based on build tags). One with restricted API set for confidential computing and another generic build.

c3d · 2021-05-17T14:00:34Z

While going through it , I had a thought on the possibility of having the host-guest API restrictions compiled in the agent. IOW, two different builds for the agent (based on build tags). One with restricted API set for confidential computing and another generic build.

@bpradipt We discussed the possibility during the confidential computing use-case meeting. This is why in the proposal I wrote:

During discussions in the Confidential Computing use cases meeting, we came to an agreement that it is probably better to have a single agent that exposes different APIs depending on the usage, rather than to configure out the irrelevant APIs at compile-time. The benefit is to simplify the delivery mechanism. For example, the image building process would have to deal with a single agent binary and not have to select the right agent based on whether the image is for confidential use or non-confidential use.

This is definitely an option though.

bpradipt · 2021-05-17T15:10:46Z

While going through it , I had a thought on the possibility of having the host-guest API restrictions compiled in the agent. IOW, two different builds for the agent (based on build tags). One with restricted API set for confidential computing and another generic build.

@bpradipt We discussed the possibility during the confidential computing use-case meeting. This is why in the proposal I wrote:

During discussions in the Confidential Computing use cases meeting, we came to an agreement that it is probably better to have a single agent that exposes different APIs depending on the usage, rather than to configure out the irrelevant APIs at compile-time. The benefit is to simplify the delivery mechanism. For example, the image building process would have to deal with a single agent binary and not have to select the right agent based on whether the image is for confidential use or non-confidential use.

This is definitely an option though.

Thanks for the clarification @c3d

c3d · 2021-05-26T16:11:53Z

Updated trust domains to trust realm to make it clear that we are not talking about TDs as defined by TDX.

c3d · 2021-05-27T13:19:14Z

Added a few images to make things easier to understand.

c3d · 2022-10-31T15:03:05Z

Updated slide deck from presentation on Octoberhttps://docs.google.com/presentation/d/16649JpQAdDb3jh3OVKWdNks0mvcqpidssrm0pOZNKZ4/edit?usp=sharing

c3d · 2023-02-27T14:17:25Z

Current categorization of endpoints maintained by @ray-valdez: https://app.box.com/file/1109515066100?s=0ybmczv3mko7ub3zfob43bpfzkbw8ms6.

Summary here:

38 endpoint APIs + 1 (listcontainers) = 39
12 belong to Host (including ~5 shared) and rest for Tenant
Currently implemented for Tenant side: pause, resume, exec, listcontainers

Split host/tenant APIs presentation listed: 35 endpoints. Also: health service (check and version), image (pullimage)

c3d · 2023-02-27T14:24:26Z

Host-side endpoints (details in document linked above):

Name	Category	Description
`AddARPNeighbors`	Networking	Add an ARP neighbor (netlink.rs)
`CreateSandbox`	Initialization	Initialize the sandbox (rpc.rs, mount.rs, network.rs, ..)
`DestroySandbox`	Termination	Destroy the sandbox (rpc.rs, sandbox.rs)
`GuestDetails`	Status / Stats	Get details on guest and agent
`MemHotplugByProbe`	Initialization	Add memory via hotplug
`OnlineCPUMem`	Initialization	Add CPU via hotplug
`UpdateInterface`	Networking	Update interfaces on links
`UpdateRoutes`	Networking	Update routes on links
`ListContainers`	Initialization	Return sandbox and container IDs

The hotplug endpoints may have to be removed in a CC context. This is dependent on #2637 and related. Until then, we can get the "final" layout of the pod using annotations to avoid hotplugging.

The ListContainers endpoint is currently somewhat special, because it is required by the OCI specification. However, like PullImage. this is something that we should be able to transfer to the guest.

c3d · 2023-02-27T14:39:02Z

Tenant "admin" endpoints (if we ever get to make a difference between admin and non-admin guest access)

Name	Category	Description
`AddSwap`	Initialization	Use block device for swap
`CreateContainer`	Initialization	Create a container (rpc.rs)
`GetIPTables`	Networking	Return the result of `IP6TABLES_SAVE` or `IPTABLES`
`GetVolumeStats`	Storage / Stats	Get volume capacity and inode statistics
`ListInterfaces`	Networking	Get networking interfaces (rpc.rs, netlink.rs)
`ListRoutes`	Networking	Get networking routes (rpc.rs, netlink.rs)
`PauseContainer`	Status	Pause container
`ResumeContainer`	Status	Resume pause container
`SetIPTables`	Networking	Set IP tables in the guest
`StartContainer`	Initialization	Start a container
`StatsContainer`	Stats	Return statistics about the container
`UpdateContainer`	Status	Update a container's resources
`PullImage`	Initialization	Pull an image from within a container

GetVolumeStats may be also necessary for the host to be able to perform storage resource allocation.

Overall, the networking endpoints are the most problematic. ListRoutesRequest handling, for example, may depend on the network interface card (NIC) configuration. For a pass-through device, the guest has complete control, so the host may not need to be involved at all. For virtual networks, on the other hand, the host may plays a role in the routing, so a coordination and sequencing between host and guest might be necessary.

Statistics endpoints seen by the agent seem to not be relevant to the host. However, similar values are required for proper host-side scheduling and accounting, so these endpoints will need to be effectively split.

c3d · 2023-02-27T14:53:35Z

Tenant non-admin endpoints:

Name	Category	Description
`CloseStdin`	Termination	Termination due to closing of standard input
`CopyFile`	Initialization	Copy files to specific location
`ExecProcess`	User Actions	Execute a process in a container
`GetMetrics`	Stats	Return metrics for the host/guest side of a container
`GetOOMEvent`	Errors	Get out of memory event for a container
`ReadStderr`	Logging	Read the standard error stream from a container
`ReadStdout`	Logging	Read the standard output stream from a container
`RemoveContainer`	Termination	Remove a container
`ReseedRandomDev`	User actions	Set the random seed
`ResizeVolume`	Storage	Resize a volume (host + guest)
`SetGuestDateTime`	User actions	Set guest time (using libc)
`SignalProcess`	User actions	Send a signal to a container / all processes in container's cgroup
`TtyWinResize`	User actions	Set the process' stdout terminal size using libc

CopyFile is currently accessing files on the host. The exact semantics we should give in a CC context remains to be defined.

GetMetrics needs to be split between host and guest, since some metrics are relevant to the host e.g. for scheduling or billing purpose.

ReadStdout and ReadStderr might need some extra buffering compared to the normal scenario, since we don't have access until a TLS / secure connexion has been established with the agent.

ReseedRandomDev has strong security implications with respect to cryptographic security. This may need to be moved to the "tenant admin" category, or disabled entirely. Some TEE platforms have special handling for crypto seeds.

ResizeVolume needs some cooperation from the host and some sequencing.

c3d · 2023-02-27T15:02:11Z

Minimal set of endpoints:

PullImage
CreateContainer
ExecProcess (Done)
PauseContainer (Done)
RemoveContainer
ResumeContainer (Done)
StartContainer
ListContainers (Done)

c3d · 2023-02-28T08:58:48Z

@ray-valdez Could you check that the above comments capture our latest status?

ray-valdez · 2023-02-28T14:09:36Z

@c3d Yes, the above comments summarize our latest status. One point to add is that the CreateContainer endpoint is also invoked by the host during sandbox creation so before handing off workload management to the tenant-side we'll need to disable access to this endpoint to the host-side.

This commit is an initial step towards mitigating the risk of untrustworthy host systems when running Kata Containers in the context of confidential computing. It serves to safeguard against malicious, privileged users gaining access to the vulnerable Kata control plane. In scenarios where a malicious cloud service provider or administrator might intercept or compromise commands from the Kata control plane, tamper with container configuration files, execute processes within the container, retrieve workload statistics, or obtain sensitive container workload information, this protective measure becomes crucial. This commit addresses the following open issues: [Securing the Kata Control Plane](http:// confidential-containers/confidential-containers#53) and RFC: Separate trust realms for tenant and host kata-containers#1834. A detail history can be found in: https://github.com/ray-valdez/kata-containers/tree/split-api-feature. The commit introduces a new split API mode in the kata-agent, which partitions the kata-agent’s API endpoints between h**ost-side** and **owner-side** controllers. When this mode is enabled, the host-side controller is restricted to manage resource allocation during startup and resource recycling at termination. In contrasts, the owner-side controller allows workload owners to directly manage theIR deployment pod and containers. This partitioning implicitly labels kata-agent’s endpoint APIs as _host-exclusive_, _owner-exclusive_, or _shared_. Host-exclusive and owner-exclusive APIs are assigned specifically to either the host-side or owner-side. For instance, `CreateSandbox` and `DestroySandbox` are examples of host-exclusive APIs, while `CopyFile` and `ExecProcess` are examples of owner-exclusive APIs. Shared APIs include those that must be shared to some extent between the control planes, such as `GetOOMEvent` and `GetGuestDetails`. This commit focuses on providing a secure channel for the owner-side to access owner-exlusive and shared APIs. Future commit(s) will restrict the host-side access to owner-exclusive APIs when split mode is enabled on the kata-agent and will address the sharing of APIs between host-side and owner-side. This commit implements the following changes: - Introduces the split mode to the kata-agent. - Integrates a gRPC TLS server to handle API requests from the owner-side. We refer to this as the kata-agent’s API proxy server which ensures that workload owners can establish a secure end-to-end communication channel with the kata-agent tor invoking API endpoint commands. - Utilizes the Key Broker Service (KBS) to provision secrets, i.e., cryptographic public and private key pairs, These secrets are crucial for establishing a secure communication channel between the owner-side and the API proxy server. To enable split mode functionality, the following steps are required: 1. Configuration: Modify the `kernel_params` option in the Kata's configuration.toml file to enable split mode and specify the IP address of the KBS. - Add following settings to the `kernel_params` option: `agent.split_api=true` and `agent.aa_kbc_params=cc_kbc::http://[IP_ADDRESS]:[PORT]`. 2. Dependency on KBS: The kata-agent relies on the KBS to provision cryptographic keys to the split API proxy server, facilitating the establishment of a secure channel. - Generate TLS keys and certificates for kata-agent’s API proxy server and client (owner-side) ``` $ KATA_DIR=”<PATH to cloned repo>” $ pushd ${KATA_DIR}/src/agent/grpc_tls_keys $ ./gen_key_cert.sh ``` - Create a zip file named 'tls-keys.zip' containing the CA public key and the server’s public and private key pair ` $ zip tls-keys.zip server.pem server.key ca.pem` - Place this zip file in the KBS resource path '/default/tenant-keys/'. During sandbox creation, the kata-agent retrieves this file using the KBS 'get resource' API. It's important to note that the KBS conducts a background check on the key request, verifying evidence provided by the Trusted Execution Environment (TEE). Future extensions to the KBS will automate the creation of the server’s public and private key pair for each sandbox. ` $ popd` To exercise the API proxy server, we provide the Kata Containers agent TLS control tool (kata-agent-tls-ctl), derived from the kata-agent-ctl tool in another commit. This tool communicates over a gRPC TLS channel with the kata-agent. Similar to the kata-agent-ctl, this is a low level tool that is intended for advanced users. Future commit(s) will introduce a more user-friendly tool that maintains state, designed to function as a kubectl plugin for managing owners’ workloads. Examples of creating and starting a container using kata-agent-tls-ctl: Setup environment ``` $ export guest_addr=10.89.0.28 # IP address associated with the confidential VM $ export guest_port=50090 # API proxy server’s port (listens on) $ export ctl=./target/x86_64-unknown-linux-musl/release/kata-agent-tls-ctl $ export key_dir=${KATA_DIR}/src/agent/grpc_tls_key ``` Display the status of containers in the sandbox environment ``` $ ${ctl} -l trace connect --key-dir "${key_dir}" --bundle-dir "${bundle_dir}" \ --server-address "ipaddr://${guest_addr}:${guest_port}" \ -c "ListContainers" ``` Set a container ID and specify an OCI spec: ``` $ container_id=9e3d1d4750e4e20945d22c358e13c85c6b88922513bce2832c0cf403f065dc6 $ OCI_SPEC_CONFIG=${KATA_DIR}/src/tools/agent-tls-ctl/config.json ``` _Note: the next two commands require pull_image support in the guest!_ **Create container request** ``` $ ${ctl} -l trace connect --key-dir "${key_dir}" --bundle-dir "${bundle_dir}" \ --server-address "ipaddr://${guest_addr}:${guest_port}" \ -c "CreateContainer cid=${container_id} spec=file:///${OCI_SPEC_CONFIG}" ``` **Start container request** ``` $ ${ctl} -l trace connect --no-auto-values --key-dir "${key_dir}" --bundle-dir "${bundle_dir}" \ --server-address "ipaddr://${guest_addr}:${guest_port}" \ -c "StartContainer json://{\"container_id\": \"${container_id}\"}" ``` Fixes: kata-containers#1834

This commit is an initial step towards mitigating the risk of untrustworthy host systems when running Kata Containers in the context of confidential computing. It serves to safeguard against malicious, privileged users gaining access to the vulnerable Kata control plane. In scenarios where a malicious cloud service provider or administrator might intercept or compromise commands from the Kata control plane, tamper with container configuration files, execute processes within the container, retrieve workload statistics, or obtain sensitive container workload information, this protective measure becomes crucial. This commit addresses the following open issues: [Securing the Kata Control Plane](confidential-containers/confidential-containers#53) and RFC: Separate trust realms for tenant and host kata-containers#1834. A detail history can be found in: https://github.com/ray-valdez/kata-containers/tree/split-api-feature. The commit introduces a new split API mode in the kata-agent, which partitions the kata-agent’s API endpoints between h**ost-side** and **owner-side** controllers. When this mode is enabled, the host-side controller is restricted to manage resource allocation during startup and resource recycling at termination. In contrasts, the owner-side controller allows workload owners to directly manage theIR deployment pod and containers. This partitioning implicitly labels kata-agent’s endpoint APIs as _host-exclusive_, _owner-exclusive_, or _shared_. Host-exclusive and owner-exclusive APIs are assigned specifically to either the host-side or owner-side. For instance, `CreateSandbox` and `DestroySandbox` are examples of host-exclusive APIs, while `CopyFile` and `ExecProcess` are examples of owner-exclusive APIs. Shared APIs include those that must be shared to some extent between the control planes, such as `GetOOMEvent` and `GetGuestDetails`. This commit focuses on providing a secure channel for the owner-side to access owner-exlusive and shared APIs. Future commit(s) will restrict the host-side access to owner-exclusive APIs when split mode is enabled on the kata-agent and will address the sharing of APIs between host-side and owner-side. This commit implements the following changes: - Introduces the split mode to the kata-agent. - Integrates a gRPC TLS server to handle API requests from the owner-side. We refer to this as the kata-agent’s API proxy server which ensures that workload owners can establish a secure end-to-end communication channel with the kata-agent tor invoking API endpoint commands. - Utilizes the Key Broker Service (KBS) to provision secrets, i.e., cryptographic public and private key pairs, These secrets are crucial for establishing a secure communication channel between the owner-side and the API proxy server. To enable split mode functionality, the following steps are required: 1. Configuration: Modify the `kernel_params` option in the Kata's configuration.toml file to enable split mode and specify the IP address of the KBS. - Add following settings to the `kernel_params` option: `agent.split_api=true` and `agent.aa_kbc_params=cc_kbc::http://[IP_ADDRESS]:[PORT]`. 2. Dependency on KBS: The kata-agent relies on the KBS to provision cryptographic keys to the split API proxy server, facilitating the establishment of a secure channel. - Generate TLS keys and certificates for kata-agent’s API proxy server and client (owner-side) ``` $ KATA_DIR=”<PATH to cloned repo>” $ pushd ${KATA_DIR}/src/agent/grpc_tls_keys $ ./gen_key_cert.sh ``` - Create a zip file named 'tls-keys.zip' containing the CA public key and the server’s public and private key pair ` $ zip tls-keys.zip server.pem server.key ca.pem` - Place this zip file in the KBS resource path '/default/tenant-keys/'. During sandbox creation, the kata-agent retrieves this file using the KBS 'get resource' API. It's important to note that the KBS conducts a background check on the key request, verifying evidence provided by the Trusted Execution Environment (TEE). Future extensions to the KBS will automate the creation of the server’s public and private key pair for each sandbox. ` $ popd` To exercise the API proxy server, we provide the Kata Containers agent TLS control tool (kata-agent-tls-ctl), derived from the kata-agent-ctl tool in another commit. This tool communicates over a gRPC TLS channel with the kata-agent. Similar to the kata-agent-ctl, this is a low level tool that is intended for advanced users. Future commit(s) will introduce a more user-friendly tool that maintains state, designed to function as a kubectl plugin for managing owners’ workloads. Examples of creating and starting a container using kata-agent-tls-ctl: Setup environment ``` $ export guest_addr=10.89.0.28 # IP address associated with the confidential VM $ export guest_port=50090 # API proxy server’s port (listens on) $ export ctl=./target/x86_64-unknown-linux-musl/release/kata-agent-tls-ctl $ export key_dir=${KATA_DIR}/src/agent/grpc_tls_key ``` Display the status of containers in the sandbox environment ``` $ ${ctl} -l trace connect --key-dir "${key_dir}" --bundle-dir "${bundle_dir}" \ --server-address "ipaddr://${guest_addr}:${guest_port}" \ -c "ListContainers" ``` Set a container ID and specify an OCI spec: ``` $ container_id=9e3d1d4750e4e20945d22c358e13c85c6b88922513bce2832c0cf403f065dc6 $ OCI_SPEC_CONFIG=${KATA_DIR}/src/tools/agent-tls-ctl/config.json ``` _Note: the next two commands require pull_image support in the guest!_ **Create container request** ``` $ ${ctl} -l trace connect --key-dir "${key_dir}" --bundle-dir "${bundle_dir}" \ --server-address "ipaddr://${guest_addr}:${guest_port}" \ -c "CreateContainer cid=${container_id} spec=file:///${OCI_SPEC_CONFIG}" ``` **Start container request** ``` $ ${ctl} -l trace connect --no-auto-values --key-dir "${key_dir}" --bundle-dir "${bundle_dir}" \ --server-address "ipaddr://${guest_addr}:${guest_port}" \ -c "StartContainer json://{\"container_id\": \"${container_id}\"}" ``` Fixes: kata-containers#1834

This commit is an initial step towards mitigating the risk of untrustworthy host systems when running Kata Containers in the context of confidential computing. It serves to safeguard against malicious, privileged users gaining access to the vulnerable Kata control plane. In scenarios where a malicious cloud service provider or administrator might intercept or compromise commands from the Kata control plane, tamper with container configuration files, execute processes within the container, retrieve workload statistics, or obtain sensitive container workload information, this protective measure becomes crucial. This commit addresses the following open issues: [Securing the Kata Control Plane](confidential-containers/confidential-containers#53) and RFC: Separate trust realms for tenant and host kata-containers#1834. A detail history can be found in: https://github.com/ray-valdez/kata-containers/tree/split-api-feature. The commit introduces a new split API mode in the kata-agent, which partitions the kata-agent’s API endpoints between h**ost-side** and **owner-side** controllers. When this mode is enabled, the host-side controller is restricted to manage resource allocation during startup and resource recycling at termination. In contrasts, the owner-side controller allows workload owners to directly manage theIR deployment pod and containers. This partitioning implicitly labels kata-agent’s endpoint APIs as _host-exclusive_, _owner-exclusive_, or _shared_. Host-exclusive and owner-exclusive APIs are assigned specifically to either the host-side or owner-side. For instance, `CreateSandbox` and `DestroySandbox` are examples of host-exclusive APIs, while `CopyFile` and `ExecProcess` are examples of owner-exclusive APIs. Shared APIs include those that must be shared to some extent between the control planes, such as `GetOOMEvent` and `GetGuestDetails`. This commit focuses on providing a secure channel for the owner-side to access owner-exlusive and shared APIs. Future commit(s) will restrict the host-side access to owner-exclusive APIs when split mode is enabled on the kata-agent and will address the sharing of APIs between host-side and owner-side. This commit implements the following changes: - Introduces the split mode to the kata-agent. - Integrates a gRPC TLS server to handle API requests from the owner-side. We refer to this as the kata-agent’s API proxy server which ensures that workload owners can establish a secure end-to-end communication channel with the kata-agent tor invoking API endpoint commands. - Utilizes the Key Broker Service (KBS) to provision secrets, i.e., cryptographic public and private key pairs. These secrets are crucial for establishing a secure communication channel between the owner-side and the API proxy server. To enable split mode functionality, the following steps are required: 1. Configuration: Modify the `kernel_params` option in the Kata's configuration.toml file to enable split mode and specify the IP address of the KBS. - Add following settings to the `kernel_params` option: `agent.split_api=true` and `agent.aa_kbc_params=cc_kbc::http://[IP_ADDRESS]:[PORT]`. 2. Dependency on KBS: The kata-agent relies on the KBS to provision cryptographic keys to the split API proxy server, facilitating the establishment of a secure channel. - Generate TLS keys and certificates for kata-agent’s API proxy server and client (owner-side) ``` $ KATA_DIR=”<PATH to cloned repo>” $ pushd ${KATA_DIR}/src/agent/grpc_tls_keys $ ./gen_key_cert.sh ``` - Create a zip file named 'tls-keys.zip' containing the CA public key and the server’s public and private key pair ` $ zip tls-keys.zip server.pem server.key ca.pem` - Place this zip file in the KBS resource path '/default/tenant-keys/'. During sandbox creation, the kata-agent retrieves this file using the KBS 'get resource' API. It's important to note that the KBS conducts a background check on the key request, verifying evidence provided by the Trusted Execution Environment (TEE). Future extensions to the KBS will automate the creation of the server’s public and private key pair for each sandbox. ` $ popd` To exercise the API proxy server, we provide the Kata Containers agent TLS control tool (kata-agent-tls-ctl), derived from the kata-agent-ctl tool in another commit. This tool communicates over a gRPC TLS channel with the kata-agent. Similar to the kata-agent-ctl, this is a low level tool that is intended for advanced users. Future commit(s) will introduce a more user-friendly tool that maintains state, designed to function as a kubectl plugin for managing owners’ workloads. Examples of creating and starting a container using kata-agent-tls-ctl: Setup environment ``` $ export guest_addr=10.89.0.28 # IP address associated with the confidential VM $ export guest_port=50090 # API proxy server’s port (listens on) $ export ctl=./target/x86_64-unknown-linux-musl/release/kata-agent-tls-ctl $ export key_dir=${KATA_DIR}/src/agent/grpc_tls_key ``` Display the status of containers in the sandbox environment ``` $ ${ctl} -l trace connect --key-dir "${key_dir}" --bundle-dir "${bundle_dir}" \ --server-address "ipaddr://${guest_addr}:${guest_port}" \ -c "ListContainers" ``` Set a container ID and specify an OCI spec: ``` $ container_id=9e3d1d4750e4e20945d22c358e13c85c6b88922513bce2832c0cf403f065dc6 $ OCI_SPEC_CONFIG=${KATA_DIR}/src/tools/agent-tls-ctl/config.json ``` _Note: the next two commands require pull_image support in the guest!_ **Create container request** ``` $ ${ctl} -l trace connect --key-dir "${key_dir}" --bundle-dir "${bundle_dir}" \ --server-address "ipaddr://${guest_addr}:${guest_port}" \ -c "CreateContainer cid=${container_id} spec=file:///${OCI_SPEC_CONFIG}" ``` **Start container request** ``` $ ${ctl} -l trace connect --no-auto-values --key-dir "${key_dir}" --bundle-dir "${bundle_dir}" \ --server-address "ipaddr://${guest_addr}:${guest_port}" \ -c "StartContainer json://{\"container_id\": \"${container_id}\"}" ``` Fixes: kata-containers#1834

This commit is an initial step towards mitigating the risk of untrustworthy host systems when running Kata Containers in the context of confidential computing. It serves to safeguard against malicious, privileged users gaining access to the vulnerable Kata control plane. This protective measure becomes crucial in scenarios where a malicious cloud service provider or administrator might intercept or compromise commands from the Kata control plane, tamper with container configuration files, execute processes within the container, retrieve workload statistics, or obtain sensitive container workload information. This commit addresses the following open issues: [Securing the Kata Control Plane](confidential-containers/confidential-containers#53) and RFC: Separate trust realms for tenant and host kata-containers#1834. A detail history can be found in: https://github.com/ray-valdez/kata-containers/tree/split-api-feature. The commit introduces a new split API mode in the kata-agent, which partitions the kata-agent’s API endpoints between **host-side** and **owner-side** controllers. When this mode is enabled, the host-side controller is restricted to manage resource allocation during startup and resource recycling at termination. In contrasts, the owner-side controller allows workload owners to directly manage theIR deployment pod and containers. This partitioning implicitly labels kata-agent’s endpoint APIs as _host-exclusive_, _owner-exclusive_, or _shared_. Host-exclusive and owner-exclusive APIs are assigned specifically to either the host-side or owner-side. For instance, `CreateSandbox` and `DestroySandbox` are examples of host-exclusive APIs, while `CopyFile` and `ExecProcess` are examples of owner-exclusive APIs. Shared APIs include those that must be shared to some extent between the control planes, such as `GetOOMEvent` and `GetGuestDetails`. This commit focuses on providing a secure channel for the owner-side to access owner-exlusive and shared APIs. Future commit(s) will restrict the host-side access to owner-exclusive APIs when split mode is enabled on the kata-agent and will address the sharing of APIs between host-side and owner-side. This commit implements the following changes: - Introduces the split mode to the kata-agent. - Integrates a gRPC TLS server to handle API requests from the owner-side. We refer to this as the kata-agent’s API proxy server which ensures that workload owners can establish a secure end-to-end communication channel with the kata-agent or invoking API endpoint commands. - Utilizes the Key Broker Service (KBS) to provision secrets, i.e., cryptographic public and private key pairs. These secrets are crucial for establishing a secure communication channel between the owner-side and the API proxy server. To enable split mode functionality, the following steps are required: 1. Configuration: Modify the `kernel_params` option in the Kata's configuration.toml file to enable split mode and specify the IP address of the KBS. - Add following settings to the `kernel_params` option: `agent.split_api=true` and `agent.aa_kbc_params=cc_kbc::http://[IP_ADDRESS]:[PORT]`. 2. Dependency on KBS: The kata-agent relies on the KBS to provision cryptographic keys to the split API proxy server, facilitating the establishment of a secure channel. - Generate TLS keys and certificates for kata-agent’s API proxy server and client (owner-side) ``` $ KATA_DIR=”<PATH to cloned repo>” $ pushd ${KATA_DIR}/src/agent/grpc_tls_keys $ ./gen_key_cert.sh ``` - Create a zip file named 'tls-keys.zip' containing the CA public key and the server’s public and private key pair ` $ zip tls-keys.zip server.pem server.key ca.pem` - Place this zip file in the KBS resource path '/default/tenant-keys/'. During sandbox creation, the kata-agent retrieves this file using the KBS 'get resource' API. It's important to note that the KBS conducts a background check on the key request, verifying evidence provided by the Trusted Execution Environment (TEE). Future extensions to the KBS will automate the creation of the server’s public and private key pair for each sandbox. ` $ popd` To exercise the API proxy server, we provide the Kata Containers agent TLS control tool (kata-agent-tls-ctl), derived from the kata-agent-ctl tool in another commit. This tool communicates over a gRPC TLS channel with the kata-agent. Similar to the kata-agent-ctl, this is a low level tool that is intended for advanced users. Future commit(s) will introduce a more user-friendly tool that maintains state, designed to function as a kubectl plugin for managing owners’ workloads. Examples of creating and starting a container using kata-agent-tls-ctl: Setup environment ``` $ export guest_addr=10.89.0.28 # IP address associated with the confidential VM $ export guest_port=50090 # API proxy server’s port (listens on) $ export ctl=./target/x86_64-unknown-linux-musl/release/kata-agent-tls-ctl $ export key_dir=${KATA_DIR}/src/agent/grpc_tls_key ``` Display the status of containers in the sandbox environment ``` $ ${ctl} -l trace connect --key-dir "${key_dir}" --bundle-dir "${bundle_dir}" \ --server-address "ipaddr://${guest_addr}:${guest_port}" \ -c "ListContainers" ``` Set a container ID and specify an OCI spec: ``` $ container_id=9e3d1d4750e4e20945d22c358e13c85c6b88922513bce2832c0cf403f065dc6 $ OCI_SPEC_CONFIG=${KATA_DIR}/src/tools/agent-tls-ctl/config.json ``` _Note: the next two commands require pull_image support in the guest!_ **Create container request** ``` $ ${ctl} -l trace connect --key-dir "${key_dir}" --bundle-dir "${bundle_dir}" \ --server-address "ipaddr://${guest_addr}:${guest_port}" \ -c "CreateContainer cid=${container_id} spec=file:///${OCI_SPEC_CONFIG}" ``` **Start container request** ``` $ ${ctl} -l trace connect --no-auto-values --key-dir "${key_dir}" --bundle-dir "${bundle_dir}" \ --server-address "ipaddr://${guest_addr}:${guest_port}" \ -c "StartContainer json://{\"container_id\": \"${container_id}\"}" ``` Fixes: kata-containers#1834

This commit is an initial step towards mitigating the risk of untrustworthy host systems when running Kata Containers in the context of confidential computing. It serves to safeguard against malicious, privileged users gaining access to the vulnerable Kata control plane. This protective measure becomes crucial in scenarios where a malicious cloud service provider or administrator might intercept or compromise commands from the Kata control plane, tamper with container configuration files, execute processes within the container, retrieve workload statistics, or obtain sensitive container workload information. ** Problem statement ** This commit addresses the following open issues: [Securing the Kata Control Plane](confidential-containers/confidential-containers#53) and RFC: Separate trust realms for tenant and host kata-containers#1834. A detail history can be found in: https://github.com/ray-valdez/kata-containers/tree/split-api-feature. ** Archictectural changes ** The commit introduces a new split API mode in the kata-agent, which partitions the kata-agent’s API endpoints between **host-side** and **owner-side** controllers. When this mode is enabled, the host-side controller is restricted to manage resource allocation during startup and resource recycling at termination. In contrasts, the owner-side controller allows workload owners to directly manage theIR deployment pod and containers. This partitioning implicitly labels kata-agent’s endpoint APIs as _host-exclusive_, _owner-exclusive_, or _shared_. Host-exclusive and owner-exclusive APIs are assigned specifically to either the host-side or owner-side. For instance, `CreateSandbox` and `DestroySandbox` are examples of host-exclusive APIs, while `CopyFile` and `ExecProcess` are examples of owner-exclusive APIs. Shared APIs include those that must be shared to some extent between the control planes, such as `GetOOMEvent` and `GetGuestDetails`. ** Content of this commit ** This commit focuses on providing a secure channel for the owner-side to access owner-exlusive and shared APIs. Future commit(s) will restrict the host-side access to owner-exclusive APIs when split mode is enabled on the kata-agent and will address the sharing of APIs between host-side and owner-side. This commit implements the following changes: - Introduces the split mode to the kata-agent. - Integrates a gRPC TLS server to handle API requests from the owner-side. We refer to this as the kata-agent’s API proxy server which ensures that workload owners can establish a secure end-to-end communication channel with the kata-agent or invoking API endpoint commands. - Utilizes the Key Broker Service (KBS) to provision secrets, i.e., cryptographic public and private key pairs. These secrets are crucial for establishing a secure communication channel between the owner-side and the API proxy server. ** Testing ** To enable split mode functionality, the following steps are required: 1. Configuration: Modify the `kernel_params` option in the Kata's configuration.toml file to enable split mode and specify the IP address of the KBS. - Add following settings to the `kernel_params` option: `agent.split_api=true` and `agent.aa_kbc_params=cc_kbc::http://[IP_ADDRESS]:[PORT]`. 2. Dependency on KBS: The kata-agent relies on the KBS to provision cryptographic keys to the split API proxy server, facilitating the establishment of a secure channel. - Generate TLS keys and certificates for kata-agent’s API proxy server and client (owner-side) ``` $ KATA_DIR=”<PATH to cloned repo>” $ pushd ${KATA_DIR}/src/agent/grpc_tls_keys $ ./gen_key_cert.sh ``` - Create a zip file named 'tls-keys.zip' containing the CA public key and the server’s public and private key pair ` $ zip tls-keys.zip server.pem server.key ca.pem` - Place this zip file in the KBS resource path '/default/tenant-keys/'. During sandbox creation, the kata-agent retrieves this file using the KBS 'get resource' API. It's important to note that the KBS conducts a background check on the key request, verifying evidence provided by the Trusted Execution Environment (TEE). Future extensions to the KBS will automate the creation of the server’s public and private key pair for each sandbox. ` $ popd` ** External tools required for testing ** To exercise the API proxy server, we provide the Kata Containers agent TLS control tool (kata-agent-tls-ctl), derived from the `kata-agent-ctl` tool in another commit (see `split-api-feature` branch referenced above). This tool communicates over a gRPC TLS channel with the kata-agent. Similar to the kata-agent-ctl, this is a low level tool that is intended for advanced users. Future commit(s) will introduce a more user-friendly tool that maintains state, designed to function as a kubectl plugin for managing owners’ workloads. Examples of creating and starting a container using `kata-agent-tls-ctl`: Setup environment ``` $ export guest_addr=10.89.0.28 # IP address associated with the confidential VM $ export guest_port=50090 # API proxy server’s port (listens on) $ export ctl=./target/x86_64-unknown-linux-musl/release/kata-agent-tls-ctl $ export key_dir=${KATA_DIR}/src/agent/grpc_tls_key ``` Display the status of containers in the sandbox environment ``` $ ${ctl} -l trace connect --key-dir "${key_dir}" --bundle-dir "${bundle_dir}" \ --server-address "ipaddr://${guest_addr}:${guest_port}" \ -c "ListContainers" ``` Set a container ID and specify an OCI spec: ``` $ container_id=9e3d1d4750e4e20945d22c358e13c85c6b88922513bce2832c0cf403f065dc6 $ OCI_SPEC_CONFIG=${KATA_DIR}/src/tools/agent-tls-ctl/config.json ``` _Note: the next two commands require pull_image support in the guest!_ **Create container request** ``` $ ${ctl} -l trace connect --key-dir "${key_dir}" --bundle-dir "${bundle_dir}" \ --server-address "ipaddr://${guest_addr}:${guest_port}" \ -c "CreateContainer cid=${container_id} spec=file:///${OCI_SPEC_CONFIG}" ``` **Start container request** ``` $ ${ctl} -l trace connect --no-auto-values --key-dir "${key_dir}" --bundle-dir "${bundle_dir}" \ --server-address "ipaddr://${guest_addr}:${guest_port}" \ -c "StartContainer json://{\"container_id\": \"${container_id}\"}" ``` Fixes: kata-containers#1834

This commit is an initial step towards mitigating the risk of untrustworthy host systems when running Kata Containers in the context of confidential computing. It serves to safeguard against malicious, privileged users gaining access to the vulnerable Kata control plane. This protective measure becomes crucial in scenarios where a malicious cloud service provider or administrator might intercept or compromise commands from the Kata control plane, tamper with container configuration files, execute processes within the container, retrieve workload statistics, or obtain sensitive container workload information. ** Problem statement ** This commit addresses the following open issues: [Securing the Kata Control Plane](confidential-containers/confidential-containers#53) and RFC: Separate trust realms for tenant and host kata-containers#1834. A detailed history can be found in: https://github.com/ray-valdez/kata-containers/tree/split-api-feature. ** Architectural changes ** The commit introduces a new split API mode in the kata-agent, which partitions the kata-agent’s API endpoints between **host-side** and **owner-side** controllers. When this mode is enabled, the host-side controller is restricted to manage resource allocation during startup and resource recycling at termination. In contrasts, the owner-side controller allows workload owners to directly manage theIR deployment pod and containers. This partitioning implicitly labels kata-agent’s endpoint APIs as _host-exclusive_, _owner-exclusive_, or _shared_. Host-exclusive and owner-exclusive APIs are assigned specifically to either the host-side or owner-side. For instance, `CreateSandbox` and `DestroySandbox` are examples of host-exclusive APIs, while `CopyFile` and `ExecProcess` are examples of owner-exclusive APIs. Shared APIs include those that must be shared to some extent between the control planes, such as `GetOOMEvent` and `GetGuestDetails`. ** Content of this commit ** This commit focuses on providing a secure channel for the owner-side to access owner-exlusive and shared APIs. Future commit(s) will restrict the host-side access to owner-exclusive APIs when split mode is enabled on the kata-agent and will address the sharing of APIs between host-side and owner-side. This commit implements the following changes: - Introduces the split mode to the kata-agent. - Integrates a gRPC TLS server to handle API requests from the owner-side. We refer to this as the kata-agent’s API proxy server which ensures that workload owners can establish a secure end-to-end communication channel with the kata-agent or invoking API endpoint commands. - Utilizes the Key Broker Service (KBS) to provision secrets, i.e., cryptographic public and private key pairs. These secrets are crucial for establishing a secure communication channel between the owner-side and the API proxy server. ** Testing ** To enable split mode functionality, the following steps are required: 1. Configuration: Modify the `kernel_params` option in the Kata's configuration.toml file to enable split mode and specify the IP address of the KBS. - Add following settings to the `kernel_params` option: `agent.split_api=true` and `agent.aa_kbc_params=cc_kbc::http://[IP_ADDRESS]:[PORT]`. 2. Dependency on KBS: The kata-agent relies on the KBS to provision cryptographic keys to the split API proxy server, facilitating the establishment of a secure channel. - Generate TLS keys and certificates for kata-agent’s API proxy server and client (owner-side) ``` $ KATA_DIR=”<PATH to cloned repo>” $ pushd ${KATA_DIR}/src/agent/grpc_tls_keys $ ./gen_key_cert.sh ``` - Create a zip file named 'tls-keys.zip' containing the CA public key and the server’s public and private key pair ` $ zip tls-keys.zip server.pem server.key ca.pem` - Place this zip file in the KBS resource path '/default/tenant-keys/'. During sandbox creation, the kata-agent retrieves this file using the KBS 'get resource' API. It's important to note that the KBS conducts a background check on the key request, verifying evidence provided by the Trusted Execution Environment (TEE). Future extensions to the KBS will automate the creation of the server’s public and private key pair for each sandbox. ` $ popd` ** External tools required for testing ** To exercise the API proxy server, we provide the Kata Containers agent TLS control tool (kata-agent-tls-ctl), derived from the `kata-agent-ctl` tool in another commit (see `split-api-feature` branch referenced above). This tool communicates over a gRPC TLS channel with the kata-agent. Similar to the kata-agent-ctl, this is a low level tool that is intended for advanced users. Future commit(s) will introduce a more user-friendly tool that maintains state, designed to function as a kubectl plugin for managing owners’ workloads. Examples of creating and starting a container using `kata-agent-tls-ctl`: Setup environment ``` $ export guest_addr=10.89.0.28 # IP address associated with the confidential VM $ export guest_port=50090 # API proxy server’s port (listens on) $ export ctl=./target/x86_64-unknown-linux-musl/release/kata-agent-tls-ctl $ export key_dir=${KATA_DIR}/src/agent/grpc_tls_key ``` Display the status of containers in the sandbox environment ``` $ ${ctl} -l trace connect --key-dir "${key_dir}" --bundle-dir "${bundle_dir}" \ --server-address "ipaddr://${guest_addr}:${guest_port}" \ -c "ListContainers" ``` Set a container ID and specify an OCI spec: ``` $ container_id=9e3d1d4750e4e20945d22c358e13c85c6b88922513bce2832c0cf403f065dc6 $ OCI_SPEC_CONFIG=${KATA_DIR}/src/tools/agent-tls-ctl/config.json ``` _Note: the next two commands require pull_image support in the guest!_ **Create container request** ``` $ ${ctl} -l trace connect --key-dir "${key_dir}" --bundle-dir "${bundle_dir}" \ --server-address "ipaddr://${guest_addr}:${guest_port}" \ -c "CreateContainer cid=${container_id} spec=file:///${OCI_SPEC_CONFIG}" ``` **Start container request** ``` $ ${ctl} -l trace connect --no-auto-values --key-dir "${key_dir}" --bundle-dir "${bundle_dir}" \ --server-address "ipaddr://${guest_addr}:${guest_port}" \ -c "StartContainer json://{\"container_id\": \"${container_id}\"}" ``` Fixes: kata-containers#1834

This commit is an initial step towards mitigating the risk of untrustworthy host systems when running Kata Containers in the context of confidential computing. It serves to safeguard against malicious, privileged users gaining access to the vulnerable Kata control plane. This protective measure becomes crucial in scenarios where a malicious cloud service provider or administrator might intercept or compromise commands from the Kata control plane, tamper with container configuration files, execute processes within the container, retrieve workload statistics, or obtain sensitive container workload information. ** Problem statement ** This commit addresses the following open issues: [Securing the Kata Control Plane](confidential-containers/confidential-containers#53) and RFC: Separate trust realms for tenant and host kata-containers#1834. A detailed history can be found in: https://github.com/ray-valdez/kata-containers/tree/split-api-feature. ** Architectural changes ** The commit introduces a new split API mode in the kata-agent, which partitions the kata-agent’s API endpoints between **host-side** and **owner-side** controllers. When this mode is enabled, the host-side controller is restricted to manage resource allocation during startup and resource recycling at termination. In contrasts, the owner-side controller allows workload owners to directly manage theIR deployment pod and containers. This partitioning implicitly labels kata-agent’s endpoint APIs as _host-exclusive_, _owner-exclusive_, or _shared_. Host-exclusive and owner-exclusive APIs are assigned specifically to either the host-side or owner-side. For instance, `CreateSandbox` and `DestroySandbox` are examples of host-exclusive APIs, while `CopyFile` and `ExecProcess` are examples of owner-exclusive APIs. Shared APIs include those that must be shared to some extent between the control planes, such as `GetOOMEvent` and `GetGuestDetails`. ** Content of this commit ** This commit focuses on providing a secure channel for the owner-side to access owner-exlusive and shared APIs. Future commit(s) will restrict the host-side access to owner-exclusive APIs when split mode is enabled on the kata-agent and will address the sharing of APIs between host-side and owner-side. This commit implements the following changes: - Introduces the split mode to the kata-agent. - Integrates a gRPC TLS server to handle API requests from the owner-side. We refer to this as the kata-agent’s API proxy server which ensures that workload owners can establish a secure end-to-end communication channel with the kata-agent or invoking API endpoint commands. - Utilizes the Key Broker Service (KBS) to provision secrets, i.e., cryptographic public and private key pairs. These secrets are crucial for establishing a secure communication channel between the owner-side and the API proxy server. ** Testing ** To enable split mode functionality, the following steps are required: 1. Configuration: Modify the `kernel_params` option in the Kata's configuration.toml file to enable split mode and specify the IP address of the KBS. - Add following settings to the `kernel_params` option: `agent.split_api=true` and `agent.aa_kbc_params=cc_kbc::http://[IP_ADDRESS]:[PORT]`. 2. Dependency on KBS: The kata-agent relies on the KBS to provision cryptographic keys to the split API proxy server, facilitating the establishment of a secure channel. - Generate TLS keys and certificates for kata-agent’s API proxy server and client (owner-side) ``` $ KATA_DIR=”<PATH to cloned repo>” $ pushd ${KATA_DIR}/src/agent/grpc_tls_keys $ ./gen_key_cert.sh ``` - Create a zip file named 'tls-keys.zip' containing the CA public key and the server’s public and private key pair ` $ zip tls-keys.zip server.pem server.key ca.pem` - Place this zip file in the KBS resource path '/default/tenant-keys/'. During sandbox creation, the kata-agent retrieves this file using the KBS 'get resource' API. It's important to note that the KBS conducts a background check on the key request, verifying evidence provided by the Trusted Execution Environment (TEE). Future extensions to the KBS will automate the creation of the server’s public and private key pair for each sandbox. ` $ popd` ** External tools required for testing ** To exercise the API proxy server, we provide the Kata Containers agent TLS control tool (kata-agent-tls-ctl), derived from the `kata-agent-ctl` tool in another commit (see `split-api-feature` branch referenced above). This tool communicates over a gRPC TLS channel with the kata-agent. Similar to the kata-agent-ctl, this is a low level tool that is intended for advanced users. Future commit(s) will introduce a more user-friendly tool that maintains state, designed to function as a kubectl plugin for managing owners’ workloads. Examples of creating and starting a container using `kata-agent-tls-ctl`: Setup environment ``` $ export guest_addr=10.89.0.28 # IP address associated with the confidential VM $ export guest_port=50090 # API proxy server’s port (listens on) $ export ctl=./target/x86_64-unknown-linux-musl/release/kata-agent-tls-ctl $ export key_dir=${KATA_DIR}/src/agent/grpc_tls_key ``` Display the status of containers in the sandbox environment ``` $ ${ctl} -l trace connect --key-dir "${key_dir}" --bundle-dir "${bundle_dir}" \ --server-address "ipaddr://${guest_addr}:${guest_port}" \ -c "ListContainers" ``` Set a container ID and specify an OCI spec: ``` $ container_id=9e3d1d4750e4e20945d22c358e13c85c6b88922513bce2832c0cf403f065dc6 $ OCI_SPEC_CONFIG=${KATA_DIR}/src/tools/agent-tls-ctl/config.json ``` _Note: the next two commands require pull_image support in the guest!_ **Create container request** ``` $ ${ctl} -l trace connect --key-dir "${key_dir}" --bundle-dir "${bundle_dir}" \ --server-address "ipaddr://${guest_addr}:${guest_port}" \ -c "CreateContainer cid=${container_id} spec=file:///${OCI_SPEC_CONFIG}" ``` **Start container request** ``` $ ${ctl} -l trace connect --no-auto-values --key-dir "${key_dir}" --bundle-dir "${bundle_dir}" \ --server-address "ipaddr://${guest_addr}:${guest_port}" \ -c "StartContainer json://{\"container_id\": \"${container_id}\"}" ``` Fixes: kata-containers#1834 Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>

This commit is an initial step towards mitigating the risk of untrustworthy host systems when running Kata Containers in the context of confidential computing. It serves to safeguard against malicious, privileged users gaining access to the vulnerable Kata control plane. This protective measure becomes crucial in scenarios where a malicious cloud service provider or administrator might intercept or compromise commands from the Kata control plane, tamper with container configuration files, execute processes within the container, retrieve workload statistics, or obtain sensitive container workload information. ** Problem statement ** This commit addresses the following open issues: [Securing the Kata Control Plane](confidential-containers/confidential-containers#53) and RFC: Separate trust realms for tenant and host kata-containers#1834. A detailed history can be found in: https://github.com/ray-valdez/kata-containers/tree/split-api-feature. ** Architectural changes ** The commit introduces a new split API mode in the kata-agent, which partitions the kata-agent’s API endpoints between **host-side** and **owner-side** controllers. When this mode is enabled, the host-side controller is restricted to manage resource allocation during startup and resource recycling at termination. In contrasts, the owner-side controller allows workload owners to directly manage theIR deployment pod and containers. This partitioning implicitly labels kata-agent’s endpoint APIs as _host-exclusive_, _owner-exclusive_, or _shared_. Host-exclusive and owner-exclusive APIs are assigned specifically to either the host-side or owner-side. For instance, `CreateSandbox` and `DestroySandbox` are examples of host-exclusive APIs, while `CopyFile` and `ExecProcess` are examples of owner-exclusive APIs. Shared APIs include those that must be shared to some extent between the control planes, such as `GetOOMEvent` and `GetGuestDetails`. ** Content of this commit ** This commit focuses on providing a secure channel for the owner-side to access owner-exlusive and shared APIs. Future commit(s) will restrict the host-side access to owner-exclusive APIs when split mode is enabled on the kata-agent and will address the sharing of APIs between host-side and owner-side. This commit implements the following changes: - Introduces the split mode to the kata-agent. - Integrates a gRPC TLS server to handle API requests from the owner-side. We refer to this as the kata-agent’s API proxy server which ensures that workload owners can establish a secure end-to-end communication channel with the kata-agent or invoking API endpoint commands. - Utilizes the Key Broker Service (KBS) to provision secrets, i.e., cryptographic public and private key pairs. These secrets are crucial for establishing a secure communication channel between the owner-side and the API proxy server. ** Testing ** To enable split mode functionality, the following steps are required: 1. Configuration: Modify the `kernel_params` option in the Kata's configuration.toml file to enable split mode and specify the IP address of the KBS. - Add following settings to the `kernel_params` option: `agent.split_api=true` and `agent.aa_kbc_params=cc_kbc::http://[IP_ADDRESS]:[PORT]`. 2. Dependency on KBS: The kata-agent relies on the KBS to provision cryptographic keys to the split API proxy server, facilitating the establishment of a secure channel. - Generate TLS keys and certificates for kata-agent’s API proxy server and client (owner-side) ``` $ KATA_DIR=”<PATH to cloned repo>” $ pushd ${KATA_DIR}/src/agent/grpc_tls_keys $ ./gen_key_cert.sh ``` - Create a zip file named 'tls-keys.zip' containing the CA public key and the server’s public and private key pair ` $ zip tls-keys.zip server.pem server.key ca.pem` - Place this zip file in the KBS resource path '/default/tenant-keys/'. During sandbox creation, the kata-agent retrieves this file using the KBS 'get resource' API. It's important to note that the KBS conducts a background check on the key request, verifying evidence provided by the Trusted Execution Environment (TEE). Future extensions to the KBS will automate the creation of the server’s public and private key pair for each sandbox. ` $ popd` ** External tools required for testing ** To exercise the API proxy server, we provide the Kata Containers agent TLS control tool (kata-agent-tls-ctl), derived from the `kata-agent-ctl` tool in another commit (see `split-api-feature` branch referenced above). This tool communicates over a gRPC TLS channel with the kata-agent. Similar to the kata-agent-ctl, this is a low level tool that is intended for advanced users. Future commit(s) will introduce a more user-friendly tool that maintains state, designed to function as a kubectl plugin for managing owners’ workloads. Examples of creating and starting a container using `kata-agent-tls-ctl`: Setup environment ``` $ export guest_addr=10.89.0.28 # IP address associated with the confidential VM $ export guest_port=50090 # API proxy server’s port (listens on) $ export ctl=./target/x86_64-unknown-linux-musl/release/kata-agent-tls-ctl $ export key_dir=${KATA_DIR}/src/agent/grpc_tls_key ``` Display the status of containers in the sandbox environment ``` $ ${ctl} -l trace connect --key-dir "${key_dir}" --bundle-dir "${bundle_dir}" \ --server-address "ipaddr://${guest_addr}:${guest_port}" \ -c "ListContainers" ``` Set a container ID and specify an OCI spec: ``` $ container_id=9e3d1d4750e4e20945d22c358e13c85c6b88922513bce2832c0cf403f065dc6 $ OCI_SPEC_CONFIG=${KATA_DIR}/src/tools/agent-tls-ctl/config.json ``` _Note: the next two commands require pull_image support in the guest!_ **Create container request** ``` $ ${ctl} -l trace connect --key-dir "${key_dir}" --bundle-dir "${bundle_dir}" \ --server-address "ipaddr://${guest_addr}:${guest_port}" \ -c "CreateContainer cid=${container_id} spec=file:///${OCI_SPEC_CONFIG}" ``` **Start container request** ``` $ ${ctl} -l trace connect --no-auto-values --key-dir "${key_dir}" --bundle-dir "${bundle_dir}" \ --server-address "ipaddr://${guest_addr}:${guest_port}" \ -c "StartContainer json://{\"container_id\": \"${container_id}\"}" ``` Fixes: kata-containers#1834 Signed-off-by: Ray Valdez <rvaldez@us.ibm.com> Signed-off-by: Salman Ahmed <sahmed@ibm.com> Signed-off-by: Christophe de Dinechin <dinechin@redhat.com> Signed-off-by: Zhongshu Gu <zgu@us.ibm.com> Signed-off-by: Pau-Chen Cheng <pau@us.ibm.com>

This commit provides a tool for testing the kata-agent's proxy API server, addressing the following open issues: [Securing the Kata Control Plane](confidential-containers/confidential-containers#53) and RFC: Separate trust realms for tenant and host kata-containers#1834. A detailed history can be found in: https://github.com/ray-valdez/kata-containers/tree/split-api-feature. Depends on: kata-containers#9159 Signed-off-by: Ray Valdez <rvaldez@us.ibm.com> Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>

This commit is an initial step towards mitigating the risk of untrustworthy host systems when running Kata Containers in the context of confidential computing. It serves to safeguard against malicious, privileged users gaining access to the vulnerable Kata control plane. This protective measure becomes crucial in scenarios where a malicious cloud service provider or administrator might intercept or compromise commands from the Kata control plane, tamper with container configuration files, execute processes within the container, retrieve workload statistics, or obtain sensitive container workload information. ** Problem statement ** This commit addresses the following open issues: [Securing the Kata Control Plane](confidential-containers/confidential-containers#53) and RFC: Separate trust realms for tenant and host kata-containers#1834. A detailed history can be found in: https://github.com/ray-valdez/kata-containers/tree/split-api-feature. ** Architectural changes ** The commit introduces a new split API mode in the kata-agent, which partitions the kata-agent’s API endpoints between **host-side** and **owner-side** controllers. When this mode is enabled, the host-side controller is restricted to manage resource allocation during startup and resource recycling at termination. In contrasts, the owner-side controller allows workload owners to directly manage theIR deployment pod and containers. This partitioning implicitly labels kata-agent’s endpoint APIs as _host-exclusive_, _owner-exclusive_, or _shared_. Host-exclusive and owner-exclusive APIs are assigned specifically to either the host-side or owner-side. For instance, `CreateSandbox` and `DestroySandbox` are examples of host-exclusive APIs, while `CopyFile` and `ExecProcess` are examples of owner-exclusive APIs. Shared APIs include those that must be shared to some extent between the control planes, such as `GetOOMEvent` and `GetGuestDetails`. ** Content of this commit ** This commit focuses on providing a secure channel for the owner-side to access owner-exlusive and shared APIs. Future commit(s) will restrict the host-side access to owner-exclusive APIs when split mode is enabled on the kata-agent and will address the sharing of APIs between host-side and owner-side. This commit implements the following changes: - Introduces the split mode to the kata-agent. - Integrates a gRPC TLS server to handle API requests from the owner-side. We refer to this as the kata-agent’s API proxy server which ensures that workload owners can establish a secure end-to-end communication channel with the kata-agent or invoking API endpoint commands. - Utilizes the Key Broker Service (KBS) to provision secrets, i.e., cryptographic public and private key pairs. These secrets are crucial for establishing a secure communication channel between the owner-side and the API proxy server. ** Testing ** To enable split mode functionality, the following steps are required: 1. Configuration: Modify the `kernel_params` option in the Kata's configuration.toml file to enable split mode and specify the IP address of the KBS. - Add following settings to the `kernel_params` option: `agent.split_api=true` and `agent.aa_kbc_params=cc_kbc::http://[IP_ADDRESS]:[PORT]`. 2. Dependency on KBS: The kata-agent relies on the KBS to provision cryptographic keys to the split API proxy server, facilitating the establishment of a secure channel. - Generate TLS keys and certificates for kata-agent’s API proxy server and client (owner-side) ``` $ KATA_DIR=”<PATH to cloned repo>” $ pushd ${KATA_DIR}/src/agent/grpc_tls_keys $ ./gen_key_cert.sh ``` - Create a zip file named 'tls-keys.zip' containing the CA public key and the server’s public and private key pair ` $ zip tls-keys.zip server.pem server.key ca.pem` - Place this zip file in the KBS resource path '/default/tenant-keys/'. During sandbox creation, the kata-agent retrieves this file using the KBS 'get resource' API. It's important to note that the KBS conducts a background check on the key request, verifying evidence provided by the Trusted Execution Environment (TEE). Future extensions to the KBS will automate the creation of the server’s public and private key pair for each sandbox. ` $ popd` ** External tools required for testing ** To exercise the API proxy server, we provide the Kata Containers agent TLS control tool (kata-agent-tls-ctl), derived from the `kata-agent-ctl` tool in another commit (see `split-api-feature` branch referenced above). This tool communicates over a gRPC TLS channel with the kata-agent. Similar to the kata-agent-ctl, this is a low level tool that is intended for advanced users. Future commit(s) will introduce a more user-friendly tool that maintains state, designed to function as a kubectl plugin for managing owners’ workloads. Examples of creating and starting a container using `kata-agent-tls-ctl`: Setup environment ``` $ export guest_addr=10.89.0.28 # IP address associated with the confidential VM $ export guest_port=50090 # API proxy server’s port (listens on) $ export ctl=./target/x86_64-unknown-linux-musl/release/kata-agent-tls-ctl $ export key_dir=${KATA_DIR}/src/agent/grpc_tls_key ``` Display the status of containers in the sandbox environment ``` $ ${ctl} -l trace connect --key-dir "${key_dir}" --bundle-dir "${bundle_dir}" \ --server-address "ipaddr://${guest_addr}:${guest_port}" \ -c "ListContainers" ``` Set a container ID and specify an OCI spec: ``` $ container_id=9e3d1d4750e4e20945d22c358e13c85c6b88922513bce2832c0cf403f065dc6 $ OCI_SPEC_CONFIG=${KATA_DIR}/src/tools/agent-tls-ctl/config.json ``` _Note: the next two commands require pull_image support in the guest!_ **Create container request** ``` $ ${ctl} -l trace connect --key-dir "${key_dir}" --bundle-dir "${bundle_dir}" \ --server-address "ipaddr://${guest_addr}:${guest_port}" \ -c "CreateContainer cid=${container_id} spec=file:///${OCI_SPEC_CONFIG}" ``` **Start container request** ``` $ ${ctl} -l trace connect --no-auto-values --key-dir "${key_dir}" --bundle-dir "${bundle_dir}" \ --server-address "ipaddr://${guest_addr}:${guest_port}" \ -c "StartContainer json://{\"container_id\": \"${container_id}\"}" ``` Fixes: kata-containers#1834 Signed-off-by: Ray Valdez <rvaldez@us.ibm.com> Signed-off-by: Salman Ahmed <sahmed@ibm.com> Signed-off-by: Christophe de Dinechin <dinechin@redhat.com> Signed-off-by: Zhongshu Gu <zgu@us.ibm.com> Signed-off-by: Pau-Chen Cheng <pau@us.ibm.com>

This commit provides a tool for testing the kata-agent's proxy API server, addressing the following open issues: [Securing the Kata Control Plane](confidential-containers/confidential-containers#53) and RFC: Separate trust realms for tenant and host kata-containers#1834. A detailed history can be found in: https://github.com/ray-valdez/kata-containers/tree/split-api-feature. Signed-off-by: Ray Valdez <rvaldez@us.ibm.com> Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>

This commit is an initial step towards mitigating the risk of untrustworthy host systems when running Kata Containers in the context of confidential computing. It serves to safeguard against malicious, privileged users gaining access to the vulnerable Kata control plane. This protective measure becomes crucial in scenarios where a malicious cloud service provider or administrator might intercept or compromise commands from the Kata control plane, tamper with container configuration files, execute processes within the container, retrieve workload statistics, or obtain sensitive container workload information. ** Problem statement ** This commit addresses the following open issues: [Securing the Kata Control Plane](confidential-containers/confidential-containers#53) and RFC: Separate trust realms for tenant and host kata-containers#1834. A detailed history can be found in: https://github.com/ray-valdez/kata-containers/tree/split-api-feature. ** Architectural changes ** The commit introduces a new split API mode in the kata-agent, which partitions the kata-agent’s API endpoints between **host-side** and **owner-side** controllers. When this mode is enabled, the host-side controller is restricted to manage resource allocation during startup and resource recycling at termination. In contrasts, the owner-side controller allows workload owners to directly manage theIR deployment pod and containers. This partitioning implicitly labels kata-agent’s endpoint APIs as _host-exclusive_, _owner-exclusive_, or _shared_. Host-exclusive and owner-exclusive APIs are assigned specifically to either the host-side or owner-side. For instance, `CreateSandbox` and `DestroySandbox` are examples of host-exclusive APIs, while `CopyFile` and `ExecProcess` are examples of owner-exclusive APIs. Shared APIs include those that must be shared to some extent between the control planes, such as `GetOOMEvent` and `GetGuestDetails`. ** Content of this commit ** This commit focuses on providing a secure channel for the owner-side to access owner-exlusive and shared APIs. Future commit(s) will restrict the host-side access to owner-exclusive APIs when split mode is enabled on the kata-agent and will address the sharing of APIs between host-side and owner-side. This commit implements the following changes: - Introduces the split mode to the kata-agent. - Integrates a gRPC TLS server to handle API requests from the owner-side. We refer to this as the kata-agent’s API proxy server which ensures that workload owners can establish a secure end-to-end communication channel with the kata-agent or invoking API endpoint commands. - Utilizes the Key Broker Service (KBS) to provision secrets, i.e., cryptographic public and private key pairs. These secrets are crucial for establishing a secure communication channel between the owner-side and the API proxy server. ** Testing ** To enable split mode functionality, the following steps are required: 1. Configuration: Modify the `kernel_params` option in the Kata's configuration.toml file to enable split mode and specify the IP address of the KBS. - Add following settings to the `kernel_params` option: `agent.split_api=true` and `agent.aa_kbc_params=cc_kbc::http://[IP_ADDRESS]:[PORT]`. 2. Dependency on KBS: The kata-agent relies on the KBS to provision cryptographic keys to the split API proxy server, facilitating the establishment of a secure channel. - Generate TLS keys and certificates for kata-agent’s API proxy server and client (owner-side) ``` $ KATA_DIR=”<PATH to cloned repo>” $ pushd ${KATA_DIR}/src/agent/grpc_tls_keys $ ./gen_key_cert.sh ``` - Create a zip file named 'tls-keys.zip' containing the CA public key and the server’s public and private key pair ` $ zip tls-keys.zip server.pem server.key ca.pem` - Place this zip file in the KBS resource path '/default/tenant-keys/'. During sandbox creation, the kata-agent retrieves this file using the KBS 'get resource' API. It's important to note that the KBS conducts a background check on the key request, verifying evidence provided by the Trusted Execution Environment (TEE). Future extensions to the KBS will automate the creation of the server’s public and private key pair for each sandbox. ` $ popd` ** External tools required for testing ** To exercise the API proxy server, we provide the Kata Containers agent TLS control tool (kata-agent-tls-ctl), derived from the `kata-agent-ctl` tool in another commit (see `split-api-feature` branch referenced above). This tool communicates over a gRPC TLS channel with the kata-agent. Similar to the kata-agent-ctl, this is a low level tool that is intended for advanced users. Future commit(s) will introduce a more user-friendly tool that maintains state, designed to function as a kubectl plugin for managing owners’ workloads. Examples of creating and starting a container using `kata-agent-tls-ctl`: Setup environment ``` $ export guest_addr=10.89.0.28 # IP address associated with the confidential VM $ export guest_port=50090 # API proxy server’s port (listens on) $ export ctl=./target/x86_64-unknown-linux-musl/release/kata-agent-tls-ctl $ export key_dir=${KATA_DIR}/src/agent/grpc_tls_key ``` Display the status of containers in the sandbox environment ``` $ ${ctl} -l trace connect --key-dir "${key_dir}" --bundle-dir "${bundle_dir}" \ --server-address "ipaddr://${guest_addr}:${guest_port}" \ -c "ListContainers" ``` Set a container ID and specify an OCI spec: ``` $ container_id=9e3d1d4750e4e20945d22c358e13c85c6b88922513bce2832c0cf403f065dc6 $ OCI_SPEC_CONFIG=${KATA_DIR}/src/tools/agent-tls-ctl/config.json ``` _Note: the next two commands require pull_image support in the guest!_ **Create container request** ``` $ ${ctl} -l trace connect --key-dir "${key_dir}" --bundle-dir "${bundle_dir}" \ --server-address "ipaddr://${guest_addr}:${guest_port}" \ -c "CreateContainer cid=${container_id} spec=file:///${OCI_SPEC_CONFIG}" ``` **Start container request** ``` $ ${ctl} -l trace connect --no-auto-values --key-dir "${key_dir}" --bundle-dir "${bundle_dir}" \ --server-address "ipaddr://${guest_addr}:${guest_port}" \ -c "StartContainer json://{\"container_id\": \"${container_id}\"}" ``` Fixes: kata-containers#1834 Signed-off-by: Ray Valdez <rvaldez@us.ibm.com> Signed-off-by: Salman Ahmed <sahmed@ibm.com> Signed-off-by: Christophe de Dinechin <dinechin@redhat.com> Signed-off-by: Zhongshu Gu <zgu@us.ibm.com> Signed-off-by: Pau-Chen Cheng <pau@us.ibm.com>

c3d added the needs-review Needs to be assessed by the team. label May 12, 2021

katacontainersbot added this to To do in Issue backlog May 12, 2021

c3d changed the title ~~RFC [Do not read yet]: Separate trust domains for tenant and host~~ RFC: Separate trust domains for tenant and host May 12, 2021

c3d added this to To do in Confidential containers May 20, 2021

ariel-adam added area/confidential-containers Issues related to confidential containers (see also CCv0 branch) and removed needs-review Needs to be assessed by the team. labels May 25, 2021

c3d self-assigned this May 26, 2021

c3d mentioned this issue May 26, 2021

kata agent runtime configuration #1837

Closed

c3d changed the title ~~RFC: Separate trust domains for tenant and host~~ RFC: Separate trust realms for tenant and host May 26, 2021

c3d mentioned this issue Sep 14, 2021

Pod API: Information that the kata runtime is missing #2637

Open

c3d mentioned this issue Mar 9, 2023

re-add OCI CLI commands for podman (Note: docker is now supported with docker-23.0) #722

Open

salmanyam mentioned this issue Mar 30, 2023

Attestation Agent should run before the pull_image confidential-containers/confidential-containers#79

Open

ray-valdez mentioned this issue Apr 4, 2023

Securing the Kata Control Plane confidential-containers/confidential-containers#53

Open

ray-valdez linked a pull request Feb 26, 2024 that will close this issue

agent: Add grpctls proxy API endpoints #9159

Open

burgerdev mentioned this issue Feb 29, 2024

genpolicy: does not guarantee order of containers #9196

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RFC: Separate trust realms for tenant and host #1834

RFC: Separate trust realms for tenant and host #1834

c3d commented May 12, 2021 •

edited

c3d commented May 12, 2021

jimcadden commented May 14, 2021

bpradipt commented May 15, 2021

c3d commented May 17, 2021 •

edited

bpradipt commented May 17, 2021

c3d commented May 26, 2021

c3d commented May 27, 2021

c3d commented Oct 31, 2022

c3d commented Feb 27, 2023

c3d commented Feb 27, 2023 •

edited

c3d commented Feb 27, 2023 •

edited

c3d commented Feb 27, 2023 •

edited

c3d commented Feb 27, 2023

c3d commented Feb 28, 2023

ray-valdez commented Feb 28, 2023

RFC: Separate trust realms for tenant and host #1834

RFC: Separate trust realms for tenant and host #1834

Comments

c3d commented May 12, 2021 • edited

Summary

Motivation

Objective

Non-objective

Proposal

Details

Host vs. Tenant trust realms

Immutable pod

Host vs. Guest kubectl commands

RPC data path

Restrict the semantics of agent API

Restrict the storage to virtio-blk

Add APIs for secure image download

Enhance building tools to create images that can be attested and encrypted

Add support for attestations

c3d commented May 12, 2021

jimcadden commented May 14, 2021

bpradipt commented May 15, 2021

c3d commented May 17, 2021 • edited

bpradipt commented May 17, 2021

c3d commented May 26, 2021

c3d commented May 27, 2021

c3d commented Oct 31, 2022

c3d commented Feb 27, 2023

c3d commented Feb 27, 2023 • edited

c3d commented Feb 27, 2023 • edited

c3d commented Feb 27, 2023 • edited

c3d commented Feb 27, 2023

c3d commented Feb 28, 2023

ray-valdez commented Feb 28, 2023

c3d commented May 12, 2021 •

edited

c3d commented May 17, 2021 •

edited

c3d commented Feb 27, 2023 •

edited

c3d commented Feb 27, 2023 •

edited

c3d commented Feb 27, 2023 •

edited