Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Separate trust realms for tenant and host #1834

Open
c3d opened this issue May 12, 2021 · 15 comments · May be fixed by #9159
Open

RFC: Separate trust realms for tenant and host #1834

c3d opened this issue May 12, 2021 · 15 comments · May be fixed by #9159
Assignees
Labels
area/confidential-containers Issues related to confidential containers (see also CCv0 branch)

Comments

@c3d
Copy link
Member

c3d commented May 12, 2021

Summary

In the context of Confidential Computing Enablement (#1332), we break the assumption that the containers trust the host. It makes clear that the trust realms for the container workload and for the pod running it do not belong to the same owner. This RFC explores the consequences in terms of API and usage model.

Motivation

Hardware-level enablement of features such as memory encryption, CPU state encryption and memory integrity protection are not sufficient to ensure confidentiality. Exploring this question has shown that achieving real confidentiality raises serious issues regarding system administration, and may not be achievable without deep changes in the existing APIs, notably with respect to the data flow between components of Kata Containers (in particular between runtime and agent).

Objective

The objective of this document issue is to explore the implications of confidentiality with respect to the architecture of Kata. The following aspects seem particularly important:

  1. Confidentiality: We don't want a design where the door is locked but the window is open. So we need to ensure that the benefits of technologies such as memory encryption are not defeated by Kata communication channels.
  2. Flexibility: Very early in the platform support for confidentiality, we already have several models that behave somewhat differently, notably with respect to attestation or integrity protection. We want to accommodate various platforms, and to not settle for a lowest common denominator.
  3. Compatibility: To the maximum extent, we want to preserve the ability to run existing workloads, or alternatively, to provide a clear upgrade path.
  4. Enforcement: We want to be able to enforce confidentiality when it is requested, not to rely on the user not making mistakes.

Non-objective

  1. All features with confidentiality: Some features will be incompatible with confidentiality, for example VGPU (at least until GPU vendors themselves provide support for confidentiality). The "enforcement" objective means it is not an objective to support confidentiality with these features, but to detect such incompatibilities as much as possible.
  2. Fork: As much as possible, we want to avoid a confidential-only fork of Kata, knowing that some other projects are already exploring other ways to deliver confidential containers.

Proposal

This RFC makes the following recommendations to address the problem:

  1. Introduce a new terminology, hosts vs. tenants, and the associated security realms.
  2. Introduce the notion of immutable pod and the implications regarding notably the OCI specification, one idea being to use annotations to pass a complete, non-incremental description of the pod and its containers.
  3. Split kubectl commands / APIs based on owner: host (create a pod), tenant (e.g. exec), both (e.g. statistics and metrics) - possibly with changes in semantics (e.g. Create() currently sets up stdin/stdout/stderr, but in a confidential case, those are tenant-owned).
  4. Change the RPC data path with the agent when running confidentially, for example using the same secure channel used for attestation, and cut communication between host and agent (i.e. no vsock with confidentiality).
  5. Restrict the semantics for the existing agent APIs, and the idea that some APIs could be locked down, e.g. through a configuration file in the image.
  6. Restrict the storage to virtio-blk when confidentiality is requested, and take over the block device using tenant-supplied secrets (e.g. format the volume with LUKS at first use), and explore the possibility to share volumes across pods owned by a same tenant (e.g. to limit storage explosion)
  7. Add APIs to the agent to securely download and mount images on such a tenant-owned block device
  8. Enhance image building tools to create images that can be attested and encrypted.
  9. Add support for attestation services in the pod boot process, and only start the pod (and therefore agent) if attestation succeeded.

In addition, it assumes that some other proposals are completed, notably:

  1. The overall Confidential Computing enablement ([RFC] [WIP] Confidential Computing Enablement #1332), which is required to enable hardware support
  2. Moving image download inside the guest (image pulling inside sandbox #149), which is required for point (7) above.
  3. Some form of node feature discovery ([RFC] Preflight-check / node feature discovery #1716) notably to detect hypervisor / hardware support
  4. Making block devices directly visible in the guest (docs: add design proposal for direct-assigned volume #1568) and not through virtiofs, for point (6) above.

Details

Host vs. Tenant trust realms

This RFC proposes the introduction of the notion of tenant and host in the Kata Containers terminology:

  • The host owns the hardware running the virtual machines / pods. In the new model, the host has no access to the tenant's data, other than the ability to kill pods and measure them from the outside.
  • The tenant owns the containers and what runs inside the pods. In the new model, the tenant does not trust the host, except in so far as the host provides raw resources (CPU, memory, disk).

Host vs  Tenant

In the rest of this proposal, the host and tenant will be considered separate trust realms, and it is assumed that some users may have access only to one side. Therefore, the proposed processes and APIs "must work" given this constraint.

Immutable pod

During the creation of the pod, the pod content is attested. This includes the boot image, kernel, firmware and possibly some restrictions on which images the pod can download.

Immutable Pod

This implies that we cannot mutate the pod once it has started running containers. Some mutations, e.g. hot-plugging memory, may make sense from a security point of view, but may be hard to implement in practice or not necessarily supported by the hardware / firmware implementations.

An important implication is that Kata in "confidential" mode will probably not need hot-plugging, which simplifies the architecture quite a bit. One interesting question is if we can leverage the work required for confidential computing in order to gain the same benefits in the "normal" case. In other words, could a future Kata 3 supporting confidential computing also only support immutable pods?

Note that hot-plugging memory may seem quite useful, but in practice today, when you resize a pod, I believe k8s will always restart the pod anyways, not mutate it in place.

Host vs. Guest kubectl commands

A result of the approach is that we need to split APIs based on whether they belong to the host or to the tenant.

Split APIs

The following list is based on the usage for kubectl, prefixed with:

  • H(host): Host privileged

  • T(enant): Tenant privileged

  • B(oth): the command makes sense for both host and tenant users

  • S(plit): The command may need to be split based on the user, e.g. accesses objects that are host or tenant

  • C(omplicated): The command may need cooperation between host and tenant

  • H create - Create a resource from a file or from stdin.

  • C expose - Take a replication controller, service, deployment or pod and expose it as a new Kubernetes Service

  • T run - Run a particular image on the cluster

  • S set - Set specific features on objects

  • B explain- Documentation of resources

  • S get - Display one or many resources

  • S edit - Edit a resource on the server

  • H delete - Delete resources by filenames, stdin, resources and names, or by resources and label selector

  • H rollout - Manage the rollout of a resource

  • H scale - Set a new size for a Deployment, ReplicaSet or Replication Controller

  • H autoscale - Auto-scale a Deployment, ReplicaSet, or ReplicationController

  • S certificate - Modify certificate resources.

  • H cluster - -info Display cluster info

  • S top - Display Resource (CPU/Memory/Storage) usage.

  • H cordon - Mark node as unschedulable

  • H uncordon - Mark node as schedulable

  • H drain - Drain node in preparation for maintenance

  • H taint - Update the taints on one or more nodes

  • S describe - Show details of a specific resource or group of resources

  • T logs - Print the logs for a container in a pod

  • T attach - Attach to a running container

  • T exec - Execute a command in a container

  • T port - -forward Forward one or more local ports to a pod

  • T proxy - Run a proxy to the Kubernetes API server

  • T cp - Copy files and directories to and from containers.

  • S auth - Inspect authorization

  • T debug - Create debugging sessions for troubleshooting workloads and nodes

  • C diff - Diff live version against would-be applied version

  • C apply - Apply a configuration to a resource by filename or stdin

  • C patch - Update field(s) of a resource

  • C replace - Replace a resource by filename or stdin

  • T wait - Experimental: Wait for a specific condition on one or many resources.

  • C kustomize - Build a kustomization target from a directory or a remote url.

  • S label - Update the labels on a resource

  • S annotate - Update the annotations on a resource

  • T completion - Output shell completion code for the specified shell (bash or zsh)

  • ? api - -resources Print the supported API resources on the server

  • ? api - -versions Print the supported API versions on the server, in the form of "group/version"

  • S config - Modify kubeconfig files

  • ? plugin - Provides utilities for interacting with plugins.

  • B version - Print the client and server version information

RPC data path

If we want to be able to support commands like kubectl exec without exposing too much data to the host, we need a secure channel of communication between kubectl and the agent.

One possible idea is to leverage the secure channel that needs to be setup for attestation. The startup sequence would become something like the following:

  1. Host: Start the guest
  2. Agent: Read credentials and configuration from image, switch to "confidential" mode
  3. Agent: Open secure channel for attestation using root credentials from step 2
  4. Agent: Receive initial secrets from attestation server
  5. Agent: Send kernel / firmware measurements over secure channel
  6. Agent: Verify attestation result, and if OK, initiate image download (see discussion in image pulling inside sandbox #149 and [RFC] [WIP] Confidential Computing Enablement #1332)
  7. Agent: Start container, capturing stdin, stderr, stdout and logs
  8. Agent: Respond to usual RPC over secure channel instead of vsock (e.g. console I/Os, exit codes, stats, logs)

Note that technically, the secure channel for the RPC needs not be the same as the one for attestation, and we may discover later good reasons to use a different one.

This implies that on the other side, the APIs are no longer routed through the runtime, but by talking either to the agent directly or through some tenant-owned proxy (the attestation server being a possibility that may reduce the complexity).

In this model, there is no longer any vsock between the runtime on the host and the agent. As a matter of fact, there are probably no good reasons for the runtime to be able to communicate with the agent once the pod has been created. The runtime, on the other hand, still has responsibility for killing the pod or for "external" metrics, e.g. resource utilisation as can be measured by cgroupv2.

Restrict the semantics of agent API

Step 2 in the previous section reads a configuration from the image, and switches to "confidential" mode.

During discussions in the Confidential Computing use cases meeting, we came to an agreement that it is probably better to have a single agent that exposes different APIs depending on the usage, rather than to configure out the irrelevant APIs at compile-time. The benefit is to simplify the delivery mechanism. For example, the image building process would have to deal with a single agent binary and not have to select the right agent based on whether the image is for confidential use or non-confidential use.

The approach proposed here is slightly more radical, since the idea is that the RPC channel with the host would entirely be closed when in confidential mode. In that model, the APIs may remain somewhat identical, though we receive them through a different transport channel.

There are still some restrictions that seem necessary looking at the agent protocol

  • CreateContainer and StartContainer may be on different sides of a payload attestation protocol. We may want to have CreateContainer before image download, but reject StartContainer if the image does not pass attestation. We probably need to clarify the difference / order with CreateSandbox and DestroySandbox (do we really need both?)
  • RemoveContainer is now primarily a tenant operation, but it might make sense to allow "friendly shutdown", i.e. relay a host-side pod tear-down request and give in-guest processes a chance to terminate cleanly.
  • Same kind of question with SignalContainer.
  • PauseContainer and ResumeContainer are related to host-controlled resources initially, or should we treat that as tenant operations?
  • All console (stdio) operations are tenant-owned, so simply routing them through a secure channel may be sufficient.
  • All the networking APIs may need a bit of thought. Do we need cooperation from the host here?
  • The tracing / observability APIs are tenant-owned and seems to be fine if we go through a secure channel
  • OnlineCPUMem and MemHotPlugByProbe would go away with an immutable pod approach.
  • ReseedRandomDev is probably one we would flatly reject in a confidential computing scenario. Can someone suggest any good use for this API in other contexts, assuming we have a complete pod description initially? What about SetGuestDateTime? Any reason that shouldn't be disabled in the confidential case?
  • GetGuestDetails: This probably becomes a tenant-only API in the confidential case, and we may need to add more details, e.g. the details of the hardware + software + firmware support (e.g. do we have integrity protection).
  • CopyFile becomes tenant-only.
  • GetOOMEvent is probably tenant-only in the confidential case, although I could see a case for reporting OOM events to the host.

Question: could an image that is confidentiality-capable be used for the non-confidential case? That would simplify image management further, but require some additional logic at startup to decide which mode to use. Probably too risky?

Question: How would agent-ctl need to be modified to address this? It already has a flexible addressing scheme for the agent, so maybe simply accepting an https:// URL for the agent instead of unix:/// might be sufficient as far as UI is concerned?

Restrict the storage to virtio-blk

This aspect depends specifically on #1568 and related work being completed. There are reasons to believe that using virtiofs to access a host-provided filesystem will never fly with confidential containers. In particular, it communicates too much information to the host if only through meta-data. This seems unfixable while retaining the file-level semantics.

Therefore, it is likely that the model in confidential mode will be:

  1. The guest mounts host-provided block devices, including for image storage and overlays.
  2. All data on such devices will be encrypted using tenant-owned keys.
  3. At mount time, the agent will check that it can decrypts the disk. If not, it will format the disk using something like LUKS and the tenant-owned keys. The idea is that the tenant controls if the disks can even be "replayed" from one run to the next, or shared between pods. By rotating keys between boots, you ensure that the disk is erased each time, and that the host cannot attempt replay attacks. By reusing keys, including between pods, you may be able to share disk that contain container images and overlays between guests owned by a same tenant.

In the case of qemu, we may be able to use qemu-nbd to share encrypted disk images between pods without the host having access to the underlying data. This would make it possible to have disk usage comparable to that of regular kata or non-kata containers. It is not entirely clear if this works today (i.e. can we have a cluster file system on top of LUKS for example).

Add APIs for secure image download

The agent APIs today do not offer features to download images. This is note really discussed in #149

Question: We may want to be able to restrict the images that can be downloaded? How?

Enhance building tools to create images that can be attested and encrypted

Can we entirely rely on existing encrypted container images toolsets? Probably not: we also need osbuilder / image buider changes to make sure we can generate images that can be attested. This belongs to the tenant, but the images are then stored on the host, so we need host credentials as well.

Add support for attestations

This would be based on remote attestation protocols. Need some way to specify the URL of the broker?

@c3d c3d added the needs-review Needs to be assessed by the team. label May 12, 2021
@c3d
Copy link
Member Author

c3d commented May 12, 2021

@c3d

@c3d c3d changed the title RFC [Do not read yet]: Separate trust domains for tenant and host RFC: Separate trust domains for tenant and host May 12, 2021
@jimcadden
Copy link
Contributor

Thanks, @c3d! I am largely in agreement with everything you've written. I do think Immutable Pod is a bit of a misnomer... presumably, after the separation, the tenant will still be able to update containers, or deploy ephemeral containers, within a running pod as long as the operation and image comes from a trusted source. I do agree that the host-controlled VM/sandbox should be made effectively immutable.

@bpradipt
Copy link
Contributor

@c3d thanks for the proposal. Nicely lays down the host and tenant aspects.
While going through it , I had a thought on the possibility of having the host-guest API restrictions compiled in the agent. IOW, two different builds for the agent (based on build tags). One with restricted API set for confidential computing and another generic build.

@c3d
Copy link
Member Author

c3d commented May 17, 2021

While going through it , I had a thought on the possibility of having the host-guest API restrictions compiled in the agent. IOW, two different builds for the agent (based on build tags). One with restricted API set for confidential computing and another generic build.

@bpradipt We discussed the possibility during the confidential computing use-case meeting. This is why in the proposal I wrote:

During discussions in the Confidential Computing use cases meeting, we came to an agreement that it is probably better to have a single agent that exposes different APIs depending on the usage, rather than to configure out the irrelevant APIs at compile-time. The benefit is to simplify the delivery mechanism. For example, the image building process would have to deal with a single agent binary and not have to select the right agent based on whether the image is for confidential use or non-confidential use.

This is definitely an option though.

@bpradipt
Copy link
Contributor

While going through it , I had a thought on the possibility of having the host-guest API restrictions compiled in the agent. IOW, two different builds for the agent (based on build tags). One with restricted API set for confidential computing and another generic build.

@bpradipt We discussed the possibility during the confidential computing use-case meeting. This is why in the proposal I wrote:

During discussions in the Confidential Computing use cases meeting, we came to an agreement that it is probably better to have a single agent that exposes different APIs depending on the usage, rather than to configure out the irrelevant APIs at compile-time. The benefit is to simplify the delivery mechanism. For example, the image building process would have to deal with a single agent binary and not have to select the right agent based on whether the image is for confidential use or non-confidential use.

This is definitely an option though.

Thanks for the clarification @c3d

@c3d c3d added this to To do in Confidential containers May 20, 2021
@ariel-adam ariel-adam added area/confidential-containers Issues related to confidential containers (see also CCv0 branch) and removed needs-review Needs to be assessed by the team. labels May 25, 2021
@c3d c3d self-assigned this May 26, 2021
@c3d c3d changed the title RFC: Separate trust domains for tenant and host RFC: Separate trust realms for tenant and host May 26, 2021
@c3d
Copy link
Member Author

c3d commented May 26, 2021

Updated trust domains to trust realm to make it clear that we are not talking about TDs as defined by TDX.

@c3d
Copy link
Member Author

c3d commented May 27, 2021

Added a few images to make things easier to understand.

@c3d
Copy link
Member Author

c3d commented Oct 31, 2022

Updated slide deck from presentation on Octoberhttps://docs.google.com/presentation/d/16649JpQAdDb3jh3OVKWdNks0mvcqpidssrm0pOZNKZ4/edit?usp=sharing

@c3d
Copy link
Member Author

c3d commented Feb 27, 2023

Current categorization of endpoints maintained by @ray-valdez: https://app.box.com/file/1109515066100?s=0ybmczv3mko7ub3zfob43bpfzkbw8ms6.

Summary here:

  • 38 endpoint APIs + 1 (listcontainers) = 39
  • 12 belong to Host (including ~5 shared) and rest for Tenant
  • Currently implemented for Tenant side: pause, resume, exec, listcontainers

Split host/tenant APIs presentation listed: 35 endpoints. Also: health service (check and version), image (pullimage)

@c3d
Copy link
Member Author

c3d commented Feb 27, 2023

Host-side endpoints (details in document linked above):

Name Category Description
AddARPNeighbors Networking Add an ARP neighbor (netlink.rs)
CreateSandbox Initialization Initialize the sandbox (rpc.rs, mount.rs, network.rs, ..)
DestroySandbox Termination Destroy the sandbox (rpc.rs, sandbox.rs)
GuestDetails Status / Stats Get details on guest and agent
MemHotplugByProbe Initialization Add memory via hotplug
OnlineCPUMem Initialization Add CPU via hotplug
UpdateInterface Networking Update interfaces on links
UpdateRoutes Networking Update routes on links
ListContainers Initialization Return sandbox and container IDs

The hotplug endpoints may have to be removed in a CC context. This is dependent on #2637 and related. Until then, we can get the "final" layout of the pod using annotations to avoid hotplugging.

The ListContainers endpoint is currently somewhat special, because it is required by the OCI specification. However, like PullImage. this is something that we should be able to transfer to the guest.

@c3d
Copy link
Member Author

c3d commented Feb 27, 2023

Tenant "admin" endpoints (if we ever get to make a difference between admin and non-admin guest access)

Name Category Description
AddSwap Initialization Use block device for swap
CreateContainer Initialization Create a container (rpc.rs)
GetIPTables Networking Return the result of IP6TABLES_SAVE or IPTABLES
GetVolumeStats Storage / Stats Get volume capacity and inode statistics
ListInterfaces Networking Get networking interfaces (rpc.rs, netlink.rs)
ListRoutes Networking Get networking routes (rpc.rs, netlink.rs)
PauseContainer Status Pause container
ResumeContainer Status Resume pause container
SetIPTables Networking Set IP tables in the guest
StartContainer Initialization Start a container
StatsContainer Stats Return statistics about the container
UpdateContainer Status Update a container's resources
PullImage Initialization Pull an image from within a container

GetVolumeStats may be also necessary for the host to be able to perform storage resource allocation.

Overall, the networking endpoints are the most problematic. ListRoutesRequest handling, for example, may depend on the network interface card (NIC) configuration. For a pass-through device, the guest has complete control, so the host may not need to be involved at all. For virtual networks, on the other hand, the host may plays a role in the routing, so a coordination and sequencing between host and guest might be necessary.

Statistics endpoints seen by the agent seem to not be relevant to the host. However, similar values are required for proper host-side scheduling and accounting, so these endpoints will need to be effectively split.

@c3d
Copy link
Member Author

c3d commented Feb 27, 2023

Tenant non-admin endpoints:

Name Category Description
CloseStdin Termination Termination due to closing of standard input
CopyFile Initialization Copy files to specific location
ExecProcess User Actions Execute a process in a container
GetMetrics Stats Return metrics for the host/guest side of a container
GetOOMEvent Errors Get out of memory event for a container
ReadStderr Logging Read the standard error stream from a container
ReadStdout Logging Read the standard output stream from a container
RemoveContainer Termination Remove a container
ReseedRandomDev User actions Set the random seed
ResizeVolume Storage Resize a volume (host + guest)
SetGuestDateTime User actions Set guest time (using libc)
SignalProcess User actions Send a signal to a container / all processes in container's cgroup
TtyWinResize User actions Set the process' stdout terminal size using libc

CopyFile is currently accessing files on the host. The exact semantics we should give in a CC context remains to be defined.

GetMetrics needs to be split between host and guest, since some metrics are relevant to the host e.g. for scheduling or billing purpose.

ReadStdout and ReadStderr might need some extra buffering compared to the normal scenario, since we don't have access until a TLS / secure connexion has been established with the agent.

ReseedRandomDev has strong security implications with respect to cryptographic security. This may need to be moved to the "tenant admin" category, or disabled entirely. Some TEE platforms have special handling for crypto seeds.

ResizeVolume needs some cooperation from the host and some sequencing.

@c3d
Copy link
Member Author

c3d commented Feb 27, 2023

Minimal set of endpoints:

  • PullImage
  • CreateContainer
  • ExecProcess (Done)
  • PauseContainer (Done)
  • RemoveContainer
  • ResumeContainer (Done)
  • StartContainer
  • ListContainers (Done)

@c3d
Copy link
Member Author

c3d commented Feb 28, 2023

@ray-valdez Could you check that the above comments capture our latest status?

@ray-valdez
Copy link

@c3d Yes, the above comments summarize our latest status. One point to add is that the CreateContainer endpoint is also invoked by the host during sandbox creation so before handing off workload management to the tenant-side we'll need to disable access to this endpoint to the host-side.

ray-valdez added a commit to ray-valdez/kata-containers that referenced this issue Feb 23, 2024
This commit is an initial step towards mitigating the risk of untrustworthy host systems when running Kata Containers in the context of confidential computing. It serves to safeguard against malicious, privileged users gaining access to the vulnerable Kata control plane. In scenarios where a malicious cloud service provider or administrator might intercept or compromise commands from the Kata control plane, tamper with container configuration files, execute processes within the container, retrieve workload statistics, or obtain sensitive container workload information, this protective measure becomes crucial.

This commit addresses the following open issues: [Securing the Kata Control Plane](http:// confidential-containers/confidential-containers#53) and RFC: Separate trust realms for tenant and host kata-containers#1834. A detail history can be found in: https://github.com/ray-valdez/kata-containers/tree/split-api-feature.

The commit introduces a new split API mode in the kata-agent, which partitions the kata-agent’s API endpoints between h**ost-side** and **owner-side** controllers. When this mode is enabled, the host-side controller is restricted to manage resource allocation during startup and resource recycling at termination. In contrasts, the owner-side controller allows workload owners to directly manage theIR deployment pod and containers. This partitioning implicitly labels kata-agent’s endpoint APIs as _host-exclusive_, _owner-exclusive_, or _shared_.

Host-exclusive and owner-exclusive APIs are assigned specifically to either the host-side or owner-side. For instance, `CreateSandbox` and `DestroySandbox` are examples of host-exclusive APIs, while `CopyFile` and `ExecProcess` are examples of owner-exclusive APIs. Shared APIs include those that must be shared to some extent between the control planes, such as `GetOOMEvent` and `GetGuestDetails`.

This commit focuses on providing a secure channel for the owner-side to access owner-exlusive  and shared APIs. Future commit(s) will restrict the host-side access to owner-exclusive APIs when split mode is enabled on the kata-agent and will address the sharing of APIs between host-side and owner-side.

This commit implements the following changes:
- Introduces the split mode to the kata-agent.
- Integrates a gRPC TLS server to handle API requests from the owner-side.  We refer to this as the kata-agent’s API proxy server which ensures that workload owners can establish a secure end-to-end communication channel with the kata-agent tor invoking API endpoint commands.
- Utilizes the Key Broker Service (KBS) to provision secrets, i.e., cryptographic public and private  key pairs, These secrets are crucial for establishing a secure communication channel between the owner-side and the API proxy server.

To enable split mode functionality, the following steps are required:

1. Configuration: Modify the `kernel_params` option in the Kata's configuration.toml file to enable split mode and specify the IP address of the KBS.
- Add following settings to the `kernel_params` option:  `agent.split_api=true`  and `agent.aa_kbc_params=cc_kbc::http://[IP_ADDRESS]:[PORT]`.

2. Dependency on KBS: The kata-agent relies on the KBS to provision cryptographic keys to the split API proxy server, facilitating the establishment of a secure channel.

- Generate TLS keys and certificates for  kata-agent’s API proxy server and client (owner-side)

```
$  KATA_DIR=”<PATH to cloned repo>”
$ pushd ${KATA_DIR}/src/agent/grpc_tls_keys
$ ./gen_key_cert.sh
```

 - Create a zip file named 'tls-keys.zip' containing the CA public key and the server’s public and private key pair

` $ zip tls-keys.zip server.pem server.key ca.pem`

  - Place this zip file in the KBS resource path '/default/tenant-keys/'. During sandbox creation, the kata-agent retrieves this file using the KBS 'get resource' API.  It's important to note that the KBS conducts a background check on the key request, verifying evidence provided by the Trusted Execution Environment (TEE). Future extensions to the KBS will automate the creation of the server’s public and private key pair for each sandbox.

` $ popd`

To  exercise the API proxy server, we provide the Kata Containers agent TLS control tool (kata-agent-tls-ctl), derived from the kata-agent-ctl tool in another commit. This tool communicates over a gRPC TLS channel with the kata-agent.  Similar to the kata-agent-ctl, this is a low level tool that is intended for advanced users. Future commit(s) will introduce a more user-friendly tool that maintains state, designed to function as a kubectl plugin for managing owners’ workloads.

Examples of creating and starting a container using kata-agent-tls-ctl:

Setup environment

```
$ export guest_addr=10.89.0.28 # IP address associated with the confidential VM
$ export guest_port=50090         # API proxy server’s port (listens on)
$ export ctl=./target/x86_64-unknown-linux-musl/release/kata-agent-tls-ctl
$ export key_dir=${KATA_DIR}/src/agent/grpc_tls_key
```

Display the status of containers in the sandbox environment

```
$ ${ctl} -l trace connect --key-dir "${key_dir}"  --bundle-dir "${bundle_dir}"  \
--server-address  "ipaddr://${guest_addr}:${guest_port}" \
 -c  "ListContainers"
```

Set a container ID and specify an OCI spec:

```
$ container_id=9e3d1d4750e4e20945d22c358e13c85c6b88922513bce2832c0cf403f065dc6
$ OCI_SPEC_CONFIG=${KATA_DIR}/src/tools/agent-tls-ctl/config.json
```

_Note: the next two commands require pull_image support in the guest!_
**Create container request**
```

$ ${ctl} -l trace connect --key-dir "${key_dir}" --bundle-dir "${bundle_dir}"  \
--server-address  "ipaddr://${guest_addr}:${guest_port}" \
 -c "CreateContainer cid=${container_id} spec=file:///${OCI_SPEC_CONFIG}"
```

**Start container request**

```
$ ${ctl} -l trace connect --no-auto-values --key-dir "${key_dir}" --bundle-dir "${bundle_dir}"  \
--server-address "ipaddr://${guest_addr}:${guest_port}" \
-c "StartContainer json://{\"container_id\": \"${container_id}\"}"
```
Fixes: kata-containers#1834
c3d pushed a commit to c3d/kata-containers that referenced this issue Feb 23, 2024
This commit is an initial step towards mitigating the risk of untrustworthy host systems when running Kata Containers in the context of confidential computing. It serves to safeguard against malicious, privileged users gaining access to the vulnerable Kata control plane. In scenarios where a malicious cloud service provider or administrator might intercept or compromise commands from the Kata control plane, tamper with container configuration files, execute processes within the container, retrieve workload statistics, or obtain sensitive container workload information, this protective measure becomes crucial.

This commit addresses the following open issues: [Securing the Kata Control Plane](confidential-containers/confidential-containers#53) and RFC: Separate trust realms for tenant and host kata-containers#1834. A detail history can be found in: https://github.com/ray-valdez/kata-containers/tree/split-api-feature.

The commit introduces a new split API mode in the kata-agent, which partitions the kata-agent’s API endpoints between h**ost-side** and **owner-side** controllers. When this mode is enabled, the host-side controller is restricted to manage resource allocation during startup and resource recycling at termination. In contrasts, the owner-side controller allows workload owners to directly manage theIR deployment pod and containers. This partitioning implicitly labels kata-agent’s endpoint APIs as _host-exclusive_, _owner-exclusive_, or _shared_.

Host-exclusive and owner-exclusive APIs are assigned specifically to either the host-side or owner-side. For instance, `CreateSandbox` and `DestroySandbox` are examples of host-exclusive APIs, while `CopyFile` and `ExecProcess` are examples of owner-exclusive APIs. Shared APIs include those that must be shared to some extent between the control planes, such as `GetOOMEvent` and `GetGuestDetails`.

This commit focuses on providing a secure channel for the owner-side to access owner-exlusive  and shared APIs. Future commit(s) will restrict the host-side access to owner-exclusive APIs when split mode is enabled on the kata-agent and will address the sharing of APIs between host-side and owner-side.

This commit implements the following changes:
- Introduces the split mode to the kata-agent.
- Integrates a gRPC TLS server to handle API requests from the owner-side.  We refer to this as the kata-agent’s API proxy server which ensures that workload owners can establish a secure end-to-end communication channel with the kata-agent tor invoking API endpoint commands.
- Utilizes the Key Broker Service (KBS) to provision secrets, i.e., cryptographic public and private  key pairs, These secrets are crucial for establishing a secure communication channel between the owner-side and the API proxy server.

To enable split mode functionality, the following steps are required:

1. Configuration: Modify the `kernel_params` option in the Kata's configuration.toml file to enable split mode and specify the IP address of the KBS.
- Add following settings to the `kernel_params` option:  `agent.split_api=true`  and `agent.aa_kbc_params=cc_kbc::http://[IP_ADDRESS]:[PORT]`.

2. Dependency on KBS: The kata-agent relies on the KBS to provision cryptographic keys to the split API proxy server, facilitating the establishment of a secure channel.

- Generate TLS keys and certificates for  kata-agent’s API proxy server and client (owner-side)

```
$  KATA_DIR=”<PATH to cloned repo>”
$ pushd ${KATA_DIR}/src/agent/grpc_tls_keys
$ ./gen_key_cert.sh
```

 - Create a zip file named 'tls-keys.zip' containing the CA public key and the server’s public and private key pair

` $ zip tls-keys.zip server.pem server.key ca.pem`

  - Place this zip file in the KBS resource path '/default/tenant-keys/'. During sandbox creation, the kata-agent retrieves this file using the KBS 'get resource' API.  It's important to note that the KBS conducts a background check on the key request, verifying evidence provided by the Trusted Execution Environment (TEE). Future extensions to the KBS will automate the creation of the server’s public and private key pair for each sandbox.

` $ popd`

To  exercise the API proxy server, we provide the Kata Containers agent TLS control tool (kata-agent-tls-ctl), derived from the kata-agent-ctl tool in another commit. This tool communicates over a gRPC TLS channel with the kata-agent.  Similar to the kata-agent-ctl, this is a low level tool that is intended for advanced users. Future commit(s) will introduce a more user-friendly tool that maintains state, designed to function as a kubectl plugin for managing owners’ workloads.

Examples of creating and starting a container using kata-agent-tls-ctl:

Setup environment

```
$ export guest_addr=10.89.0.28 # IP address associated with the confidential VM
$ export guest_port=50090         # API proxy server’s port (listens on)
$ export ctl=./target/x86_64-unknown-linux-musl/release/kata-agent-tls-ctl
$ export key_dir=${KATA_DIR}/src/agent/grpc_tls_key
```

Display the status of containers in the sandbox environment

```
$ ${ctl} -l trace connect --key-dir "${key_dir}"  --bundle-dir "${bundle_dir}"  \
--server-address  "ipaddr://${guest_addr}:${guest_port}" \
 -c  "ListContainers"
```

Set a container ID and specify an OCI spec:

```
$ container_id=9e3d1d4750e4e20945d22c358e13c85c6b88922513bce2832c0cf403f065dc6
$ OCI_SPEC_CONFIG=${KATA_DIR}/src/tools/agent-tls-ctl/config.json
```

_Note: the next two commands require pull_image support in the guest!_
**Create container request**
```

$ ${ctl} -l trace connect --key-dir "${key_dir}" --bundle-dir "${bundle_dir}"  \
--server-address  "ipaddr://${guest_addr}:${guest_port}" \
 -c "CreateContainer cid=${container_id} spec=file:///${OCI_SPEC_CONFIG}"
```

**Start container request**

```
$ ${ctl} -l trace connect --no-auto-values --key-dir "${key_dir}" --bundle-dir "${bundle_dir}"  \
--server-address "ipaddr://${guest_addr}:${guest_port}" \
-c "StartContainer json://{\"container_id\": \"${container_id}\"}"
```
Fixes: kata-containers#1834
c3d pushed a commit to c3d/kata-containers that referenced this issue Feb 23, 2024
This commit is an initial step towards mitigating the risk of
untrustworthy host systems when running Kata Containers in the context
of confidential computing. It serves to safeguard against malicious,
privileged users gaining access to the vulnerable Kata control
plane. In scenarios where a malicious cloud service provider or
administrator might intercept or compromise commands from the Kata
control plane, tamper with container configuration files, execute
processes within the container, retrieve workload statistics, or
obtain sensitive container workload information, this protective
measure becomes crucial.

This commit addresses the following open issues:
[Securing the Kata Control Plane](confidential-containers/confidential-containers#53)
and RFC: Separate trust realms for tenant and host kata-containers#1834. A detail
history can be found in:
https://github.com/ray-valdez/kata-containers/tree/split-api-feature.

The commit introduces a new split API mode in the kata-agent, which
partitions the kata-agent’s API endpoints between h**ost-side**
and **owner-side** controllers. When this mode is enabled, the
host-side controller is restricted to manage resource allocation
during startup and resource recycling at termination. In contrasts,
the owner-side controller allows workload owners to directly manage
theIR deployment pod and containers. This partitioning implicitly
labels kata-agent’s endpoint APIs as _host-exclusive_,
_owner-exclusive_, or _shared_.

Host-exclusive and owner-exclusive APIs are assigned specifically to
either the host-side or owner-side. For instance, `CreateSandbox` and
`DestroySandbox` are examples of host-exclusive APIs, while `CopyFile`
and `ExecProcess` are examples of owner-exclusive APIs. Shared APIs
include those that must be shared to some extent between the control
planes, such as `GetOOMEvent` and `GetGuestDetails`.

This commit focuses on providing a secure channel for the owner-side
to access owner-exlusive and shared APIs. Future commit(s) will
restrict the host-side access to owner-exclusive APIs when split mode
is enabled on the kata-agent and will address the sharing of APIs
between host-side and owner-side.

This commit implements the following changes:
- Introduces the split mode to the kata-agent.
- Integrates a gRPC TLS server to handle API requests from the
  owner-side.  We refer to this as the kata-agent’s API proxy server
  which ensures that workload owners can establish a secure end-to-end
  communication channel with the kata-agent tor invoking API endpoint
  commands.
- Utilizes the Key Broker Service (KBS) to provision secrets, i.e.,
  cryptographic public and private key pairs. These secrets are crucial
  for establishing a secure communication channel between the owner-side
  and the API proxy server.

To enable split mode functionality, the following steps are required:

1. Configuration: Modify the `kernel_params` option in the Kata's
  configuration.toml file to enable split mode and specify the IP
  address of the KBS.
- Add following settings to the `kernel_params` option:
  `agent.split_api=true` and
  `agent.aa_kbc_params=cc_kbc::http://[IP_ADDRESS]:[PORT]`.

2. Dependency on KBS: The kata-agent relies on the KBS to provision
   cryptographic keys to the split API proxy server, facilitating the
   establishment of a secure channel.

- Generate TLS keys and certificates for kata-agent’s API proxy server
  and client (owner-side)

```
$  KATA_DIR=”<PATH to cloned repo>”
$ pushd ${KATA_DIR}/src/agent/grpc_tls_keys
$ ./gen_key_cert.sh
```

 - Create a zip file named 'tls-keys.zip' containing the CA public key
   and the server’s public and private key pair

` $ zip tls-keys.zip server.pem server.key ca.pem`

  - Place this zip file in the KBS resource path
  '/default/tenant-keys/'. During sandbox creation, the kata-agent
  retrieves this file using the KBS 'get resource' API.  It's
  important to note that the KBS conducts a background check on the
  key request, verifying evidence provided by the Trusted Execution
  Environment (TEE). Future extensions to the KBS will automate the
  creation of the server’s public and private key pair for each
  sandbox.

` $ popd`

To  exercise the API proxy server, we provide the Kata Containers
agent TLS control tool (kata-agent-tls-ctl), derived from the
kata-agent-ctl tool in another commit. This tool communicates over a
gRPC TLS channel with the kata-agent.  Similar to the kata-agent-ctl,
this is a low level tool that is intended for advanced users. Future
commit(s) will introduce a more user-friendly tool that maintains
state, designed to function as a kubectl plugin for managing owners’
workloads.

Examples of creating and starting a container using kata-agent-tls-ctl:

Setup environment

```
$ export guest_addr=10.89.0.28 # IP address associated with the confidential VM
$ export guest_port=50090         # API proxy server’s port (listens on)
$ export ctl=./target/x86_64-unknown-linux-musl/release/kata-agent-tls-ctl
$ export key_dir=${KATA_DIR}/src/agent/grpc_tls_key
```

Display the status of containers in the sandbox environment

```
$ ${ctl} -l trace connect --key-dir "${key_dir}"  --bundle-dir "${bundle_dir}"  \
--server-address  "ipaddr://${guest_addr}:${guest_port}" \
 -c  "ListContainers"
```

Set a container ID and specify an OCI spec:

```
$ container_id=9e3d1d4750e4e20945d22c358e13c85c6b88922513bce2832c0cf403f065dc6
$ OCI_SPEC_CONFIG=${KATA_DIR}/src/tools/agent-tls-ctl/config.json
```

_Note: the next two commands require pull_image support in the guest!_
**Create container request**
```

$ ${ctl} -l trace connect --key-dir "${key_dir}" --bundle-dir "${bundle_dir}"  \
--server-address  "ipaddr://${guest_addr}:${guest_port}" \
 -c "CreateContainer cid=${container_id} spec=file:///${OCI_SPEC_CONFIG}"
```

**Start container request**

```
$ ${ctl} -l trace connect --no-auto-values --key-dir "${key_dir}" --bundle-dir "${bundle_dir}"  \
--server-address "ipaddr://${guest_addr}:${guest_port}" \
-c "StartContainer json://{\"container_id\": \"${container_id}\"}"
```

Fixes: kata-containers#1834
c3d pushed a commit to ray-valdez/kata-containers that referenced this issue Feb 23, 2024
This commit is an initial step towards mitigating the risk of
untrustworthy host systems when running Kata Containers in the context
of confidential computing. It serves to safeguard against malicious,
privileged users gaining access to the vulnerable Kata control
plane. This protective measure becomes crucial in scenarios where a
malicious cloud service provider or administrator might intercept or
compromise commands from the Kata control plane, tamper with container
configuration files, execute processes within the container, retrieve
workload statistics, or obtain sensitive container workload
information.

This commit addresses the following open issues:
[Securing the Kata Control Plane](confidential-containers/confidential-containers#53)
and RFC: Separate trust realms for tenant and host kata-containers#1834. A detail
history can be found in:
https://github.com/ray-valdez/kata-containers/tree/split-api-feature.

The commit introduces a new split API mode in the kata-agent, which
partitions the kata-agent’s API endpoints between **host-side**
and **owner-side** controllers. When this mode is enabled, the
host-side controller is restricted to manage resource allocation
during startup and resource recycling at termination. In contrasts,
the owner-side controller allows workload owners to directly manage
theIR deployment pod and containers. This partitioning implicitly
labels kata-agent’s endpoint APIs as _host-exclusive_,
_owner-exclusive_, or _shared_.

Host-exclusive and owner-exclusive APIs are assigned specifically to
either the host-side or owner-side. For instance, `CreateSandbox` and
`DestroySandbox` are examples of host-exclusive APIs, while `CopyFile`
and `ExecProcess` are examples of owner-exclusive APIs. Shared APIs
include those that must be shared to some extent between the control
planes, such as `GetOOMEvent` and `GetGuestDetails`.

This commit focuses on providing a secure channel for the owner-side
to access owner-exlusive and shared APIs. Future commit(s) will
restrict the host-side access to owner-exclusive APIs when split mode
is enabled on the kata-agent and will address the sharing of APIs
between host-side and owner-side.

This commit implements the following changes:
- Introduces the split mode to the kata-agent.
- Integrates a gRPC TLS server to handle API requests from the
  owner-side.  We refer to this as the kata-agent’s API proxy server
  which ensures that workload owners can establish a secure end-to-end
  communication channel with the kata-agent or invoking API endpoint
  commands.
- Utilizes the Key Broker Service (KBS) to provision secrets, i.e.,
  cryptographic public and private key pairs. These secrets are crucial
  for establishing a secure communication channel between the owner-side
  and the API proxy server.

To enable split mode functionality, the following steps are required:

1. Configuration: Modify the `kernel_params` option in the Kata's
  configuration.toml file to enable split mode and specify the IP
  address of the KBS.
- Add following settings to the `kernel_params` option:
  `agent.split_api=true` and
  `agent.aa_kbc_params=cc_kbc::http://[IP_ADDRESS]:[PORT]`.

2. Dependency on KBS: The kata-agent relies on the KBS to provision
   cryptographic keys to the split API proxy server, facilitating the
   establishment of a secure channel.

- Generate TLS keys and certificates for kata-agent’s API proxy server
  and client (owner-side)

```
$  KATA_DIR=”<PATH to cloned repo>”
$ pushd ${KATA_DIR}/src/agent/grpc_tls_keys
$ ./gen_key_cert.sh
```

 - Create a zip file named 'tls-keys.zip' containing the CA public key
   and the server’s public and private key pair

` $ zip tls-keys.zip server.pem server.key ca.pem`

  - Place this zip file in the KBS resource path
  '/default/tenant-keys/'. During sandbox creation, the kata-agent
  retrieves this file using the KBS 'get resource' API.  It's
  important to note that the KBS conducts a background check on the
  key request, verifying evidence provided by the Trusted Execution
  Environment (TEE). Future extensions to the KBS will automate the
  creation of the server’s public and private key pair for each
  sandbox.

` $ popd`

To  exercise the API proxy server, we provide the Kata Containers
agent TLS control tool (kata-agent-tls-ctl), derived from the
kata-agent-ctl tool in another commit. This tool communicates over a
gRPC TLS channel with the kata-agent.  Similar to the kata-agent-ctl,
this is a low level tool that is intended for advanced users. Future
commit(s) will introduce a more user-friendly tool that maintains
state, designed to function as a kubectl plugin for managing owners’
workloads.

Examples of creating and starting a container using kata-agent-tls-ctl:

Setup environment

```
$ export guest_addr=10.89.0.28 # IP address associated with the confidential VM
$ export guest_port=50090         # API proxy server’s port (listens on)
$ export ctl=./target/x86_64-unknown-linux-musl/release/kata-agent-tls-ctl
$ export key_dir=${KATA_DIR}/src/agent/grpc_tls_key
```

Display the status of containers in the sandbox environment

```
$ ${ctl} -l trace connect --key-dir "${key_dir}"  --bundle-dir "${bundle_dir}"  \
--server-address  "ipaddr://${guest_addr}:${guest_port}" \
 -c  "ListContainers"
```

Set a container ID and specify an OCI spec:

```
$ container_id=9e3d1d4750e4e20945d22c358e13c85c6b88922513bce2832c0cf403f065dc6
$ OCI_SPEC_CONFIG=${KATA_DIR}/src/tools/agent-tls-ctl/config.json
```

_Note: the next two commands require pull_image support in the guest!_
**Create container request**
```

$ ${ctl} -l trace connect --key-dir "${key_dir}" --bundle-dir "${bundle_dir}"  \
--server-address  "ipaddr://${guest_addr}:${guest_port}" \
 -c "CreateContainer cid=${container_id} spec=file:///${OCI_SPEC_CONFIG}"
```

**Start container request**

```
$ ${ctl} -l trace connect --no-auto-values --key-dir "${key_dir}" --bundle-dir "${bundle_dir}"  \
--server-address "ipaddr://${guest_addr}:${guest_port}" \
-c "StartContainer json://{\"container_id\": \"${container_id}\"}"
```

Fixes: kata-containers#1834
c3d pushed a commit to c3d/kata-containers that referenced this issue Feb 23, 2024
This commit is an initial step towards mitigating the risk of
untrustworthy host systems when running Kata Containers in the context
of confidential computing. It serves to safeguard against malicious,
privileged users gaining access to the vulnerable Kata control
plane. This protective measure becomes crucial in scenarios where a
malicious cloud service provider or administrator might intercept or
compromise commands from the Kata control plane, tamper with container
configuration files, execute processes within the container, retrieve
workload statistics, or obtain sensitive container workload
information.

** Problem statement **

This commit addresses the following open issues:
[Securing the Kata Control Plane](confidential-containers/confidential-containers#53)
and RFC: Separate trust realms for tenant and host kata-containers#1834. A detail
history can be found in:
https://github.com/ray-valdez/kata-containers/tree/split-api-feature.

** Archictectural changes **

The commit introduces a new split API mode in the kata-agent, which
partitions the kata-agent’s API endpoints between **host-side**
and **owner-side** controllers. When this mode is enabled, the
host-side controller is restricted to manage resource allocation
during startup and resource recycling at termination. In contrasts,
the owner-side controller allows workload owners to directly manage
theIR deployment pod and containers. This partitioning implicitly
labels kata-agent’s endpoint APIs as _host-exclusive_,
_owner-exclusive_, or _shared_.

Host-exclusive and owner-exclusive APIs are assigned specifically to
either the host-side or owner-side. For instance, `CreateSandbox` and
`DestroySandbox` are examples of host-exclusive APIs, while `CopyFile`
and `ExecProcess` are examples of owner-exclusive APIs. Shared APIs
include those that must be shared to some extent between the control
planes, such as `GetOOMEvent` and `GetGuestDetails`.

** Content of this commit **

This commit focuses on providing a secure channel for the owner-side
to access owner-exlusive and shared APIs. Future commit(s) will
restrict the host-side access to owner-exclusive APIs when split mode
is enabled on the kata-agent and will address the sharing of APIs
between host-side and owner-side.

This commit implements the following changes:
- Introduces the split mode to the kata-agent.
- Integrates a gRPC TLS server to handle API requests from the
  owner-side.  We refer to this as the kata-agent’s API proxy server
  which ensures that workload owners can establish a secure end-to-end
  communication channel with the kata-agent or invoking API endpoint
  commands.
- Utilizes the Key Broker Service (KBS) to provision secrets, i.e.,
  cryptographic public and private key pairs. These secrets are crucial
  for establishing a secure communication channel between the owner-side
  and the API proxy server.

** Testing **

To enable split mode functionality, the following steps are required:

1. Configuration: Modify the `kernel_params` option in the Kata's
  configuration.toml file to enable split mode and specify the IP
  address of the KBS.
- Add following settings to the `kernel_params` option:
  `agent.split_api=true` and
  `agent.aa_kbc_params=cc_kbc::http://[IP_ADDRESS]:[PORT]`.

2. Dependency on KBS: The kata-agent relies on the KBS to provision
   cryptographic keys to the split API proxy server, facilitating the
   establishment of a secure channel.

- Generate TLS keys and certificates for kata-agent’s API proxy server
  and client (owner-side)

```
$  KATA_DIR=”<PATH to cloned repo>”
$ pushd ${KATA_DIR}/src/agent/grpc_tls_keys
$ ./gen_key_cert.sh
```

 - Create a zip file named 'tls-keys.zip' containing the CA public key
   and the server’s public and private key pair

` $ zip tls-keys.zip server.pem server.key ca.pem`

  - Place this zip file in the KBS resource path
  '/default/tenant-keys/'. During sandbox creation, the kata-agent
  retrieves this file using the KBS 'get resource' API.  It's
  important to note that the KBS conducts a background check on the
  key request, verifying evidence provided by the Trusted Execution
  Environment (TEE). Future extensions to the KBS will automate the
  creation of the server’s public and private key pair for each
  sandbox.

` $ popd`

** External tools required for testing **

To  exercise the API proxy server, we provide the Kata Containers
agent TLS control tool (kata-agent-tls-ctl), derived from the
`kata-agent-ctl` tool in another commit (see `split-api-feature` branch
referenced above).

This tool communicates over a gRPC TLS channel with the kata-agent.
Similar to the kata-agent-ctl, this is a low level tool that is
intended for advanced users. Future commit(s) will introduce a more
user-friendly tool that maintains state, designed to function as a
kubectl plugin for managing owners’ workloads.

Examples of creating and starting a container using
`kata-agent-tls-ctl`:

Setup environment

```
$ export guest_addr=10.89.0.28 # IP address associated with the confidential VM
$ export guest_port=50090         # API proxy server’s port (listens on)
$ export ctl=./target/x86_64-unknown-linux-musl/release/kata-agent-tls-ctl
$ export key_dir=${KATA_DIR}/src/agent/grpc_tls_key
```

Display the status of containers in the sandbox environment

```
$ ${ctl} -l trace connect --key-dir "${key_dir}"  --bundle-dir "${bundle_dir}"  \
--server-address  "ipaddr://${guest_addr}:${guest_port}" \
 -c  "ListContainers"
```

Set a container ID and specify an OCI spec:

```
$ container_id=9e3d1d4750e4e20945d22c358e13c85c6b88922513bce2832c0cf403f065dc6
$ OCI_SPEC_CONFIG=${KATA_DIR}/src/tools/agent-tls-ctl/config.json
```

_Note: the next two commands require pull_image support in the guest!_
**Create container request**
```

$ ${ctl} -l trace connect --key-dir "${key_dir}" --bundle-dir "${bundle_dir}"  \
--server-address  "ipaddr://${guest_addr}:${guest_port}" \
 -c "CreateContainer cid=${container_id} spec=file:///${OCI_SPEC_CONFIG}"
```

**Start container request**

```
$ ${ctl} -l trace connect --no-auto-values --key-dir "${key_dir}" --bundle-dir "${bundle_dir}"  \
--server-address "ipaddr://${guest_addr}:${guest_port}" \
-c "StartContainer json://{\"container_id\": \"${container_id}\"}"
```

Fixes: kata-containers#1834
ray-valdez added a commit to ray-valdez/kata-containers that referenced this issue Feb 26, 2024
This commit is an initial step towards mitigating the risk of
untrustworthy host systems when running Kata Containers in the context
of confidential computing. It serves to safeguard against malicious,
privileged users gaining access to the vulnerable Kata control
plane. This protective measure becomes crucial in scenarios where a
malicious cloud service provider or administrator might intercept or
compromise commands from the Kata control plane, tamper with container
configuration files, execute processes within the container, retrieve
workload statistics, or obtain sensitive container workload
information.

** Problem statement **

This commit addresses the following open issues:
[Securing the Kata Control Plane](confidential-containers/confidential-containers#53)
and RFC: Separate trust realms for tenant and host kata-containers#1834. A detailed
history can be found in:
https://github.com/ray-valdez/kata-containers/tree/split-api-feature.

** Architectural changes **

The commit introduces a new split API mode in the kata-agent, which
partitions the kata-agent’s API endpoints between **host-side**
and **owner-side** controllers. When this mode is enabled, the
host-side controller is restricted to manage resource allocation
during startup and resource recycling at termination. In contrasts,
the owner-side controller allows workload owners to directly manage
theIR deployment pod and containers. This partitioning implicitly
labels kata-agent’s endpoint APIs as _host-exclusive_,
_owner-exclusive_, or _shared_.

Host-exclusive and owner-exclusive APIs are assigned specifically to
either the host-side or owner-side. For instance, `CreateSandbox` and
`DestroySandbox` are examples of host-exclusive APIs, while `CopyFile`
and `ExecProcess` are examples of owner-exclusive APIs. Shared APIs
include those that must be shared to some extent between the control
planes, such as `GetOOMEvent` and `GetGuestDetails`.

** Content of this commit **

This commit focuses on providing a secure channel for the owner-side
to access owner-exlusive and shared APIs. Future commit(s) will
restrict the host-side access to owner-exclusive APIs when split mode
is enabled on the kata-agent and will address the sharing of APIs
between host-side and owner-side.

This commit implements the following changes:
- Introduces the split mode to the kata-agent.
- Integrates a gRPC TLS server to handle API requests from the
  owner-side.  We refer to this as the kata-agent’s API proxy server
  which ensures that workload owners can establish a secure end-to-end
  communication channel with the kata-agent or invoking API endpoint
  commands.
- Utilizes the Key Broker Service (KBS) to provision secrets, i.e.,
  cryptographic public and private key pairs. These secrets are crucial
  for establishing a secure communication channel between the owner-side
  and the API proxy server.

** Testing **

To enable split mode functionality, the following steps are required:

1. Configuration: Modify the `kernel_params` option in the Kata's
  configuration.toml file to enable split mode and specify the IP
  address of the KBS.
- Add following settings to the `kernel_params` option:
  `agent.split_api=true` and
  `agent.aa_kbc_params=cc_kbc::http://[IP_ADDRESS]:[PORT]`.

2. Dependency on KBS: The kata-agent relies on the KBS to provision
   cryptographic keys to the split API proxy server, facilitating the
   establishment of a secure channel.

- Generate TLS keys and certificates for kata-agent’s API proxy server
  and client (owner-side)

```
$  KATA_DIR=”<PATH to cloned repo>”
$ pushd ${KATA_DIR}/src/agent/grpc_tls_keys
$ ./gen_key_cert.sh
```

 - Create a zip file named 'tls-keys.zip' containing the CA public key
   and the server’s public and private key pair

` $ zip tls-keys.zip server.pem server.key ca.pem`

  - Place this zip file in the KBS resource path
  '/default/tenant-keys/'. During sandbox creation, the kata-agent
  retrieves this file using the KBS 'get resource' API.  It's
  important to note that the KBS conducts a background check on the
  key request, verifying evidence provided by the Trusted Execution
  Environment (TEE). Future extensions to the KBS will automate the
  creation of the server’s public and private key pair for each
  sandbox.

` $ popd`

** External tools required for testing **

To  exercise the API proxy server, we provide the Kata Containers
agent TLS control tool (kata-agent-tls-ctl), derived from the
`kata-agent-ctl` tool in another commit (see `split-api-feature` branch
referenced above).

This tool communicates over a gRPC TLS channel with the kata-agent.
Similar to the kata-agent-ctl, this is a low level tool that is
intended for advanced users. Future commit(s) will introduce a more
user-friendly tool that maintains state, designed to function as a
kubectl plugin for managing owners’ workloads.

Examples of creating and starting a container using
`kata-agent-tls-ctl`:

Setup environment

```
$ export guest_addr=10.89.0.28 # IP address associated with the confidential VM
$ export guest_port=50090         # API proxy server’s port (listens on)
$ export ctl=./target/x86_64-unknown-linux-musl/release/kata-agent-tls-ctl
$ export key_dir=${KATA_DIR}/src/agent/grpc_tls_key
```

Display the status of containers in the sandbox environment

```
$ ${ctl} -l trace connect --key-dir "${key_dir}"  --bundle-dir "${bundle_dir}"  \
--server-address  "ipaddr://${guest_addr}:${guest_port}" \
 -c  "ListContainers"
```

Set a container ID and specify an OCI spec:

```
$ container_id=9e3d1d4750e4e20945d22c358e13c85c6b88922513bce2832c0cf403f065dc6
$ OCI_SPEC_CONFIG=${KATA_DIR}/src/tools/agent-tls-ctl/config.json
```

_Note: the next two commands require pull_image support in the guest!_
**Create container request**
```

$ ${ctl} -l trace connect --key-dir "${key_dir}" --bundle-dir "${bundle_dir}"  \
--server-address  "ipaddr://${guest_addr}:${guest_port}" \
 -c "CreateContainer cid=${container_id} spec=file:///${OCI_SPEC_CONFIG}"
```

**Start container request**

```
$ ${ctl} -l trace connect --no-auto-values --key-dir "${key_dir}" --bundle-dir "${bundle_dir}"  \
--server-address "ipaddr://${guest_addr}:${guest_port}" \
-c "StartContainer json://{\"container_id\": \"${container_id}\"}"
```

Fixes: kata-containers#1834
@ray-valdez ray-valdez linked a pull request Feb 26, 2024 that will close this issue
c3d pushed a commit to ray-valdez/kata-containers that referenced this issue Feb 26, 2024
This commit is an initial step towards mitigating the risk of
untrustworthy host systems when running Kata Containers in the context
of confidential computing. It serves to safeguard against malicious,
privileged users gaining access to the vulnerable Kata control
plane. This protective measure becomes crucial in scenarios where a
malicious cloud service provider or administrator might intercept or
compromise commands from the Kata control plane, tamper with container
configuration files, execute processes within the container, retrieve
workload statistics, or obtain sensitive container workload
information.

** Problem statement **

This commit addresses the following open issues:
[Securing the Kata Control Plane](confidential-containers/confidential-containers#53)
and RFC: Separate trust realms for tenant and host kata-containers#1834. A detailed
history can be found in:
https://github.com/ray-valdez/kata-containers/tree/split-api-feature.

** Architectural changes **

The commit introduces a new split API mode in the kata-agent, which
partitions the kata-agent’s API endpoints between **host-side**
and **owner-side** controllers. When this mode is enabled, the
host-side controller is restricted to manage resource allocation
during startup and resource recycling at termination. In contrasts,
the owner-side controller allows workload owners to directly manage
theIR deployment pod and containers. This partitioning implicitly
labels kata-agent’s endpoint APIs as _host-exclusive_,
_owner-exclusive_, or _shared_.

Host-exclusive and owner-exclusive APIs are assigned specifically to
either the host-side or owner-side. For instance, `CreateSandbox` and
`DestroySandbox` are examples of host-exclusive APIs, while `CopyFile`
and `ExecProcess` are examples of owner-exclusive APIs. Shared APIs
include those that must be shared to some extent between the control
planes, such as `GetOOMEvent` and `GetGuestDetails`.

** Content of this commit **

This commit focuses on providing a secure channel for the owner-side
to access owner-exlusive and shared APIs. Future commit(s) will
restrict the host-side access to owner-exclusive APIs when split mode
is enabled on the kata-agent and will address the sharing of APIs
between host-side and owner-side.

This commit implements the following changes:
- Introduces the split mode to the kata-agent.
- Integrates a gRPC TLS server to handle API requests from the
  owner-side.  We refer to this as the kata-agent’s API proxy server
  which ensures that workload owners can establish a secure end-to-end
  communication channel with the kata-agent or invoking API endpoint
  commands.
- Utilizes the Key Broker Service (KBS) to provision secrets, i.e.,
  cryptographic public and private key pairs. These secrets are crucial
  for establishing a secure communication channel between the owner-side
  and the API proxy server.

** Testing **

To enable split mode functionality, the following steps are required:

1. Configuration: Modify the `kernel_params` option in the Kata's
  configuration.toml file to enable split mode and specify the IP
  address of the KBS.
- Add following settings to the `kernel_params` option:
  `agent.split_api=true` and
  `agent.aa_kbc_params=cc_kbc::http://[IP_ADDRESS]:[PORT]`.

2. Dependency on KBS: The kata-agent relies on the KBS to provision
   cryptographic keys to the split API proxy server, facilitating the
   establishment of a secure channel.

- Generate TLS keys and certificates for kata-agent’s API proxy server
  and client (owner-side)

```
$  KATA_DIR=”<PATH to cloned repo>”
$ pushd ${KATA_DIR}/src/agent/grpc_tls_keys
$ ./gen_key_cert.sh
```

 - Create a zip file named 'tls-keys.zip' containing the CA public key
   and the server’s public and private key pair

` $ zip tls-keys.zip server.pem server.key ca.pem`

  - Place this zip file in the KBS resource path
  '/default/tenant-keys/'. During sandbox creation, the kata-agent
  retrieves this file using the KBS 'get resource' API.  It's
  important to note that the KBS conducts a background check on the
  key request, verifying evidence provided by the Trusted Execution
  Environment (TEE). Future extensions to the KBS will automate the
  creation of the server’s public and private key pair for each
  sandbox.

` $ popd`

** External tools required for testing **

To  exercise the API proxy server, we provide the Kata Containers
agent TLS control tool (kata-agent-tls-ctl), derived from the
`kata-agent-ctl` tool in another commit (see `split-api-feature` branch
referenced above).

This tool communicates over a gRPC TLS channel with the kata-agent.
Similar to the kata-agent-ctl, this is a low level tool that is
intended for advanced users. Future commit(s) will introduce a more
user-friendly tool that maintains state, designed to function as a
kubectl plugin for managing owners’ workloads.

Examples of creating and starting a container using
`kata-agent-tls-ctl`:

Setup environment

```
$ export guest_addr=10.89.0.28 # IP address associated with the confidential VM
$ export guest_port=50090         # API proxy server’s port (listens on)
$ export ctl=./target/x86_64-unknown-linux-musl/release/kata-agent-tls-ctl
$ export key_dir=${KATA_DIR}/src/agent/grpc_tls_key
```

Display the status of containers in the sandbox environment

```
$ ${ctl} -l trace connect --key-dir "${key_dir}"  --bundle-dir "${bundle_dir}"  \
--server-address  "ipaddr://${guest_addr}:${guest_port}" \
 -c  "ListContainers"
```

Set a container ID and specify an OCI spec:

```
$ container_id=9e3d1d4750e4e20945d22c358e13c85c6b88922513bce2832c0cf403f065dc6
$ OCI_SPEC_CONFIG=${KATA_DIR}/src/tools/agent-tls-ctl/config.json
```

_Note: the next two commands require pull_image support in the guest!_
**Create container request**
```

$ ${ctl} -l trace connect --key-dir "${key_dir}" --bundle-dir "${bundle_dir}"  \
--server-address  "ipaddr://${guest_addr}:${guest_port}" \
 -c "CreateContainer cid=${container_id} spec=file:///${OCI_SPEC_CONFIG}"
```

**Start container request**

```
$ ${ctl} -l trace connect --no-auto-values --key-dir "${key_dir}" --bundle-dir "${bundle_dir}"  \
--server-address "ipaddr://${guest_addr}:${guest_port}" \
-c "StartContainer json://{\"container_id\": \"${container_id}\"}"
```

Fixes: kata-containers#1834
c3d pushed a commit to ray-valdez/kata-containers that referenced this issue Feb 26, 2024
This commit is an initial step towards mitigating the risk of
untrustworthy host systems when running Kata Containers in the context
of confidential computing. It serves to safeguard against malicious,
privileged users gaining access to the vulnerable Kata control
plane. This protective measure becomes crucial in scenarios where a
malicious cloud service provider or administrator might intercept or
compromise commands from the Kata control plane, tamper with container
configuration files, execute processes within the container, retrieve
workload statistics, or obtain sensitive container workload
information.

** Problem statement **

This commit addresses the following open issues:
[Securing the Kata Control Plane](confidential-containers/confidential-containers#53)
and RFC: Separate trust realms for tenant and host kata-containers#1834. A detailed
history can be found in:
https://github.com/ray-valdez/kata-containers/tree/split-api-feature.

** Architectural changes **

The commit introduces a new split API mode in the kata-agent, which
partitions the kata-agent’s API endpoints between **host-side**
and **owner-side** controllers. When this mode is enabled, the
host-side controller is restricted to manage resource allocation
during startup and resource recycling at termination. In contrasts,
the owner-side controller allows workload owners to directly manage
theIR deployment pod and containers. This partitioning implicitly
labels kata-agent’s endpoint APIs as _host-exclusive_,
_owner-exclusive_, or _shared_.

Host-exclusive and owner-exclusive APIs are assigned specifically to
either the host-side or owner-side. For instance, `CreateSandbox` and
`DestroySandbox` are examples of host-exclusive APIs, while `CopyFile`
and `ExecProcess` are examples of owner-exclusive APIs. Shared APIs
include those that must be shared to some extent between the control
planes, such as `GetOOMEvent` and `GetGuestDetails`.

** Content of this commit **

This commit focuses on providing a secure channel for the owner-side
to access owner-exlusive and shared APIs. Future commit(s) will
restrict the host-side access to owner-exclusive APIs when split mode
is enabled on the kata-agent and will address the sharing of APIs
between host-side and owner-side.

This commit implements the following changes:
- Introduces the split mode to the kata-agent.
- Integrates a gRPC TLS server to handle API requests from the
  owner-side.  We refer to this as the kata-agent’s API proxy server
  which ensures that workload owners can establish a secure end-to-end
  communication channel with the kata-agent or invoking API endpoint
  commands.
- Utilizes the Key Broker Service (KBS) to provision secrets, i.e.,
  cryptographic public and private key pairs. These secrets are crucial
  for establishing a secure communication channel between the owner-side
  and the API proxy server.

** Testing **

To enable split mode functionality, the following steps are required:

1. Configuration: Modify the `kernel_params` option in the Kata's
  configuration.toml file to enable split mode and specify the IP
  address of the KBS.
- Add following settings to the `kernel_params` option:
  `agent.split_api=true` and
  `agent.aa_kbc_params=cc_kbc::http://[IP_ADDRESS]:[PORT]`.

2. Dependency on KBS: The kata-agent relies on the KBS to provision
   cryptographic keys to the split API proxy server, facilitating the
   establishment of a secure channel.

- Generate TLS keys and certificates for kata-agent’s API proxy server
  and client (owner-side)

```
$  KATA_DIR=”<PATH to cloned repo>”
$ pushd ${KATA_DIR}/src/agent/grpc_tls_keys
$ ./gen_key_cert.sh
```

 - Create a zip file named 'tls-keys.zip' containing the CA public key
   and the server’s public and private key pair

` $ zip tls-keys.zip server.pem server.key ca.pem`

  - Place this zip file in the KBS resource path
  '/default/tenant-keys/'. During sandbox creation, the kata-agent
  retrieves this file using the KBS 'get resource' API.  It's
  important to note that the KBS conducts a background check on the
  key request, verifying evidence provided by the Trusted Execution
  Environment (TEE). Future extensions to the KBS will automate the
  creation of the server’s public and private key pair for each
  sandbox.

` $ popd`

** External tools required for testing **

To  exercise the API proxy server, we provide the Kata Containers
agent TLS control tool (kata-agent-tls-ctl), derived from the
`kata-agent-ctl` tool in another commit (see `split-api-feature` branch
referenced above).

This tool communicates over a gRPC TLS channel with the kata-agent.
Similar to the kata-agent-ctl, this is a low level tool that is
intended for advanced users. Future commit(s) will introduce a more
user-friendly tool that maintains state, designed to function as a
kubectl plugin for managing owners’ workloads.

Examples of creating and starting a container using
`kata-agent-tls-ctl`:

Setup environment

```
$ export guest_addr=10.89.0.28 # IP address associated with the confidential VM
$ export guest_port=50090         # API proxy server’s port (listens on)
$ export ctl=./target/x86_64-unknown-linux-musl/release/kata-agent-tls-ctl
$ export key_dir=${KATA_DIR}/src/agent/grpc_tls_key
```

Display the status of containers in the sandbox environment

```
$ ${ctl} -l trace connect --key-dir "${key_dir}"  --bundle-dir "${bundle_dir}"  \
--server-address  "ipaddr://${guest_addr}:${guest_port}" \
 -c  "ListContainers"
```

Set a container ID and specify an OCI spec:

```
$ container_id=9e3d1d4750e4e20945d22c358e13c85c6b88922513bce2832c0cf403f065dc6
$ OCI_SPEC_CONFIG=${KATA_DIR}/src/tools/agent-tls-ctl/config.json
```

_Note: the next two commands require pull_image support in the guest!_
**Create container request**
```

$ ${ctl} -l trace connect --key-dir "${key_dir}" --bundle-dir "${bundle_dir}"  \
--server-address  "ipaddr://${guest_addr}:${guest_port}" \
 -c "CreateContainer cid=${container_id} spec=file:///${OCI_SPEC_CONFIG}"
```

**Start container request**

```
$ ${ctl} -l trace connect --no-auto-values --key-dir "${key_dir}" --bundle-dir "${bundle_dir}"  \
--server-address "ipaddr://${guest_addr}:${guest_port}" \
-c "StartContainer json://{\"container_id\": \"${container_id}\"}"
```

Fixes: kata-containers#1834

Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
ray-valdez added a commit to ray-valdez/kata-containers that referenced this issue Feb 26, 2024
This commit is an initial step towards mitigating the risk of
untrustworthy host systems when running Kata Containers in the context
of confidential computing. It serves to safeguard against malicious,
privileged users gaining access to the vulnerable Kata control
plane. This protective measure becomes crucial in scenarios where a
malicious cloud service provider or administrator might intercept or
compromise commands from the Kata control plane, tamper with container
configuration files, execute processes within the container, retrieve
workload statistics, or obtain sensitive container workload
information.

** Problem statement **

This commit addresses the following open issues:
[Securing the Kata Control Plane](confidential-containers/confidential-containers#53)
and RFC: Separate trust realms for tenant and host kata-containers#1834. A detailed
history can be found in:
https://github.com/ray-valdez/kata-containers/tree/split-api-feature.

** Architectural changes **

The commit introduces a new split API mode in the kata-agent, which
partitions the kata-agent’s API endpoints between **host-side**
and **owner-side** controllers. When this mode is enabled, the
host-side controller is restricted to manage resource allocation
during startup and resource recycling at termination. In contrasts,
the owner-side controller allows workload owners to directly manage
theIR deployment pod and containers. This partitioning implicitly
labels kata-agent’s endpoint APIs as _host-exclusive_,
_owner-exclusive_, or _shared_.

Host-exclusive and owner-exclusive APIs are assigned specifically to
either the host-side or owner-side. For instance, `CreateSandbox` and
`DestroySandbox` are examples of host-exclusive APIs, while `CopyFile`
and `ExecProcess` are examples of owner-exclusive APIs. Shared APIs
include those that must be shared to some extent between the control
planes, such as `GetOOMEvent` and `GetGuestDetails`.

** Content of this commit **

This commit focuses on providing a secure channel for the owner-side
to access owner-exlusive and shared APIs. Future commit(s) will
restrict the host-side access to owner-exclusive APIs when split mode
is enabled on the kata-agent and will address the sharing of APIs
between host-side and owner-side.

This commit implements the following changes:
- Introduces the split mode to the kata-agent.
- Integrates a gRPC TLS server to handle API requests from the
  owner-side.  We refer to this as the kata-agent’s API proxy server
  which ensures that workload owners can establish a secure end-to-end
  communication channel with the kata-agent or invoking API endpoint
  commands.
- Utilizes the Key Broker Service (KBS) to provision secrets, i.e.,
  cryptographic public and private key pairs. These secrets are crucial
  for establishing a secure communication channel between the owner-side
  and the API proxy server.

** Testing **

To enable split mode functionality, the following steps are required:

1. Configuration: Modify the `kernel_params` option in the Kata's
  configuration.toml file to enable split mode and specify the IP
  address of the KBS.
- Add following settings to the `kernel_params` option:
  `agent.split_api=true` and
  `agent.aa_kbc_params=cc_kbc::http://[IP_ADDRESS]:[PORT]`.

2. Dependency on KBS: The kata-agent relies on the KBS to provision
   cryptographic keys to the split API proxy server, facilitating the
   establishment of a secure channel.

- Generate TLS keys and certificates for kata-agent’s API proxy server
  and client (owner-side)

```
$  KATA_DIR=”<PATH to cloned repo>”
$ pushd ${KATA_DIR}/src/agent/grpc_tls_keys
$ ./gen_key_cert.sh
```

 - Create a zip file named 'tls-keys.zip' containing the CA public key
   and the server’s public and private key pair

` $ zip tls-keys.zip server.pem server.key ca.pem`

  - Place this zip file in the KBS resource path
  '/default/tenant-keys/'. During sandbox creation, the kata-agent
  retrieves this file using the KBS 'get resource' API.  It's
  important to note that the KBS conducts a background check on the
  key request, verifying evidence provided by the Trusted Execution
  Environment (TEE). Future extensions to the KBS will automate the
  creation of the server’s public and private key pair for each
  sandbox.

` $ popd`

** External tools required for testing **

To  exercise the API proxy server, we provide the Kata Containers
agent TLS control tool (kata-agent-tls-ctl), derived from the
`kata-agent-ctl` tool in another commit (see `split-api-feature` branch
referenced above).

This tool communicates over a gRPC TLS channel with the kata-agent.
Similar to the kata-agent-ctl, this is a low level tool that is
intended for advanced users. Future commit(s) will introduce a more
user-friendly tool that maintains state, designed to function as a
kubectl plugin for managing owners’ workloads.

Examples of creating and starting a container using
`kata-agent-tls-ctl`:

Setup environment

```
$ export guest_addr=10.89.0.28 # IP address associated with the confidential VM
$ export guest_port=50090         # API proxy server’s port (listens on)
$ export ctl=./target/x86_64-unknown-linux-musl/release/kata-agent-tls-ctl
$ export key_dir=${KATA_DIR}/src/agent/grpc_tls_key
```

Display the status of containers in the sandbox environment

```
$ ${ctl} -l trace connect --key-dir "${key_dir}"  --bundle-dir "${bundle_dir}"  \
--server-address  "ipaddr://${guest_addr}:${guest_port}" \
 -c  "ListContainers"
```

Set a container ID and specify an OCI spec:

```
$ container_id=9e3d1d4750e4e20945d22c358e13c85c6b88922513bce2832c0cf403f065dc6
$ OCI_SPEC_CONFIG=${KATA_DIR}/src/tools/agent-tls-ctl/config.json
```

_Note: the next two commands require pull_image support in the guest!_
**Create container request**
```

$ ${ctl} -l trace connect --key-dir "${key_dir}" --bundle-dir "${bundle_dir}"  \
--server-address  "ipaddr://${guest_addr}:${guest_port}" \
 -c "CreateContainer cid=${container_id} spec=file:///${OCI_SPEC_CONFIG}"
```

**Start container request**

```
$ ${ctl} -l trace connect --no-auto-values --key-dir "${key_dir}" --bundle-dir "${bundle_dir}"  \
--server-address "ipaddr://${guest_addr}:${guest_port}" \
-c "StartContainer json://{\"container_id\": \"${container_id}\"}"
```

Fixes: kata-containers#1834

Signed-off-by: Ray Valdez <rvaldez@us.ibm.com>
Signed-off-by: Salman Ahmed <sahmed@ibm.com>
Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
Signed-off-by: Zhongshu Gu <zgu@us.ibm.com>
Signed-off-by: Pau-Chen Cheng <pau@us.ibm.com>
c3d pushed a commit to c3d/kata-containers that referenced this issue Feb 27, 2024
This commit is an initial step towards mitigating the risk of
untrustworthy host systems when running Kata Containers in the context
of confidential computing. It serves to safeguard against malicious,
privileged users gaining access to the vulnerable Kata control
plane. This protective measure becomes crucial in scenarios where a
malicious cloud service provider or administrator might intercept or
compromise commands from the Kata control plane, tamper with container
configuration files, execute processes within the container, retrieve
workload statistics, or obtain sensitive container workload
information.

** Problem statement **

This commit addresses the following open issues:
[Securing the Kata Control Plane](confidential-containers/confidential-containers#53)
and RFC: Separate trust realms for tenant and host kata-containers#1834. A detailed
history can be found in:
https://github.com/ray-valdez/kata-containers/tree/split-api-feature.

** Architectural changes **

The commit introduces a new split API mode in the kata-agent, which
partitions the kata-agent’s API endpoints between **host-side**
and **owner-side** controllers. When this mode is enabled, the
host-side controller is restricted to manage resource allocation
during startup and resource recycling at termination. In contrasts,
the owner-side controller allows workload owners to directly manage
theIR deployment pod and containers. This partitioning implicitly
labels kata-agent’s endpoint APIs as _host-exclusive_,
_owner-exclusive_, or _shared_.

Host-exclusive and owner-exclusive APIs are assigned specifically to
either the host-side or owner-side. For instance, `CreateSandbox` and
`DestroySandbox` are examples of host-exclusive APIs, while `CopyFile`
and `ExecProcess` are examples of owner-exclusive APIs. Shared APIs
include those that must be shared to some extent between the control
planes, such as `GetOOMEvent` and `GetGuestDetails`.

** Content of this commit **

This commit focuses on providing a secure channel for the owner-side
to access owner-exlusive and shared APIs. Future commit(s) will
restrict the host-side access to owner-exclusive APIs when split mode
is enabled on the kata-agent and will address the sharing of APIs
between host-side and owner-side.

This commit implements the following changes:
- Introduces the split mode to the kata-agent.
- Integrates a gRPC TLS server to handle API requests from the
  owner-side.  We refer to this as the kata-agent’s API proxy server
  which ensures that workload owners can establish a secure end-to-end
  communication channel with the kata-agent or invoking API endpoint
  commands.
- Utilizes the Key Broker Service (KBS) to provision secrets, i.e.,
  cryptographic public and private key pairs. These secrets are crucial
  for establishing a secure communication channel between the owner-side
  and the API proxy server.

** Testing **

To enable split mode functionality, the following steps are required:

1. Configuration: Modify the `kernel_params` option in the Kata's
  configuration.toml file to enable split mode and specify the IP
  address of the KBS.
- Add following settings to the `kernel_params` option:
  `agent.split_api=true` and
  `agent.aa_kbc_params=cc_kbc::http://[IP_ADDRESS]:[PORT]`.

2. Dependency on KBS: The kata-agent relies on the KBS to provision
   cryptographic keys to the split API proxy server, facilitating the
   establishment of a secure channel.

- Generate TLS keys and certificates for kata-agent’s API proxy server
  and client (owner-side)

```
$  KATA_DIR=”<PATH to cloned repo>”
$ pushd ${KATA_DIR}/src/agent/grpc_tls_keys
$ ./gen_key_cert.sh
```

 - Create a zip file named 'tls-keys.zip' containing the CA public key
   and the server’s public and private key pair

` $ zip tls-keys.zip server.pem server.key ca.pem`

  - Place this zip file in the KBS resource path
  '/default/tenant-keys/'. During sandbox creation, the kata-agent
  retrieves this file using the KBS 'get resource' API.  It's
  important to note that the KBS conducts a background check on the
  key request, verifying evidence provided by the Trusted Execution
  Environment (TEE). Future extensions to the KBS will automate the
  creation of the server’s public and private key pair for each
  sandbox.

` $ popd`

** External tools required for testing **

To  exercise the API proxy server, we provide the Kata Containers
agent TLS control tool (kata-agent-tls-ctl), derived from the
`kata-agent-ctl` tool in another commit (see `split-api-feature` branch
referenced above).

This tool communicates over a gRPC TLS channel with the kata-agent.
Similar to the kata-agent-ctl, this is a low level tool that is
intended for advanced users. Future commit(s) will introduce a more
user-friendly tool that maintains state, designed to function as a
kubectl plugin for managing owners’ workloads.

Examples of creating and starting a container using
`kata-agent-tls-ctl`:

Setup environment

```
$ export guest_addr=10.89.0.28 # IP address associated with the confidential VM
$ export guest_port=50090         # API proxy server’s port (listens on)
$ export ctl=./target/x86_64-unknown-linux-musl/release/kata-agent-tls-ctl
$ export key_dir=${KATA_DIR}/src/agent/grpc_tls_key
```

Display the status of containers in the sandbox environment

```
$ ${ctl} -l trace connect --key-dir "${key_dir}"  --bundle-dir "${bundle_dir}"  \
--server-address  "ipaddr://${guest_addr}:${guest_port}" \
 -c  "ListContainers"
```

Set a container ID and specify an OCI spec:

```
$ container_id=9e3d1d4750e4e20945d22c358e13c85c6b88922513bce2832c0cf403f065dc6
$ OCI_SPEC_CONFIG=${KATA_DIR}/src/tools/agent-tls-ctl/config.json
```

_Note: the next two commands require pull_image support in the guest!_
**Create container request**
```

$ ${ctl} -l trace connect --key-dir "${key_dir}" --bundle-dir "${bundle_dir}"  \
--server-address  "ipaddr://${guest_addr}:${guest_port}" \
 -c "CreateContainer cid=${container_id} spec=file:///${OCI_SPEC_CONFIG}"
```

**Start container request**

```
$ ${ctl} -l trace connect --no-auto-values --key-dir "${key_dir}" --bundle-dir "${bundle_dir}"  \
--server-address "ipaddr://${guest_addr}:${guest_port}" \
-c "StartContainer json://{\"container_id\": \"${container_id}\"}"
```

Fixes: kata-containers#1834

Signed-off-by: Ray Valdez <rvaldez@us.ibm.com>
Signed-off-by: Salman Ahmed <sahmed@ibm.com>
Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
Signed-off-by: Zhongshu Gu <zgu@us.ibm.com>
Signed-off-by: Pau-Chen Cheng <pau@us.ibm.com>
ray-valdez added a commit to ray-valdez/kata-containers that referenced this issue Mar 5, 2024
This commit provides a tool for testing the kata-agent's proxy API
server, addressing the following open issues: [Securing the Kata
Control Plane](confidential-containers/confidential-containers#53) and
RFC: Separate trust realms for tenant and host kata-containers#1834. A detailed
history can be found in:
https://github.com/ray-valdez/kata-containers/tree/split-api-feature.

Depends on: kata-containers#9159

Signed-off-by: Ray Valdez <rvaldez@us.ibm.com>
Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
ray-valdez added a commit to ray-valdez/kata-containers that referenced this issue Mar 15, 2024
This commit provides a tool for testing the kata-agent's proxy API
server, addressing the following open issues: [Securing the Kata
Control Plane](confidential-containers/confidential-containers#53) and
RFC: Separate trust realms for tenant and host kata-containers#1834. A detailed
history can be found in:
https://github.com/ray-valdez/kata-containers/tree/split-api-feature.

Depends on: kata-containers#9159

Signed-off-by: Ray Valdez <rvaldez@us.ibm.com>
Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
c3d pushed a commit to c3d/kata-containers that referenced this issue Mar 25, 2024
This commit is an initial step towards mitigating the risk of
untrustworthy host systems when running Kata Containers in the context
of confidential computing. It serves to safeguard against malicious,
privileged users gaining access to the vulnerable Kata control
plane. This protective measure becomes crucial in scenarios where a
malicious cloud service provider or administrator might intercept or
compromise commands from the Kata control plane, tamper with container
configuration files, execute processes within the container, retrieve
workload statistics, or obtain sensitive container workload
information.

** Problem statement **

This commit addresses the following open issues:
[Securing the Kata Control Plane](confidential-containers/confidential-containers#53)
and RFC: Separate trust realms for tenant and host kata-containers#1834. A detailed
history can be found in:
https://github.com/ray-valdez/kata-containers/tree/split-api-feature.

** Architectural changes **

The commit introduces a new split API mode in the kata-agent, which
partitions the kata-agent’s API endpoints between **host-side**
and **owner-side** controllers. When this mode is enabled, the
host-side controller is restricted to manage resource allocation
during startup and resource recycling at termination. In contrasts,
the owner-side controller allows workload owners to directly manage
theIR deployment pod and containers. This partitioning implicitly
labels kata-agent’s endpoint APIs as _host-exclusive_,
_owner-exclusive_, or _shared_.

Host-exclusive and owner-exclusive APIs are assigned specifically to
either the host-side or owner-side. For instance, `CreateSandbox` and
`DestroySandbox` are examples of host-exclusive APIs, while `CopyFile`
and `ExecProcess` are examples of owner-exclusive APIs. Shared APIs
include those that must be shared to some extent between the control
planes, such as `GetOOMEvent` and `GetGuestDetails`.

** Content of this commit **

This commit focuses on providing a secure channel for the owner-side
to access owner-exlusive and shared APIs. Future commit(s) will
restrict the host-side access to owner-exclusive APIs when split mode
is enabled on the kata-agent and will address the sharing of APIs
between host-side and owner-side.

This commit implements the following changes:
- Introduces the split mode to the kata-agent.
- Integrates a gRPC TLS server to handle API requests from the
  owner-side.  We refer to this as the kata-agent’s API proxy server
  which ensures that workload owners can establish a secure end-to-end
  communication channel with the kata-agent or invoking API endpoint
  commands.
- Utilizes the Key Broker Service (KBS) to provision secrets, i.e.,
  cryptographic public and private key pairs. These secrets are crucial
  for establishing a secure communication channel between the owner-side
  and the API proxy server.

** Testing **

To enable split mode functionality, the following steps are required:

1. Configuration: Modify the `kernel_params` option in the Kata's
  configuration.toml file to enable split mode and specify the IP
  address of the KBS.
- Add following settings to the `kernel_params` option:
  `agent.split_api=true` and
  `agent.aa_kbc_params=cc_kbc::http://[IP_ADDRESS]:[PORT]`.

2. Dependency on KBS: The kata-agent relies on the KBS to provision
   cryptographic keys to the split API proxy server, facilitating the
   establishment of a secure channel.

- Generate TLS keys and certificates for kata-agent’s API proxy server
  and client (owner-side)

```
$  KATA_DIR=”<PATH to cloned repo>”
$ pushd ${KATA_DIR}/src/agent/grpc_tls_keys
$ ./gen_key_cert.sh
```

 - Create a zip file named 'tls-keys.zip' containing the CA public key
   and the server’s public and private key pair

` $ zip tls-keys.zip server.pem server.key ca.pem`

  - Place this zip file in the KBS resource path
  '/default/tenant-keys/'. During sandbox creation, the kata-agent
  retrieves this file using the KBS 'get resource' API.  It's
  important to note that the KBS conducts a background check on the
  key request, verifying evidence provided by the Trusted Execution
  Environment (TEE). Future extensions to the KBS will automate the
  creation of the server’s public and private key pair for each
  sandbox.

` $ popd`

** External tools required for testing **

To  exercise the API proxy server, we provide the Kata Containers
agent TLS control tool (kata-agent-tls-ctl), derived from the
`kata-agent-ctl` tool in another commit (see `split-api-feature` branch
referenced above).

This tool communicates over a gRPC TLS channel with the kata-agent.
Similar to the kata-agent-ctl, this is a low level tool that is
intended for advanced users. Future commit(s) will introduce a more
user-friendly tool that maintains state, designed to function as a
kubectl plugin for managing owners’ workloads.

Examples of creating and starting a container using
`kata-agent-tls-ctl`:

Setup environment

```
$ export guest_addr=10.89.0.28 # IP address associated with the confidential VM
$ export guest_port=50090         # API proxy server’s port (listens on)
$ export ctl=./target/x86_64-unknown-linux-musl/release/kata-agent-tls-ctl
$ export key_dir=${KATA_DIR}/src/agent/grpc_tls_key
```

Display the status of containers in the sandbox environment

```
$ ${ctl} -l trace connect --key-dir "${key_dir}"  --bundle-dir "${bundle_dir}"  \
--server-address  "ipaddr://${guest_addr}:${guest_port}" \
 -c  "ListContainers"
```

Set a container ID and specify an OCI spec:

```
$ container_id=9e3d1d4750e4e20945d22c358e13c85c6b88922513bce2832c0cf403f065dc6
$ OCI_SPEC_CONFIG=${KATA_DIR}/src/tools/agent-tls-ctl/config.json
```

_Note: the next two commands require pull_image support in the guest!_
**Create container request**
```

$ ${ctl} -l trace connect --key-dir "${key_dir}" --bundle-dir "${bundle_dir}"  \
--server-address  "ipaddr://${guest_addr}:${guest_port}" \
 -c "CreateContainer cid=${container_id} spec=file:///${OCI_SPEC_CONFIG}"
```

**Start container request**

```
$ ${ctl} -l trace connect --no-auto-values --key-dir "${key_dir}" --bundle-dir "${bundle_dir}"  \
--server-address "ipaddr://${guest_addr}:${guest_port}" \
-c "StartContainer json://{\"container_id\": \"${container_id}\"}"
```

Fixes: kata-containers#1834

Signed-off-by: Ray Valdez <rvaldez@us.ibm.com>
Signed-off-by: Salman Ahmed <sahmed@ibm.com>
Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
Signed-off-by: Zhongshu Gu <zgu@us.ibm.com>
Signed-off-by: Pau-Chen Cheng <pau@us.ibm.com>
ray-valdez added a commit to ray-valdez/kata-containers that referenced this issue Mar 26, 2024
This commit provides a tool for testing the kata-agent's proxy API
server, addressing the following open issues: [Securing the Kata
Control Plane](confidential-containers/confidential-containers#53) and
RFC: Separate trust realms for tenant and host kata-containers#1834. A detailed
history can be found in:
https://github.com/ray-valdez/kata-containers/tree/split-api-feature.

Signed-off-by: Ray Valdez <rvaldez@us.ibm.com>
Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
c3d pushed a commit to ray-valdez/kata-containers that referenced this issue Mar 27, 2024
This commit is an initial step towards mitigating the risk of
untrustworthy host systems when running Kata Containers in the context
of confidential computing. It serves to safeguard against malicious,
privileged users gaining access to the vulnerable Kata control
plane. This protective measure becomes crucial in scenarios where a
malicious cloud service provider or administrator might intercept or
compromise commands from the Kata control plane, tamper with container
configuration files, execute processes within the container, retrieve
workload statistics, or obtain sensitive container workload
information.

** Problem statement **

This commit addresses the following open issues:
[Securing the Kata Control Plane](confidential-containers/confidential-containers#53)
and RFC: Separate trust realms for tenant and host kata-containers#1834. A detailed
history can be found in:
https://github.com/ray-valdez/kata-containers/tree/split-api-feature.

** Architectural changes **

The commit introduces a new split API mode in the kata-agent, which
partitions the kata-agent’s API endpoints between **host-side**
and **owner-side** controllers. When this mode is enabled, the
host-side controller is restricted to manage resource allocation
during startup and resource recycling at termination. In contrasts,
the owner-side controller allows workload owners to directly manage
theIR deployment pod and containers. This partitioning implicitly
labels kata-agent’s endpoint APIs as _host-exclusive_,
_owner-exclusive_, or _shared_.

Host-exclusive and owner-exclusive APIs are assigned specifically to
either the host-side or owner-side. For instance, `CreateSandbox` and
`DestroySandbox` are examples of host-exclusive APIs, while `CopyFile`
and `ExecProcess` are examples of owner-exclusive APIs. Shared APIs
include those that must be shared to some extent between the control
planes, such as `GetOOMEvent` and `GetGuestDetails`.

** Content of this commit **

This commit focuses on providing a secure channel for the owner-side
to access owner-exlusive and shared APIs. Future commit(s) will
restrict the host-side access to owner-exclusive APIs when split mode
is enabled on the kata-agent and will address the sharing of APIs
between host-side and owner-side.

This commit implements the following changes:
- Introduces the split mode to the kata-agent.
- Integrates a gRPC TLS server to handle API requests from the
  owner-side.  We refer to this as the kata-agent’s API proxy server
  which ensures that workload owners can establish a secure end-to-end
  communication channel with the kata-agent or invoking API endpoint
  commands.
- Utilizes the Key Broker Service (KBS) to provision secrets, i.e.,
  cryptographic public and private key pairs. These secrets are crucial
  for establishing a secure communication channel between the owner-side
  and the API proxy server.

** Testing **

To enable split mode functionality, the following steps are required:

1. Configuration: Modify the `kernel_params` option in the Kata's
  configuration.toml file to enable split mode and specify the IP
  address of the KBS.
- Add following settings to the `kernel_params` option:
  `agent.split_api=true` and
  `agent.aa_kbc_params=cc_kbc::http://[IP_ADDRESS]:[PORT]`.

2. Dependency on KBS: The kata-agent relies on the KBS to provision
   cryptographic keys to the split API proxy server, facilitating the
   establishment of a secure channel.

- Generate TLS keys and certificates for kata-agent’s API proxy server
  and client (owner-side)

```
$  KATA_DIR=”<PATH to cloned repo>”
$ pushd ${KATA_DIR}/src/agent/grpc_tls_keys
$ ./gen_key_cert.sh
```

 - Create a zip file named 'tls-keys.zip' containing the CA public key
   and the server’s public and private key pair

` $ zip tls-keys.zip server.pem server.key ca.pem`

  - Place this zip file in the KBS resource path
  '/default/tenant-keys/'. During sandbox creation, the kata-agent
  retrieves this file using the KBS 'get resource' API.  It's
  important to note that the KBS conducts a background check on the
  key request, verifying evidence provided by the Trusted Execution
  Environment (TEE). Future extensions to the KBS will automate the
  creation of the server’s public and private key pair for each
  sandbox.

` $ popd`

** External tools required for testing **

To  exercise the API proxy server, we provide the Kata Containers
agent TLS control tool (kata-agent-tls-ctl), derived from the
`kata-agent-ctl` tool in another commit (see `split-api-feature` branch
referenced above).

This tool communicates over a gRPC TLS channel with the kata-agent.
Similar to the kata-agent-ctl, this is a low level tool that is
intended for advanced users. Future commit(s) will introduce a more
user-friendly tool that maintains state, designed to function as a
kubectl plugin for managing owners’ workloads.

Examples of creating and starting a container using
`kata-agent-tls-ctl`:

Setup environment

```
$ export guest_addr=10.89.0.28 # IP address associated with the confidential VM
$ export guest_port=50090         # API proxy server’s port (listens on)
$ export ctl=./target/x86_64-unknown-linux-musl/release/kata-agent-tls-ctl
$ export key_dir=${KATA_DIR}/src/agent/grpc_tls_key
```

Display the status of containers in the sandbox environment

```
$ ${ctl} -l trace connect --key-dir "${key_dir}"  --bundle-dir "${bundle_dir}"  \
--server-address  "ipaddr://${guest_addr}:${guest_port}" \
 -c  "ListContainers"
```

Set a container ID and specify an OCI spec:

```
$ container_id=9e3d1d4750e4e20945d22c358e13c85c6b88922513bce2832c0cf403f065dc6
$ OCI_SPEC_CONFIG=${KATA_DIR}/src/tools/agent-tls-ctl/config.json
```

_Note: the next two commands require pull_image support in the guest!_
**Create container request**
```

$ ${ctl} -l trace connect --key-dir "${key_dir}" --bundle-dir "${bundle_dir}"  \
--server-address  "ipaddr://${guest_addr}:${guest_port}" \
 -c "CreateContainer cid=${container_id} spec=file:///${OCI_SPEC_CONFIG}"
```

**Start container request**

```
$ ${ctl} -l trace connect --no-auto-values --key-dir "${key_dir}" --bundle-dir "${bundle_dir}"  \
--server-address "ipaddr://${guest_addr}:${guest_port}" \
-c "StartContainer json://{\"container_id\": \"${container_id}\"}"
```

Fixes: kata-containers#1834

Signed-off-by: Ray Valdez <rvaldez@us.ibm.com>
Signed-off-by: Salman Ahmed <sahmed@ibm.com>
Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
Signed-off-by: Zhongshu Gu <zgu@us.ibm.com>
Signed-off-by: Pau-Chen Cheng <pau@us.ibm.com>
c3d pushed a commit to ray-valdez/kata-containers that referenced this issue Mar 27, 2024
This commit is an initial step towards mitigating the risk of
untrustworthy host systems when running Kata Containers in the context
of confidential computing. It serves to safeguard against malicious,
privileged users gaining access to the vulnerable Kata control
plane. This protective measure becomes crucial in scenarios where a
malicious cloud service provider or administrator might intercept or
compromise commands from the Kata control plane, tamper with container
configuration files, execute processes within the container, retrieve
workload statistics, or obtain sensitive container workload
information.

** Problem statement **

This commit addresses the following open issues:
[Securing the Kata Control Plane](confidential-containers/confidential-containers#53)
and RFC: Separate trust realms for tenant and host kata-containers#1834. A detailed
history can be found in:
https://github.com/ray-valdez/kata-containers/tree/split-api-feature.

** Architectural changes **

The commit introduces a new split API mode in the kata-agent, which
partitions the kata-agent’s API endpoints between **host-side**
and **owner-side** controllers. When this mode is enabled, the
host-side controller is restricted to manage resource allocation
during startup and resource recycling at termination. In contrasts,
the owner-side controller allows workload owners to directly manage
theIR deployment pod and containers. This partitioning implicitly
labels kata-agent’s endpoint APIs as _host-exclusive_,
_owner-exclusive_, or _shared_.

Host-exclusive and owner-exclusive APIs are assigned specifically to
either the host-side or owner-side. For instance, `CreateSandbox` and
`DestroySandbox` are examples of host-exclusive APIs, while `CopyFile`
and `ExecProcess` are examples of owner-exclusive APIs. Shared APIs
include those that must be shared to some extent between the control
planes, such as `GetOOMEvent` and `GetGuestDetails`.

** Content of this commit **

This commit focuses on providing a secure channel for the owner-side
to access owner-exlusive and shared APIs. Future commit(s) will
restrict the host-side access to owner-exclusive APIs when split mode
is enabled on the kata-agent and will address the sharing of APIs
between host-side and owner-side.

This commit implements the following changes:
- Introduces the split mode to the kata-agent.
- Integrates a gRPC TLS server to handle API requests from the
  owner-side.  We refer to this as the kata-agent’s API proxy server
  which ensures that workload owners can establish a secure end-to-end
  communication channel with the kata-agent or invoking API endpoint
  commands.
- Utilizes the Key Broker Service (KBS) to provision secrets, i.e.,
  cryptographic public and private key pairs. These secrets are crucial
  for establishing a secure communication channel between the owner-side
  and the API proxy server.

** Testing **

To enable split mode functionality, the following steps are required:

1. Configuration: Modify the `kernel_params` option in the Kata's
  configuration.toml file to enable split mode and specify the IP
  address of the KBS.
- Add following settings to the `kernel_params` option:
  `agent.split_api=true` and
  `agent.aa_kbc_params=cc_kbc::http://[IP_ADDRESS]:[PORT]`.

2. Dependency on KBS: The kata-agent relies on the KBS to provision
   cryptographic keys to the split API proxy server, facilitating the
   establishment of a secure channel.

- Generate TLS keys and certificates for kata-agent’s API proxy server
  and client (owner-side)

```
$  KATA_DIR=”<PATH to cloned repo>”
$ pushd ${KATA_DIR}/src/agent/grpc_tls_keys
$ ./gen_key_cert.sh
```

 - Create a zip file named 'tls-keys.zip' containing the CA public key
   and the server’s public and private key pair

` $ zip tls-keys.zip server.pem server.key ca.pem`

  - Place this zip file in the KBS resource path
  '/default/tenant-keys/'. During sandbox creation, the kata-agent
  retrieves this file using the KBS 'get resource' API.  It's
  important to note that the KBS conducts a background check on the
  key request, verifying evidence provided by the Trusted Execution
  Environment (TEE). Future extensions to the KBS will automate the
  creation of the server’s public and private key pair for each
  sandbox.

` $ popd`

** External tools required for testing **

To  exercise the API proxy server, we provide the Kata Containers
agent TLS control tool (kata-agent-tls-ctl), derived from the
`kata-agent-ctl` tool in another commit (see `split-api-feature` branch
referenced above).

This tool communicates over a gRPC TLS channel with the kata-agent.
Similar to the kata-agent-ctl, this is a low level tool that is
intended for advanced users. Future commit(s) will introduce a more
user-friendly tool that maintains state, designed to function as a
kubectl plugin for managing owners’ workloads.

Examples of creating and starting a container using
`kata-agent-tls-ctl`:

Setup environment

```
$ export guest_addr=10.89.0.28 # IP address associated with the confidential VM
$ export guest_port=50090         # API proxy server’s port (listens on)
$ export ctl=./target/x86_64-unknown-linux-musl/release/kata-agent-tls-ctl
$ export key_dir=${KATA_DIR}/src/agent/grpc_tls_key
```

Display the status of containers in the sandbox environment

```
$ ${ctl} -l trace connect --key-dir "${key_dir}"  --bundle-dir "${bundle_dir}"  \
--server-address  "ipaddr://${guest_addr}:${guest_port}" \
 -c  "ListContainers"
```

Set a container ID and specify an OCI spec:

```
$ container_id=9e3d1d4750e4e20945d22c358e13c85c6b88922513bce2832c0cf403f065dc6
$ OCI_SPEC_CONFIG=${KATA_DIR}/src/tools/agent-tls-ctl/config.json
```

_Note: the next two commands require pull_image support in the guest!_
**Create container request**
```

$ ${ctl} -l trace connect --key-dir "${key_dir}" --bundle-dir "${bundle_dir}"  \
--server-address  "ipaddr://${guest_addr}:${guest_port}" \
 -c "CreateContainer cid=${container_id} spec=file:///${OCI_SPEC_CONFIG}"
```

**Start container request**

```
$ ${ctl} -l trace connect --no-auto-values --key-dir "${key_dir}" --bundle-dir "${bundle_dir}"  \
--server-address "ipaddr://${guest_addr}:${guest_port}" \
-c "StartContainer json://{\"container_id\": \"${container_id}\"}"
```

Fixes: kata-containers#1834

Signed-off-by: Ray Valdez <rvaldez@us.ibm.com>
Signed-off-by: Salman Ahmed <sahmed@ibm.com>
Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
Signed-off-by: Zhongshu Gu <zgu@us.ibm.com>
Signed-off-by: Pau-Chen Cheng <pau@us.ibm.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/confidential-containers Issues related to confidential containers (see also CCv0 branch)
Development

Successfully merging a pull request may close this issue.

5 participants