Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Proposal] Introduce "sandboxer" plugin #7739

Open
abel-von opened this issue Nov 30, 2022 · 6 comments
Open

[Proposal] Introduce "sandboxer" plugin #7739

abel-von opened this issue Nov 30, 2022 · 6 comments
Labels
area/cri Container Runtime Interface (CRI) kind/feature
Milestone

Comments

@abel-von
Copy link
Contributor

abel-von commented Nov 30, 2022

What is the problem you're trying to solve

After the work of #4131, the "sandbox" becomes the first class citizen in containerd, and no more "pause container" disguising a sandbox, bravo!

Now the Sandbox Controller has two implementations:

  1. pause container, for compatibility of former version.
  2. shim sandbox controller, integrating the sandbox controller to shim process.

Except for the pause container sandbox controller, the only option to implement the sandbox api is to integrate it to shim. which makes a "fat shim".

The new Sandbox API may be the ending of containerd shim. we can require that each sandbox should expose the Task API that containerd can call to manage container lifecycle. Then for the vm based sandbox, the agent in VM can expose Task API by virtio-vsock, so that containerd can call directly to the agent in vm to start a container, we don't need an extra API conversion from host to agent as kata is currently doing. This can reduce the memory overhead and API invoking path.

Describe the solution you'd like

We can use the powerfull plugin machanism to achieve it, Here we introduce a "Sandboxer" plugin to containerd. like "Snapshotter" to snapshot, a "Sandboxer" is for lifecycle and resource management of one kind of sandbox, we can provide different kinds of sandboxers, such as hypervisor based sandboxer, cgroup/ns based sandboxer, and wasm based sandboxer that may appear in the future.

image

The shim process is not required for sandboxes other than runc(we need a process to do tty io copy, if we don't need container that use tty, like all kubernetes launched containers do, we can start runc task server in containerd and remove the shim process, see #7502).

The Sandboxer can be either a TTRPC plugin running in containerd, or a remote plugin running outside the containerd, which is only one process on each node. containerd do not need to manage the shim process lifecycle anymore, because there is no shim process anymore.

Additional context

Like what Snapshotter did, containerd will collect all "Sandboxer"s when init

   sandboxesRaw, err := ic.GetByType(plugin.SandboxPlugin)
    if err != nil {
        return nil, err
    }
    sandboxes := make(map[string]sandbox.Controller)
    for name, srv := range sandboxesRaw {
        inst, err := srv.Instance()
        if err != nil {
            return nil, err
        }
        sandboxes[name] = inst.(sandbox.Controller)
}

We may add three APIs to Sandbox Controller for update sandbox resources when start/stop/update a container:

   // AppendContainer append the container resources to the sandbox
   AppendContainer(ctx context.Context, sandboxId string, container *Container) (*Container, error)
   // UpdateContainer update the container resources in the sandbox, like updating cpu/mem resources,
   // or add/remove resources(especially io pipes) of an exec process
   UpdateContainer(ctx context.Context, sandboxId string, container *Container) (*Container, error)
   // RemoveContainer remove the container resources in the sandbox
   RemoveContainer(ctx context.Context, sandboxId string, id string) error

Here are sequence diagrams when Sandboxer plugin introduced:

  1. RunPodSandbox
    runpodsandbox
  2. When creating a container, the Sandboxer and SandboxID of Container should be set by Calling WithSandbox.
    createcontainer
  3. When starting container, before calling Tasks API to create task, check if SandboxKey of container is empty, if not, call AppendContainer of sandbox api, to attach the container resources to the sandbox, and set the returned Bundle and TaskAddress to CreateTaskRequest, TasksServer will connect to the TaskAddress directly and call Create method of Task API, with the Bundle in the CreateTaskRequest, rather than StartShim to call a shim binary.
    startcontainer
  4. The biggest challenge of Exec is to update sandbox to attach the named pipes of the exec io to the sandbox, we have to call UpdateContainer of sandbox api to do this, we give all informations of the execing process to the sandbox plugin to let it update the sandbox where container is running in.
    exec

we have a demo PR #7502

@fuweid fuweid added the area/cri Container Runtime Interface (CRI) label Dec 6, 2022
@Burning1020
Copy link
Member

Based on sandbox api, I think this proposal would use a sandboxer to manage all pods, which can reduce much memory overhead of shim.
cc @mxpv @mikebrow @samuelkarp @dmcgowan

@kzys
Copy link
Member

kzys commented Dec 23, 2022

(For others - there were some discussions on the PR too: #7502)

Regarding shims, it would be better to start from the problems of containerd's shims rather than citing other shims such as dockershim in Kubernetes. They are different.

@abel-von
Copy link
Contributor Author

abel-von commented Jan 17, 2023

We made a presensation about this proposal in the last meeting.
Here is the recoding: https://zoom.us/rec/play/OTahK-eI9F0P9XVhjV3ez0TyySgBEHQ2bCMmZlSKQe3moOlupHuS7_PcGRVBIjhRB5oRzF6b5u5-Q_zg.7OpNQ7w5ffYPf1bQ

and here is the slides: https://docs.google.com/presentation/d/1-6LXpWbQa7sw63KF1guDbHStcH4I9TgT/edit?usp=drivesdk&ouid=109476472776399967160&rtpof=true&sd=true

For anyone who is interested.

@abel-von
Copy link
Contributor Author

(For others - there were some discussions on the PR too: #7502)

Regarding shims, it would be better to start from the problems of containerd's shims rather than citing other shims such as dockershim in Kubernetes. They are different.

removed the citing of other shims

@Burning1020
Copy link
Member

@dmcgowan Could you help add a 2.0 milestone?

@arukiidou
Copy link

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/cri Container Runtime Interface (CRI) kind/feature
Projects
Status: Todo
Development

No branches or pull requests

6 participants