Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kata Containers streaming IO #6714

Closed
bergwolf opened this issue Apr 25, 2023 · 2 comments · Fixed by #7483
Closed

Kata Containers streaming IO #6714

bergwolf opened this issue Apr 25, 2023 · 2 comments · Fixed by #7483
Labels
feature New functionality needs-review Needs to be assessed by the team. soc Summer of Code related

Comments

@bergwolf
Copy link
Member

Right now, we forward container io stream (stdin/stdout/stderr) wrapped with ttrpc request, which adds unnecessary overhead and is error-prone. We have been fighting with missing stdin/stdout bytes from time to time since the beginning of the project.

To solve it once and for all, let's add a streaming IO interface between runtime and agent, using vsock connections as the underlying data transfer channel. It can remove the runtime shim and the agent from container IO data path.

For example, for container stdio, we can go from

containerd pipefd <-> kata runtime shim <-> vmm <-> kata agent <-> container stdio pipefd

to

containerd pipefd <-> vmm <-> container stdio pipefd

Key change:

  • a new fd based vsock backend implementation, which forwards whatever data between the guest vsock device and the host fd
  • a new streaming IO agent API that helps to setup the fd connection between guest and host
  • runtime-rs/kata-agent change to help setup the fd mapping between guest and host
  • support container stdio with the streaming IO API
  • (possibly) portforward with the streaming IO API
@bergwolf bergwolf added feature New functionality needs-review Needs to be assessed by the team. soc Summer of Code related labels Apr 25, 2023
@ClSlaid
Copy link

ClSlaid commented May 31, 2023

Anyone taking on this?

@bergwolf
Copy link
Member Author

bergwolf commented Jun 6, 2023

@ClSlaid Yes. It has been assigned to a student in CCF summer of code.

frezcirno added a commit to frezcirno/kata-containers that referenced this issue Aug 14, 2023
Fixes: kata-containers#6714

Signed-off-by: Zixuan Tan <tanzixuan.me@gmail.com>
frezcirno added a commit to frezcirno/kata-containers that referenced this issue Aug 14, 2023
Two toml options, `use_passfd_io` and `passfd_listener_port` are introduced
to enable and configure dragonball's vsock fd passthrough io feature.

Fixes: kata-containers#6714

Signed-off-by: Zixuan Tan <tanzixuan.me@gmail.com>
frezcirno added a commit to frezcirno/kata-containers that referenced this issue Aug 14, 2023
Currently in the kata container, every io read/write operation requires
an RPC request from the runtime to the agent. This process involves
unnecessary data copying into/from an RPC request/response, which
introduces high overhead.

To solve this issue, this commit utilize the vsock fd passthrough, a
newly introduced feature in the Dragonball hypervisor. This feature
allows other host programs to pass a file descriptor to the Dragonball
process directly as the backend of an ordinary hybrid vsock connection.

The runtime-rs now utilizes this feature for container process io. It
passes the stdin/stdout/stderr fifo from containerd to Dragonball,
eliminating the need for an RPC for each io read/write operation.

The agent uses the vsock stream as the child process's
stdin/stdout/stderr in passfd mode, eliminating the need for a pipe
to bump data (in non-tty mode).

Fixes: kata-containers#6714

Signed-off-by: Zixuan Tan <tanzixuan.me@gmail.com>
frezcirno added a commit to frezcirno/kata-containers that referenced this issue Aug 14, 2023
frezcirno added a commit to frezcirno/kata-containers that referenced this issue Aug 14, 2023
Fixes: kata-containers#6714

Signed-off-by: Zixuan Tan <tanzixuan.me@gmail.com>
frezcirno added a commit to frezcirno/kata-containers that referenced this issue Aug 29, 2023
Currently in the kata container, every io read/write operation requires
an RPC request from the runtime to the agent. This process involves
unnecessary data copying into/from an RPC request/response, which
introduces high overhead.

To solve this issue, this commit utilize the vsock fd passthrough, a
newly introduced feature in the Dragonball hypervisor. This feature
allows other host programs to pass a file descriptor to the Dragonball
process directly as the backend of an ordinary hybrid vsock connection.

The runtime-rs now utilizes this feature for container process io. It
passes the stdin/stdout/stderr fifo from containerd to Dragonball,
eliminating the need for an RPC for each io read/write operation.

The agent uses the vsock stream as the child process's
stdin/stdout/stderr in passfd mode, eliminating the need for a pipe
to bump data (in non-tty mode).

Fixes: kata-containers#6714

Signed-off-by: Zixuan Tan <tanzixuan.me@gmail.com>
frezcirno added a commit to frezcirno/kata-containers that referenced this issue Aug 29, 2023
frezcirno added a commit to frezcirno/kata-containers that referenced this issue Aug 29, 2023
Fixes: kata-containers#6714

Signed-off-by: Zixuan Tan <tanzixuan.me@gmail.com>
frezcirno added a commit to frezcirno/kata-containers that referenced this issue Sep 13, 2023
Fixes: kata-containers#6714

Signed-off-by: Zixuan Tan <tanzixuan.me@gmail.com>
frezcirno added a commit to frezcirno/kata-containers that referenced this issue Sep 13, 2023
Two toml options, `use_passfd_io` and `passfd_listener_port` are introduced
to enable and configure dragonball's vsock fd passthrough io feature.

Fixes: kata-containers#6714

Signed-off-by: Zixuan Tan <tanzixuan.me@gmail.com>
frezcirno added a commit to frezcirno/kata-containers that referenced this issue Sep 13, 2023
Currently in the kata container, every io read/write operation requires
an RPC request from the runtime to the agent. This process involves
unnecessary data copying into/from an RPC request/response, which
introduces high overhead.

To solve this issue, this commit utilize the vsock fd passthrough, a
newly introduced feature in the Dragonball hypervisor. This feature
allows other host programs to pass a file descriptor to the Dragonball
process directly as the backend of an ordinary hybrid vsock connection.

The runtime-rs now utilizes this feature for container process io. It
passes the stdin/stdout/stderr fifo from containerd to Dragonball,
eliminating the need for an RPC for each io read/write operation.

The agent uses the vsock stream as the child process's
stdin/stdout/stderr in passfd mode, eliminating the need for a pipe
to bump data (in non-tty mode).

Fixes: kata-containers#6714

Signed-off-by: Zixuan Tan <tanzixuan.me@gmail.com>
frezcirno added a commit to frezcirno/kata-containers that referenced this issue Sep 13, 2023
frezcirno added a commit to frezcirno/kata-containers that referenced this issue Sep 13, 2023
Fixes: kata-containers#6714

Signed-off-by: Zixuan Tan <tanzixuan.me@gmail.com>
frezcirno added a commit to frezcirno/kata-containers that referenced this issue Oct 3, 2023
Fixes: kata-containers#6714

Signed-off-by: Zixuan Tan <tanzixuan.me@gmail.com>
frezcirno added a commit to frezcirno/kata-containers that referenced this issue Oct 3, 2023
Two toml options, `use_passfd_io` and `passfd_listener_port` are introduced
to enable and configure dragonball's vsock fd passthrough io feature.

This commit is a preparation for vsock fd passthrough io feature.

Fixes: kata-containers#6714

Signed-off-by: Zixuan Tan <tanzixuan.me@gmail.com>
frezcirno added a commit to frezcirno/kata-containers that referenced this issue Oct 3, 2023
Currently in the kata container, every io read/write operation requires
an RPC request from the runtime to the agent. This process involves
data copying into/from an RPC request/response, which are high overhead.

To solve this issue, this commit utilize the vsock fd passthrough, a
newly introduced feature in the Dragonball hypervisor. This feature
allows other host programs to pass a file descriptor to the Dragonball
process, directly as the backend of an ordinary hybrid vsock connection.

The runtime-rs now utilizes this feature for container process io. It
open the stdin/stdout/stderr fifo from containerd, and pass them to
Dragonball, then don't bother with process io any more, eliminating
the need for an RPC for each io read/write operation.

In passfd io mode, the agent uses the vsock connections as the child
process's stdin/stdout/stderr, eliminating the need for a pipe
to bump data (in non-tty mode).

Fixes: kata-containers#6714

Signed-off-by: Zixuan Tan <tanzixuan.me@gmail.com>
frezcirno added a commit to frezcirno/kata-containers that referenced this issue Oct 3, 2023
Currently in the kata container, every io read/write operation requires
an RPC request from the runtime to the agent. This process involves
data copying into/from an RPC request/response, which are high overhead.

To solve this issue, this commit utilize the vsock fd passthrough, a
newly introduced feature in the Dragonball hypervisor. This feature
allows other host programs to pass a file descriptor to the Dragonball
process, directly as the backend of an ordinary hybrid vsock connection.

The runtime-rs now utilizes this feature for container process io. It
open the stdin/stdout/stderr fifo from containerd, and pass them to
Dragonball, then don't bother with process io any more, eliminating
the need for an RPC for each io read/write operation.

In passfd io mode, the agent uses the vsock connections as the child
process's stdin/stdout/stderr, eliminating the need for a pipe
to bump data (in non-tty mode).

Fixes: kata-containers#6714

Signed-off-by: Zixuan Tan <tanzixuan.me@gmail.com>
frezcirno added a commit to frezcirno/kata-containers that referenced this issue Oct 4, 2023
When one end of the connection close, the epoll event will be triggered
forever. We should close the connection and kill the connection.

Fixes: kata-containers#6714

Signed-off-by: Zixuan Tan <tanzixuan.me@gmail.com>
frezcirno added a commit to frezcirno/kata-containers that referenced this issue Oct 26, 2023
In linux, when a FIFO is opened and there are no writers, the reader
will continuously receive the HUP event. This can be problematic
when creating containers in detached mode, as the stdin FIFO writer
is closed after the container is created, resulting in this situation.

In passfd io mode, open stdin fifo with O_RDWR|O_NONBLOCK to avoid the
HUP event.

Fixes: kata-containers#6714
Signed-off-by: Zixuan Tan <tanzixuan.me@gmail.com>
frezcirno added a commit to frezcirno/kata-containers that referenced this issue Oct 29, 2023
Partially fix some issues related to container io detach and attach.

Fixes: kata-containers#6714
Signed-off-by: Zixuan Tan <tanzixuan.me@gmail.com>
lifupan pushed a commit to lifupan/kata-containers that referenced this issue Nov 2, 2023
Partially fix some issues related to container io detach and attach.

Fixes: kata-containers#6714
Signed-off-by: Zixuan Tan <tanzixuan.me@gmail.com>
frezcirno added a commit to frezcirno/kata-containers that referenced this issue Nov 4, 2023
Partially fix some issues related to container io detach and attach.

Fixes: kata-containers#6714
Signed-off-by: Zixuan Tan <tanzixuan.me@gmail.com>
frezcirno added a commit to frezcirno/kata-containers that referenced this issue Jan 23, 2024
Linux forbids opening an existing socket through /proc/<pid>/fd/<fd>,
making some images relying on the special file /dev/stdout(stderr),
/proc/self/fd/1(2) fail to boot in passfd io mode, where the
stdout/stderr of a container process is a vsock socket.

For back compatibility, a pipe is introduced between the process
and the socket, and its read end is set as stdout/stderr of the
container process instead of the socket. The agent will do the
forwarding between the pipe and the socket.

Fixes: kata-containers#6714
Signed-off-by: Zixuan Tan <tanzixuan.me@gmail.com>
frezcirno added a commit to frezcirno/kata-containers that referenced this issue Jan 23, 2024
Fixes: kata-containers#6714
Signed-off-by: Zixuan Tan <tanzixuan.me@gmail.com>
frezcirno added a commit to frezcirno/kata-containers that referenced this issue Jan 31, 2024
There is a race condition in agent HVSOCK_STREAMS hashmap, where a
stream may be taken before it is inserted into the hashmap. This patch
add simple retry logic to the stream consumer to alleviate this issue.

Fixes: kata-containers#6714
Signed-off-by: Zixuan Tan <tanzixuan.me@gmail.com>
frezcirno added a commit to frezcirno/kata-containers that referenced this issue Jan 31, 2024
Fixes: kata-containers#6714

Signed-off-by: Zixuan Tan <tanzixuan.me@gmail.com>
frezcirno added a commit to frezcirno/kata-containers that referenced this issue Jan 31, 2024
Two toml options, `use_passfd_io` and `passfd_listener_port` are introduced
to enable and configure dragonball's vsock fd passthrough io feature.

This commit is a preparation for vsock fd passthrough io feature.

Fixes: kata-containers#6714

Signed-off-by: Zixuan Tan <tanzixuan.me@gmail.com>
frezcirno added a commit to frezcirno/kata-containers that referenced this issue Jan 31, 2024
Currently in the kata container, every io read/write operation requires
an RPC request from the runtime to the agent. This process involves
data copying into/from an RPC request/response, which are high overhead.

To solve this issue, this commit utilize the vsock fd passthrough, a
newly introduced feature in the Dragonball hypervisor. This feature
allows other host programs to pass a file descriptor to the Dragonball
process, directly as the backend of an ordinary hybrid vsock connection.

The runtime-rs now utilizes this feature for container process io. It
open the stdin/stdout/stderr fifo from containerd, and pass them to
Dragonball, then don't bother with process io any more, eliminating
the need for an RPC for each io read/write operation.

In passfd io mode, the agent uses the vsock connections as the child
process's stdin/stdout/stderr, eliminating the need for a pipe
to bump data (in non-tty mode).

Fixes: kata-containers#6714

Signed-off-by: Zixuan Tan <tanzixuan.me@gmail.com>
frezcirno added a commit to frezcirno/kata-containers that referenced this issue Jan 31, 2024
When one end of the connection close, the epoll event will be triggered
forever. We should close the connection and kill the connection.

Fixes: kata-containers#6714

Signed-off-by: Zixuan Tan <tanzixuan.me@gmail.com>
frezcirno added a commit to frezcirno/kata-containers that referenced this issue Jan 31, 2024
In linux, when a FIFO is opened and there are no writers, the reader
will continuously receive the HUP event. This can be problematic
when creating containers in detached mode, as the stdin FIFO writer
is closed after the container is created, resulting in this situation.

In passfd io mode, open stdin fifo with O_RDWR|O_NONBLOCK to avoid the
HUP event.

Fixes: kata-containers#6714
Signed-off-by: Zixuan Tan <tanzixuan.me@gmail.com>
frezcirno added a commit to frezcirno/kata-containers that referenced this issue Jan 31, 2024
Partially fix some issues related to container io detach and attach.

Fixes: kata-containers#6714
Signed-off-by: Zixuan Tan <tanzixuan.me@gmail.com>
frezcirno added a commit to frezcirno/kata-containers that referenced this issue Jan 31, 2024
In passfd io mode, when not using a terminal, the stdout/stderr vsock
streams are directly used as the stdout/stderr of the child process.
These streams are non-blocking by default.

The stdout/stderr of the process should be blocking, otherwise
the process may encounter EAGAIN error when writing to stdout/stderr.

Fixes: kata-containers#6714
Signed-off-by: Zixuan Tan <tanzixuan.me@gmail.com>
frezcirno added a commit to frezcirno/kata-containers that referenced this issue Jan 31, 2024
This patch uses a biased select to avoid stdin data loss in case of
CloseStdinRequest.

Fixes: kata-containers#6714
Signed-off-by: Zixuan Tan <tanzixuan.me@gmail.com>
frezcirno added a commit to frezcirno/kata-containers that referenced this issue Jan 31, 2024
This patch adds O_NONBLOCK flag when open stdout and stderr FIFOs
to avoid blocking.

Fixes: kata-containers#6714
Signed-off-by: Zixuan Tan <tanzixuan.me@gmail.com>
frezcirno added a commit to frezcirno/kata-containers that referenced this issue Jan 31, 2024
Fix rustfmt and clippy warnings detected by CI.

Fixes: kata-containers#6714
Signed-off-by: Zixuan Tan <tanzixuan.me@gmail.com>
frezcirno added a commit to frezcirno/kata-containers that referenced this issue Jan 31, 2024
Linux forbids opening an existing socket through /proc/<pid>/fd/<fd>,
making some images relying on the special file /dev/stdout(stderr),
/proc/self/fd/1(2) fail to boot in passfd io mode, where the
stdout/stderr of a container process is a vsock socket.

For back compatibility, a pipe is introduced between the process
and the socket, and its read end is set as stdout/stderr of the
container process instead of the socket. The agent will do the
forwarding between the pipe and the socket.

Fixes: kata-containers#6714
Signed-off-by: Zixuan Tan <tanzixuan.me@gmail.com>
frezcirno added a commit to frezcirno/kata-containers that referenced this issue Jan 31, 2024
Fixes: kata-containers#6714
Signed-off-by: Zixuan Tan <tanzixuan.me@gmail.com>
frezcirno added a commit to frezcirno/kata-containers that referenced this issue Jan 31, 2024
There is a race condition in agent HVSOCK_STREAMS hashmap, where a
stream may be taken before it is inserted into the hashmap. This patch
add simple retry logic to the stream consumer to alleviate this issue.

Fixes: kata-containers#6714
Signed-off-by: Zixuan Tan <tanzixuan.me@gmail.com>
c3d pushed a commit to c3d/kata-containers that referenced this issue Feb 23, 2024
Fixes: kata-containers#6714

Signed-off-by: Zixuan Tan <tanzixuan.me@gmail.com>
c3d pushed a commit to c3d/kata-containers that referenced this issue Feb 23, 2024
Two toml options, `use_passfd_io` and `passfd_listener_port` are introduced
to enable and configure dragonball's vsock fd passthrough io feature.

This commit is a preparation for vsock fd passthrough io feature.

Fixes: kata-containers#6714

Signed-off-by: Zixuan Tan <tanzixuan.me@gmail.com>
c3d pushed a commit to c3d/kata-containers that referenced this issue Feb 23, 2024
Currently in the kata container, every io read/write operation requires
an RPC request from the runtime to the agent. This process involves
data copying into/from an RPC request/response, which are high overhead.

To solve this issue, this commit utilize the vsock fd passthrough, a
newly introduced feature in the Dragonball hypervisor. This feature
allows other host programs to pass a file descriptor to the Dragonball
process, directly as the backend of an ordinary hybrid vsock connection.

The runtime-rs now utilizes this feature for container process io. It
open the stdin/stdout/stderr fifo from containerd, and pass them to
Dragonball, then don't bother with process io any more, eliminating
the need for an RPC for each io read/write operation.

In passfd io mode, the agent uses the vsock connections as the child
process's stdin/stdout/stderr, eliminating the need for a pipe
to bump data (in non-tty mode).

Fixes: kata-containers#6714

Signed-off-by: Zixuan Tan <tanzixuan.me@gmail.com>
c3d pushed a commit to c3d/kata-containers that referenced this issue Feb 23, 2024
When one end of the connection close, the epoll event will be triggered
forever. We should close the connection and kill the connection.

Fixes: kata-containers#6714

Signed-off-by: Zixuan Tan <tanzixuan.me@gmail.com>
c3d pushed a commit to c3d/kata-containers that referenced this issue Feb 23, 2024
In linux, when a FIFO is opened and there are no writers, the reader
will continuously receive the HUP event. This can be problematic
when creating containers in detached mode, as the stdin FIFO writer
is closed after the container is created, resulting in this situation.

In passfd io mode, open stdin fifo with O_RDWR|O_NONBLOCK to avoid the
HUP event.

Fixes: kata-containers#6714
Signed-off-by: Zixuan Tan <tanzixuan.me@gmail.com>
c3d pushed a commit to c3d/kata-containers that referenced this issue Feb 23, 2024
Partially fix some issues related to container io detach and attach.

Fixes: kata-containers#6714
Signed-off-by: Zixuan Tan <tanzixuan.me@gmail.com>
c3d pushed a commit to c3d/kata-containers that referenced this issue Feb 23, 2024
In passfd io mode, when not using a terminal, the stdout/stderr vsock
streams are directly used as the stdout/stderr of the child process.
These streams are non-blocking by default.

The stdout/stderr of the process should be blocking, otherwise
the process may encounter EAGAIN error when writing to stdout/stderr.

Fixes: kata-containers#6714
Signed-off-by: Zixuan Tan <tanzixuan.me@gmail.com>
c3d pushed a commit to c3d/kata-containers that referenced this issue Feb 23, 2024
This patch uses a biased select to avoid stdin data loss in case of
CloseStdinRequest.

Fixes: kata-containers#6714
Signed-off-by: Zixuan Tan <tanzixuan.me@gmail.com>
c3d pushed a commit to c3d/kata-containers that referenced this issue Feb 23, 2024
This patch adds O_NONBLOCK flag when open stdout and stderr FIFOs
to avoid blocking.

Fixes: kata-containers#6714
Signed-off-by: Zixuan Tan <tanzixuan.me@gmail.com>
c3d pushed a commit to c3d/kata-containers that referenced this issue Feb 23, 2024
Fix rustfmt and clippy warnings detected by CI.

Fixes: kata-containers#6714
Signed-off-by: Zixuan Tan <tanzixuan.me@gmail.com>
c3d pushed a commit to c3d/kata-containers that referenced this issue Feb 23, 2024
Linux forbids opening an existing socket through /proc/<pid>/fd/<fd>,
making some images relying on the special file /dev/stdout(stderr),
/proc/self/fd/1(2) fail to boot in passfd io mode, where the
stdout/stderr of a container process is a vsock socket.

For back compatibility, a pipe is introduced between the process
and the socket, and its read end is set as stdout/stderr of the
container process instead of the socket. The agent will do the
forwarding between the pipe and the socket.

Fixes: kata-containers#6714
Signed-off-by: Zixuan Tan <tanzixuan.me@gmail.com>
c3d pushed a commit to c3d/kata-containers that referenced this issue Feb 23, 2024
Fixes: kata-containers#6714
Signed-off-by: Zixuan Tan <tanzixuan.me@gmail.com>
c3d pushed a commit to c3d/kata-containers that referenced this issue Feb 23, 2024
There is a race condition in agent HVSOCK_STREAMS hashmap, where a
stream may be taken before it is inserted into the hashmap. This patch
add simple retry logic to the stream consumer to alleviate this issue.

Fixes: kata-containers#6714
Signed-off-by: Zixuan Tan <tanzixuan.me@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature New functionality needs-review Needs to be assessed by the team. soc Summer of Code related
Projects
Issue backlog
  
To do
2 participants