Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

regression: can't kill and delete the container with shared(host) pid ns when the init process has dead #4047

Closed
lifubang opened this issue Oct 2, 2023 · 1 comment · Fixed by #4102
Labels
regression release-block This one should be resolved before draft an new release!
Milestone

Comments

@lifubang
Copy link
Member

lifubang commented Oct 2, 2023

Description

After merge #3825 , if we create a container without PID namespace, and then exec some processes to this container, after the init process has dead, we can't kill and delete this container anymoe.
I think this is introduced by the commit f8ad20f .

Steps to reproduce the issue

  1. create a container test without PID namespace, with the entry point is sleep 20;
  2. runc exec -d test sleep infinity
  3. wait 20 seconds, the init process exited;

Describe the results you received and expected

  1. runc kill test KILL
    received:
    ERRO[0000] container not running

expected:
It should have the same effect like runc kill -a test KILL with runc 1.1.*

  1. runc delete -f test
    received:
ERRO[0000] Failed to remove paths: map[:/sys/fs/cgroup/unified/test blkio:/sys/fs/cgroup/blkio/user.slice/test cpu:/sys/fs/cgroup/cpu,cpuacct/user.slice/test cpuacct:/sys/fs/cgroup/cpu,cpuacct/user.slice/test cpuset:/sys/fs/cgroup/cpuset/test devices:/sys/fs/cgroup/devices/user.slice/test freezer:/sys/fs/cgroup/freezer/test hugetlb:/sys/fs/cgroup/hugetlb/test memory:/sys/fs/cgroup/memory/user.slice/user-1000.slice/session-8.scope/test misc:/sys/fs/cgroup/misc/test name=systemd:/sys/fs/cgroup/systemd/user.slice/user-1000.slice/session-8.scope/test net_cls:/sys/fs/cgroup/net_cls,net_prio/test net_prio:/sys/fs/cgroup/net_cls,net_prio/test perf_event:/sys/fs/cgroup/perf_event/test pids:/sys/fs/cgroup/pids/user.slice/user-1000.slice/session-8.scope/test rdma:/sys/fs/cgroup/rdma/test]

expected:
The container can be removed successfuly.

What version of runc are you using?

The main branch

Host OS information

NAME="Ubuntu"
VERSION="20.04.5 LTS (Focal Fossa)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 20.04.5 LTS"
VERSION_ID="20.04"
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
VERSION_CODENAME=focal
UBUNTU_CODENAME=focal

Host kernel information

Linux acmcoder 5.15.0-84-generic #93~20.04.1-Ubuntu SMP Wed Sep 6 16:15:40 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux

@lifubang lifubang changed the title regression: can't kill and delete the container without pid ns when the init process has dead regression: can't kill and delete the container with shared(hosted) pid ns when the init process has dead Oct 2, 2023
@lifubang lifubang changed the title regression: can't kill and delete the container with shared(hosted) pid ns when the init process has dead regression: can't kill and delete the container with shared(host) pid ns when the init process has dead Oct 2, 2023
@lifubang lifubang added this to the 1.2.0 milestone Oct 23, 2023
@lifubang lifubang added the release-block This one should be resolved before draft an new release! label Oct 23, 2023
@kolyshkin
Copy link
Contributor

kolyshkin commented Nov 7, 2023

So, there is a third scenario which is not described in the description. It is runc delete (without -f).

runc 1.1 did kill all the cgroup processes for a shared pid ns container with no init, and #3825 broke it.

Two ways to do when runc delete is called on such a container:

  1. Kill the remaining processes (the logic behind this is "container init is dead, thus the container is dead, so runc delete should remove all its bits and pieces, including the leftover processes"). This is also backward-compatible with older runc.
  2. Warn that container cgroup is not empty (and suggest to use runc kill or runc delete -f). The logic behind this is, this situation is not normal, and this container state is wrong (a stopped container should not have any leftover processes), so we want the user to know about it.

I'm not sure which one is better. For backward compatibility, (1) is a good choice. Logically, I like (2) more.

If we are to implement (2), I think we should also implement runc exec --ignore-stopped for such containers (so that a user can do something about it rather than killing those processes).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
regression release-block This one should be resolved before draft an new release!
Projects
None yet
2 participants