Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug 1875946: Pod process container does not correctly reap zombie process with shareProcessNamespace: true #27

Merged
merged 1 commit into from Sep 15, 2020

Conversation

psykulsk
Copy link
Contributor

@psykulsk psykulsk commented Aug 16, 2020

The purpose of this pull request is to add handling the SIGCHLD signal and proper reaping of the zombie processes to the pod binary.

This is to avoid exhaustion of the process table slots in some long running scenarios and to keep the behavior of pods consistent with the vanilla kubernetes distribution, when using shareProcessNamespace: true.

Why

The default configuration of the cri-o deployed by Openshift/OKD (version 4) specifies the pause_command = "/usr/bin/pod". This means that when the user spawns a pod with shareProcessNamespace: true, the pod binary will be launched as the init process in the pod's pid namespace. If one of the containers in that pod has an exec readiness probe configured, that executes some kind of command that may sometimes exceed the readiness probe's timeout, all processes launched by this command will be left as zombie/defunct processes and will never be cleaned up. That's because the /usr/bin/pod, running as the PID 1 process, does not currently handle the SIGCHLD signal and does not reap the zombie processes. Over a longer time span this may fill the process table size available inside the container namespace and cause errors like "Cannot fork" and cause other problems during the pod's main container runtime.

The pause container used in the vanilla kubernetes distribution handles the SIGCHLD and reaps the zombies as can be seen in the source code.

Steps to reproduce on OKD/Openshift 4 cluster

  1. Deploy Openshift/OKD 4 cluster
  2. Create a pod with shareProcessNamespace: true and an exec readiness probe that will timeout. Example of a pod:
apiVersion: v1
kind: Pod
metadata:
  name: test
spec:
  shareProcessNamespace: true
  containers:
  - name: ubuntu
    image: ubuntu
    command:
      - /bin/tail
      - -f
      - /dev/null
    readinessProbe:
      exec:
        command:
          - /bin/sh
          - -c
          - sleep 5
      periodSeconds: 3
      timeoutSeconds: 1
  1. Do kubectl exec -it test bash, run the top command, and observe how a new zombie process appears every 3 seconds.

Steps to reproduce with Docker (or other container runtime)

  1. Run a container with the image containing the pod binary. For example, use the image from the latest release of OKD (pod image specified here).
docker run -d --name pod quay.io/openshift/okd-content@sha256:1207114d4db1bdb9431fdcc890b6813e889fdcfba94900396f0ab9ca4f0c5dbd
  1. Exec into the container and run top.
docker exec -it pod top
  1. From a different terminal, create another container in the same pid namespace.
docker run -it  --rm  --pid=container:pod ubuntu bash
  1. From the new container, run a command that will spawn a few children and then exit without waiting.
for i in {1..10}; do $(sleep 2) & done; exit
  1. Observe how new zombie processes appear in the top output.

When you repeat steps mentioned above, but with an image that contains pod binary with the added SIGCHLD handling, no zombie processes are left hanging.

References:

@openshift-ci-robot
Copy link
Contributor

Hi @psykulsk. Thanks for your PR.

I'm waiting for a openshift member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci-robot openshift-ci-robot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Aug 16, 2020
@psykulsk
Copy link
Contributor Author

/assign @smarterclayton

@psykulsk
Copy link
Contributor Author

/cc @soltysh

@openshift-ci-robot
Copy link
Contributor

@psykulsk: GitHub didn't allow me to request PR reviews from the following users: soltysh.

Note that only openshift members and repo collaborators can review this PR, and authors cannot review their own PRs.

In response to this:

/cc @soltysh

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@psykulsk
Copy link
Contributor Author

/assign @soltysh

@openshift-ci-robot
Copy link
Contributor

@psykulsk: GitHub didn't allow me to assign the following users: soltysh.

Note that only openshift members, repo collaborators and people who have commented on this issue/PR can be assigned. Additionally, issues/PRs can only have 10 assignees at the same time.
For more information please see the contributor guide

In response to this:

/assign @soltysh

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@psykulsk
Copy link
Contributor Author

@smarterclayton @derekwaynecarr @soltysh could you have a look at this or assign someone else? Thanks!

@smarterclayton
Copy link
Contributor

@mrunalp can you asisgn a reviewer for this and verify consistency with kube pause.c?

@smarterclayton
Copy link
Contributor

/ok-to-test

@openshift-ci-robot openshift-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Sep 4, 2020
@smarterclayton smarterclayton changed the title Handling of SIGCHLD and reaping zombie processes in pod.go Bug 1875946: Pod process container does not correctly reap zombie process with shareProcessNamespace: true Sep 4, 2020
@openshift-ci-robot openshift-ci-robot added the bugzilla/severity-high Referenced Bugzilla bug's severity is high for the branch this PR is targeting. label Sep 4, 2020
@openshift-ci-robot
Copy link
Contributor

@psykulsk: This pull request references Bugzilla bug 1875946, which is valid. The bug has been moved to the POST state. The bug has been updated to refer to the pull request using the external bug tracker.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target release (4.6.0) matches configured target release for branch (4.6.0)
  • bug is in the state NEW, which is one of the valid states (NEW, ASSIGNED, ON_DEV, POST, POST)

In response to this:

Bug 1875946: Pod process container does not correctly reap zombie process with shareProcessNamespace: true

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci-robot openshift-ci-robot added the bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. label Sep 4, 2020
@mrunalp
Copy link
Member

mrunalp commented Sep 4, 2020

Yeah, looking!

pod/pod.go Outdated Show resolved Hide resolved
@mrunalp
Copy link
Member

mrunalp commented Sep 10, 2020

Could you either squash or clean up the commits?

pod/pod.go Show resolved Hide resolved
@psykulsk
Copy link
Contributor Author

/retest

pod/pod.go Outdated Show resolved Hide resolved
pod/pod.go Show resolved Hide resolved
pod/pod.go Outdated Show resolved Hide resolved
pod/pod.go Show resolved Hide resolved
@psykulsk
Copy link
Contributor Author

/retest

pod/pod.go Outdated Show resolved Hide resolved
pod/pod.go Outdated Show resolved Hide resolved
pod/pod.go Outdated Show resolved Hide resolved
@psykulsk
Copy link
Contributor Author

/retest

2 similar comments
@psykulsk
Copy link
Contributor Author

/retest

@psykulsk
Copy link
Contributor Author

/retest

pod/pod.go Outdated Show resolved Hide resolved
pod/pod.go Outdated Show resolved Hide resolved
@mrunalp
Copy link
Member

mrunalp commented Sep 14, 2020

This looks fine besides 2 final nits. Thanks 👍

@mrunalp
Copy link
Member

mrunalp commented Sep 14, 2020

/lgtm

@openshift-ci-robot
Copy link
Contributor

@mrunalp: changing LGTM is restricted to collaborators

In response to this:

/lgtm

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@mrunalp
Copy link
Member

mrunalp commented Sep 15, 2020

/test e2e-aws-upgrade

Copy link
Member

@soltysh soltysh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tagging based on @mrunalp review
/lgtm
/approve

@openshift-ci-robot
Copy link
Contributor

@soltysh: changing LGTM is restricted to collaborators

In response to this:

Tagging based on @mrunalp review
/lgtm
/approve

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@mrunalp
Copy link
Member

mrunalp commented Sep 15, 2020

@smarterclayton ptal for approve/lgtm

@smarterclayton
Copy link
Contributor

/lgtm
/approve

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Sep 15, 2020
@openshift-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: mrunalp, psykulsk, smarterclayton, soltysh

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci-robot openshift-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Sep 15, 2020
@openshift-merge-robot openshift-merge-robot merged commit a3f9ebe into openshift:master Sep 15, 2020
@openshift-ci-robot
Copy link
Contributor

@psykulsk: All pull requests linked via external trackers have merged:

Bugzilla bug 1875946 has been moved to the MODIFIED state.

In response to this:

Bug 1875946: Pod process container does not correctly reap zombie process with shareProcessNamespace: true

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@psykulsk psykulsk deleted the reapchildren branch September 15, 2020 18:52
@haircommander
Copy link
Member

/cherry-pick release-4.5

@openshift-cherrypick-robot

@haircommander: new pull request created: #57

In response to this:

/cherry-pick release-4.5

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. bugzilla/severity-high Referenced Bugzilla bug's severity is high for the branch this PR is targeting. bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. lgtm Indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants