Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kubelet: respect exec probe timeouts #94115

Merged
merged 8 commits into from Nov 10, 2020

Conversation

andrewsykim
Copy link
Member

@andrewsykim andrewsykim commented Aug 20, 2020

Signed-off-by: Andrew Sy Kim kim.andrewsy@gmail.com

What type of PR is this?

/kind bug

What this PR does / why we need it:
This PR fixes exec timeout issues for both dockershim and containerd.

When using kubelet dockershim, the exec timeout is not respected which results in timeouts for exec readiness/liveness probes to not be respected as well. This PR ensures the timeout passed into RunInContainer for dockershim is respected. The timeout value passed into RunInContainer is derived from the probe timeout in the case of readiness/liveness probes.

For containerd, the prober ignores timeout errors from remote runtime's ExecSync since it expects utilexec.ExitCodeError for any failed probes. Any other error ends up being ignored by the prober. This PR updates ExecSync to return utilexec.ExitCodeError when the grpc error from CRI is DeadlineContextExceeded.

Which issue(s) this PR fixes:

Fixes #94080

Special notes for your reviewer:

Does this PR introduce a user-facing change?:

ACTION REQUIRED: a bug was fixed in kubelet where exec probe timeouts were not respected. Ensure that pods relying on this behavior are updated to correctly handle probe timeouts.

This change in behavior may be unexpected for some clusters and can be disabled by turning off the ExecProbeTimeout feature gate. This gate will be locked and removed in future releases so that exec probe timeouts are always respected.

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:


@k8s-ci-robot k8s-ci-robot added kind/bug Categorizes issue or PR as related to a bug. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. labels Aug 20, 2020
@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. area/kubelet sig/node Categorizes an issue or PR as relevant to SIG Node. and removed do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Aug 20, 2020
@andrewsykim
Copy link
Member Author

andrewsykim commented Aug 20, 2020

I was able to reproduce this on a GKE cluster.

Deployed a pod where the exec probe takes 10s to pass and the timeout is set to 1. On nodes with existing kubelet, the probe passes since the timeout is not respected.

I manually updated one of the kubelets with this change and now the timeout is respected and the readiness probe fails.

$ kubectl get po -o wide
NAME                               READY   STATUS    RESTARTS   AGE     IP         NODE                                       NOMINATED NODE   READINESS GATES
slow-deployment-6989cc5777-bc92t   1/1     Running   0          4h26m   10.4.2.5   gke-cluster-1-default-pool-9c83cf52-q55l   <none>           <none>
slow-deployment-6989cc5777-cvhhz   1/1     Running   0          4h26m   10.4.0.7   gke-cluster-1-default-pool-9c83cf52-nchv   <none>           <none>
slow-deployment-6989cc5777-f4vhj   0/1     Running   0          4h26m   10.4.1.6   gke-cluster-1-default-pool-9c83cf52-n6x4   <none>           <none>

Deployment:

$ cat slow.yaml 
apiVersion: apps/v1
kind: Deployment
metadata:
  name: slow-deployment
  labels:
    app: slow
spec:
  replicas: 3
  selector:
    matchLabels:
      app: slow
  template:
    metadata:
      labels:
        app: slow
    spec:
      containers:
      - name: slow
        image: jugosag/slow:1.0.0
        imagePullPolicy: Always
        ports:
        - containerPort: 8080
        env:
        - name: healthCheckSlowness
          value: "10"
        readinessProbe:
          exec:
            command:
            - curl
            - -f
            - localhost:8080/slow?timeInSecs=10
          initialDelaySeconds: 2
          periodSeconds: 1
          timeoutSeconds: 1

Thanks @jgoeres for the test deployment.

klog.Errorf("Exec session %s in container %s terminated but process still running!", execObj.ID, container.ID)
break
count++
if count == 5 {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this count check should be removed now since we're checking the timeout now, but I'll defer to kubelet maintainers since I don't have context on why this is here originally.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks like earlier we just had an infinite loop until InspectExec can tell us that the process we started exited... the count was added to break out of it in 7748a02

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nvm, I think we need to keep this since it's expected behavior for kubelet to exit at some point with a nil error and allow the exec'd command to continue running.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That or there should be some default timeout here, but don't want to introduce new changes here.

@dims
Copy link
Member

dims commented Aug 20, 2020

/assign @dashpole @sjenning

@dims
Copy link
Member

dims commented Aug 20, 2020

@andrewsykim also additional context, see #50176

@andrewsykim andrewsykim force-pushed the fix-dockershim-exec branch 2 times, most recently from 2bb6cf2 to 330fc4f Compare August 20, 2020 01:54
@SergeyKanzhelev
Copy link
Member

is it an alternative to this PR: #58925?

@andrewsykim
Copy link
Member Author

/retest

@andrewsykim
Copy link
Member Author

e2e failures look legit, will dig into it

/hold

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Aug 20, 2020
@@ -110,28 +110,32 @@ func (*NativeExecHandler) ExecInContainer(client libdocker.Interface, container
}

ticker := time.NewTicker(2 * time.Second)
execTimeout := time.After(timeout)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

need to check for 0 timeout, since exec outside of probes will have 0 timeout

@andrewsykim andrewsykim force-pushed the fix-dockershim-exec branch 2 times, most recently from 8805bd1 to 7ee5e07 Compare August 20, 2020 16:03
@k8s-ci-robot k8s-ci-robot added release-note-action-required Denotes a PR that introduces potentially breaking changes that require user action. and removed release-note Denotes a PR that will be considered when it comes time to generate release notes. labels Nov 15, 2020
Comment on lines +148 to +156
// Only limit the amount of InspectExec calls if the exec timeout was not set.
// When a timeout is not set, we stop polling the exec session after 5 attempts and allow the process to continue running.
if execTimeout == nil {
count++
if count == 5 {
klog.Errorf("Exec session %s in container %s terminated but process still running!", execObj.ID, container.ID)
return nil
}
}
Copy link

@dionysius dionysius Apr 19, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't this mean that any probe, without timeout, taking longer than 10 seconds will automatically be no error, independent of what the actual exit of the command is?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There have been a number of follow-ups to this PR since it landed so I suggest you take a look at what's committed on the master branch.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ignore my question - interpretation error from my side, on master (and probably on this commit as well) client.StartExec() is blocking and thus this code block happens after the command has finished in some form. It is 10 seconds maximum for gathering the exit info, although the code on master is slightly different with the same effect.

@PrabhuMathi
Copy link

PrabhuMathi commented May 4, 2021

Facing same issue in 1.19.7 (AKS) with containerd(1.5.0-beta) as cri. Could you please advise in which version of containerd or kubernetes patch this was fixed?

Pod stuck in running 0/1 status

readinessProbe: exec: command: - sleep - "15" failureThreshold: 3 initialDelaySeconds: 10 periodSeconds: 10 successThreshold: 1 timeoutSeconds: 2

Readiness probe errored: rpc error: code = DeadlineExceeded desc = failed to exec in container: timeout 2s exceeded: context deadline exceeded

@matthyx
Copy link
Contributor

matthyx commented May 4, 2021

@PrabhuMathi I don't think this commit has been cherry-picked in 1.19 - you'll have to use 1.20 or above

@PrabhuMathi
Copy link

@PrabhuMathi I don't think this commit has been cherry-picked in 1.19 - you'll have to use 1.20 or above

@matthyx thanks for the update, Let me give timeout in the command itself as mentioned and proceed. Also started facing readiness probe failure after migrating to Containerd any ref on this please?

@SergeyKanzhelev
Copy link
Member

@PrabhuMathi I don't think this commit has been cherry-picked in 1.19 - you'll have to use 1.20 or above

@matthyx thanks for the update, Let me give timeout in the command itself as mentioned and proceed. Also started facing readiness probe failure after migrating to Containerd any ref on this please?

As far as I remember, with Containerd before 1.19, probe will timeout, but it will not be treated as an error. With Docker there will be no timeout. Is it what you experience?

@PrabhuMathi
Copy link

@PrabhuMathi I don't think this commit has been cherry-picked in 1.19 - you'll have to use 1.20 or above

@matthyx thanks for the update, Let me give timeout in the command itself as mentioned and proceed. Also started facing readiness probe failure after migrating to Containerd any ref on this please?

As far as I remember, with Containerd before 1.19, probe will timeout, but it will not be treated as an error. With Docker there will be no timeout. Is it what you experience?

Yes true in docker same pod runs fine, but when i move it to contained node facing this issue, but actual problem i'm facing is not about timeout but readiness probe is failing with command (sleep 15)

@matthyx
Copy link
Contributor

matthyx commented May 7, 2021

@PrabhuMathi can you share your podspec?

stuartpb added a commit to stuartpb/bitnami-charts that referenced this pull request Sep 16, 2021
The timeout wrapper in health checks was added in helm/charts#11355
to work around Docker/containerd not respecting timeouts in probes
(cf. kubernetes/kubernetes#58925). The upstream issue has been fixed
since Kubernetes 1.20 (kubernetes/kubernetes#94115), and this wrapper
causes degraded behavior (ie. any failure in the wrapped command only
gets reported as "The monitored command dumped core", without details
for the specific failure), so the original behavior should be restored.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/kubelet area/test cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/bug Categorizes issue or PR as related to a bug. lgtm "Looks good to me", indicates that a PR is ready to be merged. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. release-note-action-required Denotes a PR that introduces potentially breaking changes that require user action. sig/node Categorizes an issue or PR as relevant to SIG Node. sig/testing Categorizes an issue or PR as relevant to SIG Testing. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

exec-type liveness or readiness probes ignore timeout