Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kubectl get should have a way to filter for advanced pods status #49387

Open
simonswine opened this issue Jul 21, 2017 · 66 comments
Open

kubectl get should have a way to filter for advanced pods status #49387

simonswine opened this issue Jul 21, 2017 · 66 comments

Comments

@simonswine
Copy link
Member

@simonswine simonswine commented Jul 21, 2017

What happened:

I'd like to have a simple command to check for pods that are currently not ready

What you expected to happen:

I can see a couple of options:

  • There is some magic flag I am not aware of
  • Having a flag for kubectl get to filter the output using go/jsonpath
  • Distinguish between Pod Phase Running&Ready and Running
  • Flag to filter on ready status

How to get that currently:

kubectl get pods --all-namespaces -o json  | jq -r '.items[] | select(.status.phase != "Running" or ([ .status.conditions[] | select(.type == "Ready" and .state == false) ] | length ) == 1 ) | .metadata.namespace + "/" + .metadata.name'
@simonswine
Copy link
Member Author

@simonswine simonswine commented Jul 21, 2017

/kind feature

Loading

@simonswine
Copy link
Member Author

@simonswine simonswine commented Jul 21, 2017

/sig cli

Loading

@EtienneDeneuve
Copy link

@EtienneDeneuve EtienneDeneuve commented Aug 28, 2017

Same here, It sound incredible to use a complex syntax to only list non running container...

Loading

@jackzampolin
Copy link

@jackzampolin jackzampolin commented Oct 18, 2017

Ideally I would be able to say something like:

kubectl get pods --namespace foo -l status=pending

Loading

@carlossg
Copy link
Contributor

@carlossg carlossg commented Nov 23, 2017

I had to make a small modification to .status == "False" to get it to work

kubectl get pods -a --all-namespaces -o json  | jq -r '.items[] | select(.status.phase != "Running" or ([ .status.conditions[] | select(.type == "Ready" and .status == "False") ] | length ) == 1 ) | .metadata.namespace + "/" + .metadata.name'

Loading

@dixudx
Copy link
Member

@dixudx dixudx commented Nov 24, 2017

#50140 provides a new flag --field-selector to filter these pods now.

$ kubectl get pods --field-selector=status.phase!=Running

/close

Loading

@asarkar
Copy link

@asarkar asarkar commented Dec 13, 2017

@dixudx

kubectl version
Client Version: version.Info{Major:"1", Minor:"8", GitVersion:"v1.8.4", GitCommit:"9befc2b8928a9426501d3bf62f72849d5cbcd5a3", GitTreeState:"clean", BuildDate:"2017-11-20T19:11:02Z", GoVersion:"go1.9.2", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"8", GitVersion:"v1.8.0", GitCommit:"0b9efaeb34a2fc51ff8e4d34ad9bc6375459c4a4", GitTreeState:"clean", BuildDate:"2017-11-29T22:43:34Z", GoVersion:"go1.9.1", Compiler:"gc", Platform:"linux/amd64"}
kubectl get po --field-selector=status.phase==Running -l app=k8s-watcher
Error: unknown flag: --field-selector

Loading

@dixudx
Copy link
Member

@dixudx dixudx commented Dec 13, 2017

@asarkar --field-selector is targeted for v1.9, which is coming out soon.

Loading

@simonswine
Copy link
Member Author

@simonswine simonswine commented Jan 15, 2018

@dixudx thanks for the PR for the field-selector. But I think this is not what I had in mind. I wanted to be able figure out pods that have one or more container that are not passing the readiness checks.

Given that I have non ready pod (kubectl v1.9.1) READY 0/1:

$ kubectl get pods                                       
NAME          READY     STATUS    RESTARTS   AGE
pod-unready   0/1       Running   0          50s

This pod is still in Phase running, so I can't get it using your proposed filter:

$ kubectl get pods --field-selector=status.phase!=Running
No resources found.

Loading

@simonswine
Copy link
Member Author

@simonswine simonswine commented Jan 15, 2018

/reopen

Loading

@k8s-ci-robot k8s-ci-robot reopened this Jan 15, 2018
@af732p
Copy link

@af732p af732p commented Jan 18, 2018

Got the same issue.
I would be glad to have something like:
kubectl get pods --field-selector=status.ready!=True

Loading

@artemyarulin
Copy link

@artemyarulin artemyarulin commented Feb 21, 2018

Hm, can I use it for getting nested array items? Like I want to do

kubectl get pods --field-selector=status.containerStatuses.restartCount!=0

But it returns error, tried status.containerStatuses..restartCount, but it also doesn't work and returns same error Error from server (BadRequest): Unable to find "pods" that match label selector "", field selector "status.containerStatuses..restartCount==0": field label not supported: status.containerStatuses..restartCount

Loading

@jackzampolin
Copy link

@jackzampolin jackzampolin commented Feb 21, 2018

@artemyarulin try status.containerStatuses[*].restartCount==0

Loading

@artemyarulin
Copy link

@artemyarulin artemyarulin commented Feb 22, 2018

Thanks, just tried with kubectl v1.9.3/cluster v1.9.2 and it returns same error - Error from server (BadRequest): Unable to find "pods" that match label selector "", field selector "status.containerStatuses[*].restartCount!=0": field label not supported: status.containerStatuses[*].restartCount. Am I doing something wrong? Does it work for you?

Loading

eloycoto added a commit to eloycoto/cilium that referenced this issue Mar 6, 2018
Due the test-flake I discovered that the termination helper didn't work
as expected, and the status.phase is not represent at all
(kubernetes/kubernetes#49387)

Issue:

```
vagrant@k8s1:~$ kubectl delete pod testds-w7prl
pod "testds-w7prl" deleted
vagrant@k8s1:~$ kubectl get pods
NAME               READY     STATUS        RESTARTS   AGE
netcatds-bhxv4     1/1       Running       0          5m
netcatds-zpzzl     1/1       Running       0          5m
testclient-8qx59   1/1       Running       0          1m
testclient-r9xmm   1/1       Running       0          1m
testds-fwss5       1/1       Running       0          32s
testds-w7prl       0/1       Terminating   0          1m
vagrant@k8s1:~$ kubectl get pods -o "jsonpath='{.items[*].status.phase}'"
'Running Running Running Running^C
vagrant@k8s1:~$ kubectl get pods
NAME               READY     STATUS        RESTARTS   AGE
netcatds-bhxv4     1/1       Running       0          5m
netcatds-zpzzl     1/1       Running       0          5m
testclient-8qx59   1/1       Running       0          1m
testclient-r9xmm   1/1       Running       0          1m
testds-fwss5       1/1       Running       0          40s
testds-w7prl       0/1       Terminating   0          1m
vagrant@k8s1:~$
```

Signed-off-by: Eloy Coto <eloy.coto@gmail.com>
ianvernon added a commit to cilium/cilium that referenced this issue Mar 8, 2018
Due the test-flake I discovered that the termination helper didn't work
as expected, and the status.phase is not represent at all
(kubernetes/kubernetes#49387)

Issue:

```
vagrant@k8s1:~$ kubectl delete pod testds-w7prl
pod "testds-w7prl" deleted
vagrant@k8s1:~$ kubectl get pods
NAME               READY     STATUS        RESTARTS   AGE
netcatds-bhxv4     1/1       Running       0          5m
netcatds-zpzzl     1/1       Running       0          5m
testclient-8qx59   1/1       Running       0          1m
testclient-r9xmm   1/1       Running       0          1m
testds-fwss5       1/1       Running       0          32s
testds-w7prl       0/1       Terminating   0          1m
vagrant@k8s1:~$ kubectl get pods -o "jsonpath='{.items[*].status.phase}'"
'Running Running Running Running^C
vagrant@k8s1:~$ kubectl get pods
NAME               READY     STATUS        RESTARTS   AGE
netcatds-bhxv4     1/1       Running       0          5m
netcatds-zpzzl     1/1       Running       0          5m
testclient-8qx59   1/1       Running       0          1m
testclient-r9xmm   1/1       Running       0          1m
testds-fwss5       1/1       Running       0          40s
testds-w7prl       0/1       Terminating   0          1m
vagrant@k8s1:~$
```

Signed-off-by: Eloy Coto <eloy.coto@gmail.com>
@migueleliasweb
Copy link

@migueleliasweb migueleliasweb commented Apr 19, 2018

Sadly, the same thing happens for v1.9.4:

What I'm trying to do here is to get all pods with a given parent uid...

$ kubectl get pod --field-selector='metadata.ownerReferences[*].uid=d83a23e1-37ba-11e8-bccf-0a5d7950f698'
Error from server (BadRequest): Unable to find "pods" that match label selector "", field selector "ownerReferences[*].uid=d83a23e1-37ba-11e8-bccf-0a5d7950f698": field label not supported: ownerReferences[*].uid

Waiting anxiously for this feature •ᴗ•

Loading

@dixudx
Copy link
Member

@dixudx dixudx commented Apr 19, 2018

--field-selector='metadata.ownerReferences[].uid=d83a23e1-37ba-11e8-bccf-0a5d7950f698'
field label not supported: ownerReferences[
].uid

This filter string is not supported.

For pods, only "metadata.name", "metadata.namespace", "spec.nodeName", "spec.restartPolicy", "spec.schedulerName", status.phase", "status.podIP", "status.nominatedNodeName", "sepc.nodeName" are supported.

@migueleliasweb If you want to filer out the pod in your case, you can use jq.

$ kubectl get pod -o json | jq '.items | map(select(.metadata.ownerReferences[] | .uid=="d83a23e1-37ba-11e8-bccf-0a5d7950f698"))'

Also you can use JSONPath Support of kubectl.

Loading

@migueleliasweb
Copy link

@migueleliasweb migueleliasweb commented Apr 19, 2018

Thanks @dixudx . But let me understand a litle bit better. If I'm running this query in a cluster with a few thousand pods:

  • Does the APIServer fetch all of them from ETCD and them apply in-memory filtering ?
  • Or does my kubectl receive all pods and apply locally the filter ?
  • Or the filtering occours inside the ETCD ? So only the filtered results are returned ?

Loading

@dixudx
Copy link
Member

@dixudx dixudx commented Apr 20, 2018

Does the APIServer will fetch all of them from ETCD and them apply in-memory filtering ?
Or does my kubectl will receive all pods and apply locally the filter ?
Or the filtering occours inside the ETCD ? So only the filtered results are returned ?

@migueleliasweb If --field-selector is issued when using kubectl, the filtering is in a cache of apiserver. APIServer will have a single watch open to etcd, watching all the objects (of a given type) without any filtering. The changes delivered from etcd will then be stored in a cache of apiserver.

For --sort-by, the filtering is on kubectl client side.

Loading

@kvs
Copy link

@kvs kvs commented Apr 26, 2018

This work great for me with kubectl get, but it would also be nice if it could apply to delete and describe

Loading

@fejta-bot
Copy link

@fejta-bot fejta-bot commented Jul 25, 2018

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

Loading

@raelga
Copy link
Member

@raelga raelga commented May 3, 2020

/remove-lifecycle stale

Loading

@alexburlton-sonocent
Copy link

@alexburlton-sonocent alexburlton-sonocent commented May 7, 2020

Adding my voice to this - agree that it would be awesome to be able to do kubectl get pods --ready or similar. I was wanting to add a step to a pipeline which would wait until all pods were ready (and fail after a timeout otherwise) and have had to rely on grep... which is brittle if the output of kubectl ever changes. Would much rather be querying the status of the pods more directly.

Loading

@flickerfly
Copy link

@flickerfly flickerfly commented Jul 1, 2020

As @luvkrai stated, I have to use grep to find containers in CrashLoopBackOff status. Here's how I'm filtering right now. I sort by nodeName to show if I have a node that is causing more problems than others. Seems like a super hard problem somehow. Funny we can get all these output in the "STATUS" column consistently, but can't filter on that column without additional tools.

oc get pods --all-namespaces -o wide --sort-by=.spec.nodeName | grep -Ev "(Running|Completed)"

Loading

@flickerfly
Copy link

@flickerfly flickerfly commented Jul 1, 2020

If I had more Golang knowledge, I think how this is accomplished to build the "STATUS" column would be clear here and might lead to a better solution.
https://github.com/kubernetes/kubectl/blob/7daf5bcdb45a24640236b361b86c056282ddcf80/pkg/describe/describe.go#L679

@alexburlton-sonocent To avoid some of the brittle aspects of grep you could use the --no-headers --output custom-columns= option to specify which columns you want, but you may run into the same problem that the complete info put out in the STATUS column isn't dependably found in the pod definition.

Loading

@kgtw
Copy link

@kgtw kgtw commented Jul 25, 2020

Here's something I use to find all pods that are not fully running (eg some containers are failing)

kubectl get po --all-namespaces | gawk 'match($3, /([0-9])+\/([0-9])+/, a) {if (a[1] < a[2] && $4 != "Completed") print $0}'
NAMESPACE        NAME                               READY   STATUS      RESTARTS   AGE
blah             blah-6d46d95b96-7wsh6              2/4     Running     0          33h

Loading

@fejta-bot
Copy link

@fejta-bot fejta-bot commented Oct 23, 2020

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

Loading

@invidian
Copy link
Member

@invidian invidian commented Oct 23, 2020

/remove-lifecycle stale

Loading

@aloisio31
Copy link

@aloisio31 aloisio31 commented Oct 27, 2020

see here

Loading

@fejta-bot
Copy link

@fejta-bot fejta-bot commented Jan 25, 2021

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

Loading

@invidian
Copy link
Member

@invidian invidian commented Jan 25, 2021

/remove-lifecycle stale

Loading

@rdxmb
Copy link

@rdxmb rdxmb commented Jan 25, 2021

/remove-lifecycle stale

Loading

@Ravikanth31
Copy link

@Ravikanth31 Ravikanth31 commented Feb 12, 2021

kubectl get pods | grep -v Running
try this it will list all the pods which are not in running state

Loading

@ensonic
Copy link

@ensonic ensonic commented Feb 19, 2021

kubectl get pods --field-selector=status.phase!=Running

is not a good solution, since one would need to be able to say

kubectl get pods --field-selector=status.phase!=(Running|Completed)

Unfortunately the set operators (in and not in) don't seems to work on the command line (and not for field-selectors).

Loading

@browgregpa
Copy link

@browgregpa browgregpa commented Mar 22, 2021

If running unix
kubectl get pods --field-selector=status.phase!=Running --all-namespaces | grep -v "Completed"

Loading

@fejta-bot
Copy link

@fejta-bot fejta-bot commented Jun 20, 2021

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale

Loading

@fejta-bot
Copy link

@fejta-bot fejta-bot commented Jul 20, 2021

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle rotten

Loading

@invidian
Copy link
Member

@invidian invidian commented Jul 20, 2021

/remove-lifecycle rotten

Loading

@ringerc
Copy link

@ringerc ringerc commented Sep 21, 2021

This is a seriously confusing usability issue when getting started with k8s.

As a new user I found it very confusing that my pods showed Running, but readiness checks were failing, so the real status was "0/2". I understand why readiness is distinct from pod status, but it's definitely something kubectl could help users with.

kubectl wait pod --for status=Ready --all --all-namespaces will report the first pod that is not Ready, then exit, but does not print easily machine-readable output or continue to report other pod after it finds the first non-ready pod.

kubectl get pods doesn't offer field selectors to match non-ready pods. So people are resorting to chaining various hacks with external tools.

Loading

@randywatson1979
Copy link

@randywatson1979 randywatson1979 commented Nov 17, 2021

In my case I have rolling update deployment in which the new replicaset is marked as Running and the old replicaset Terminating after kubectl rollout returned success.

Waiting for deployment "some-deployment-app" rollout to finish: 1 old replicas are pending termination...
deployment "some-deployment-app" successfully rolled out

kubectl get pods -n some-namespace --selector=app=some-app 

NAME                        READY   STATUS        RESTARTS   AGE
some-deployment-app-p5pzw   1/1     Terminating   0          8m44s
some-deployment-app-j3xwk   1/1     Running       0          91s

However, I have randomly encountered that the -o json output shows both replicasets are running with spec: .items[0].status.phase: "Running" and .items[0].status.containerStatuses[0].state.running
.items[1].status.phase: "Running" and .items[1].status.containerStatuses[0].state.running

But after one second:
.items[0].status.phase: "Running" and .items[0].status.containerStatuses[0].state.terminated
.items[1].status.phase: "Running" and .items[1].status.containerStatuses[0].state.running

Then when I use the go-lang method, I get the correct results:
kubectl get pod -n some-namespace --selector=app=some-app -o go-template --template='{{range .items}}{{$metadataName := .metadata.name}}{{range .status.containerStatuses}}{{if .state.running}}{{$metadataName}}{{end}}{{end}}{{end}}')

So as you noticed the field selector will not work because both will always return status.phase: "Running".
It seems like (assumption) that there is a delay when Kubernetes marks the replicaset as Terminating (Sigterm) but the scheduler needs to perform this action with the pod.
So, where is Terminating based from with kubectl get pods command? It's certainly not from the specs of the replicasets.

This behaviour is causing confusion.

I could set a fixed delay of 3 seconds, but that is a very bad solution.
Another way is to set terminationGracePeriodSeconds: 0 for the deployment which does remedie the issue, but this effectively disables the termination grace period which defaults to 30 seconds. I'm not sure if it is bad practice.

Further inspections seems that during the termination grace period sigterm and sigkill, the container specs aren't updated yet. When it does get a sigkill or the container has shutdown itself within the grace period, then it will update the terminating container specs after the fact:

...
"status": {
               "containerStatuses": [
                        {
                        "state": {
                                        "terminated": {
                                            "containerID": "docker://88f2d0e303bc9a253e66002523c5be662cd756206d6ba68e",
                                            "exitCode": 0,
                                            "finishedAt": "2021-11-17T14:43:04Z",
                                            "reason": "Completed",
                                            "startedAt": "2021-11-17T13:47:15Z"
                                        }
                                    }

Loading

@mnpenner
Copy link

@mnpenner mnpenner commented Nov 18, 2021

You can do something like this to find pods that have a not-ready container:

kubectl get pods -oyaml | yq e '.items[] | select(.status.containerStatuses[].ready | not) | .metadata.name' -

Doesn't work so well on crons and one-off jobs but that's just to give you an idea if you're looking for a more powerful query selector.

yq being https://github.com/mikefarah/yq. Alternatively you can use -ojson with jq instead.

kubectl get pods -ojson | jq '.items[] | select(.status.containerStatuses[].started | not) | [{name: .metadata.name, statuses: .status.containerStatuses}]'

Or maybe:

kubectl get pods -ojson | jq '.items[] | select(.status.containerStatuses[] | ((.ready|not) and .state.terminated.exitCode!=0)) | [{name: .metadata.name, statuses: .status.containerStatuses}]'

Loading

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet