Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

If container is not in correct state podman exec should exit with 126 #3307

Merged
merged 1 commit into from Jun 12, 2019

Conversation

rhatdan
Copy link
Member

@rhatdan rhatdan commented Jun 12, 2019

This way a tool can determine if the container exists or not, but is in the
wrong state.

Since 126 is documeted as:
126 if the contained command cannot be invoked

It makes sense that the container would exit with this state.

Signed-off-by: Daniel J Walsh dwalsh@redhat.com

This way a tool can determine if the container exists or not, but is in the
wrong state.

Since 126 is documeted as:
**_126_** if the **_contained command_** cannot be invoked

It makes sense that the container would exit with this state.

Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
@openshift-ci-robot
Copy link
Collaborator

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: rhatdan

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci-robot openshift-ci-robot added approved Indicates a PR has been approved by an approver from all required OWNERS files. size/XS labels Jun 12, 2019
@mbaldessari
Copy link

mbaldessari commented Jun 12, 2019

I tested this and can confirm it works. Previous behaviour:

[root@controller-0 ~]# podman exec nonexisting /bin/true; echo $?
unable to exec into nonexisting: no container with name or ID nonexisting found: no such container
125
# memcached is stopped
[root@controller-0 ~]# podman exec memcached /bin/true; echo $?
cannot exec into container that is not running
125

After I patched podman with this review:

 [root@undercloud-0 ~]# podman exec nonexisting /bin/true; echo $?
Error: unable to exec into nonexisting: no container with name or ID nonexisting found: no such container
125
 [root@undercloud-0 ~]# podman exec memcached /bin/true; echo $?                                                                                                                                                                                                                 
Error: cannot exec into container that is not running: container state improper
126

Thanks again!

mbaldessari added a commit to mbaldessari/resource-agents that referenced this pull request Jun 12, 2019
With the following podman change containers/podman#3307
we are able to run podman exec and distinguish between the case of
nonexisting and nonrunning containers. This allows us to avoid podman
inspect altogether which is not very performant under I/O load.

NB: Need to see if there is a way to make this more backwards compatible
Signed-off-by: Michele Baldessari <michele@acksyn.org>
mbaldessari added a commit to mbaldessari/resource-agents that referenced this pull request Jun 12, 2019
With the following podman change containers/podman#3307
we are able to run podman exec and distinguish between the case of
nonexisting and nonrunning containers. This allows us to avoid podman
inspect altogether which is not very performant under I/O load.

Tested as follows:
1) Installed a podman version on all cluster node that included the GH#3307
2)

NB: Need to see if there is a way to make this more backwards compatible
Signed-off-by: Michele Baldessari <michele@acksyn.org>
mbaldessari added a commit to mbaldessari/resource-agents that referenced this pull request Jun 12, 2019
With the following podman change containers/podman#3307
we are able to run podman exec and distinguish between the case of
nonexisting and nonrunning containers. This allows us to avoid podman
inspect altogether which is not very performant under I/O load.

Tested as follows:
1) Installed a podman version on all cluster node that included the GH#3307
2) Observed that monitoring operations kept working okay
3) Restarted rabbitmq-bundle and galera-bundle successfully
4) podman stopped a container and we correctly detected the monitor failure and recovered from it:
Jun 12 12:12:36 controller-0 podman(haproxy-bundle-podman-1)[403734]: ERROR: monitor cmd failed (rc=126), output: Error: cannot exec into container that is not running: container state improper
Jun 12 12:12:36 controller-0 pacemaker-execd[25744]: notice: haproxy-bundle-podman-1_monitor_60000:403609:stderr [ ocf-exit-reason:monitor cmd failed (rc=126), output: Error: cannot exec into container that is not running: container state improper ]
Jun 12 12:12:36 controller-0 pacemaker-controld[25747]: notice: controller-0-haproxy-bundle-podman-1_monitor_60000:264 [ ocf-exit-reason:monitor cmd failed (rc=126), output: Error: cannot exec into container that is not running: container state improper\n ]
Jun 12 12:12:36 controller-0 pacemaker-controld[25747]: notice: controller-0-haproxy-bundle-podman-1_monitor_60000:264 [ ocf-exit-reason:monitor cmd failed (rc=126), output: Error: cannot exec into container that is not running: container state improper\n ]
Jun 12 12:12:37 controller-0 podman(haproxy-bundle-podman-1)[403894]: INFO: f9d4944366c6484e94a1b476e76dd9a1022d19b08cb8f2526faf9afd27256bf3
Jun 12 12:12:37 controller-0 podman(haproxy-bundle-podman-1)[403961]: NOTICE: Cleaning up inactive container, haproxy-bundle-podman-1.
Jun 12 12:12:37 controller-0 podman[403965]: 2019-06-12 12:12:37.689546085 +0000 UTC m=+0.228982024 container remove f9d4944366c6484e94a1b476e76dd9a1022d19b08cb8f2526faf9afd27256bf3 (image=192.168.24.1:8787/rhosp15/openstack-haproxy:20190607.1, name=haproxy-bundle-podman-1
Jun 12 12:12:37 controller-0 podman(haproxy-bundle-podman-1)[404051]: INFO: f9d4944366c6484e94a1b476e76dd9a1022d19b08cb8f2526faf9afd27256bf3
Jun 12 12:12:37 controller-0 pacemaker-controld[25747]: notice: Result of stop operation for haproxy-bundle-podman-1 on controller-0: 0 (ok)
Jun 12 12:12:38 controller-0 podman(haproxy-bundle-podman-1)[404238]: INFO: running container haproxy-bundle-podman-1 for the first time
Jun 12 12:12:39 controller-0 podman(haproxy-bundle-podman-1)[404522]: NOTICE: Container haproxy-bundle-podman-1  started successfully
Jun 12 12:12:39 controller-0 pacemaker-controld[25747]: notice: Result of start operation for haproxy-bundle-podman-1 on controller-0: 0 (ok)

6) Stopped and removed a container and pcmk detected it correctly:
Jun 12 12:14:40 controller-0 podman(haproxy-bundle-podman-1)[414018]: ERROR: monitor cmd failed (rc=125), output: Error: unable to exec into haproxy-bundle-podman-1: no container with name or ID haproxy-bundle-podman-1 found: no such container
Jun 12 12:14:40 controller-0 pacemaker-execd[25744]: notice: haproxy-bundle-podman-1_monitor_60000:413986:stderr [ ocf-exit-reason:monitor cmd failed (rc=125), output: Error: unable to exec into haproxy-bundle-podman-1: no container with name or ID haproxy-bundle-podman-1
found: no such container ]
Jun 12 12:14:40 controller-0 pacemaker-controld[25747]: notice: controller-0-haproxy-bundle-podman-1_monitor_60000:312 [ ocf-exit-reason:monitor cmd failed (rc=125), output: Error: unable to exec into haproxy-bundle-podman-1: no container with name or ID haproxy-bundle-pod
man-1 found: no such container\n ]
Jun 12 12:14:40 controller-0 pacemaker-controld[25747]: notice: controller-0-haproxy-bundle-podman-1_monitor_60000:312 [ ocf-exit-reason:monitor cmd failed (rc=125), output: Error: unable to exec into haproxy-bundle-podman-1: no container with name or ID haproxy-bundle-pod
man-1 found: no such container\n ]
Jun 12 12:14:40 controller-0 pacemaker-controld[25747]: notice: Result of stop operation for haproxy-bundle-podman-1 on controller-0: 0 (ok)
Jun 12 12:14:41 controller-0 podman(haproxy-bundle-podman-1)[414183]: INFO: running container haproxy-bundle-podman-1 for the first time
Jun 12 12:14:42 controller-0 podman(haproxy-bundle-podman-1)[414334]: NOTICE: Container haproxy-bundle-podman-1  started successfully
Jun 12 12:14:42 controller-0 pacemaker-controld[25747]: notice: Result of start operation for haproxy-bundle-podman-1 on controller-0: 0 (ok)

7) Added 'set -x' to the RA and correctly observed that no 'podman inspect' has been invoked during monitoring and start/stop operations

NB: Need to see if there is a way to make this more backwards compatible
Signed-off-by: Michele Baldessari <michele@acksyn.org>
@TomSweeneyRedHat
Copy link
Member

LGTM and happy green test buttons.

@mheon
Copy link
Member

mheon commented Jun 12, 2019

/lgtm

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Jun 12, 2019
@openshift-merge-robot openshift-merge-robot merged commit 9faff31 into containers:master Jun 12, 2019
@rh-atomic-bot rh-atomic-bot mentioned this pull request Jun 12, 2019
7 tasks
@github-actions github-actions bot added the locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. label Sep 26, 2023
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Sep 26, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants