-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
libpod: simplify WaitForExit() #23601
Conversation
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: Luap99 The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
We were not able to find or create Copr project
Unless the HTTP status code above is >= 500, please check your configuration for:
|
libpod/container_api.go
Outdated
|
||
func waitForConmonExit(conmonPID, conmonPidFd int, pollInterval time.Duration) error { | ||
if conmonPidFd > -1 { | ||
fds := []unix.PollFd{{Fd: int32(conmonPidFd), Events: unix.POLLIN}} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this work on FreeBSD? Otherwise will need to move over to one of the _linux files
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes because getConmonPidFd() is already platform specific and the bsd stub returns -1 just like in the case where it is not supported on linux.
libpod/container_api.go
Outdated
return -1, err | ||
} | ||
|
||
if err := c.checkExitFile(); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can return nil if the exit file doesn't exist - as it might if Conmon was killed by a SIGKILL. I think we get into a bad state in that case - the container will still be marked as running.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should not get into a bad state as this is just the wait call, cleanup/stop and other podman process would still be able to cleanup correctly I think.
However there is the issue that we call GetContainerExitCode() which would not have the current exit code and then we might return an old exit code from there instead of an error. So I guess I need to handle that somehow
Well the comment about a deadlock in kube play is true apparently... although I really do not follow the logic there. |
/packit copr-build |
if err := v.update(); err != nil { | ||
return err | ||
} | ||
// DANGEROUS: Do not lock here yet because we might needed to remove containers first. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm thinking we should have a "Removing" bool in volumes that prevents further use once it's set. Would solve the container-addition-during-removal problem. But that seems like a separate PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes and the longer I think about the more I think we might similar issues around networks, secretes, etc..
The code is flaky somehow, the question is should podman wait even guarantee that the ctr is removed in this case? We just wait for exit not removed... |
So we have a race of cleanup process against I think the expectation would be that |
I don't understand how this even works here, we wait on the conmon pid and conmon waits for podman container cleanup to finish and cleanup should have removed the ctr. I wonder if it is related the the super short lived true process because podman wait might never see the running conmon pid. |
Do we actually wait for the cleanup process as rootless? We double fork to
join the right namespace and that ought to detach us from Conmon?
…On Wed, Aug 14, 2024 at 11:08 Paul Holzinger ***@***.***> wrote:
So we have a race of cleanup process against podman wait by the look of
things.
I think the expectation would be that podman wait on a --rm container
would wait until it's gone, but... I don't think the docs say it will do
more than wait for exit?
I don't understand how this even works here, we wait on the conmon pid and
conmon waits for podman container cleanup to finish and cleanup should have
removed the ctr. I wonder if it is related the the super short lived true
process because podman wait might never see the running conmon pid.
—
Reply to this email directly, view it on GitHub
<#23601 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AB3AOCH7EZZIBGA7XAW7QH3ZRNXHZAVCNFSM6AAAAABMOJ5EMCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDEOBZGA3TEMJZGY>
.
You are receiving this because you commented.Message ID:
***@***.***>
|
Which we? conmon waits for the result of the podman container cleanup unless it was killed, so if we wait for the conmon pid to exit (which my rewrite does) that cannot fail unless the pid was dead already, then this wait code exits early as we are still locked at that point so cleanup had no chance to run yet |
We meaning Conmon. I thought that, as part of rootless init/joining the rootless userns, Podman would do some forks, which could cause Conmon to think Podman had exited. On looking more, I do not believe that's the case after some checking, though. |
Sure we do that, but podman run doesn't matter here conmon exec's podman container cleanup once the ctr process exits and then it waits on the cleanup process to finish to parent process doesn't no interact with that process at all and is not relevant. |
Ok so the problem was basically that |
LGTM |
/lgtm |
a60f611
to
9c6f5dd
Compare
The current code did several complicated state checks that simply do not work properly on a fast restarting container. It uses a special case for --restart=always but forgot to take care of --restart=on-failure which always hang for 20s until it run into the timeout. The old logic also used to call CheckConmonRunning() but synced the state before which means it may check a new conmon every time and thus misses exits. To fix the new the code is much simpler. Check the conmon pid, if it is no longer running then get then check exit file and get exit code. This is related to containers#23473 but I am not sure if this fixes it because we cannot reproduce. Signed-off-by: Paul Holzinger <pholzing@redhat.com>
Init containers are meant to exit early before other containers are started. Thus stopping the infra container in such case is wrong. Signed-off-by: Paul Holzinger <pholzing@redhat.com>
Now that on-failure exits right away the test is racy as the RestartCount is not at the value we expect as the container is still restarting in the background. As such add a timer based approach. Signed-off-by: Paul Holzinger <pholzing@redhat.com>
We cannot get first the volume lock and the container locks. Other code paths always have to first lock the container and the lock the volumes, i.e. to mount/umount them. As such locking the volume fust can always result in ABBA deadlocks. To fix this move the lock down after the container removal. The removal code is racy regardless of the lock as the volume lcok on create is no longer taken since commit 3cc9db8 due another deadlock there. Fixes containers#23613 Signed-off-by: Paul Holzinger <pholzing@redhat.com>
removeVolume() already does the same check so we do not need it twice. Signed-off-by: Paul Holzinger <pholzing@redhat.com>
Waiting now actually makes sure to exit on first container exit. Also notice that it does not wait for --rm to have the container removed at this point. Signed-off-by: Paul Holzinger <pholzing@redhat.com>
Ok this should be good now, so much for a "simple" change. |
/lgtm |
b282902
into
containers:main
The current code did several complicated state checks that simply do not work properly on a fast restarting container. It uses a special case for --restart=always but forgot to take care of --restart=on-failure which always hang for 20s until it run into the timeout.
The old logic also used to call CheckConmonRunning() but synced the state before which means it may check a new conmon every time and thus misses exits.
To fix the new the code is much simpler. Check the conmon pid, if it is no longer running then get then check exit file and get exit code.
This is related to #23473 but I am not sure if this fixes it because we cannot reproduce.
Does this PR introduce a user-facing change?