-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pass back the pid of runc:[1:CHILD] so we can wait on it #1506
Pass back the pid of runc:[1:CHILD] so we can wait on it #1506
Conversation
|
||
// Clean up the zombie parent process | ||
firstChildProcess, err := os.FindProcess(pid.PidFirstChild) | ||
if err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not convinced that propagating the error here makes sense (especially since you might get ECHILD
-- I haven't checked the Go source yet). Maybe we should do something like this
if firstChildProcess, err := os.FindProcess(pid.PidFirstChild); err == nil {
if _, err := firstChildProcess.Wait(); err != nil {
return err
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also are we sure it's not possible to get PidFirstChild = -1
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
os.FindProcess can't fail on Unix systems. https://godoc.org/os#FindProcess
On Unix systems, FindProcess always succeeds and returns a Process for the given pid, regardless of whether the process exists.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also are we sure it's not possible to get
PidFirstChild = -1?
Pretty sure it is impossible since it'd mean the grandchild'd pid has been received. This implies that the child has already been created.
I can just change the nsenter function to differentiate between child
and grandchild
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Anyways, let me know what changes you want to see.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What would process.Wait
return if the process doesn't exist?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In which case the Wait
is what will fail if the process doesn't exist -- so maybe some thing more like
if _, err := firstChildProcess.Wait(); err != nil {
if err != unix.ECHILD && err != unix.ESRCH {
return err
}
}
Maybe?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the concern that runc:[1:CHILD]
has already been reaped by an external reaper? Otherwise it should always exist (and wait(2)
can't fail).
libcontainer/process_linux.go
Outdated
if err != nil { | ||
return err | ||
} | ||
if _, err := firstChildProcess.Wait(); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As discussed IRL, basically this should either just ignore the error:
_, _ = firstChildProcess.Wait()
Or explicitly special-case unix.ESRCH
or unix.ECHILD
, but I'm not convinced that makes much more sense than just ignoring the error (if you grep
for Wait
you'll find it).
libcontainer/process_linux.go
Outdated
if err != nil { | ||
return err | ||
} | ||
if _, err := firstChildProcess.Wait(); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment here.
This allows the libcontainer to automatically clean up runc:[1:CHILD] processes created as part of nsenter. Signed-off-by: Alex Fang <littlelightlittlefire@gmail.com>
0d21c21
to
e92add2
Compare
Pull request updated |
does this mean we can run |
@c4milo Probably, I would have to test it to find out, but this was intended to solve the more general intermediate zombie issue. |
Use mainstream runc again now that PR opencontainers/runc/pull/1506 is merged
Fixes #1443