Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question: Does a race condition exist in the "Into the woods" implementation? #22

Closed
ajones-miovision opened this issue Mar 2, 2023 · 2 comments

Comments

@ajones-miovision
Copy link

I may be wrong here but is there a race condition present in the "Into the woods" implementation?
We wait4 on the child pid we fork in the parent as a means to block so we don't kill pid1 until our forked process exits. Great. However, when that original child eventually dies, it is possible (dependent upon context switching within the go runtime) that the wait4 in reapChild (line 61) may handle that exit before the original wait4 in the main thread. In this case I think pid 1 won't ever exit because it will be waiting on a pid that had already reaped by the reaper. Thoughts?

@ajones-miovision
Copy link
Author

ajones-miovision commented Mar 2, 2023

Ok, so i wrote a small c program that tries to replicate the issue and it seems like both waits will return so my assumption that one would get stuck waiting forever is false. They will both return but only one will receive the exit status.

@ramr
Copy link
Owner

ramr commented Mar 4, 2023

Cool - am glad you figured it out.

Yeah, in your example with the parent wait[4] code executing after the child exits (Note: exited but not terminated) ... the exit of the child process leaves it in a "waitable" state, so a subsequent wait* call would return the child status information.

The *nix rationale being something akin to a parent should be free to do some other work ("chores") whilst the child is playing and come back and clean up afterwards!

Note that if the child process is never wait[ed] on aka never cleaned up, you'll end up with a zombie process.

And also note that wait [wait{2,4} are wrappers] could return if the child process has changed "state" .. this is terminated in your case but it could well be a suspend-continuation workflow ... aka sig{stop,tstp,cont}. Hence the check for ECHILD if the syscall wasn't interrupted.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants