This repository was archived by the owner on Mar 24, 2026. It is now read-only.
Merged
Conversation
The old implementation had a race-condition with wait() that would sometimes (often?) result in the first process to exit successfully and fire off the wait() function leaving the child processes orphaned as TFBReaper exited. The new implementation drops wait() in favor of simply entering an infinite loop and relying on the suite to handle the process cleanup. This new implementation has the added wrinkle of TFBReaper remaining as a 'defunct' process because it had been exited forcibly and gone into a state of waiting for its parent (TFB python suite) to clean it or exit. This would result in hundreds of defunct TFBReaper processes left running as a full benchmark neared conclusion (each test would spawn a new-and-eventually-defunct TFBReaper). The fix to this issue is actually the original problem - have TFBReaper fork a child process and exit the parent. This causes TFBReaper's child process to become orphaned and adopted by init(1), which will clean defunct processes by design almost immediately.
Also, fixes a couple of errant frameworks who were not cleaning exiting their start shells.
If a script fails, TFBReaper should fail out as well so that the suite doesn't hang forever waiting on a train that ain't coming.
Essentially, a SIGTERM to the wrong process before the right process could result in a fork() that causes a rogue process to not be SIGTERMed. The SIGKILL ends up cleaning this out anyway, but we really want to avoid that if possible. Sending SIGSTOP to all the processes before SIGTERM tells them to stop allowing fork() calls. Then, when we call SIGTERM we should not have any rogue processes as the result of unprotected fork()ing. Also, dropped some unneeded sleep() calls that just slowed down each test run.
Now pauses to keep from spiking CPU
Member
Author
|
This task will lead the way toward the database setup improvements. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
As the name suggests - this pull request aims at removing
testrunneras a requirement for the suite.