Skip to content
This repository was archived by the owner on Mar 24, 2026. It is now read-only.

Remove testrunner#2407

Merged
msmith-techempower merged 28 commits intomasterfrom
remove-testrunner
Dec 15, 2016
Merged

Remove testrunner#2407
msmith-techempower merged 28 commits intomasterfrom
remove-testrunner

Conversation

@msmith-techempower
Copy link
Copy Markdown
Member

As the name suggests - this pull request aims at removing testrunner as a requirement for the suite.

The old implementation had a race-condition with wait()
that would sometimes (often?) result in the first process
to exit successfully and fire off the wait() function
leaving the child processes orphaned as TFBReaper exited.

The new implementation drops wait() in favor of simply
entering an infinite loop and relying on the suite to handle
the process cleanup.

This new implementation has the added wrinkle of TFBReaper
remaining as a 'defunct' process because it had been exited
forcibly and gone into a state of waiting for its parent
(TFB python suite) to clean it or exit. This would result
in hundreds of defunct TFBReaper processes left running as
a full benchmark neared conclusion (each test would spawn
a new-and-eventually-defunct TFBReaper). The fix to this
issue is actually the original problem - have TFBReaper
fork a child process and exit the parent.

This causes TFBReaper's child process to become orphaned
and adopted by init(1), which will clean defunct processes
by design almost immediately.
Also, fixes a couple of errant frameworks who were
not cleaning exiting their start shells.
If a script fails, TFBReaper should fail out as well
so that the suite doesn't hang forever waiting on a
train that ain't coming.
Essentially, a SIGTERM to the wrong process before
the right process could result in a fork() that causes
a rogue process to not be SIGTERMed. The SIGKILL ends
up cleaning this out anyway, but we really want to
avoid that if possible.

Sending SIGSTOP to all the processes before SIGTERM
tells them to stop allowing fork() calls. Then, when
we call SIGTERM we should not have any rogue processes
as the result of unprotected fork()ing.

Also, dropped some unneeded sleep() calls that just
slowed down each test run.
Now pauses to keep from spiking CPU
@msmith-techempower
Copy link
Copy Markdown
Member Author

This task will lead the way toward the database setup improvements.

@msmith-techempower msmith-techempower merged commit 280fb7f into master Dec 15, 2016
@msmith-techempower msmith-techempower deleted the remove-testrunner branch December 15, 2016 22:01
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant