Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

make pantsd signal integration tests more reliable by setting the timeout #7504

Merged

Conversation

Projects
None yet
3 participants
@cosmicexplorer
Copy link
Contributor

commented Apr 5, 2019

Problem

An attempt to resolve #7425. As mentioned on that ticket, the default (reasonable) timeout for waiting for pailgun subprocesses to complete after a signal (--pantsd-pailgun-quit-timeout) was 1 second. I believe the remaining issue here is that some invocations in CI would actually take more than a second to quit, which would cause nondeterministic behavior.

Solution

  • Set the timeout to 5 seconds on any test which intends for its subprocess to complete within the timeout.
  • Match more of the stderr contents in each error case to remove ambiguity.

Result

TestPantsDaemonIntegration.test_pantsd_sigterm hopefully won't continue to flake!!

cosmicexplorer added some commits Apr 5, 2019

@illicitonion
Copy link
Contributor

left a comment

Thanks!

@@ -418,8 +418,20 @@ def test_pantsd_multiple_parallel_runs(self):
self.assert_success(creator_handle.join())
self.assert_success(waiter_handle.join())

def _assert_pantsd_keyboardinterrupt_signal(self, signum, messages, regexps=[],
def _assert_pantsd_keyboardinterrupt_signal(self, signum,
single_msg=None, messages=[], regexps=[],

This comment has been minimized.

Copy link
@illicitonion

illicitonion Apr 5, 2019

Contributor

single_msg doesn't seem to pull its weight vs just having the caller wrap a string in a list... Most of the callers still use a single-element messages, but having both here opens up:

  1. Questions of whether single_msg and messages are mutually exclusive options
  2. Makes messages have a default mutable value, which can lead to errors

I'd probably ditch single_msg :)

This comment has been minimized.

Copy link
@cosmicexplorer

cosmicexplorer Apr 6, 2019

Author Contributor

Agreed!

This comment has been minimized.

Copy link
@cosmicexplorer

cosmicexplorer Apr 6, 2019

Author Contributor

Well, single_msg is intended to make it easier to check whether a message matches the output exactly. But this can be done with regexps anyway (actually, the whole thing can be done with regexps anyway, if we force the use of re.escape()). I will try narrowing this to just allow regexps and see if that makes things confusing (since this method is only being called in a few places in this file).

Interrupted by user:
Interrupted by user over pailgun client!
$"""],
quit_timeout=5.0)

This comment has been minimized.

Copy link
@illicitonion

illicitonion Apr 5, 2019

Contributor

Should we just bump the default value to 5.0, rather than overriding this in tests? Ideally the default value would cover the common cases we care about, and "on Travis" feels like a common case we care about...

This comment has been minimized.

Copy link
@cosmicexplorer

cosmicexplorer Apr 6, 2019

Author Contributor

100% agree with this!

This comment has been minimized.

Copy link
@cosmicexplorer

cosmicexplorer Apr 6, 2019

Author Contributor

Or actually, I agree with the part about bumping the default value, but I would love the default for interactive usage to be less than 5 seconds, since in the case where the pantsd-runner process is unresponsive except to a SIGKILL that leads to 5 seconds of the user not having control of their terminal. In CI in general however (not just in upstream pants) I think we absolutely want to increase this (likely to 5 seconds) since we don't have interactive users sending ctrl-c to pants.

I would love to set the default depending on whether stdin is a tty, I'll open up a separate issue to discuss that. For now I think the best of both worlds might be to increase the hardcoded default to 5 seconds, but also to immediately print an info log saying "pants is waiting seconds for the subprocess to time out" before waiting for the timeout.

This comment has been minimized.

Copy link
@cosmicexplorer

cosmicexplorer Apr 6, 2019

Author Contributor

Made #7514 for the possibility of decreasing the hardcoded default if at a tty. The point about:

Ideally the default value would cover the common cases we care about, and "on Travis" feels like a common case we care about...

is great, and I'll try to remember that in the future. Thanks!

@cosmicexplorer cosmicexplorer force-pushed the cosmicexplorer:fix-pants-sigterm-test branch from 52c33c2 to e1d3427 Apr 6, 2019

@cosmicexplorer cosmicexplorer merged commit c443c45 into pantsbuild:master Apr 7, 2019

1 check passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details
@ity
Copy link
Contributor

left a comment

looks good!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.