HDDS-10174. Enable Surefire process checker#6075
Merged
adoroszlai merged 2 commits intoapache:masterfrom Jan 24, 2024
Merged
Conversation
This reverts commit a41dae4. Reason for revert: new Surefire fails to kill fork after surefire.timeout
aswinshakil
approved these changes
Jan 23, 2024
Member
aswinshakil
left a comment
There was a problem hiding this comment.
LGTM+1. Great find. Thanks for the patch @adoroszlai
hemantk-12
approved these changes
Jan 23, 2024
Contributor
hemantk-12
left a comment
There was a problem hiding this comment.
Thanks for the patch @adoroszlai
LGTM.
Contributor
Author
|
Thanks @aswinshakil, @hemantk-12 for the review. |
This was referenced Mar 16, 2024
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What changes were proposed in this pull request?
Ozone build configures 20 minutes fork timeout:
ozone/pom.xml
Line 286 in 73e6f90
but it's not applied, because the facility to check the fork is disabled by default:
So instead of killing the fork after 20 minutes, any runaway test is caught only by the Github job timeout, 2.5 hours for integration tests. The problem is even worse for the
flaky-test-check, which gives more time (6 hours) to jobs.This change enables the process checkers.
However, current version of Surefire doesn't seem to kill the fork after timeout even with process checkers enabled. Thus we need to downgrade to 3.0.0-M5.
https://issues.apache.org/jira/browse/HDDS-10174
How was this patch tested?
Tested with a simple repro:
and this command, using various versions of Surefire (including latest 3.2.5):
3.0.0-M6 and newer run for 5 minutes:
3.0.0-M5 and earlier timeout after 1 minute as requested:
CI:
https://github.com/adoroszlai/ozone/actions/runs/7625431710