-
Notifications
You must be signed in to change notification settings - Fork 475
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HDDS-3309. Add TimedOutTestsListener to surefire and add timeout to integration tests #813
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @smengcl for working on this. I tested the change by configuring very short timeout. Here are my observations:
- The
parallel-tests
profile is not enabled by default. The listener should be configured globally, outside of profiles to take effect. I think it should be added to the rootpom.xml
here:
TimedOutTestsListener
is not notified of "fork timeouts", only of individual test timeout, ie. the kind which you wanted to add in HDDS-3309. Add timeout to all integration tests #750. So I think we need both changes.
Thanks for the comment @adoroszlai . I can confirm I also played with the maven surefire listener a bit. Some observations:
|
Looks like hadoop-hdds can't locate the class in hadoop-ozone |
I started working on a similar listener (for debugging disk space usage in tests) and ran into this classpath issue. Solved it by moving these test utilities to a new module. I'll create a PR with the base change to unblock you. With that change you'll be able to add the |
Thanks. btw It looks to me the root cause of the issue is that |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank @smengcl for continuing with this. Having moved TimedOutTestsListener
to a separate module, it is now possible to add it globally. Can you please update this PR by removing the unnecessary change in TimedOutTestsListener
? Your change in pom.xml
should be fine for a new CI run.
hadoop-hdds/common/src/test/java/org/apache/hadoop/test/TimedOutTestsListener.java
Outdated
Show resolved
Hide resolved
@adoroszlai I have rebased the jira and added commits similar to #750 . I have also left all ignored integration test classes untouched from the timeout change. Please take a look. :) |
This reverts commit 8dbbde83e1485ffe0ece2b40e822fdf17c466638.
Rebased again since HDDS-3602 is merged. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @smengcl for continuing work on this. There are few files with whitespace-only change, and several with ignored test class but timeout being added. Please let me know if you would like a list of these. Otherwise it looks good to me.
Thanks @adoroszlai for another review. I have addressed the whitespace and removed timeout from more ignored tests. |
Thanks, the update looks good. |
Thanks @elek for triggering the recheck. The latest it-client failure is caused by flakiness of But I see a green run on my dev branch: smengcl@843f1d8 Will merge shortly. |
In case of flaky tests I would suggest disabling them (with opening a Jira + saving log + commit @ignore without review) and repeat test. I think it provides better safety. |
What changes were proposed in this pull request?
Add
TimedOutTestsListener
as a listener to maven-surefire-plugin like Hadoop does. And add timeout to integration tests that are not marked with@Ignore
at the time of the patch.Also previously discussed under #750.
What is the link to the Apache JIRA
https://issues.apache.org/jira/browse/HDDS-3309
How was this patch tested?
The patch should make the timed out tests print out the stack trace (if any). Checks itself.
Bonus
To test if the
TimedOutTestsListener
is working, you can change the timeout of a class to an impossibly small value like 5 sec (for an integration test to initialize). Then run that integration test. Make sure the test isn't ignored or it won't run withmvn test
.Note that the timeout must be smaller than
surefire.fork.timeout
shown below otherwise the fork will still be killed with no stack trace or thread dump.https://github.com/apache/hadoop-ozone/blob/2e9bae2f8636974c5c0e11a4a3e04d9cbfbb2288/pom.xml#L228
For example, in
TestOzoneManagerRestart
I removed the@Ignore
annotation and set the timeout to5000
ms. Then ran:mvn test -e -Dtest=TestOzoneManagerRestart
Maven output:
More importantly, the
surefire-reports
folder has the detailed logs with thread dump:Example thread dump in
surefire-reports/org.apache.hadoop.ozone.om.TestOzoneManagerRestart-output.txt
: