New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HDDS-2995. Add integration test for Recon's Passive SCM state. #594
Conversation
6958920
to
9654ed8
Compare
...zone/integration-test/src/test/java/org/apache/hadoop/ozone/recon/TestReconAsPassiveScm.java
Show resolved
Hide resolved
The IT failures are related. Working on fixing them. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
I am not able to reproduce this issue locally. Trying to root cause this based of PR runs. |
/retest |
0e110f8
to
3d35547
Compare
/retest |
50d1efc
to
09fe453
Compare
The test failures seem to be unrelated. I will try one more time. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @avijayanhwx for working on this. I ran the new test 20 times, it had 2 problems one time each:
Tests run: 2, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 112.145 s <<< FAILURE! - in org.apache.hadoop.ozone.recon.TestReconAsPassiveScm
testDatanodeRegistrationAndReports(org.apache.hadoop.ozone.recon.TestReconAsPassiveScm) Time elapsed: 25.52 s <<< ERROR!
picocli.CommandLine$ExecutionException: Error while calling command (org.apache.hadoop.ozone.recon.ReconServer@1c68f6b8): java.lang.NullPointerException
at picocli.CommandLine.execute(CommandLine.java:1180)
...
at org.apache.hadoop.hdds.cli.GenericCli.execute(GenericCli.java:65)
at org.apache.hadoop.ozone.MiniOzoneClusterImpl$Builder.build(MiniOzoneClusterImpl.java:536)
at org.apache.hadoop.ozone.recon.TestReconAsPassiveScm.init(TestReconAsPassiveScm.java:72)
...
Caused by: java.lang.NullPointerException
at org.apache.hadoop.ozone.recon.ReconServer.start(ReconServer.java:118)
Tests run: 2, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 126.987 s <<< FAILURE! - in org.apache.hadoop.ozone.recon.TestReconAsPassiveScm
testDatanodeRegistrationAndReports(org.apache.hadoop.ozone.recon.TestReconAsPassiveScm) Time elapsed: 40.297 s <<< FAILURE!
java.lang.AssertionError: expected:<3> but was:<0>
...
at org.apache.hadoop.ozone.recon.TestReconAsPassiveScm.testDatanodeRegistrationAndReports(TestReconAsPassiveScm.java:117)
https://github.com/adoroszlai/hadoop-ozone/runs/474216109
Logs: https://github.com/adoroszlai/hadoop-ozone/suites/487647417/artifacts/2328018
protected Optional<Integer> reconHttpPort = Optional.empty(); | ||
protected Optional<Integer> reconDatanodePort = Optional.empty(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: please use OptionalInt
.
reconServer = new ReconServer(); | ||
reconServer.execute(new String[]{}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can call startRecon()
instead to avoid duplicate code?
LOG.error("Exception while getting new container info from SCM", | ||
ioEx); | ||
return; | ||
synchronized (getContainerManager()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't it sync on containerManager
variable?
Thanks for the review @adoroszlai & @swagle. Closing this PR since it is taking longer than expected to fix the flakiness of Recon's integration tests. This patch contains other important refactoring that need not wait for the integration tests to be added. I will be creating separate JIRAs for that. |
What changes were proposed in this pull request?
What is the link to the Apache JIRA
https://issues.apache.org/jira/browse/HDDS-2995
How was this patch tested?
Manually tested.