New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HDDS-4167. Acceptance test logs missing if SCM fails to exit safe mode #1366
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
❤️ |
elek
approved these changes
Sep 1, 2020
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 thanks for the patch.
Nice cleanup.
Thanks @elek for reviewing and committing it. |
rakeshadr
pushed a commit
to rakeshadr/hadoop-ozone
that referenced
this pull request
Sep 3, 2020
errose28
added a commit
to errose28/ozone
that referenced
this pull request
Sep 11, 2020
* master: (26 commits) HDDS-4167. Acceptance test logs missing if fails during cluster startup (apache#1366) HDDS-4121. Implement OmMetadataMangerImpl#getExpiredOpenKeys. (apache#1351) HDDS-3867. Extend the chunkinfo tool to display information from all nodes in the pipeline. (apache#1154) HDDS-4077. Incomplete OzoneFileSystem statistics (apache#1329) HDDS-3903. OzoneRpcClient support batch rename keys. (apache#1150) HDDS-4151. Skip the inputstream while offset larger than zero in s3g (apache#1354) HDDS-4147. Add OFS to FileSystem META-INF (apache#1352) HDDS-4137. Turn on the verbose mode of safe mode check on testlib (apache#1343) HDDS-4146. Show the ScmId and ClusterId in the scm web ui. (apache#1350) HDDS-4145. Bump version to 1.1.0-SNAPSHOT on master (apache#1349) HDDS-4109. Tests in TestOzoneFileSystem should use the existing MiniOzoneCluster (apache#1316) HDDS-4149. Implement OzoneFileStatus#toString (apache#1356) HDDS-4153. Increase default timeout in kubernetes tests (apache#1357) HDDS-2411. add a datanode chunk validator fo datanode chunk generator (apache#1312) HDDS-4140. Auto-close /pending pull requests after 21 days of inactivity (apache#1344) HDDS-4152. Archive container logs for kubernetes check (apache#1355) HDDS-4056. Convert OzoneAdmin to pluggable model (apache#1285) HDDS-3972. Add option to limit number of items displaying through ldb tool. (apache#1206) HDDS-4068. Client should not retry same OM on network connection failure (apache#1324) HDDS-4062. Non rack aware pipelines should not be created if multiple racks are alive. (apache#1291) ...
ayushtkn
pushed a commit
to ayushtkn/hadoop-ozone
that referenced
this pull request
Oct 31, 2020
* HDDS-1577. Add default pipeline placement policy implementation. (apache#1366) (cherry picked from commit b640a5f6d53830aee4b9c2a7d17bf57c987962cd) * HDDS-1571. Create an interface for pipeline placement policy to support network topologies. (apache#1395) (cherry picked from commit 753fc6703a39154ed6013e44dbae572391748906) * HDDS-2089: Add createPipeline CLI. (apache#1418) (cherry picked from commit 326b5acd4a63fe46821919322867f5daff30750c) * HDDS-1569 Support creating multiple pipelines with same datanode. Contributed by Li Cheng. This closes apache#28 * HDDS-1572 Implement a Pipeline scrubber to clean up non-OPEN pipeline. (apache#237) * Rebase Fix * HDDS-2650 Fix createPipeline CLI. (apache#340) * HDDS-2035 Implement datanode level CLI to reveal pipeline relation. (apache#348) * Revert "HDDS-2650 Fix createPipeline CLI. (apache#340)" This reverts commit 7c71710. * HDDS-2650 Fix createPipeline CLI and make it message based. (apache#370) * HDDS-1574 Average out pipeline allocation on datanodes and add metrcs/test (apache#291) * Resolve rebase conflict. * HDDS-2756. Handle pipeline creation failure in different way when it exceeds pipeline limit Closes apache#401 * HDDS-2115 Add acceptance test for createPipeline CLI and datanode list CLI (apache#375) * HDDS-2115 Add acceptance test for createPipeline CLI and datanode list CLI. * HDDS-2772 Better management for pipeline creation limitation. (apache#410) * HDDS-2913 Update config names and CLI for multi-raft feature. (apache#462) * HDDS-2924. Fix Pipeline#nodeIdsHash collision issue. (apache#478) * HDDS-2923 Add fall-back protection for rack awareness in pipeline creation. (apache#516) * HDDS-3007 Fix CI test failure for TestSCMNodeManager. (apache#550) Co-authored-by: Sammi Chen <sammichen@apache.org> Co-authored-by: Xiaoyu Yao <xyao@apache.org>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What changes were proposed in this pull request?
Acceptance test sometimes fails due to SCM not coming out of safe mode. If this happens, the cluster is stopped without running Robot tests. rebot command to process test results fails due to missing input, and acceptance check is abruptly stopped without fetching docker logs or running tests in other environments.
Fix:
rebot
processing if input files are available (it is safe to let the final one intest-all.sh
fail)ozone-mr
, which contains multiple test sub-directories, to avoid error infind
And some cleanup:
test-all.sh
andozone-mr/test.sh
by extracting functions for the shared code being fixedset +e; ...; set -e
withif ! ...; then ...
(partly belongs to HDDS-4101) in the code being fixedhttps://issues.apache.org/jira/browse/HDDS-4167
How was this patch tested?
Temporarily reduced wait time for exit from safe mode to 10 seconds, causing all tests to fail early. Verified that docker logs were still added to the bundle:
https://github.com/adoroszlai/hadoop-ozone/runs/1045059176
Regular CI:
https://github.com/adoroszlai/hadoop-ozone/runs/1045057585