HDDS-1561: Mark OPEN containers as QUASI_CLOSED as part of Ratis groupRemove #1401

lokeshj1703 · 2019-09-04T16:36:36Z

Right now, if a pipeline is destroyed by SCM, all the container on the pipeline are marked as quasi closed when datanode received close container command. SCM while processing these containers reports, marks these containers as closed once majority of the nodes are available.

This is however not a sufficient condition in cases where the raft log directory is missing or corrupted. As the containers will not have all the applied transaction.
To solve this problem, we should QUASI_CLOSE the containers in datanode as part of ratis groupRemove. If a container is in OPEN state in datanode without any active pipeline, it will be marked as Unhealthy while processing close container command.

nandakumar131

Overall the patch looks good.
Added some minor comments.

nandakumar131 · 2019-09-04T17:07:32Z

.../hadoop/ozone/container/common/statemachine/commandhandler/CloseContainerCommandHandler.java

+      } else if (closeCommand.getForce()) {
        // SCM told us to force close the container.
        controller.closeContainer(containerId);
      }


Not exactly related to this patch, but this part of code has become a little bit messy.
We should be able to refactor this.

switch (container.getContainerState()) { case OPEN: controller.markContainerForClose(containerId); case CLOSING: final HddsProtos.PipelineID pipelineID = closeCommand.getPipelineID(); final XceiverServerSpi writeChannel = ozoneContainer.getWriteChannel(); if (writeChannel.isExist(pipelineID)) { writeChannel.submitRequest(getContainerCommandRequestProto( datanodeDetails, containerId), pipelineID); } else { controller.markContainerUnhealthy(containerId); } break; case QUASI_CLOSED: if (closeCommand.getForce()) { controller.closeContainer(containerId); break; } case CLOSED: case UNHEALTHY: case INVALID: LOG.debug("Cannot close the container #{}, the container is" + " in {} state.", containerId, container.getContainerState()); }

addressed in 2nd comit

nandakumar131 · 2019-09-04T17:09:44Z

...a/org/apache/hadoop/ozone/container/common/transport/server/ratis/ContainerStateMachine.java

+      }
+    }
+  }
+


If markContainerForClose fails, quasiCloseContainer will definitely fail. We can put both of the calls into same try catch.

addressed in 2nd comit

hadoop-yetus · 2019-09-04T21:57:20Z

💔 -1 overall

Vote	Subsystem	Runtime	Comment
0	reexec	39	Docker mode activated.
		_ Prechecks _
+1	dupname	0	No case conflicting files found.
+1	@author	0	The patch does not contain any @author tags.
+1	test4tests	0	The patch appears to include 5 new or modified test files.
		_ trunk Compile Tests _
0	mvndep	72	Maven dependency ordering for branch
+1	mvninstall	657	trunk passed
+1	compile	396	trunk passed
+1	checkstyle	80	trunk passed
+1	mvnsite	0	trunk passed
+1	shadedclient	883	branch has no errors when building and testing our client artifacts.
+1	javadoc	185	trunk passed
0	spotbugs	440	Used deprecated FindBugs config; considering switching to SpotBugs.
+1	findbugs	675	trunk passed
		_ Patch Compile Tests _
0	mvndep	41	Maven dependency ordering for patch
+1	mvninstall	587	the patch passed
+1	compile	414	the patch passed
+1	javac	414	the patch passed
-0	checkstyle	42	hadoop-hdds: The patch generated 3 new + 0 unchanged - 0 fixed = 3 total (was 0)
+1	mvnsite	0	the patch passed
+1	whitespace	0	The patch has no whitespace issues.
+1	xml	3	The patch has no ill-formed XML file.
+1	shadedclient	683	patch has no errors when building and testing our client artifacts.
+1	javadoc	178	the patch passed
+1	findbugs	662	the patch passed
		_ Other Tests _
+1	unit	259	hadoop-hdds in the patch passed.
-1	unit	187	hadoop-ozone in the patch failed.
+1	asflicense	40	The patch does not generate ASF License warnings.
		6315

Reason	Tests
Failed junit tests	hadoop.ozone.om.ratis.TestOzoneManagerRatisServer

Subsystem	Report/Notes
Docker	Client=19.03.1 Server=19.03.1 base: https://builds.apache.org/job/hadoop-multibranch/job/PR-1401/1/artifact/out/Dockerfile
GITHUB PR	#1401
JIRA Issue	HDDS-1561
Optional Tests	dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle xml
uname	Linux 47571461d1a2 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
Build tool	maven
Personality	personality/hadoop.sh
git revision	trunk / `337e9b7`
Default Java	1.8.0_222
checkstyle	https://builds.apache.org/job/hadoop-multibranch/job/PR-1401/1/artifact/out/diff-checkstyle-hadoop-hdds.txt
unit	https://builds.apache.org/job/hadoop-multibranch/job/PR-1401/1/artifact/out/patch-unit-hadoop-ozone.txt
Test Results	https://builds.apache.org/job/hadoop-multibranch/job/PR-1401/1/testReport/
Max. process+thread count	1293 (vs. ulimit of 5500)
modules	C: hadoop-hdds hadoop-hdds/container-service hadoop-ozone hadoop-ozone/integration-test hadoop-ozone/ozone-manager U: .
Console output	https://builds.apache.org/job/hadoop-multibranch/job/PR-1401/1/console
versions	git=2.7.4 maven=3.3.9 findbugs=3.1.0-RC1
Powered by	Apache Yetus 0.10.0 http://yetus.apache.org

This message was automatically generated.

lokeshj1703 · 2019-09-05T10:27:18Z

@nandakumar131 Thanks for reviewing the PR! 2nd commit addresses checkstyle issues and review comments.

hadoop-yetus · 2019-09-05T12:09:00Z

💔 -1 overall

Vote	Subsystem	Runtime	Comment
0	reexec	40	Docker mode activated.
		_ Prechecks _
+1	dupname	0	No case conflicting files found.
+1	@author	0	The patch does not contain any @author tags.
+1	test4tests	0	The patch appears to include 5 new or modified test files.
		_ trunk Compile Tests _
0	mvndep	65	Maven dependency ordering for branch
+1	mvninstall	580	trunk passed
+1	compile	380	trunk passed
+1	checkstyle	83	trunk passed
+1	mvnsite	0	trunk passed
+1	shadedclient	870	branch has no errors when building and testing our client artifacts.
+1	javadoc	175	trunk passed
0	spotbugs	419	Used deprecated FindBugs config; considering switching to SpotBugs.
+1	findbugs	615	trunk passed
		_ Patch Compile Tests _
0	mvndep	40	Maven dependency ordering for patch
+1	mvninstall	554	the patch passed
+1	compile	386	the patch passed
+1	javac	386	the patch passed
-0	checkstyle	43	hadoop-hdds: The patch generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0)
+1	mvnsite	0	the patch passed
+1	whitespace	0	The patch has no whitespace issues.
+1	xml	2	The patch has no ill-formed XML file.
+1	shadedclient	691	patch has no errors when building and testing our client artifacts.
+1	javadoc	174	the patch passed
-1	findbugs	210	hadoop-hdds generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0)
		_ Other Tests _
+1	unit	296	hadoop-hdds in the patch passed.
-1	unit	187	hadoop-ozone in the patch failed.
+1	asflicense	46	The patch does not generate ASF License warnings.
		6106

Reason	Tests
FindBugs	module:hadoop-hdds
	Switch statement found in org.apache.hadoop.ozone.container.common.statemachine.commandhandler.CloseContainerCommandHandler.handle(SCMCommand, OzoneContainer, StateContext, SCMConnectionManager) where one case falls through to the next case At CloseContainerCommandHandler.java:OzoneContainer, StateContext, SCMConnectionManager) where one case falls through to the next case At CloseContainerCommandHandler.java:[lines 92-95]
Failed junit tests	hadoop.ozone.om.ratis.TestOzoneManagerRatisServer

Subsystem	Report/Notes
Docker	Client=19.03.1 Server=19.03.1 base: https://builds.apache.org/job/hadoop-multibranch/job/PR-1401/2/artifact/out/Dockerfile
GITHUB PR	#1401
JIRA Issue	HDDS-1561
Optional Tests	dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle xml
uname	Linux ece162389836 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
Build tool	maven
Personality	personality/hadoop.sh
git revision	trunk / `172bcd8`
Default Java	1.8.0_222
checkstyle	https://builds.apache.org/job/hadoop-multibranch/job/PR-1401/2/artifact/out/diff-checkstyle-hadoop-hdds.txt
findbugs	https://builds.apache.org/job/hadoop-multibranch/job/PR-1401/2/artifact/out/new-findbugs-hadoop-hdds.html
unit	https://builds.apache.org/job/hadoop-multibranch/job/PR-1401/2/artifact/out/patch-unit-hadoop-ozone.txt
Test Results	https://builds.apache.org/job/hadoop-multibranch/job/PR-1401/2/testReport/
Max. process+thread count	1205 (vs. ulimit of 5500)
modules	C: hadoop-hdds hadoop-hdds/container-service hadoop-ozone hadoop-ozone/integration-test hadoop-ozone/ozone-manager U: .
Console output	https://builds.apache.org/job/hadoop-multibranch/job/PR-1401/2/console
versions	git=2.7.4 maven=3.3.9 findbugs=3.1.0-RC1
Powered by	Apache Yetus 0.10.0 http://yetus.apache.org

This message was automatically generated.

hadoop-yetus · 2019-09-05T15:40:47Z

💔 -1 overall

Vote	Subsystem	Runtime	Comment
0	reexec	44	Docker mode activated.
		_ Prechecks _
+1	dupname	1	No case conflicting files found.
+1	@author	0	The patch does not contain any @author tags.
+1	test4tests	0	The patch appears to include 5 new or modified test files.
		_ trunk Compile Tests _
0	mvndep	25	Maven dependency ordering for branch
+1	mvninstall	568	trunk passed
+1	compile	378	trunk passed
+1	checkstyle	82	trunk passed
+1	mvnsite	0	trunk passed
+1	shadedclient	865	branch has no errors when building and testing our client artifacts.
+1	javadoc	175	trunk passed
0	spotbugs	416	Used deprecated FindBugs config; considering switching to SpotBugs.
+1	findbugs	614	trunk passed
		_ Patch Compile Tests _
0	mvndep	39	Maven dependency ordering for patch
+1	mvninstall	538	the patch passed
+1	compile	395	the patch passed
+1	javac	395	the patch passed
+1	checkstyle	90	the patch passed
+1	mvnsite	0	the patch passed
+1	whitespace	0	The patch has no whitespace issues.
+1	xml	3	The patch has no ill-formed XML file.
+1	shadedclient	686	patch has no errors when building and testing our client artifacts.
+1	javadoc	176	the patch passed
+1	findbugs	632	the patch passed
		_ Other Tests _
+1	unit	297	hadoop-hdds in the patch passed.
-1	unit	183	hadoop-ozone in the patch failed.
+1	asflicense	48	The patch does not generate ASF License warnings.
		6033

Reason	Tests
Failed junit tests	hadoop.ozone.om.ratis.TestOzoneManagerRatisServer

Subsystem	Report/Notes
Docker	Client=19.03.1 Server=19.03.1 base: https://builds.apache.org/job/hadoop-multibranch/job/PR-1401/3/artifact/out/Dockerfile
GITHUB PR	#1401
JIRA Issue	HDDS-1561
Optional Tests	dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle xml
uname	Linux 5b06fab0ad6b 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
Build tool	maven
Personality	personality/hadoop.sh
git revision	trunk / `511df1e`
Default Java	1.8.0_222
unit	https://builds.apache.org/job/hadoop-multibranch/job/PR-1401/3/artifact/out/patch-unit-hadoop-ozone.txt
Test Results	https://builds.apache.org/job/hadoop-multibranch/job/PR-1401/3/testReport/
Max. process+thread count	1238 (vs. ulimit of 5500)
modules	C: hadoop-hdds hadoop-hdds/container-service hadoop-ozone hadoop-ozone/integration-test hadoop-ozone/ozone-manager U: .
Console output	https://builds.apache.org/job/hadoop-multibranch/job/PR-1401/3/console
versions	git=2.7.4 maven=3.3.9 findbugs=3.1.0-RC1
Powered by	Apache Yetus 0.10.0 http://yetus.apache.org

This message was automatically generated.

nandakumar131 · 2019-09-06T07:45:22Z

Test failures are not related.

…pRemove (apache#1401)

Mark OPEN containers as QUASI_CLOSED as part of Ratis groupRemove

6bfc5b4

lokeshj1703 added the ozone label Sep 4, 2019

lokeshj1703 requested a review from nandakumar131 September 4, 2019 16:36

lokeshj1703 self-assigned this Sep 4, 2019

lokeshj1703 requested a review from mukul1987 September 4, 2019 16:37

nandakumar131 reviewed Sep 4, 2019

View reviewed changes

Fix checkstyle issues and address review comments

4342754

Fix findbugs and checkstyle issues

c3e53b9

nandakumar131 approved these changes Sep 6, 2019

View reviewed changes

nandakumar131 merged commit 6e4cdf8 into apache:trunk Sep 6, 2019

lokeshj1703 deleted the HDDS-1561 branch September 6, 2019 08:04

amahussein pushed a commit to amahussein/hadoop that referenced this pull request Oct 29, 2019

HDDS-1561: Mark OPEN containers as QUASI_CLOSED as part of Ratis grou…

370bd00

…pRemove (apache#1401)

RogPodge pushed a commit to RogPodge/hadoop that referenced this pull request Mar 25, 2020

HDDS-1561: Mark OPEN containers as QUASI_CLOSED as part of Ratis grou…

bef84c4

…pRemove (apache#1401)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

HDDS-1561: Mark OPEN containers as QUASI_CLOSED as part of Ratis groupRemove #1401

HDDS-1561: Mark OPEN containers as QUASI_CLOSED as part of Ratis groupRemove #1401

Uh oh!

lokeshj1703 commented Sep 4, 2019

Uh oh!

nandakumar131 left a comment

Uh oh!

nandakumar131 Sep 4, 2019

Uh oh!

lokeshj1703 Sep 5, 2019

Uh oh!

nandakumar131 Sep 4, 2019

Uh oh!

lokeshj1703 Sep 5, 2019

Uh oh!

hadoop-yetus commented Sep 4, 2019

Uh oh!

lokeshj1703 commented Sep 5, 2019

Uh oh!

hadoop-yetus commented Sep 5, 2019

Uh oh!

hadoop-yetus commented Sep 5, 2019

Uh oh!

nandakumar131 commented Sep 6, 2019

Uh oh!

Uh oh!

HDDS-1561: Mark OPEN containers as QUASI_CLOSED as part of Ratis groupRemove #1401

HDDS-1561: Mark OPEN containers as QUASI_CLOSED as part of Ratis groupRemove #1401

Uh oh!

Conversation

lokeshj1703 commented Sep 4, 2019

Uh oh!

nandakumar131 left a comment

Choose a reason for hiding this comment

Uh oh!

nandakumar131 Sep 4, 2019

Choose a reason for hiding this comment

Uh oh!

lokeshj1703 Sep 5, 2019

Choose a reason for hiding this comment

Uh oh!

nandakumar131 Sep 4, 2019

Choose a reason for hiding this comment

Uh oh!

lokeshj1703 Sep 5, 2019

Choose a reason for hiding this comment

Uh oh!

hadoop-yetus commented Sep 4, 2019

Uh oh!

lokeshj1703 commented Sep 5, 2019

Uh oh!

hadoop-yetus commented Sep 5, 2019

Uh oh!

hadoop-yetus commented Sep 5, 2019

Uh oh!

nandakumar131 commented Sep 6, 2019

Uh oh!

Uh oh!