Skip to content

Conversation

lokeshj1703
Copy link
Contributor

Right now, if a pipeline is destroyed by SCM, all the container on the pipeline are marked as quasi closed when datanode received close container command. SCM while processing these containers reports, marks these containers as closed once majority of the nodes are available.

This is however not a sufficient condition in cases where the raft log directory is missing or corrupted. As the containers will not have all the applied transaction.
To solve this problem, we should QUASI_CLOSE the containers in datanode as part of ratis groupRemove. If a container is in OPEN state in datanode without any active pipeline, it will be marked as Unhealthy while processing close container command.

Copy link
Contributor

@nandakumar131 nandakumar131 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall the patch looks good.
Added some minor comments.

} else if (closeCommand.getForce()) {
// SCM told us to force close the container.
controller.closeContainer(containerId);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not exactly related to this patch, but this part of code has become a little bit messy.
We should be able to refactor this.

switch (container.getContainerState()) {
 case OPEN:
   controller.markContainerForClose(containerId);
 case CLOSING:
   final HddsProtos.PipelineID pipelineID = closeCommand.getPipelineID();
   final XceiverServerSpi writeChannel = ozoneContainer.getWriteChannel();
   if (writeChannel.isExist(pipelineID)) {
     writeChannel.submitRequest(getContainerCommandRequestProto(
         datanodeDetails, containerId), pipelineID);
   } else {
     controller.markContainerUnhealthy(containerId);
   }
   break;
 case QUASI_CLOSED:
   if (closeCommand.getForce()) {
     controller.closeContainer(containerId);
     break;
   }
 case CLOSED:
 case UNHEALTHY:
 case INVALID:
   LOG.debug("Cannot close the container #{}, the container is" +
       " in {} state.", containerId, container.getContainerState());
 }

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

addressed in 2nd comit

}
}
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If markContainerForClose fails, quasiCloseContainer will definitely fail. We can put both of the calls into same try catch.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

addressed in 2nd comit

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Comment
0 reexec 39 Docker mode activated.
_ Prechecks _
+1 dupname 0 No case conflicting files found.
+1 @author 0 The patch does not contain any @author tags.
+1 test4tests 0 The patch appears to include 5 new or modified test files.
_ trunk Compile Tests _
0 mvndep 72 Maven dependency ordering for branch
+1 mvninstall 657 trunk passed
+1 compile 396 trunk passed
+1 checkstyle 80 trunk passed
+1 mvnsite 0 trunk passed
+1 shadedclient 883 branch has no errors when building and testing our client artifacts.
+1 javadoc 185 trunk passed
0 spotbugs 440 Used deprecated FindBugs config; considering switching to SpotBugs.
+1 findbugs 675 trunk passed
_ Patch Compile Tests _
0 mvndep 41 Maven dependency ordering for patch
+1 mvninstall 587 the patch passed
+1 compile 414 the patch passed
+1 javac 414 the patch passed
-0 checkstyle 42 hadoop-hdds: The patch generated 3 new + 0 unchanged - 0 fixed = 3 total (was 0)
+1 mvnsite 0 the patch passed
+1 whitespace 0 The patch has no whitespace issues.
+1 xml 3 The patch has no ill-formed XML file.
+1 shadedclient 683 patch has no errors when building and testing our client artifacts.
+1 javadoc 178 the patch passed
+1 findbugs 662 the patch passed
_ Other Tests _
+1 unit 259 hadoop-hdds in the patch passed.
-1 unit 187 hadoop-ozone in the patch failed.
+1 asflicense 40 The patch does not generate ASF License warnings.
6315
Reason Tests
Failed junit tests hadoop.ozone.om.ratis.TestOzoneManagerRatisServer
Subsystem Report/Notes
Docker Client=19.03.1 Server=19.03.1 base: https://builds.apache.org/job/hadoop-multibranch/job/PR-1401/1/artifact/out/Dockerfile
GITHUB PR #1401
JIRA Issue HDDS-1561
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle xml
uname Linux 47571461d1a2 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality personality/hadoop.sh
git revision trunk / 337e9b7
Default Java 1.8.0_222
checkstyle https://builds.apache.org/job/hadoop-multibranch/job/PR-1401/1/artifact/out/diff-checkstyle-hadoop-hdds.txt
unit https://builds.apache.org/job/hadoop-multibranch/job/PR-1401/1/artifact/out/patch-unit-hadoop-ozone.txt
Test Results https://builds.apache.org/job/hadoop-multibranch/job/PR-1401/1/testReport/
Max. process+thread count 1293 (vs. ulimit of 5500)
modules C: hadoop-hdds hadoop-hdds/container-service hadoop-ozone hadoop-ozone/integration-test hadoop-ozone/ozone-manager U: .
Console output https://builds.apache.org/job/hadoop-multibranch/job/PR-1401/1/console
versions git=2.7.4 maven=3.3.9 findbugs=3.1.0-RC1
Powered by Apache Yetus 0.10.0 http://yetus.apache.org

This message was automatically generated.

@lokeshj1703
Copy link
Contributor Author

@nandakumar131 Thanks for reviewing the PR! 2nd commit addresses checkstyle issues and review comments.

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Comment
0 reexec 40 Docker mode activated.
_ Prechecks _
+1 dupname 0 No case conflicting files found.
+1 @author 0 The patch does not contain any @author tags.
+1 test4tests 0 The patch appears to include 5 new or modified test files.
_ trunk Compile Tests _
0 mvndep 65 Maven dependency ordering for branch
+1 mvninstall 580 trunk passed
+1 compile 380 trunk passed
+1 checkstyle 83 trunk passed
+1 mvnsite 0 trunk passed
+1 shadedclient 870 branch has no errors when building and testing our client artifacts.
+1 javadoc 175 trunk passed
0 spotbugs 419 Used deprecated FindBugs config; considering switching to SpotBugs.
+1 findbugs 615 trunk passed
_ Patch Compile Tests _
0 mvndep 40 Maven dependency ordering for patch
+1 mvninstall 554 the patch passed
+1 compile 386 the patch passed
+1 javac 386 the patch passed
-0 checkstyle 43 hadoop-hdds: The patch generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0)
+1 mvnsite 0 the patch passed
+1 whitespace 0 The patch has no whitespace issues.
+1 xml 2 The patch has no ill-formed XML file.
+1 shadedclient 691 patch has no errors when building and testing our client artifacts.
+1 javadoc 174 the patch passed
-1 findbugs 210 hadoop-hdds generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0)
_ Other Tests _
+1 unit 296 hadoop-hdds in the patch passed.
-1 unit 187 hadoop-ozone in the patch failed.
+1 asflicense 46 The patch does not generate ASF License warnings.
6106
Reason Tests
FindBugs module:hadoop-hdds
Switch statement found in org.apache.hadoop.ozone.container.common.statemachine.commandhandler.CloseContainerCommandHandler.handle(SCMCommand, OzoneContainer, StateContext, SCMConnectionManager) where one case falls through to the next case At CloseContainerCommandHandler.java:OzoneContainer, StateContext, SCMConnectionManager) where one case falls through to the next case At CloseContainerCommandHandler.java:[lines 92-95]
Failed junit tests hadoop.ozone.om.ratis.TestOzoneManagerRatisServer
Subsystem Report/Notes
Docker Client=19.03.1 Server=19.03.1 base: https://builds.apache.org/job/hadoop-multibranch/job/PR-1401/2/artifact/out/Dockerfile
GITHUB PR #1401
JIRA Issue HDDS-1561
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle xml
uname Linux ece162389836 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality personality/hadoop.sh
git revision trunk / 172bcd8
Default Java 1.8.0_222
checkstyle https://builds.apache.org/job/hadoop-multibranch/job/PR-1401/2/artifact/out/diff-checkstyle-hadoop-hdds.txt
findbugs https://builds.apache.org/job/hadoop-multibranch/job/PR-1401/2/artifact/out/new-findbugs-hadoop-hdds.html
unit https://builds.apache.org/job/hadoop-multibranch/job/PR-1401/2/artifact/out/patch-unit-hadoop-ozone.txt
Test Results https://builds.apache.org/job/hadoop-multibranch/job/PR-1401/2/testReport/
Max. process+thread count 1205 (vs. ulimit of 5500)
modules C: hadoop-hdds hadoop-hdds/container-service hadoop-ozone hadoop-ozone/integration-test hadoop-ozone/ozone-manager U: .
Console output https://builds.apache.org/job/hadoop-multibranch/job/PR-1401/2/console
versions git=2.7.4 maven=3.3.9 findbugs=3.1.0-RC1
Powered by Apache Yetus 0.10.0 http://yetus.apache.org

This message was automatically generated.

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Comment
0 reexec 44 Docker mode activated.
_ Prechecks _
+1 dupname 1 No case conflicting files found.
+1 @author 0 The patch does not contain any @author tags.
+1 test4tests 0 The patch appears to include 5 new or modified test files.
_ trunk Compile Tests _
0 mvndep 25 Maven dependency ordering for branch
+1 mvninstall 568 trunk passed
+1 compile 378 trunk passed
+1 checkstyle 82 trunk passed
+1 mvnsite 0 trunk passed
+1 shadedclient 865 branch has no errors when building and testing our client artifacts.
+1 javadoc 175 trunk passed
0 spotbugs 416 Used deprecated FindBugs config; considering switching to SpotBugs.
+1 findbugs 614 trunk passed
_ Patch Compile Tests _
0 mvndep 39 Maven dependency ordering for patch
+1 mvninstall 538 the patch passed
+1 compile 395 the patch passed
+1 javac 395 the patch passed
+1 checkstyle 90 the patch passed
+1 mvnsite 0 the patch passed
+1 whitespace 0 The patch has no whitespace issues.
+1 xml 3 The patch has no ill-formed XML file.
+1 shadedclient 686 patch has no errors when building and testing our client artifacts.
+1 javadoc 176 the patch passed
+1 findbugs 632 the patch passed
_ Other Tests _
+1 unit 297 hadoop-hdds in the patch passed.
-1 unit 183 hadoop-ozone in the patch failed.
+1 asflicense 48 The patch does not generate ASF License warnings.
6033
Reason Tests
Failed junit tests hadoop.ozone.om.ratis.TestOzoneManagerRatisServer
Subsystem Report/Notes
Docker Client=19.03.1 Server=19.03.1 base: https://builds.apache.org/job/hadoop-multibranch/job/PR-1401/3/artifact/out/Dockerfile
GITHUB PR #1401
JIRA Issue HDDS-1561
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle xml
uname Linux 5b06fab0ad6b 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality personality/hadoop.sh
git revision trunk / 511df1e
Default Java 1.8.0_222
unit https://builds.apache.org/job/hadoop-multibranch/job/PR-1401/3/artifact/out/patch-unit-hadoop-ozone.txt
Test Results https://builds.apache.org/job/hadoop-multibranch/job/PR-1401/3/testReport/
Max. process+thread count 1238 (vs. ulimit of 5500)
modules C: hadoop-hdds hadoop-hdds/container-service hadoop-ozone hadoop-ozone/integration-test hadoop-ozone/ozone-manager U: .
Console output https://builds.apache.org/job/hadoop-multibranch/job/PR-1401/3/console
versions git=2.7.4 maven=3.3.9 findbugs=3.1.0-RC1
Powered by Apache Yetus 0.10.0 http://yetus.apache.org

This message was automatically generated.

@nandakumar131
Copy link
Contributor

Test failures are not related.

@nandakumar131 nandakumar131 merged commit 6e4cdf8 into apache:trunk Sep 6, 2019
@lokeshj1703 lokeshj1703 deleted the HDDS-1561 branch September 6, 2019 08:04
amahussein pushed a commit to amahussein/hadoop that referenced this pull request Oct 29, 2019
RogPodge pushed a commit to RogPodge/hadoop that referenced this pull request Mar 25, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants