HBASE-23984 [Flakey Tests] TestMasterAbortAndRSGotKilled fails in tea… by saintstack · Pull Request #1311 · apache/hbase

saintstack · 2020-03-19T18:25:07Z

…rdown

saintstack · 2020-03-19T18:44:49Z

Updated the changeset comment:



    HBASE-23984 [Flakey Tests] TestMasterAbortAndRSGotKilled fails in teardown

    hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
     Change parameter name and add javadoc to make it more clear what the
     param actually is.

    hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/handler/AssignRegionHandler.java
     Move postOpenDeployTasks so if it fails to talk to the Master -- which
     can happen on cluster shutdown -- then we will do cleanup of state;
     without this the RS can get stuck and won't go down.

    hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/handler/CloseRegionHandler.java
     Add handleException so CRH looks more like UnassignRegionHandler and
     AssignRegionHandler around exception handling. Add a bit of doc on
     why CRH.

    hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/handler/UnassignRegionHandler.java
     Right shift most of the body of process so can add in a finally
     that cleans up rs.getRegionsInTransitionInRS is on exception
     (otherwise outstanding entries can stop a RS going down on cluster
     shutdown)

Apache-HBase · 2020-03-19T19:20:52Z

🎊 +1 overall

Vote	Subsystem	Runtime	Comment
+0 🆗	reexec	1m 25s	Docker mode activated.
		_ Prechecks _
+1 💚	dupname	0m 0s	No case conflicting files found.
+1 💚	hbaseanti	0m 0s	Patch does not have any anti-patterns.
+1 💚	@author	0m 0s	The patch does not contain any @author tags.
		_ branch-2 Compile Tests _
+1 💚	mvninstall	6m 15s	branch-2 passed
+1 💚	checkstyle	1m 17s	branch-2 passed
+1 💚	spotbugs	2m 11s	branch-2 passed
		_ Patch Compile Tests _
+1 💚	mvninstall	5m 35s	the patch passed
+1 💚	checkstyle	1m 18s	hbase-server: The patch generated 0 new + 53 unchanged - 4 fixed = 53 total (was 57)
+1 💚	whitespace	0m 0s	The patch has no whitespace issues.
+1 💚	hadoopcheck	17m 44s	Patch does not cause any errors with Hadoop 2.8.5 2.9.2 or 3.1.2.
+1 💚	spotbugs	2m 20s	the patch passed
		_ Other Tests _
+1 💚	asflicense	0m 13s	The patch does not generate ASF License warnings.
		45m 39s

Subsystem	Report/Notes
Docker	Client=19.03.8 Server=19.03.8 base: https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1311/1/artifact/yetus-general-check/output/Dockerfile
GITHUB PR	#1311
Optional Tests	dupname asflicense spotbugs hadoopcheck hbaseanti checkstyle
uname	Linux 5f26a9ea0127 4.15.0-74-generic #84-Ubuntu SMP Thu Dec 19 08:06:28 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
Build tool	maven
Personality	dev-support/hbase-personality.sh
git revision	branch-2 / `ffb2359`
Max. process+thread count	83 (vs. ulimit of 10000)
modules	C: hbase-server U: hbase-server
Console output	https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1311/1/console
versions	git=2.17.1 maven=2018-06-17T18:33:14Z) spotbugs=3.1.12
Powered by	Apache Yetus 0.11.1 https://yetus.apache.org

This message was automatically generated.

Apache-HBase · 2020-03-19T20:41:04Z

💔 -1 overall

Vote	Subsystem	Runtime	Comment
+0 🆗	reexec	0m 49s	Docker mode activated.
-0 ⚠️	yetus	0m 7s	Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --whitespace-eol-ignore-list --whitespace-tabs-ignore-list --quick-hadoopcheck
		_ Prechecks _
		_ branch-2 Compile Tests _
+1 💚	mvninstall	6m 36s	branch-2 passed
+1 💚	compile	1m 11s	branch-2 passed
-1 ❌	shadedjars	0m 10s	branch has 7 errors when building our shaded downstream artifacts.
-0 ⚠️	javadoc	0m 42s	hbase-server in branch-2 failed.
		_ Patch Compile Tests _
+1 💚	mvninstall	6m 17s	the patch passed
+1 💚	compile	1m 6s	the patch passed
+1 💚	javac	1m 6s	the patch passed
-1 ❌	shadedjars	0m 10s	patch has 7 errors when building our shaded downstream artifacts.
-0 ⚠️	javadoc	0m 40s	hbase-server in the patch failed.
		_ Other Tests _
-0 ⚠️	unit	105m 39s	hbase-server in the patch failed.
		125m 49s

Subsystem	Report/Notes
Docker	Client=19.03.8 Server=19.03.8 base: https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1311/1/artifact/yetus-jdk11-hadoop3-check/output/Dockerfile
GITHUB PR	#1311
Optional Tests	javac javadoc unit shadedjars compile
uname	Linux eda6b4b0615f 4.15.0-60-generic #67-Ubuntu SMP Thu Aug 22 16:55:30 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
Build tool	maven
Personality	dev-support/hbase-personality.sh
git revision	branch-2 / `ffb2359`
Default Java	2020-01-14
shadedjars	https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1311/1/artifact/yetus-jdk11-hadoop3-check/output/branch-shadedjars.txt
javadoc	https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1311/1/artifact/yetus-jdk11-hadoop3-check/output/branch-javadoc-hbase-server.txt
shadedjars	https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1311/1/artifact/yetus-jdk11-hadoop3-check/output/patch-shadedjars.txt
javadoc	https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1311/1/artifact/yetus-jdk11-hadoop3-check/output/patch-javadoc-hbase-server.txt
unit	https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1311/1/artifact/yetus-jdk11-hadoop3-check/output/patch-unit-hbase-server.txt
Test Results	https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1311/1/testReport/
Max. process+thread count	5729 (vs. ulimit of 10000)
modules	C: hbase-server U: hbase-server
Console output	https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1311/1/console
versions	git=2.17.1 maven=2018-06-17T18:33:14Z)
Powered by	Apache Yetus 0.11.1 https://yetus.apache.org

This message was automatically generated.

Apache-HBase · 2020-03-19T20:51:07Z

💔 -1 overall

Vote	Subsystem	Runtime	Comment
+0 🆗	reexec	0m 43s	Docker mode activated.
-0 ⚠️	yetus	0m 7s	Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --whitespace-eol-ignore-list --whitespace-tabs-ignore-list --quick-hadoopcheck
		_ Prechecks _
		_ branch-2 Compile Tests _
+1 💚	mvninstall	5m 50s	branch-2 passed
+1 💚	compile	0m 59s	branch-2 passed
+1 💚	shadedjars	4m 50s	branch has no errors when building our shaded downstream artifacts.
+1 💚	javadoc	0m 35s	branch-2 passed
		_ Patch Compile Tests _
+1 💚	mvninstall	5m 18s	the patch passed
+1 💚	compile	1m 7s	the patch passed
+1 💚	javac	1m 7s	the patch passed
+1 💚	shadedjars	6m 55s	patch has no errors when building our shaded downstream artifacts.
+1 💚	javadoc	0m 59s	the patch passed
		_ Other Tests _
-1 ❌	unit	106m 15s	hbase-server in the patch failed.
		135m 54s

Subsystem	Report/Notes
Docker	Client=19.03.8 Server=19.03.8 base: https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1311/1/artifact/yetus-jdk8-hadoop2-check/output/Dockerfile
GITHUB PR	#1311
Optional Tests	javac javadoc unit shadedjars compile
uname	Linux 1e3dc7b68916 4.15.0-60-generic #67-Ubuntu SMP Thu Aug 22 16:55:30 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
Build tool	maven
Personality	dev-support/hbase-personality.sh
git revision	branch-2 / `ffb2359`
Default Java	1.8.0_232
unit	https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1311/1/artifact/yetus-jdk8-hadoop2-check/output/patch-unit-hbase-server.txt
Test Results	https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1311/1/testReport/
Max. process+thread count	5504 (vs. ulimit of 10000)
modules	C: hbase-server U: hbase-server
Console output	https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1311/1/console
versions	git=2.17.1 maven=2018-06-17T18:33:14Z)
Powered by	Apache Yetus 0.11.1 https://yetus.apache.org

This message was automatically generated.

ndimiduk · 2020-03-19T20:41:17Z

...e-server/src/main/java/org/apache/hadoop/hbase/regionserver/handler/AssignRegionHandler.java

      // opening can not be interrupted by a close request any more.
      region = HRegion.openHRegion(regionInfo, htd, rs.getWAL(regionInfo), rs.getConfiguration(),
        rs, null);
+      rs.postOpenDeployTasks(new PostOpenDeployContext(region, openProcId, masterSystemTime));


Yikes! Yeah, this seems better here. Good.

No...

IIRC, the design here is that, postOpenDeployTasks is the PONR, if we arrive here, then we can not revert back, the only way to address the exception is to abort the region server.

The fact is that, if we haven't told master anything, it is fine for us to close the region and tell master the failure, but once we have already called master with the succeeded message, even if the rpc call fails, we do not know whether the other side(the master) has received and processed the request already, so the only way is to retry for ever, and if this can not be done, the only way is to abort ourselves...

bq. IIRC, the design here is that, postOpenDeployTasks is the PONR, if we arrive here, then we can not revert back, the only way to address the exception is to abort the region server.

Ok. That helps. Let me add above as comment and ensure the above happens and that I get my fix in.

ndimiduk · 2020-03-19T20:44:54Z

hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/handler/CloseRegionHandler.java

+ * <p>Expects that the close has been registered in the hosting RegionServer before
+ * submitting this Handler; i.e. <code>rss.getRegionsInTransitionInRS().putIfAbsent(
+ * this.regionInfo.getEncodedNameAsBytes(), Boolean.FALSE);</code> has been called first.
+ * In here when done, we do the deregister.</p>


helpful observation.

ndimiduk · 2020-03-19T20:51:21Z

hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/handler/CloseRegionHandler.java

    }
  }
+
+  @Override protected void handleException(Throwable t) {


Maybe it's just because I'm new to the *Handler code, but it's not clear to me why one would handle exceptions locally vs. handle them from this handleException method. I guess it's all hooks for operating within the confines of a Runnable off on a thread pool somewhere.

Yes, inconsistently used. Here trying to keep w/ the herd.

ndimiduk · 2020-03-19T21:00:12Z

hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/handler/CloseRegionHandler.java

-        // the master to split our logs in order to recover the data.
-        server.abort("Unrecoverable exception while closing region " +
-          regionInfo.getRegionNameAsString() + ", still finishing close", ioe);
-        throw new RuntimeException(ioe);


After reading the above comment and seeing you discarded the throwing of this exception, I initially choked. But reading through the actual use of these Handlers in the ExecutorService instance hanging off of HRegionServer, and HMaster I can only conclude that the above throw was only wishful thinking. There's even a comment (emphasis mine):

Start up all services. If any of these threads gets an unhandled exception
then they just die with a logged message. This should be fine because
in general, we do not expect the master to get such unhandled exceptions
as OOMEs; it should be lightly loaded. See what HRegionServer does if
need to install an unexpected exception handler.

The author of the above comment speaks wistfully of what i can only assume is HRegionServer#uncaughtExceptionHandler. However, it doesn't appear that this is threaded down into the executor service, which means this line's throw statement is simply logged and ignored.

So yes, I think removing the throw is the right choice. It removes the false sense of handling this error condition correctly. It's really the abort that protects the content of the memstore.

Also, why is there not a named exception thrown by the memstore when it cannot flush? Seems like a useful point in that data structure's API.

ndimiduk · 2020-03-19T21:03:03Z

...server/src/main/java/org/apache/hadoop/hbase/regionserver/handler/UnassignRegionHandler.java

+      rs.finishRegionProcedure(closeProcId);
+      LOG.info("Closed {}", regionName);
+    } finally {
      rs.getRegionsInTransitionInRS().remove(encodedNameBytes, Boolean.FALSE);


saintstack · 2020-03-19T21:20:54Z

Thanks @ndimiduk . Will wait see if this patch good by @Apache9

Apache9

Let's focus on just making the UT pass here, without changing other code.

I suggest we open a follow on issue, to discuss the abort behavior. To me, the operations in abort method do not make sense. Maybe we just need to try our best to close the connection to zk to let master know we are dead, and then just do a System.exit(1). For now we will do lots of clean up work and even want to flush all the regions? This is not a abort I'd say, it is almost like a graceful shutdown...

Apache9 · 2020-03-20T01:45:37Z

...e-server/src/main/java/org/apache/hadoop/hbase/regionserver/handler/AssignRegionHandler.java

      // opening can not be interrupted by a close request any more.
      region = HRegion.openHRegion(regionInfo, htd, rs.getWAL(regionInfo), rs.getConfiguration(),
        rs, null);
+      rs.postOpenDeployTasks(new PostOpenDeployContext(region, openProcId, masterSystemTime));


No...

IIRC, the design here is that, postOpenDeployTasks is the PONR, if we arrive here, then we can not revert back, the only way to address the exception is to abort the region server.

The fact is that, if we haven't told master anything, it is fine for us to close the region and tell master the failure, but once we have already called master with the succeeded message, even if the rpc call fails, we do not know whether the other side(the master) has received and processed the request already, so the only way is to retry for ever, and if this can not be done, the only way is to abort ourselves...

Apache9 · 2020-03-20T01:50:33Z

...server/src/main/java/org/apache/hadoop/hbase/regionserver/handler/UnassignRegionHandler.java

+      // Cache the close region procedure id after report region transition succeed.
+      rs.finishRegionProcedure(closeProcId);
+      LOG.info("Closed {}", regionName);
+    } finally {


So this is the actual fix here?

If you really want to do this to let the test pass, I suggest you add the removal in the handleException method, and add a FIXME or TODO comment to say that this is just for making test pass, should be addressed later.

saintstack · 2020-03-20T04:34:32Z

bq. Let's focus on just making the UT pass here, without changing other code.

It is not just about unit test.

bq. I suggest we open a follow on issue, to discuss the abort behavior.

You are welcome to. I'm current just interested in landing a fix for cluster shutdown/RS aborts and concurrent assign/unassigns which causes flakey test failures and hangs in the wild.

bq. To me, the operations in abort method do not make sense. Maybe we just need to try our best to close the connection to zk to let master know we are dead, and then just do a System.exit(1). For now we will do lots of clean up work and even want to flush all the regions? This is not a abort I'd say, it is almost like a graceful shutdown...

For new issue.

saintstack · 2020-03-20T04:37:33Z

bq. If you really want to do this to let the test pass, I suggest you add the removal in the handleException method, and add a FIXME or TODO comment to say that this is just for making test pass, should be addressed later.

I can move it to handleException, np. I will NOT note that it a UT fix only. There is an obvious hole here holds up shutdowns and shutdowns are not UT only.

These Handlers strike me as arbitrary regards where stuff goes; no wonder there are holes.

Let me put up another patch w/ your suggestions.

saintstack · 2020-03-20T05:08:57Z

New push. Enjoys the benefit of @Apache9 feedback. Main change is restoring these Handlers to as they were (but w/ the PONR comment added) and then in the handleException, just removing entry from RS RIT map just before call to abort. Gets me what I want and leaves rest of code as was.

Apache-HBase · 2020-03-20T06:11:23Z

🎊 +1 overall

Vote	Subsystem	Runtime	Comment
+0 🆗	reexec	1m 56s	Docker mode activated.
		_ Prechecks _
+1 💚	dupname	0m 0s	No case conflicting files found.
+1 💚	hbaseanti	0m 0s	Patch does not have any anti-patterns.
+1 💚	@author	0m 0s	The patch does not contain any @author tags.
		_ branch-2 Compile Tests _
+1 💚	mvninstall	7m 49s	branch-2 passed
+1 💚	checkstyle	1m 26s	branch-2 passed
+1 💚	spotbugs	2m 40s	branch-2 passed
		_ Patch Compile Tests _
+1 💚	mvninstall	6m 48s	the patch passed
+1 💚	checkstyle	1m 27s	hbase-server: The patch generated 0 new + 53 unchanged - 4 fixed = 53 total (was 57)
+1 💚	whitespace	0m 0s	The patch has no whitespace issues.
+1 💚	hadoopcheck	22m 19s	Patch does not cause any errors with Hadoop 2.8.5 2.9.2 or 3.1.2.
+1 💚	spotbugs	2m 27s	the patch passed
		_ Other Tests _
+1 💚	asflicense	0m 13s	The patch does not generate ASF License warnings.
		56m 6s

Subsystem	Report/Notes
Docker	Client=19.03.8 Server=19.03.8 base: https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1311/2/artifact/yetus-general-check/output/Dockerfile
GITHUB PR	#1311
Optional Tests	dupname asflicense spotbugs hadoopcheck hbaseanti checkstyle
uname	Linux c521f1c30def 4.15.0-74-generic #84-Ubuntu SMP Thu Dec 19 08:06:28 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
Build tool	maven
Personality	dev-support/hbase-personality.sh
git revision	branch-2 / `ffb2359`
Max. process+thread count	83 (vs. ulimit of 10000)
modules	C: hbase-server U: hbase-server
Console output	https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1311/2/console
versions	git=2.17.1 maven=2018-06-17T18:33:14Z) spotbugs=3.1.12
Powered by	Apache Yetus 0.11.1 https://yetus.apache.org

This message was automatically generated.

Apache9

+1 for now.

Let's open another issue to address the shutdown issue.

Apache9 · 2020-03-20T06:14:09Z

hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/handler/CloseRegionHandler.java

+    // Done!  Region is closed on this RS
+    this.rsServices.getRegionsInTransitionInRS().
+      remove(this.regionInfo.getEncodedNameAsBytes(), Boolean.FALSE);
+    LOG.debug("Closed " + region.getRegionInfo().getRegionNameAsString());


LOG.debug("Closed {}", region.getRegionInfo().getRegionNameAsString());

Apache-HBase · 2020-03-20T06:42:57Z

💔 -1 overall

Vote	Subsystem	Runtime	Comment
+0 🆗	reexec	0m 37s	Docker mode activated.
-0 ⚠️	yetus	0m 6s	Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --whitespace-eol-ignore-list --whitespace-tabs-ignore-list --quick-hadoopcheck
		_ Prechecks _
		_ branch-2 Compile Tests _
+1 💚	mvninstall	6m 30s	branch-2 passed
+1 💚	compile	1m 7s	branch-2 passed
-1 ❌	shadedjars	0m 12s	branch has 7 errors when building our shaded downstream artifacts.
-0 ⚠️	javadoc	0m 41s	hbase-server in branch-2 failed.
		_ Patch Compile Tests _
+1 💚	mvninstall	6m 11s	the patch passed
+1 💚	compile	1m 6s	the patch passed
+1 💚	javac	1m 6s	the patch passed
-1 ❌	shadedjars	0m 11s	patch has 7 errors when building our shaded downstream artifacts.
-0 ⚠️	javadoc	0m 37s	hbase-server in the patch failed.
		_ Other Tests _
-0 ⚠️	unit	68m 26s	hbase-server in the patch failed.
		87m 43s

Subsystem	Report/Notes
Docker	Client=19.03.8 Server=19.03.8 base: https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1311/2/artifact/yetus-jdk11-hadoop3-check/output/Dockerfile
GITHUB PR	#1311
Optional Tests	javac javadoc unit shadedjars compile
uname	Linux 18256abb643f 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
Build tool	maven
Personality	dev-support/hbase-personality.sh
git revision	branch-2 / `ffb2359`
Default Java	2020-01-14
shadedjars	https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1311/2/artifact/yetus-jdk11-hadoop3-check/output/branch-shadedjars.txt
javadoc	https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1311/2/artifact/yetus-jdk11-hadoop3-check/output/branch-javadoc-hbase-server.txt
shadedjars	https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1311/2/artifact/yetus-jdk11-hadoop3-check/output/patch-shadedjars.txt
javadoc	https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1311/2/artifact/yetus-jdk11-hadoop3-check/output/patch-javadoc-hbase-server.txt
unit	https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1311/2/artifact/yetus-jdk11-hadoop3-check/output/patch-unit-hbase-server.txt
Test Results	https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1311/2/testReport/
Max. process+thread count	5611 (vs. ulimit of 10000)
modules	C: hbase-server U: hbase-server
Console output	https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1311/2/console
versions	git=2.17.1 maven=2018-06-17T18:33:14Z)
Powered by	Apache Yetus 0.11.1 https://yetus.apache.org

This message was automatically generated.

Apache-HBase · 2020-03-20T06:49:47Z

💔 -1 overall

Vote	Subsystem	Runtime	Comment
+0 🆗	reexec	0m 50s	Docker mode activated.
-0 ⚠️	yetus	0m 7s	Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --whitespace-eol-ignore-list --whitespace-tabs-ignore-list --quick-hadoopcheck
		_ Prechecks _
		_ branch-2 Compile Tests _
+1 💚	mvninstall	6m 6s	branch-2 passed
+1 💚	compile	1m 1s	branch-2 passed
+1 💚	shadedjars	4m 39s	branch has no errors when building our shaded downstream artifacts.
+1 💚	javadoc	0m 37s	branch-2 passed
		_ Patch Compile Tests _
+1 💚	mvninstall	5m 17s	the patch passed
+1 💚	compile	0m 58s	the patch passed
+1 💚	javac	0m 58s	the patch passed
+1 💚	shadedjars	4m 27s	patch has no errors when building our shaded downstream artifacts.
+1 💚	javadoc	0m 34s	the patch passed
		_ Other Tests _
-1 ❌	unit	67m 34s	hbase-server in the patch failed.
		94m 32s

Subsystem	Report/Notes
Docker	Client=19.03.8 Server=19.03.8 base: https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1311/2/artifact/yetus-jdk8-hadoop2-check/output/Dockerfile
GITHUB PR	#1311
Optional Tests	javac javadoc unit shadedjars compile
uname	Linux 5de009b95d02 4.15.0-60-generic #67-Ubuntu SMP Thu Aug 22 16:55:30 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
Build tool	maven
Personality	dev-support/hbase-personality.sh
git revision	branch-2 / `ffb2359`
Default Java	1.8.0_232
unit	https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1311/2/artifact/yetus-jdk8-hadoop2-check/output/patch-unit-hbase-server.txt
Test Results	https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1311/2/testReport/
Max. process+thread count	5609 (vs. ulimit of 10000)
modules	C: hbase-server U: hbase-server
Console output	https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1311/2/console
versions	git=2.17.1 maven=2018-06-17T18:33:14Z)
Powered by	Apache Yetus 0.11.1 https://yetus.apache.org

This message was automatically generated.

…rdown hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java Change parameter name and add javadoc to make it more clear what the param actually is. hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/handler/AssignRegionHandler.java Move postOpenDeployTasks so if it fails to talk to the Master -- which can happen on cluster shutdown -- then we will do cleanup of state; without this the RS can get stuck and won't go down. hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/handler/CloseRegionHandler.java Add handleException so CRH looks more like UnassignRegionHandler and AssignRegionHandler around exception handling. Add a bit of doc on why CRH. hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/handler/UnassignRegionHandler.java Right shift most of the body of process so can add in a finally that cleans up rs.getRegionsInTransitionInRS is on exception (otherwise outstanding entries can stop a RS going down on cluster shutdown)

saintstack · 2020-03-20T19:10:11Z

One of the test failures -- A perversion around Region handling in TestRegionObserverInterface -- exposed issue w/ the CloseRegionHandler refactor trying to make it look like other handlers around regionsInTransitionInRS handling. Fixed (and fixed the issue @Apache9 noted above). Added region name logging to this journal stuff -- otherwise its just opaque... thats just log change.

saintstack · 2020-03-20T19:19:59Z

Thanks for review @Apache9 . I'd filed HBASE-24015 a few days ago because it seemed plain this issue had opened a can of worms -- and that was before you showed up. You want to go more radical that the scope of HBASE-24015, so I made HBASE-24026 for shutdown redo. Thanks.

Apache-HBase · 2020-03-20T19:55:26Z

🎊 +1 overall

Vote	Subsystem	Runtime	Comment
+0 🆗	reexec	1m 41s	Docker mode activated.
		_ Prechecks _
+1 💚	dupname	0m 0s	No case conflicting files found.
+1 💚	hbaseanti	0m 0s	Patch does not have any anti-patterns.
+1 💚	@author	0m 0s	The patch does not contain any @author tags.
		_ branch-2 Compile Tests _
+1 💚	mvninstall	6m 18s	branch-2 passed
+1 💚	checkstyle	1m 21s	branch-2 passed
+1 💚	spotbugs	2m 9s	branch-2 passed
		_ Patch Compile Tests _
+1 💚	mvninstall	5m 43s	the patch passed
+1 💚	checkstyle	1m 23s	hbase-server: The patch generated 0 new + 259 unchanged - 4 fixed = 259 total (was 263)
+1 💚	whitespace	0m 0s	The patch has no whitespace issues.
+1 💚	hadoopcheck	11m 54s	Patch does not cause any errors with Hadoop 2.10.0 or 3.1.2.
+1 💚	spotbugs	2m 18s	the patch passed
		_ Other Tests _
+1 💚	asflicense	0m 14s	The patch does not generate ASF License warnings.
		40m 14s

Subsystem	Report/Notes
Docker	Client=19.03.8 Server=19.03.8 base: https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1311/3/artifact/yetus-general-check/output/Dockerfile
GITHUB PR	#1311
Optional Tests	dupname asflicense spotbugs hadoopcheck hbaseanti checkstyle
uname	Linux 0053669810d7 4.15.0-74-generic #84-Ubuntu SMP Thu Dec 19 08:06:28 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
Build tool	maven
Personality	dev-support/hbase-personality.sh
git revision	branch-2 / `8320f73`
Max. process+thread count	83 (vs. ulimit of 10000)
modules	C: hbase-server U: hbase-server
Console output	https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1311/3/console
versions	git=2.17.1 maven=2018-06-17T18:33:14Z) spotbugs=3.1.12
Powered by	Apache Yetus 0.11.1 https://yetus.apache.org

This message was automatically generated.

Apache-HBase · 2020-03-20T21:08:51Z

💔 -1 overall

Vote	Subsystem	Runtime	Comment
+0 🆗	reexec	1m 23s	Docker mode activated.
-0 ⚠️	yetus	0m 8s	Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --whitespace-eol-ignore-list --whitespace-tabs-ignore-list --quick-hadoopcheck
		_ Prechecks _
		_ branch-2 Compile Tests _
+1 💚	mvninstall	10m 43s	branch-2 passed
+1 💚	compile	1m 51s	branch-2 passed
-1 ❌	shadedjars	0m 13s	branch has 7 errors when building our shaded downstream artifacts.
-0 ⚠️	javadoc	1m 2s	hbase-server in branch-2 failed.
		_ Patch Compile Tests _
+1 💚	mvninstall	9m 49s	the patch passed
+1 💚	compile	1m 46s	the patch passed
+1 💚	javac	1m 46s	the patch passed
-1 ❌	shadedjars	0m 24s	patch has 7 errors when building our shaded downstream artifacts.
-0 ⚠️	javadoc	1m 9s	hbase-server in the patch failed.
		_ Other Tests _
+1 💚	unit	82m 49s	hbase-server in the patch passed.
		113m 27s

Subsystem	Report/Notes
Docker	Client=19.03.8 Server=19.03.8 base: https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1311/3/artifact/yetus-jdk11-hadoop3-check/output/Dockerfile
GITHUB PR	#1311
Optional Tests	javac javadoc unit shadedjars compile
uname	Linux 9bf12a52404a 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
Build tool	maven
Personality	dev-support/hbase-personality.sh
git revision	branch-2 / `8320f73`
Default Java	2020-01-14
shadedjars	https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1311/3/artifact/yetus-jdk11-hadoop3-check/output/branch-shadedjars.txt
javadoc	https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1311/3/artifact/yetus-jdk11-hadoop3-check/output/branch-javadoc-hbase-server.txt
shadedjars	https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1311/3/artifact/yetus-jdk11-hadoop3-check/output/patch-shadedjars.txt
javadoc	https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1311/3/artifact/yetus-jdk11-hadoop3-check/output/patch-javadoc-hbase-server.txt
Test Results	https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1311/3/testReport/
Max. process+thread count	5634 (vs. ulimit of 10000)
modules	C: hbase-server U: hbase-server
Console output	https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1311/3/console
versions	git=2.17.1 maven=2018-06-17T18:33:14Z)
Powered by	Apache Yetus 0.11.1 https://yetus.apache.org

This message was automatically generated.

Apache-HBase · 2020-03-20T21:24:27Z

💔 -1 overall

Vote	Subsystem	Runtime	Comment
+0 🆗	reexec	1m 25s	Docker mode activated.
-0 ⚠️	yetus	0m 6s	Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --whitespace-eol-ignore-list --whitespace-tabs-ignore-list --quick-hadoopcheck
		_ Prechecks _
		_ branch-2 Compile Tests _
+1 💚	mvninstall	6m 8s	branch-2 passed
+1 💚	compile	1m 1s	branch-2 passed
+1 💚	shadedjars	4m 57s	branch has no errors when building our shaded downstream artifacts.
+1 💚	javadoc	0m 38s	branch-2 passed
		_ Patch Compile Tests _
+1 💚	mvninstall	5m 44s	the patch passed
+1 💚	compile	1m 0s	the patch passed
+1 💚	javac	1m 0s	the patch passed
+1 💚	shadedjars	4m 58s	patch has no errors when building our shaded downstream artifacts.
+1 💚	javadoc	0m 35s	the patch passed
		_ Other Tests _
-1 ❌	unit	101m 0s	hbase-server in the patch failed.
		129m 6s

Subsystem	Report/Notes
Docker	Client=19.03.8 Server=19.03.8 base: https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1311/3/artifact/yetus-jdk8-hadoop2-check/output/Dockerfile
GITHUB PR	#1311
Optional Tests	javac javadoc unit shadedjars compile
uname	Linux 3e382637ced1 4.15.0-74-generic #84-Ubuntu SMP Thu Dec 19 08:06:28 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
Build tool	maven
Personality	dev-support/hbase-personality.sh
git revision	branch-2 / `8320f73`
Default Java	1.8.0_232
unit	https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1311/3/artifact/yetus-jdk8-hadoop2-check/output/patch-unit-hbase-server.txt
Test Results	https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1311/3/testReport/
Max. process+thread count	4172 (vs. ulimit of 10000)
modules	C: hbase-server U: hbase-server
Console output	https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1311/3/console
versions	git=2.17.1 maven=2018-06-17T18:33:14Z)
Powered by	Apache Yetus 0.11.1 https://yetus.apache.org

This message was automatically generated.

saintstack · 2020-03-20T22:24:37Z

Test report shows no failures. The test output got dropped because of below. I've been running this in tests local and seems fine. Will merge and keep an eye on it.

Post stage
[Pipeline] junit
[2020-03-20T21:24:40.217Z] Recording test results
[2020-03-20T21:24:43.747Z] Remote call on H1 failed
Error when executing always post condition:
java.io.IOException: Remote call on H1 failed
at hudson.remoting.Channel.call(Channel.java:963)
at hudson.FilePath.act(FilePath.java:1072)
at hudson.FilePath.act(FilePath.java:1061)
at hudson.tasks.junit.JUnitParser.parseResult(JUnitParser.java:114)
at hudson.tasks.junit.JUnitResultArchiver.parse(JUnitResultArchiver.java:137)
at hudson.tasks.junit.JUnitResultArchiver.parseAndAttach(JUnitResultArchiver.java:167)
at hudson.tasks.junit.pipeline.JUnitResultsStepExecution.run(JUnitResultsStepExecution.java:52)
at hudson.tasks.junit.pipeline.JUnitResultsStepExecution.run(JUnitResultsStepExecution.java:25)
at org.jenkinsci.plugins.workflow.steps.SynchronousNonBlockingStepExecution.lambda$start$0(SynchronousNonBlockingStepExecution.java:47)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.NoClassDefFoundError: Could not initialize class jenkins.model.Jenkins
at hudson.ExtensionList.lookup(ExtensionList.java:433)
at hudson.tasks.junit.TestNameTransformer.all(TestNameTransformer.java:40)
at hudson.tasks.junit.TestNameTransformer.getTransformedName(TestNameTransformer.java:33)
at hudson.tasks.junit.CaseResult.getTransformedTestName(CaseResult.java:273)
at hudson.tasks.junit.SuiteResult.casesByName(SuiteResult.java:134)
at hudson.tasks.junit.SuiteResult.addCase(SuiteResult.java:297)
at hudson.tasks.junit.SuiteResult.(SuiteResult.java:270)
at hudson.tasks.junit.SuiteResult.parseSuite(SuiteResult.java:209)
at hudson.tasks.junit.SuiteResult.parse(SuiteResult.java:181)
at hudson.tasks.junit.TestResult.parse(TestResult.java:348)
at hudson.tasks.junit.TestResult.parsePossiblyEmpty(TestResult.java:281)
at hudson.tasks.junit.TestResult.parse(TestResult.java:206)
at hudson.tasks.junit.TestResult.parse(TestResult.java:178)
at hudson.tasks.junit.TestResult.(TestResult.java:143)
at hudson.tasks.junit.JUnitParser$ParseResultCallable.invoke(JUnitParser.java:146)
at hudson.tasks.junit.JUnitParser$ParseResultCallable.invoke(JUnitParser.java:118)
at hudson.FilePath$FileCallableWrapper.call(FilePath.java:3052)
at hudson.remoting.UserRequest.perform(UserRequest.java:212)
at hudson.remoting.UserRequest.perform(UserRequest.java:54)
at hudson.remoting.Request$2.run(Request.java:369)
at hudson.remoting.InterceptingExecutorService$1.call(InterceptingExecutorService.java:72)
... 4 more
Suppressed: hudson.remoting.Channel$CallSiteStackTrace: Remote call to H1
at hudson.remoting.Channel.attachCallSiteStackTrace(Channel.java:1743)
at hudson.remoting.UserRequest$ExceptionResponse.retrieve(UserRequest.java:357)
at hudson.remoting.Channel.call(Channel.java:957)
at hudson.FilePath.act(FilePath.java:1072)
at hudson.FilePath.act(FilePath.java:1061)
at hudson.tasks.junit.JUnitParser.parseResult(JUnitParser.java:114)
at hudson.tasks.junit.JUnitResultArchiver.parse(JUnitResultArchiver.java:137)
at hudson.tasks.junit.JUnitResultArchiver.parseAndAttach(JUnitResultArchiver.java:167)
at hudson.tasks.junit.pipeline.JUnitResultsStepExecution.run(JUnitResultsStepExecution.java:52)
at hudson.tasks.junit.pipeline.JUnitResultsStepExecution.run(JUnitResultsStepExecution.java:25)
at org.jenkinsci.plugins.workflow.steps.SynchronousNonBlockingStepExecution.lambda$start$0(SynchronousNonBlockingStepExecution.java:47)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
... 4 more

…rdown (#1311) hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java Change parameter name and add javadoc to make it more clear what the param actually is. hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/handler/AssignRegionHandler.java Move postOpenDeployTasks so if it fails to talk to the Master -- which can happen on cluster shutdown -- then we will do cleanup of state; without this the RS can get stuck and won't go down. hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/handler/CloseRegionHandler.java Add handleException so CRH looks more like UnassignRegionHandler and AssignRegionHandler around exception handling. Add a bit of doc on why CRH. hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/handler/UnassignRegionHandler.java Right shift most of the body of process so can add in a finally that cleans up rs.getRegionsInTransitionInRS is on exception (otherwise outstanding entries can stop a RS going down on cluster shutdown) Signed-off-by: Nick Dimiduk <ndimiduk@apache.org> Signed-off-by: Duo Zhang <zhangduo@apache.org>

…rdown (apache#1311) hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java Change parameter name and add javadoc to make it more clear what the param actually is. hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/handler/AssignRegionHandler.java Move postOpenDeployTasks so if it fails to talk to the Master -- which can happen on cluster shutdown -- then we will do cleanup of state; without this the RS can get stuck and won't go down. hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/handler/CloseRegionHandler.java Add handleException so CRH looks more like UnassignRegionHandler and AssignRegionHandler around exception handling. Add a bit of doc on why CRH. hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/handler/UnassignRegionHandler.java Right shift most of the body of process so can add in a finally that cleans up rs.getRegionsInTransitionInRS is on exception (otherwise outstanding entries can stop a RS going down on cluster shutdown) Signed-off-by: Nick Dimiduk <ndimiduk@apache.org> Signed-off-by: Duo Zhang <zhangduo@apache.org>

saintstack force-pushed the HBASE-23984 branch from 41a2a54 to 4e7c530 Compare March 19, 2020 18:30

ndimiduk approved these changes Mar 19, 2020

View reviewed changes

Apache9 requested changes Mar 20, 2020

View reviewed changes

saintstack force-pushed the HBASE-23984 branch from 4e7c530 to 1df4e23 Compare March 20, 2020 05:06

Apache9 approved these changes Mar 20, 2020

View reviewed changes

saintstack force-pushed the HBASE-23984 branch from 1df4e23 to 096a8bc Compare March 20, 2020 19:06

saintstack merged commit 392bce0 into apache:branch-2 Mar 20, 2020

Conversation

saintstack commented Mar 19, 2020

Uh oh!

saintstack commented Mar 19, 2020

Uh oh!

Apache-HBase commented Mar 19, 2020

Uh oh!

Apache-HBase commented Mar 19, 2020

Uh oh!

Apache-HBase commented Mar 19, 2020

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

saintstack commented Mar 19, 2020

Uh oh!

Apache9 left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

saintstack commented Mar 20, 2020

Uh oh!

saintstack commented Mar 20, 2020

Uh oh!

saintstack commented Mar 20, 2020

Uh oh!

Apache-HBase commented Mar 20, 2020

Uh oh!

Apache9 left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Apache-HBase commented Mar 20, 2020

Uh oh!

Apache-HBase commented Mar 20, 2020

Uh oh!

saintstack commented Mar 20, 2020

Uh oh!

saintstack commented Mar 20, 2020

Uh oh!

Apache-HBase commented Mar 20, 2020

Uh oh!

Apache-HBase commented Mar 20, 2020

Uh oh!

Apache-HBase commented Mar 20, 2020

Uh oh!

saintstack commented Mar 20, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants