-
Notifications
You must be signed in to change notification settings - Fork 3.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HBASE-24208 Remove RS entry from zk draining servers node after RS been stopped #1841
Conversation
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left few comments, looks good overall.
hbase-server/src/main/java/org/apache/hadoop/hbase/master/ServerManager.java
Outdated
Show resolved
Hide resolved
import org.slf4j.Logger; | ||
import org.slf4j.LoggerFactory; | ||
|
||
@Category({ LargeTests.class, ClientTests.class }) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
MediumTests
should be good here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done thx
@@ -573,6 +574,7 @@ public synchronized long expireServer(final ServerName serverName) { | |||
} | |||
|
|||
synchronized long expireServer(final ServerName serverName, boolean force) { | |||
ZKWatcher zkw = master.getZooKeeper(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can include this inline directly on L591: ZNodePaths.joinZNode(master.getZooKeeper().getZNodePaths().drainingZNode
....
Because for below 2 return statements, this just turns out to be redundant statement.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done thx
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
@Category({ MediumTests.class, ClientTests.class }) | ||
public class TestDecommissionStopAdminApi extends TestAdminBase{ | ||
@ClassRule | ||
public static final HBaseClassTestRule CLASS_RULE = HBaseClassTestRule.forClass(TestDecommissionStopAdminApi.class); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The line is exceeding 100 chars. Not sure why build didn't catch this as checkstyle issue. Anyways, you can bring it to 2 lines.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
@ClassRule | ||
public static final HBaseClassTestRule CLASS_RULE = HBaseClassTestRule.forClass(TestDecommissionStopAdminApi.class); | ||
|
||
private static final Logger LOG = LoggerFactory.getLogger(TestDecommissionStopAdminApi.class); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can be removed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
ServerName serverName = serversToDecommission.get(0); | ||
ADMIN.stopRegionServer(serverName.getHostname()+":"+serverName.getPort()); | ||
sleep(1000); | ||
assertTrue(ADMIN.listDecommissionedRegionServers().isEmpty()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of relying on sleep (which usually results in flaky tests, works maybe 90% times, and 10% remains flaky), we can use Waiter.wait()
:
Something like this maybe:
assertNotEquals("Error message...xyz...", -1,
TEST_UTIL.waitFor(1000, () -> ADMIN.listDecommissionedRegionServers().isEmpty()));
Basically Waiter keeps executing the code until it has successfully returned true from the Predicate. If it can't even after timeout, it will return -1.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
import org.slf4j.LoggerFactory; | ||
|
||
@Category({ MediumTests.class, ClientTests.class }) | ||
public class TestDecommissionStopAdminApi extends TestAdminBase{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Btw, we can put testDecommissionAndStopRegionServers()
in TestAdmin2
or TestAdmin3
. They are already LargeTests and have good no of Admin tests. However, keeping this test in separate file is also good. Maybe we can rename this class to TestAdmin4
? This way, new Admin tests can be written in this class going forward.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
import org.apache.hadoop.hbase.TableName; | ||
import org.apache.hadoop.hbase.testclassification.ClientTests; | ||
import org.apache.hadoop.hbase.testclassification.MediumTests; | ||
import org.junit.Before; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Before
, TableName
and HBaseTestingUtility
are unused imports.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed
🎊 +1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
hbase-server/src/test/java/org/apache/hadoop/hbase/client/TestAdmin4.java
Outdated
Show resolved
Hide resolved
🎊 +1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
hbase-server/src/main/java/org/apache/hadoop/hbase/master/ServerManager.java
Outdated
Show resolved
Hide resolved
List<ServerName> decommissionedRegionServers = ADMIN.listDecommissionedRegionServers(); | ||
assertTrue(decommissionedRegionServers.isEmpty()); | ||
|
||
ArrayList<ServerName> clusterRegionServers = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cant we use Admin#getRegionServers(boolean excludeDecommissionedRS) API directly here?
hbase-server/src/test/java/org/apache/hadoop/hbase/client/TestAdmin4.java
Outdated
Show resolved
Hide resolved
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
@gkanade Can you please check test failure locally? I am not getting NPE on master branch. However, last couple of builds have consistent failures:
|
🎊 +1 overall
This message was automatically generated. |
@virajjasani @anoopsjohn i made the fix and now verifies this works locally (the issue was one of the tests was not initializing zk instance for HM so i just put a null check around my code), this should work plz review |
+1 |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
The build is green for us (failed QA is not relevant). |
…en stopped (#1841) Signed-off-by: Anoop Sam John <anoopsamjohn@apache.org> Signed-off-by: Viraj Jasani <vjasani@apache.org>
…en stopped (apache#1841) Signed-off-by: Anoop Sam John <anoopsamjohn@apache.org> Signed-off-by: Viraj Jasani <vjasani@apache.org>
removed RS entry from zk draining server znode on RS stopped