New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HBASE-24885 STUCK RIT by hbck2 assigns #2989
Conversation
Adds region state check on hbck2 assigns/unassigns. Returns pid of -1 if in inappropriate state with logging explaination which suggests passing override if operator wants to assign/unassign anyways. Here is an example of what happens now if hbck2 tries an unassign and Region already unassigned: 2020-08-19 11:22:06,926 INFO [RpcServer.default.FPBQ.Fifo.handler=1,queue=0,port=50086] assignment.AssignmentManager(820): Failed {ENCODED => d1112e553991e938b6852f87774c91ee, NAME => 'TestHbck,zzzzz,1597861310769.d1112e553991e938b6852f87774c91ee.', STARTKEY => 'zzzzz', ENDKEY => ''} unassign, override=false; set override to by-pass state checks. org.apache.hadoop.hbase.client.DoNotRetryRegionException: Unexpected state for state=CLOSED, location=null, table=TestHbck, region=d1112e553991e938b6852f87774c91ee at org.apache.hadoop.hbase.master.assignment.AssignmentManager.preTransitCheck(AssignmentManager.java:583) at org.apache.hadoop.hbase.master.assignment.AssignmentManager.createOneUnassignProcedure(AssignmentManager.java:812) at org.apache.hadoop.hbase.master.MasterRpcServices.unassigns(MasterRpcServices.java:2616) at org.apache.hadoop.hbase.shaded.protobuf.generated.MasterProtos$HbckService$2.callBlockingMethod(MasterProtos.java) at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:397) at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:133) at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:338) at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:318) Previous it would just create the unassign anyways. Now must pass override to queue the procedure regardless. Safer. hbase-server/src/main/java/org/apache/hadoop/hbase/master/MasterRpcServices.java javadoc on assigns/unassigns. Minor refactor in assigns/unassigns to cater to case where procedure may come back null (if override not set and fails state checks). hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/AssignmentManager.java checkstyle cleanups. Clarifying javadoc on how there is no state checking when bulk assigns creating/enabling tables. createOneAssignProcedure and createOneUnassignProcedure now handle exceptions which now can be thrown if no override and region state is not appropriate. Aggregation of createAssignProcedure and createUnassignProcedure instances adding in region state check invoked if override is NOT set. hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/RegionStateNode.java Change to setProcedure so it returns passed proc as result instead of void Signed-off-by: Duo Zhang <zhangduo@apache.org> (cherry picked from commit c7e31f7)
the original PR: #2283 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. I guess the slight change in behaviour not an issue to backport it to branch-2.2?
💔 -1 overall
This message was automatically generated. |
@wchevreuil thanks for the quick review!
this change is only introducing extra checks on the master Rpc API called by hbck2. Even if there would be difference inthe assignment procedure between 2.2 and 2.3, these checks happen before the procedure would be even triggered. I tested the change on a 2.2 cluster, and after applying this change:
p.s. I'll check the checkstyle problems soon |
🎊 +1 overall
This message was automatically generated. |
Signed-off-by: Wellington Chevreuil <wchevreuil@apache.org> (cherry picked from commit ae2bbfe) Change-Id: I8a920c021843fbb2d369ca7d1c112bf131c51cd3
This PR is about to backport HBASE-24885 to branch-2.2.
The backport jira ticket: HBASE-25606
I put the original commit message below.
The cherry-pick was based on the commit on branch-2.3. It was quite clean, only a few trivial conflicts to resolve.
(cherry picked from commit c7e31f7)
Adds region state check on hbck2 assigns/unassigns. Returns pid of -1
if in inappropriate state with logging explaination which suggests
passing override if operator wants to assign/unassign anyways. Here
is an example of what happens now if hbck2 tries an unassign and
Region already unassigned:
2020-08-19 11:22:06,926 INFO [RpcServer.default.FPBQ.Fifo.handler=1,queue=0,port=50086] assignment.AssignmentManager(820): Failed {ENCODED => d1112e553991e938b6852f87774c91ee, NAME => 'TestHbck,zzzzz,1597861310769.d1112e553991e938b6852f87774c91ee.', STARTKEY => 'zzzzz', ENDKEY => ''} unassign, override=false; set override to by-pass state checks.
org.apache.hadoop.hbase.client.DoNotRetryRegionException: Unexpected state for state=CLOSED, location=null, table=TestHbck, region=d1112e553991e938b6852f87774c91ee
at org.apache.hadoop.hbase.master.assignment.AssignmentManager.preTransitCheck(AssignmentManager.java:583)
at org.apache.hadoop.hbase.master.assignment.AssignmentManager.createOneUnassignProcedure(AssignmentManager.java:812)
at org.apache.hadoop.hbase.master.MasterRpcServices.unassigns(MasterRpcServices.java:2616)
at org.apache.hadoop.hbase.shaded.protobuf.generated.MasterProtos$HbckService$2.callBlockingMethod(MasterProtos.java)
at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:397)
at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:133)
at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:338)
at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:318)
Previous it would just create the unassign anyways. Now must pass override
to queue the procedure regardless. Safer.
hbase-server/src/main/java/org/apache/hadoop/hbase/master/MasterRpcServices.java
javadoc on assigns/unassigns. Minor refactor in assigns/unassigns to cater to
case where procedure may come back null (if override not set and fails state checks).
hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/AssignmentManager.java
checkstyle cleanups.
Clarifying javadoc on how there is no state checking when bulk assigns creating/enabling
tables.
createOneAssignProcedure and createOneUnassignProcedure now handle exceptions which now
can be thrown if no override and region state is not appropriate.
Aggregation of createAssignProcedure and createUnassignProcedure instances adding in
region state check invoked if override is NOT set.
hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/RegionStateNode.java
Change to setProcedure so it returns passed proc as result instead of void
Signed-off-by: Duo Zhang zhangduo@apache.org