-
Notifications
You must be signed in to change notification settings - Fork 3.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HBASE-22404 Open/Close region request may be executed twice when mast… #234
Conversation
@@ -1272,6 +1279,10 @@ public RSRpcServices(HRegionServer rs) throws IOException { | |||
|
|||
closedScanners = CacheBuilder.newBuilder() | |||
.expireAfterAccess(scannerLeaseTimeoutPeriod, TimeUnit.MILLISECONDS).build(); | |||
sumbittedOpenProcedures = | |||
CacheBuilder.newBuilder().expireAfterAccess(600, TimeUnit.SECONDS).build(); | |||
sumbittedCloseProcedures = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this enough? What if the master crashed for an hour and then restart...
dafd0c1
to
037e081
Compare
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
Show resolved
Hide resolved
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
Show resolved
Hide resolved
💔 -1 overall
This message was automatically generated. |
b96bb38
to
d435f7f
Compare
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
* | ||
* After the open/close region request executed and report region transition succeed, cache it in | ||
* executed region procedures cache. See {@link #finishRegionProcedure(long)}. And the cache will | ||
* expire after 600 seconds. Because the duplicate open/close region request should not be delayed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
'Because the duplicated open/close region request is not likely to be delayed...'
And maybe we should also do something at master side? When reportRegionStateTransition succeeded, let's also remove the pending operations in RSProcedureDispatcher if possible? Or maybe add some checks in splitAndResolveOperation, if a RemoteProcedure has already been marked as succeeded, just give up and do not schedule it again. Otherwise we may still got duplicate requests if RSProcedureDispatcher keep retrying more than 10 minutes? Not likely but theoretically it is possible...
d435f7f
to
e62519e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No big concerns. Just need to add more comments.
@@ -232,8 +233,9 @@ public RemoteProcedure getRemoteProcedure() { | |||
public interface RemoteProcedure<TEnv, TRemote> { | |||
/** | |||
* For building the remote operation. | |||
* May be empty if no need to send remote call. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please append this 'Usually, this means the RemoteProcedure has been finished already. This is possible, as we may have already sent the procedure to RS but then the rpc connection is broken so the executeProcedures call fails, but the RS does receive the procedure and execute it and then report back, before we retry again.'
@@ -84,11 +85,12 @@ protected void deserializeStateData(ProcedureStateSerializer serializer) throws | |||
} | |||
|
|||
@Override | |||
public RemoteProcedureDispatcher.RemoteOperation remoteCallBuild(MasterProcedureEnv env, | |||
public Optional<RemoteProcedureDispatcher.RemoteOperation> remoteCallBuild(MasterProcedureEnv env, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think for these procedures we should also test whether the procedure has already been finished. Anyway, should be another issue.
e62519e
to
36c4fd2
Compare
💔 -1 overall
This message was automatically generated. |
36c4fd2
to
fc75239
Compare
…er restart