-
Notifications
You must be signed in to change notification settings - Fork 3.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HBASE-22699 refactor isMetaClearingException #501
Conversation
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
hbase-client/src/main/java/org/apache/hadoop/hbase/exceptions/ClientExceptionsUtil.java
Outdated
Show resolved
Hide resolved
|| cur instanceof NotServingRegionException || cur instanceof RequestTooBigException); | ||
private static boolean regionDefinitelyOnTheRegionServerException(Throwable t) { | ||
return (t instanceof RegionTooBusyException || t instanceof RpcThrottlingException | ||
|| t instanceof RetryImmediatelyException || t instanceof CallQueueTooBigException |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And for CallQueueTooBigException, CallDroppedException, RequestTooBigExcetion, it does not mean that the region is on the region server? We just do not want to push too much pressure on meta so we do not want to retry on these exceptions everytime, but the method name here is a bit confusing...
private static boolean regionDefinitelyOnTheRegionServerException(Throwable t) { | ||
return (t instanceof RegionTooBusyException || t instanceof RpcThrottlingException | ||
|| t instanceof RetryImmediatelyException || t instanceof CallQueueTooBigException | ||
|| t instanceof CallDroppedException || t instanceof RequestTooBigException); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And for RequestTooBigException, it is a DoNotRetryIOException so it is fine to not clear meta, but for CallQueueTooBigException and CallDroppedException, what if the region is not on the region server, and the region server can not recover for a very long time? The client will stuck there?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CallQueueTooBigException and CallDroppedException are exceptions which imply the server is overload, so if we clear the meta here, there will be many requests which cause the overload sending to the server where the meta is, which may cause other problems? And because those two exceptions are recoverable, the client may receive RegionMovedException etc if the region is moved after the load is back to normal level and then will clear the meta? Maybe both of them should inherit from DoNotRetryNowIOException or something like that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But at least they do not mean the region is on the region server right? So the method name here is confusing...
And I still have concerns about do not retry on these two exceptions. Maybe we should not clear meta everytime, but as I said above, what if the region server can not recover all the time but the region has been moved elsewhere? The client will have no chance to update the meta?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the name is confusing and should be renamed.
Yes, it is really a problem that the client has no change to update the meta in that situation. The problem here is this function is just return a boolean which says Yes or No. It seems we should add more status to carry this information, like Integer or something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For now I think we can keep the method there, but move these two exceptions out to the isMetaClearingException method to keep the old behavior. Can open another issue to discuss how to better handle these two exceptions. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, let me open another issue.
@@ -87,15 +90,15 @@ public static Throwable findException(Object exception) { | |||
} | |||
Throwable cur = (Throwable) exception; | |||
while (cur != null) { | |||
if (isSpecialException(cur)) { | |||
if (matchExceptionWeCare(cur)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: although not imp, can you keep the name same as before?
matchExceptionWeCare
doesn't seem much vocal or may be we don't need that one liner function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The isSpecialException has been removed so we need a new method. And regionDefinatelyOnRegionServer is not a good name here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree it should have meaningful name. I was thinking rather than matchExceptionWeCare
which might not be a generic method nor meaningful, is it fine to write this directly? :
if(regionDefinitelyOnTheRegionServerException(cur) || regionServerIsOverloadException(cur) || cur instanceof NotServingRegionException)
@@ -87,15 +90,15 @@ public static Throwable findException(Object exception) { | |||
} | |||
Throwable cur = (Throwable) exception; | |||
while (cur != null) { | |||
if (isSpecialException(cur)) { | |||
if (matchExceptionWeCare(cur)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree it should have meaningful name. I was thinking rather than matchExceptionWeCare
which might not be a generic method nor meaningful, is it fine to write this directly? :
if(regionDefinitelyOnTheRegionServerException(cur) || regionServerIsOverloadException(cur) || cur instanceof NotServingRegionException)
hbase-client/src/main/java/org/apache/hadoop/hbase/exceptions/ClientExceptionsUtil.java
Outdated
Show resolved
Hide resolved
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
hbase-client/src/main/java/org/apache/hadoop/hbase/exceptions/ClientExceptionsUtil.java
Outdated
Show resolved
Hide resolved
} | ||
|
||
private static boolean regionLocationIsConsistentWithServerException(Throwable t) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
isRegionLocationConsistentWithServerException?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I find a UT function named metaCachePreservingExceptions in class TestMetaCache with similar meanings, so here I rename to isMetaCachePreservingException. What do you think about that, sir?
|| t instanceof RequestTooBigException; | ||
} | ||
|
||
private static boolean matchExceptionWeCare(Throwable t) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'v been recasting names so they are mbean style... 'is' methods have boolean returns...
Here I'd call it isExceptionWeCare but even this is a bad name. What does it mean 'WeCare'. What happens when this triggers? Can we get that into the name instead? Thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function is called in client side, and the purpose of it is that if the return value of the function is true, so we are sure about how to deal with meta cache. The function name findException is too general and make it confusing to name a detailed function inside it.I tried many times and cannot find a suitable one.
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
@@ -67,7 +68,8 @@ static void updateCachedLocationOnError(HRegionLocation loc, Throwable exception | |||
LOG.debug("The actual exception when updating {} is {}", loc, | |||
cause != null ? cause.toString() : "none"); | |||
} | |||
if (cause == null || !isMetaClearingException(cause)) { | |||
if (cause == null || !isMetaClearingException(cause) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So now we need to modify the code other than ClientExceptionsUtil, this means we need to create patch for branch-2? As on branch-2, I believe there are other code paths which need to handle this...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I create another pull request to branch2: #578
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@johnhomsea You created a new PR. Does it subsume this one? Or is it complementary? Thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it subsumes this one, but branch-2 is different from master in some cases.
💔 -1 overall
This message was automatically generated. |
Closing. It looks like this has been superceded by #578 Shout if I have it wrong @johnhomsea . Thanks. |
This is for master and #578 is for branch-2, I think... |
So @saintstack if you think #578 is fine then this one should also be fine... Will merge it if no objections later. |
Yes |
💔 -1 overall
This message was automatically generated. |
No description provided.