New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HBASE-26809: Report client backoff time for server overloaded #4786
Conversation
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
Looking into the issues above. May be due to my branch being a little behind the branch-2 latest changes. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
ac21df9
to
13728dc
Compare
13728dc
to
140878e
Compare
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
I'm trying to reproduce/troubleshoot the issues above locally. I am able to successfully build with a subset of tests with the following command:
When I try to run a spotbugs check locally I get the following error
Any tips on troubleshooting my build issues are much appreciated. cc: @bbeaudreault |
Actually I think I just found the links to the results on the ci server which is helpful. Though still curious if I can run something locally to test beforehand. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
Everything looks good here, but not sure what to do for existing spotbugs failures. I don't think I introduced the errors called out in the report. cc: @bbeaudreault |
Yea, the spotbugs issues look related to the recent spotbugs version upgrade. I think they will be tackled in https://issues.apache.org/jira/browse/HBASE-27373. |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
@bbeaudreault Should I rerun this build? It looks like there were some changes merged to fix the spotbugs warnings (https://issues.apache.org/jira/browse/HBASE-27373) |
🎊 +1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
May retrigger this latest build... ran the failing test locally and it seemed happy.
|
Actually, doesn't look like I have permission to do this. @bbeaudreault Should/can I request access to the build server? |
Unfortunately only committers have access to the build server. I kicked off a rebuild last night. The easiest way for you to self-service this is to push a commit. Sometimes what I used to do is |
4479504
to
45ad745
Compare
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
The build looks happy. Ready for review. |
hbase-client/src/main/java/org/apache/hadoop/hbase/client/RpcRetryingCallerImpl.java
Show resolved
Hide resolved
hbase-client/src/main/java/org/apache/hadoop/hbase/client/RpcRetryingCallerFactory.java
Show resolved
Hide resolved
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
hmm not sure why the build was aborted, but I can kick off another. |
@@ -65,26 +68,29 @@ public <T> RpcRetryingCaller<T> newCaller() { | |||
// is cheap as it does not require parsing a complex structure. | |||
return new RpcRetryingCallerImpl<>(connectionConf.getPauseMillis(), | |||
connectionConf.getPauseMillisForServerOverloaded(), connectionConf.getRetriesNumber(), | |||
interceptor, startLogErrorsCnt, connectionConf.getRpcTimeout()); | |||
interceptor, startLogErrorsCnt, connectionConf.getRpcTimeout(), metrics); | |||
} | |||
|
|||
public static RpcRetryingCallerFactory instantiate(Configuration configuration) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, sorry for the delay here but I've been meaning to look into the usages of these methods. We basically want to ensure that our metrics object is getting in there as often as possible.
So I opened up branch-2 in intellij and looked at the usages of these instantiate overloads:
instantiate(Configuration)
-- all of the callers to this method have a ClusterConnection in the call context, which we can pull our metrics from (connection.getConnectionMetrics())- `instantiate(Configuration, ServerStatisticTracker) -- all but the SecureBulkLoadClient and LoadIncrementalHFiles have similar.
So I think we should try to add ConnectionMetrics to all of the methods here, and we can pass null in for the 2 cases above.
💔 -1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
Co-authored-by: Briana Augenreich <baugenreich@hubspot.com> Signed-off-by: Bryan Beaudreault <bbeaudreault@apache.org>
…overloaded (apache#4786) Co-authored-by: Briana Augenreich <baugenreich@hubspot.com> Signed-off-by: Bryan Beaudreault <bbeaudreault@apache.org>
Branch 2 changes for #4729
Enable metric reporting on backoff time when a server is overloaded.