-
Notifications
You must be signed in to change notification settings - Fork 3.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HBASE-27853 Add client side table metrics for rpc calls and request latency. #5228
HBASE-27853 Add client side table metrics for rpc calls and request latency. #5228
Conversation
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry it took so long to get eyes on this. Thanks for the submission. I had a few requests
@@ -346,10 +346,9 @@ public static RegionAction.Builder getRegionActionBuilderWithRegion( | |||
public static ScanRequest buildScanRequest(byte[] regionName, Scan scan, int numberOfRows, | |||
boolean closeScanner) throws IOException { | |||
ScanRequest.Builder builder = ScanRequest.newBuilder(); | |||
RegionSpecifier region = buildRegionSpecifier(RegionSpecifierType.REGION_NAME, regionName); | |||
builder.setRegion(buildRegionSpecifier(RegionSpecifierType.REGION_NAME, regionName)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is the reason for these changes in this class?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
because the way I obtain the table name is through the region name. However, for a scan query, the region name will only be passed in the first request, and subsequent requests will no longer pass the region name but the scannerId. Therefore, this class needs to be modified, otherwise the table metrics for the scan request cannot be obtained.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for explaining.
I'm not sure we should modify the RPC protocol just for the sake of these client side metrics. It might be better to use HBaseRpcController to shuttle the table name into the onCallFinished
method (where updateRpc
is called). As an example, you could modify AsyncScanSingleRegionRpcRetryingCaller.call method:
- resetController(controller, callTimeoutNs, priority);
+ resetController(controller, callTimeoutNs, priority, loc.getRegion().getTable());
ScanRequest req = RequestConverter.buildScanRequest(scannerId, scan.getCaching(), false,
nextCallSeq, scan.isScanMetricsEnabled(), false, scan.getLimit());
final Context context = Context.current();
stub.scan(controller, req, resp -> {
try (Scope ignored = context.makeCurrent()) {
onComplete(controller, resp);
}
});
This requires a bit more changes, but at least it keeps our RPC protocol unchanged.
You'll need to add a setTableName and getTableName to HBaseRpcController and DelegatingHBaseRpcController. Then update that ConnectionUtils.resetController
method (and callers), as well as RegionServerCallable once on branch-2.
@Apache9 if you have time, do you agree with this advice?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Apache9 @bbeaudreault hi, could you provide me with more assistance so that I can complete this PR? Thank you very much.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there something else you’d need? I’d go down the path I described, which I feel pretty good about. Just duo often has nice ideas so I figured I’d ask, but he’s probably busy.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bbeaudreault Thank you for your reply! Okay, I will make modifications as you described.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bbeaudreault hi, I have made the modifications as per your request. Could you please review the code and provide some suggestions? I would greatly appreciate it!
hbase-client/src/main/java/org/apache/hadoop/hbase/client/MetricsConnection.java
Outdated
Show resolved
Hide resolved
hbase-client/src/main/java/org/apache/hadoop/hbase/client/MetricsConnection.java
Outdated
Show resolved
Hide resolved
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
Thanks for making those changes. There are some warnings in the pre commit hooks (checkstyle, etc). Can you fix them? I'm going to be out of office until sept 5, so probably won't be able to re-review this until then. Just fyi, I'll get back to it soon. |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
@bbeaudreault hi, could you continue the code review for me if you are available? Thanks! |
@@ -172,6 +172,7 @@ private CompletableFuture<OpenScannerResponse> callOpenScanner(HBaseRpcControlle | |||
} | |||
CompletableFuture<OpenScannerResponse> future = new CompletableFuture<>(); | |||
try { | |||
controller.setTableName(loc.getRegion().getTable()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm pretty sure this is unnecessary. This callOpenScanner
ends up getting called via AsyncSingleRequestRpcRetryingCaller, which does resetCallTimeout()
before calling the callable. So the change you made in that method should handle this case
@@ -121,7 +121,11 @@ protected final void resetCallTimeout() { | |||
} else { | |||
callTimeoutNs = rpcTimeoutNs; | |||
} | |||
resetController(controller, callTimeoutNs, priority); | |||
if (getTableName().isPresent()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could be simplified to one call, with getTableName().orElse(null)
if (methodName != null) { | ||
String table; | ||
if (tableName == null || StringUtils.isEmpty(tableName.getNameAsString())) { | ||
// Fallback to get table name from region specifier. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think we can simplify this whole method (and above calls) now. I'm not sure we need to handle fallback, since all cases should have a TableName in the controller now. So we don't need to extract the region in updateRpc above, nor do we have to to parse the tablename from the region here.
if we missed a spot, i think we'd consider that a bug and fix it. maybe you could add end-to-end tests to ensure that each of the request types updates a table metric?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bbeaudreault hi, I have already modified the code according to your requirements. Could you continue the code review for me if you are available? Thanks!
🎊 +1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is looking good! One more request.
Also, can you please squash and rebase this PR so that we can clear out the patch
warning in the pre-commit hook?
"Get".equals(method.getName()) || "Mutate".equals(method.getName()) | ||
|| "Scan".equals(method.getName()) || "Multi".equals(method.getName()) | ||
) { | ||
updateTableMetric(methodName.toString(), tableName, stats, e); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we still need this, or can we simply add the call to updateTableMetric
in the appropriate switch state cases below? Just to avoid unnecessary complexity and equality checks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
5280f8c
to
c9dbf05
Compare
done. |
🎊 +1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
@zhuyaogai unit test failures look related |
🎊 +1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
@bbeaudreault hi, I found that the latest unit test failures seem to be unrelated to me? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right, the test failures I was referring to are outdated. The new ones look ok.
I just caught one possible bug, and we can merge after that.
hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionUtils.java
Show resolved
Hide resolved
c9dbf05
to
d186215
Compare
🎊 +1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
Test failure looks unrelated. LGTM |
@zhuyaogai thanks for all the work here! Can you submit a backport PR for branch-2? I think you will need to add a bit more code for the sync client (HTable) there |
…atency. (#5228) Signed-off-by: Bryan Beaudreault <bbeaudreault@apache.org>
@bbeaudreault Okay, but it might need to be later. By the way, which branch are you referring to when you say branch-2? |
Lowercase |
…atency. (apache#5228) Signed-off-by: Bryan Beaudreault <bbeaudreault@apache.org>
Details: HBASE-27853