-
Notifications
You must be signed in to change notification settings - Fork 3.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HBASE-23937 : Support Online LargeLogs similar to SlowLogs APIs #1346
Conversation
@ndimiduk Could you please review this patch? This is one of the sub-tasks that you suggested i.e. support large logs API similar to slow logs. |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
* @throws IOException if a remote or network exception occurs | ||
*/ | ||
List<SlowLogRecord> getLargeLogResponses(final Set<ServerName> serverNames, | ||
final SlowLogQueryFilter largeLogQueryFilter) throws IOException; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bharathv I am planning to change SlowLogRecord
to OnlineLogRecord
and SlowLogQueryFilter
to OnlineLogQueryFilter
. Will make that change before merging the PR because this class is being referred at multiple places (in thrift), so will be redundant refactor for now (to make review relevant to actual changes).
Also, the overall change is providing new API in Admin and shell command and to have 2 flags at server side (isSlowLog and isLargeLog) to fetch relevant records from server.
@@ -287,6 +287,8 @@ message SlowLogResponseRequest { | |||
optional string client_address = 3; | |||
optional string user_name = 4; | |||
optional uint32 limit = 5 [default = 10]; | |||
optional bool is_slow_log = 6 [default = false]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use an enum? OnlineLogType for ex.. Also, you don't need to have two separate RPCs if you pass this "type" as a param to the RPC call (we can potentially have other "types" in the future in which case we don't want to add a new set of RPCs).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another thought..why not implement this type as a LogQueryFilter? that makes it much simpler right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMHO having separate shell command and Admin API should be good and clean for operators rather than having "type" as a filter. I agree on implementing enum for "type" on server side including this proto file. We can have filter logic at server side but having separate API for client would be better because all filters that we want to provide to client should be related to actual data filters - client, user, region, table etc rather than type.
Most of the time, user would not use filter so we don't want them to use filter just for log type.
get_slowlog_responses '*'
and get_largelog_responses '*'
might be mostly used (without filters).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMHO having separate shell command and Admin API should be good and clean for operators rather than having "type" as a filter.
Umm.. shell command is fine, probably makes it easier for operators...whats the use of having a separate Admin API? All of these requests fall under the same "get_log" category, so probably good to logically group them into same API (which takes Type parameter) and avoid code bloat?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe we follow 1:1 shell command to Admin API right? This is to ensure operator using shell commands and Java/Thrift clients using Admin API have similar experience. No?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They still have the same experience, its just what each client does with the API. For example, there are 10 (or more) ways of creating a table from shell but only 3 corresponding create admin APIs. All these 10 map to one of these 3 APIs. Java/Thrift clients for this API build the TableDescriptor and use one of these 3 API calls. It is somewhat a similar case here. At least this is my understanding.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm yeah makes sense. I am not too strong of an opinion to keep 1:1 for slow and large logs. Just thought of keeping them same but yes having single API with LogQueryFilter to differentiate between slow and large logs seems better approach.
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
.map(CompletableFuture::join) | ||
.flatMap(List::stream) | ||
.collect(Collectors.toList())); | ||
if (slowLogQueryFilter.getType() == null |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why this branch on the client? Can't the server side infer the type and respond accordingly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nope, both can't use same enum because at server side, we have enum in TooSlowLog.proto which we save in Queue, whereas this enum is for client side filter object. We need to compare each attribute present in client side filter with server side data.
Having this branch on client side is better compared to passing it on to server because otherwise we will need to have it in Admin.proto too to pass it over in RPC call (too many places).
|
||
enum Type { | ||
SLOW_LOG = 0; | ||
LARGE_LOG = 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Helps to have a code comment about what a SLOW log is and what a LARGE log is..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
@@ -369,6 +369,9 @@ service AdminService { | |||
rpc GetSlowLogResponses(SlowLogResponseRequest) | |||
returns(SlowLogResponses); | |||
|
|||
rpc GetLargeLogResponses(SlowLogResponseRequest) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same question as above.. we can get rid of this one, right? (I thought that was our discussion in the PR earlier).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We discussed about getting rid of separate Admin API and have a same endpoint. This one is for RS RPC services. Since we don't want to have multiple enums in Admin.proto, TooSlowLogs.proto, we need this separately and it's only used in RSRpcServices with minimal code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another reason why we must have this is to find out sometimes which specific Get API was slow or large in nature even between get_slow and get_large responses.
Sample response:
{
"startTime": 1585735414657,
"processingTime": 1,
"queueTime": 0,
"responseSize": 2351,
"clientAddress": "171.30.20.8:63571",
"serverClass": "HRegionServer",
"methodName": "GetLargeLogResponses",
"callDetails": "GetLargeLogResponses(org.apache.hadoop.hbase.shaded.protobuf.generated.AdminProtos$SlowLogResponseRequest)",
"param": "class org.apache.hadoop.hbase.shaded.protobuf.generated.AdminProtos$SlowLogResponseRequest",
"userName": "vjasani"
}
{
"startTime": 1585735372313,
"processingTime": 1,
"queueTime": 0,
"responseSize": 2260,
"clientAddress": "171.30.20.8:63571",
"serverClass": "HRegionServer",
"methodName": "GetSlowLogResponses",
"callDetails": "GetSlowLogResponses(org.apache.hadoop.hbase.shaded.protobuf.generated.AdminProtos$SlowLogResponseRequest)",
"param": "class org.apache.hadoop.hbase.shaded.protobuf.generated.AdminProtos$SlowLogResponseRequest",
"userName": "vjasani"
}
For this differentiation, we need to have:
rpc GetLargeLogResponses(SlowLogResponseRequest)
💔 -1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
…wLogRecord -> OnlineLogRecord
@bharathv addressed all comments and renamed classes as I mentioned in initial comment. Waiting for QA before merging. If there are any specific concern, I can take it up in follow up Jira. |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
Signed-off-by: Bharath Vissapragada <bharathv@apache.org>
Signed-off-by: Bharath Vissapragada <bharathv@apache.org>
…he#1346) Signed-off-by: Bharath Vissapragada <bharathv@apache.org>
…he#1346) Signed-off-by: Bharath Vissapragada <bharathv@apache.org>
…he#1346) Signed-off-by: Bharath Vissapragada <bharathv@apache.org>
No description provided.