-
Notifications
You must be signed in to change notification settings - Fork 3.3k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
HBASE-24405 : Document hbase:slowlog related operations (#1747)
Signed-off-by: ramkrish86 <ramkrishna@apache.org> Signed-off-by: Anoop Sam John <anoopsamjohn@apache.org>
- Loading branch information
1 parent
9d9f07b
commit e1d7dc8
Showing
4 changed files
with
151 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
118 changes: 118 additions & 0 deletions
118
src/main/asciidoc/_chapters/slow_log_responses_from_systable.adoc
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,118 @@ | ||
//// | ||
/** | ||
* | ||
* Licensed to the Apache Software Foundation (ASF) under one | ||
* or more contributor license agreements. See the NOTICE file | ||
* distributed with this work for additional information | ||
* regarding copyright ownership. The ASF licenses this file | ||
* to you under the Apache License, Version 2.0 (the | ||
* "License"); you may not use this file except in compliance | ||
* with the License. You may obtain a copy of the License at | ||
* | ||
* http://www.apache.org/licenses/LICENSE-2.0 | ||
* | ||
* Unless required by applicable law or agreed to in writing, software | ||
* distributed under the License is distributed on an "AS IS" BASIS, | ||
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
* See the License for the specific language governing permissions and | ||
* limitations under the License. | ||
*/ | ||
//// | ||
[[slow_log_responses_from_systable]] | ||
==== Get Slow/Large Response Logs from System table hbase:slowlog | ||
The above section provides details about Admin APIs: | ||
* get_slowlog_responses | ||
* get_largelog_responses | ||
* clear_slowlog_responses | ||
All of the above APIs access online in-memory ring buffers from | ||
individual RegionServers and accumulate logs from ring buffers to display | ||
to end user. However, since the logs are stored in memory, after RegionServer is | ||
restarted, all the objects held in memory of that RegionServer will be cleaned up | ||
and previous logs are lost. What if we want to persist all these logs forever? | ||
What if we want to store them in such a manner that operator can get all historical | ||
records with some filters? e.g get me all large/slow RPC logs that are triggered by | ||
user1 and are related to region: | ||
cluster_test,cccccccc,1589635796466.aa45e1571d533f5ed0bb31cdccaaf9cf. ? | ||
If we have a system table that stores such logs in increasing (not so strictly though) | ||
order of time, it can definitely help operators debug some historical events | ||
(scan, get, put, compaction, flush etc) with detailed inputs. | ||
Config which enabled system table to be created and store all log events is | ||
`hbase.regionserver.slowlog.systable.enabled`. | ||
The default value for this config is `false`. If provided `true` | ||
(Note: `hbase.regionserver.slowlog.buffer.enabled` should also be `true`), | ||
a cron job running in every RegionServer will persist the slow/large logs into | ||
table hbase:slowlog. By default cron job runs every 10 min. Duration can be configured | ||
with key: `hbase.slowlog.systable.chore.duration`. By default, RegionServer will | ||
store upto 1000(config key: `hbase.regionserver.slowlog.systable.queue.size`) | ||
slow/large logs in an internal queue and the chore will retrieve these logs | ||
from the queue and perform batch insertion in hbase:slowlog. | ||
hbase:slowlog has single ColumnFamily: `info` | ||
`info` contains multiple qualifiers which are the same attributes present as | ||
part of `get_slowlog_responses` API response. | ||
* info:call_details | ||
* info:client_address | ||
* info:method_name | ||
* info:param | ||
* info:processing_time | ||
* info:queue_time | ||
* info:region_name | ||
* info:response_size | ||
* info:server_class | ||
* info:start_time | ||
* info:type | ||
* info:username | ||
And example of 2 rows from hbase:slowlog scan result: | ||
[source] | ||
---- | ||
\x024\xC1\x03\xE9\x04\xF5@ column=info:call_details, timestamp=2020-05-16T14:58:14.211Z, value=Scan(org.apache.hadoop.hbase.shaded.protobuf.generated.ClientProtos$ScanRequest) | ||
\x024\xC1\x03\xE9\x04\xF5@ column=info:client_address, timestamp=2020-05-16T14:58:14.211Z, value=172.20.10.2:57347 | ||
\x024\xC1\x03\xE9\x04\xF5@ column=info:method_name, timestamp=2020-05-16T14:58:14.211Z, value=Scan | ||
\x024\xC1\x03\xE9\x04\xF5@ column=info:param, timestamp=2020-05-16T14:58:14.211Z, value=region { type: REGION_NAME value: "hbase:meta,,1" } scan { column { family: "info" } attribute { name: "_isolationle | ||
vel_" value: "\x5C000" } start_row: "cluster_test,33333333,99999999999999" stop_row: "cluster_test,," time_range { from: 0 to: 9223372036854775807 } max_versions: 1 cache_blocks | ||
: true max_result_size: 2097152 reversed: true caching: 10 include_stop_row: true readType: PREAD } number_of_rows: 10 close_scanner: false client_handles_partials: true client_ | ||
handles_heartbeats: true track_scan_metrics: false | ||
\x024\xC1\x03\xE9\x04\xF5@ column=info:processing_time, timestamp=2020-05-16T14:58:14.211Z, value=18 | ||
\x024\xC1\x03\xE9\x04\xF5@ column=info:queue_time, timestamp=2020-05-16T14:58:14.211Z, value=0 | ||
\x024\xC1\x03\xE9\x04\xF5@ column=info:region_name, timestamp=2020-05-16T14:58:14.211Z, value=hbase:meta,,1 | ||
\x024\xC1\x03\xE9\x04\xF5@ column=info:response_size, timestamp=2020-05-16T14:58:14.211Z, value=1575 | ||
\x024\xC1\x03\xE9\x04\xF5@ column=info:server_class, timestamp=2020-05-16T14:58:14.211Z, value=HRegionServer | ||
\x024\xC1\x03\xE9\x04\xF5@ column=info:start_time, timestamp=2020-05-16T14:58:14.211Z, value=1589640743732 | ||
\x024\xC1\x03\xE9\x04\xF5@ column=info:type, timestamp=2020-05-16T14:58:14.211Z, value=ALL | ||
\x024\xC1\x03\xE9\x04\xF5@ column=info:username, timestamp=2020-05-16T14:58:14.211Z, value=user2 | ||
\x024\xC1\x06X\x81\xF6\xEC column=info:call_details, timestamp=2020-05-16T14:59:58.764Z, value=Scan(org.apache.hadoop.hbase.shaded.protobuf.generated.ClientProtos$ScanRequest) | ||
\x024\xC1\x06X\x81\xF6\xEC column=info:client_address, timestamp=2020-05-16T14:59:58.764Z, value=172.20.10.2:57348 | ||
\x024\xC1\x06X\x81\xF6\xEC column=info:method_name, timestamp=2020-05-16T14:59:58.764Z, value=Scan | ||
\x024\xC1\x06X\x81\xF6\xEC column=info:param, timestamp=2020-05-16T14:59:58.764Z, value=region { type: REGION_NAME value: "cluster_test,cccccccc,1589635796466.aa45e1571d533f5ed0bb31cdccaaf9cf." } scan { a | ||
ttribute { name: "_isolationlevel_" value: "\x5C000" } start_row: "cccccccc" time_range { from: 0 to: 9223372036854775807 } max_versions: 1 cache_blocks: true max_result_size: 2 | ||
097152 caching: 2147483647 include_stop_row: false } number_of_rows: 2147483647 close_scanner: false client_handles_partials: true client_handles_heartbeats: true track_scan_met | ||
rics: false | ||
\x024\xC1\x06X\x81\xF6\xEC column=info:processing_time, timestamp=2020-05-16T14:59:58.764Z, value=24 | ||
\x024\xC1\x06X\x81\xF6\xEC column=info:queue_time, timestamp=2020-05-16T14:59:58.764Z, value=0 | ||
\x024\xC1\x06X\x81\xF6\xEC column=info:region_name, timestamp=2020-05-16T14:59:58.764Z, value=cluster_test,cccccccc,1589635796466.aa45e1571d533f5ed0bb31cdccaaf9cf. | ||
\x024\xC1\x06X\x81\xF6\xEC column=info:response_size, timestamp=2020-05-16T14:59:58.764Z, value=211227 | ||
\x024\xC1\x06X\x81\xF6\xEC column=info:server_class, timestamp=2020-05-16T14:59:58.764Z, value=HRegionServer | ||
\x024\xC1\x06X\x81\xF6\xEC column=info:start_time, timestamp=2020-05-16T14:59:58.764Z, value=1589640743932 | ||
\x024\xC1\x06X\x81\xF6\xEC column=info:type, timestamp=2020-05-16T14:59:58.764Z, value=ALL | ||
\x024\xC1\x06X\x81\xF6\xEC column=info:username, timestamp=2020-05-16T14:59:58.764Z, value=user1 | ||
---- | ||
Operator can use ColumnValueFilter to filter records based on region_name, username, | ||
client_address etc. | ||
Time range based queries will also be very useful. | ||
Example: | ||
[source] | ||
---- | ||
scan 'hbase:slowlog', { TIMERANGE => [1589621394000, 1589637999999] } | ||
---- |