HBASE-25217 [Metrics] Add metrics for Call in IPC response queue#2585
HBASE-25217 [Metrics] Add metrics for Call in IPC response queue#2585Reidddddd wants to merge 4 commits intoapache:branch-1from
Conversation
|
💔 -1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
bharathv
left a comment
There was a problem hiding this comment.
Curious whats the context for this change? Noticed something fishy there in the call queue?
Need some tests here?
| "were served from the tail of the queue"; | ||
| String NUM_CALL_RESPONSE_QUEUE_NAME = "numCallsInResponseQueue"; | ||
| String NUM_CALL_RESPONSE_QUEUE_DESC = "Number of calls in response queue."; | ||
| String NUM_SIZE_RESPONSE_QUEUE_NAME = "numSizeInResponseQueue"; |
There was a problem hiding this comment.
nit: SIZE_OF_RESPONSE_QUEUE and sizeOfResponseQueue ? num + size sounds confusing..
| String NUM_CALL_RESPONSE_QUEUE_NAME = "numCallsInResponseQueue"; | ||
| String NUM_CALL_RESPONSE_QUEUE_DESC = "Number of calls in response queue."; | ||
| String NUM_SIZE_RESPONSE_QUEUE_NAME = "numSizeInResponseQueue"; | ||
| String NUM_SIZE_RESPONSE_QUEUE_DESC = "Size in response queue."; |
| private final MutableFastCounter sentBytes; | ||
| private final MutableFastCounter receivedBytes; | ||
|
|
||
| private final MutableFastCounter numCallInResponseQueue; |
There was a problem hiding this comment.
I think MutableGaugeLong is the right type for this.. Counters are meant to be always increasing where as a guage can go up and down...
hbase-server/src/main/java/org/apache/hadoop/hbase/ipc/RpcServer.java
Outdated
Show resolved
Hide resolved
The main purpose is to gain better insight of the whole rpc process procedure. Currently, we have in request queue metrics, call processed metrics (queue time, wait time, total time etc), but we don't have metrics for calls in response queue which is the last part of the procedure. I have some clients in production who fond of doing big joins or big scans, so I suspect if any possible that responses would be accumulated in such scenarios, knowing that socket channel only processed 64KB per chunk. |
|
🎊 +1 overall
This message was automatically generated. |
Ack, thanks. One last question, lgtm otherwise We need any tests to assert some metrics state? (so that its not broken in the future inadvertently) |
hbase-server/src/main/java/org/apache/hadoop/hbase/ipc/RpcServer.java
Outdated
Show resolved
Hide resolved
| metrics.removeCallFromResponseQueue(call.response.getRemaining()); | ||
| if (!processResponse(call)) { | ||
| connection.responseQueue.addFirst(call); | ||
| metrics.addCallToResponseQueue(call.response.getRemaining()); |
There was a problem hiding this comment.
Isn't the right way to do this something like..
before = call.response.getRemaining(); <=== 1
if (!processResponse(call)) {
after = call.response.getRemaining();
metrics.decr(after-before); <=== 2
} else {
metrics.decr(before);
}
The difference is you are not decrementing until the bytes are actually flushed, otherwise if a metrics poller polls between (1) and (2), there is some weird state in between, perhaps ^^ is more desirable for you.
|
🎊 +1 overall
This message was automatically generated. |
|
I applied it on my env. The response queue size got negative... |
| @@ -1159,6 +1159,8 @@ private long purge(long lastPurgeTime) { | |||
| } | |||
| Call call = connection.responseQueue.peekFirst(); | |||
There was a problem hiding this comment.
Oh, peekFirst doesn't remove element.
Is it possible that getRemaining returns negative? |
|
I don't know it on top of my head but mind adding a test for this :-), perhaps easier to debug with break points too if you can repro. |
No description provided.