HBASE-27048 Server side scanner time limit should account for time in queue#4562
HBASE-27048 Server side scanner time limit should account for time in queue#4562bbeaudreault merged 3 commits intoapache:masterfrom
Conversation
|
🎊 +1 overall
This message was automatically generated. |
0a49f6d to
2be4356
Compare
|
🎊 +1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
|
💔 -1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
|
💔 -1 overall
This message was automatically generated. |
|
💔 -1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
|
💔 -1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
|
💔 -1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
| long now = EnvironmentEdgeManager.currentTime(); | ||
| // if we can, subtract time already spent in rpc queue. otherwise we may still timeout | ||
| if (rpcCall != null) { | ||
| timeLimitDelta -= (now - rpcCall.getReceiveTime()); |
There was a problem hiding this comment.
Later we will use timeLimitDelta / 2, so I do not think we should subtract the queue time here. It does not make sense to count the remaining time and then divided by 2...
There was a problem hiding this comment.
Thanks for taking a look!
Why does that not make sense? The goal of the time limit is to return before the timeout. Dividing by 2 is not enough here. One could have a timeout of 5s and then be in the queue of 4s. In that case 5 / 2 = 2.5, but there's only 1s left.
Just subtracting the queue time alone is not enough either. In this case, 5-4 is 1s, but with clock drift or other things there could only be 900ms left. So we could still timeout.
So I think subtracting queue time and then dividing by 2 makes sense. But if you think that is too much we could divide by a smaller factor instead. We really just need some buffer, in addition to counting queue time.
There was a problem hiding this comment.
I added this PR because we've specifically seen cases where the scanner still timeouts, even with dividing by 2. This only happens when server is overloaded with high queue times, but it exacerbated load issues and we should try to respond even under heavy load.
It's especially bad if it happens on openScanner because the client will retry and the scanner will remain open. We've seen thousands of leaked scanners from this and server slows to crawl. I plan to push a separate PR later to try to close timed out scanners. But I think first we should try to account for queue times in time limit so we don't timeout at all if possible.
There was a problem hiding this comment.
I do not mean we should not count queue time, I just mean, we should not subtract it here. The queue is related to rpcTimeout, not related to scannerLeaseTimeoutPeriod right? So for me I prefer we substract it earlier, when calculate the rpcTimeout or callTimeout.
There was a problem hiding this comment.
Oh I see. I will try moving it up a few lines so it just applies to callTimeout or rpcTimeout. Thanks for clarifying.
|
🎊 +1 overall
This message was automatically generated. |
|
💔 -1 overall
This message was automatically generated. |
|
💔 -1 overall
This message was automatically generated. |
|
I need to fix the test. But let me know if this is what you had in mind and I can get it wrapped up |
|
🎊 +1 overall
This message was automatically generated. |
|
💔 -1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
apurtell
left a comment
There was a problem hiding this comment.
Approved with a question.
| * Default value of {@link RSRpcServices#REGION_SERVER_RPC_MINIMUM_SCAN_TIME_LIMIT_DELTA} | ||
| */ | ||
| private static final long DEFAULT_REGION_SERVER_RPC_MINIMUM_SCAN_TIME_LIMIT_DELTA = 10; | ||
| static final long DEFAULT_REGION_SERVER_RPC_MINIMUM_SCAN_TIME_LIMIT_DELTA = 10; |
There was a problem hiding this comment.
Is the configuration variable associated with this constant ever changed? It is "hbase.region.server.rpc.minimum.scan.time.limit.delta" defined in RsRpcServices. @bbeaudreault @Apache9
Do we need it? Seems like one of those knobs that nobody will ever turn.
There was a problem hiding this comment.
I agree. Only thing i could think of is maybe for like an s3-backed cluster where you know latency will be higher? 10ms may be too small and still timeout before starting. But that's really grasping.
I'm happy to remove if that's what we want. @Apache9
… queue (#4562) Signed-off-by: Duo Zhang <zhangduo@apache.org> Signed-off-by: Andrew Purtell <apurtell@apache.org>
… queue (#4562) Signed-off-by: Duo Zhang <zhangduo@apache.org> Signed-off-by: Andrew Purtell <apurtell@apache.org>
… queue (apache#4562) Signed-off-by: Duo Zhang <zhangduo@apache.org> Signed-off-by: Andrew Purtell <apurtell@apache.org>
… queue (#4562) Signed-off-by: Duo Zhang <zhangduo@apache.org> Signed-off-by: Andrew Purtell <apurtell@apache.org>
… queue (apache#4562) Signed-off-by: Duo Zhang <zhangduo@apache.org> Signed-off-by: Andrew Purtell <apurtell@apache.org>
… queue (apache#4562) Signed-off-by: Duo Zhang <zhangduo@apache.org> Signed-off-by: Andrew Purtell <apurtell@apache.org> (cherry picked from commit 87f2281) Change-Id: I74e34b4364f0d6d3535da7b8d08fe17d21894db0
No description provided.