-
Notifications
You must be signed in to change notification settings - Fork 3.4k
HBASE-29168 Add configurable throttling of region moves in CacheAwareLoadBalancer. #6763
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…LoadBalancer. Change-Id: I5af00b9e661c5e0a38ab5c346bd49349d8d01a74
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
Change-Id: I4039f703c34e12cc43d5e728e65da8de0a89ab2a
This comment has been minimized.
This comment has been minimized.
Change-Id: I5353a787667dc7a3d38d1a9a2b60ce5ef58bc4a6
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
| public class CacheAwareLoadBalancer extends StochasticLoadBalancer { | ||
| private static final Logger LOG = LoggerFactory.getLogger(CacheAwareLoadBalancer.class); | ||
|
|
||
| public static final String CACHE_RATIO_THRESHOLD = "hbase.master.balancer.stochastic.cacheRatio"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this ratio is only used for throttling, then I wonder whether we should name it more specifically
|
|
||
| public static final String CACHE_RATIO_THRESHOLD = "hbase.master.balancer.stochastic.cacheRatio"; | ||
|
|
||
| public static final String MOVE_THROTTLING = "hbase.master.balancer.move.throttlingMs"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If throttle support is only added to the cache aware balancer, then maybe this configuration name should also be specific to the cache aware balancer
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. Right now, it's only implemented by the cache aware balancer. But since it may be implemented by other possible balancer implementations in the future, I thought may be worth move these two to the LoadBalancer interface.
|
|
||
| public static final String MOVE_THROTTLING = "hbase.master.balancer.move.throttlingMs"; | ||
|
|
||
| public static final long MOVE_THROTTLING_DEFAULT = 60 * 1000; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: the unit is ambiguous. We should either use a more intentional type like Duration, or we should include the unit in the name
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Decided to go with using the Duration suggestion. This is now moved to the LoadBalancer interface (see my reply in other comment above).
| float ratioThreshold = | ||
| this.configuration.getFloat(CACHE_RATIO_THRESHOLD, CACHE_RATIO_THRESHOLD_DEFAULT); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could be a pretty hot path on a bigger, imbalanced cluster. So I wonder if we should move this conf lookup to somewhere less hot?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved this and the sleepTime to the loadConf() method
Change-Id: Ie2882a3714775cd883ac2895785c3ace0977f3a9
|
Thanks for your suggestions, @rmdmattingly ! I had applied them, please review it again. |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
|
Latest UT failure isn't related. I believe it's a flakey and had submitted a separate fix for that at HBASE-29183. |
|
re-test |
|
🎊 +1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
taklwu
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
| conf.set("hbase.bucketcache.persistent.path", testDir + "/bucket.persistence"); | ||
| conf.setLong(CacheConfig.BUCKETCACHE_PERSIST_INTERVAL_KEY, 100); | ||
| conf.setBoolean(CacheConfig.EVICT_BLOCKS_ON_CLOSE_KEY, true); | ||
| conf.setBoolean(CACHE_BLOCKS_ON_WRITE_KEY, true); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nit] why do we need to add the cache blocks on write here? does anyone change the test behavior? I cannot find any other edits in this file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this is not related to this throttling change, but I found out this test was flakey, with the assertion from line 104 failing sometimes. Most of the times, it will pass, as once we flush the newly added data on line 95, we'll eventually open the newly created file and run a prefetch for it. But this happens in the background, and sometimes, the prefetch might not have cached any blocks when we get to line 104 of this test. So, if we want to make sure blocks are cached at that point, we should have blockOnWrite set to true.
|
|
||
| public static final String CACHE_RATIO_THRESHOLD = | ||
| "hbase.master.balancer.stochastic.throttling.cacheRatio"; | ||
| public static final float CACHE_RATIO_THRESHOLD_DEFAULT = 0.8f; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nit] do you think we need a doc update about this feature ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, I'll add it to the cache aware balancer doc section. Let me do it on a separate jira once this get merged.
…LoadBalancer. (#6763) Signed-off-by: Tak Lon (Stephen) Wu <taklwu@apache.org>
…LoadBalancer. (apache#6763) Signed-off-by: Tak Lon (Stephen) Wu <taklwu@apache.org> Change-Id: I1c9df00f1723246685771f69930b1e5f89db72ff
…LoadBalancer. (apache#6763) (apache#6865) Signed-off-by: Tak Lon (Stephen) Wu <taklwu@apache.org>
No description provided.