-
Notifications
You must be signed in to change notification settings - Fork 3.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HBASE-24418 Consolidate Normalizer implementations #1786
Conversation
I intend to backport this at least to branch-2.3. I think branch-2.2 needs some other patches before this would apply. PTAL, @saintstack @Apache9 @infraio @huaxiangsun @joshelser @mnpoonia @ddupg. |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left few comments. Looks good overall with much better consolidation.
hbase-server/src/main/java/org/apache/hadoop/hbase/master/normalizer/RegionNormalizer.java
Outdated
Show resolved
Hide resolved
...rver/src/test/java/org/apache/hadoop/hbase/master/normalizer/TestSimpleRegionNormalizer.java
Show resolved
Hide resolved
...e-server/src/main/java/org/apache/hadoop/hbase/master/normalizer/SimpleRegionNormalizer.java
Show resolved
Hide resolved
...e-server/src/main/java/org/apache/hadoop/hbase/master/normalizer/SimpleRegionNormalizer.java
Outdated
Show resolved
Hide resolved
...e-server/src/main/java/org/apache/hadoop/hbase/master/normalizer/SimpleRegionNormalizer.java
Show resolved
Hide resolved
...e-server/src/main/java/org/apache/hadoop/hbase/master/normalizer/SimpleRegionNormalizer.java
Show resolved
Hide resolved
masterServices.getAssignmentManager().getRegionStates().getRegionsOfTable(table); | ||
List<RegionInfo> tableRegions = masterServices.getAssignmentManager() | ||
.getRegionStates() | ||
.getRegionsOfTable(table); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We retrieve List<RegionInfo> tableRegions
again in computeSplitNormalizationPlans()
and computeMergeNormalizationPlans()
. It should be safe to pass this list to both methods and avoid calls to AM?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 on above
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, let me see if i can do something better here. I don't like how accessing these collection objects doesn't align with their use here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is all too messy for my liking. These computeSplitNormalizationPlans
and computeMergeNormalizationPlans
share a bunch of data in common, and both go rummaging around in the AM independently. TableName, Region list, RegionStates, computed average size, it could all be reused...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How much do you guys care about this? The best solution i have right now is an inner class that does the lookups once, and holds state across methods.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The inner class ends up being awkward too. Let me address the other feedback and return to this one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've pulled out the common state into an inner class; it's not so bad after all. PTAL.
...e-server/src/main/java/org/apache/hadoop/hbase/master/normalizer/SimpleRegionNormalizer.java
Outdated
Show resolved
Hide resolved
...e-server/src/main/java/org/apache/hadoop/hbase/master/normalizer/SimpleRegionNormalizer.java
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great. Generally, I think the default is too aggressive. Also think should purge the notion that there could be alternate normalizer impls... Save a bit on config and helps simplify.
hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
Outdated
Show resolved
Hide resolved
hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
Outdated
Show resolved
Hide resolved
...e-server/src/main/java/org/apache/hadoop/hbase/master/normalizer/SimpleRegionNormalizer.java
Show resolved
Hide resolved
...e-server/src/main/java/org/apache/hadoop/hbase/master/normalizer/SimpleRegionNormalizer.java
Outdated
Show resolved
Hide resolved
masterServices.getAssignmentManager().getRegionStates().getRegionsOfTable(table); | ||
List<RegionInfo> tableRegions = masterServices.getAssignmentManager() | ||
.getRegionStates() | ||
.getRegionsOfTable(table); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 on above
...server/src/main/java/org/apache/hadoop/hbase/master/normalizer/SplitPlanFirstComparator.java
Outdated
Show resolved
Hide resolved
...e-server/src/main/java/org/apache/hadoop/hbase/master/normalizer/SimpleRegionNormalizer.java
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, nice change!
Only the test testNormalizerCannotMergeNonAdjacentRegions() needs to be fixed, as without the patch, it also passed for me.
hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
Outdated
Show resolved
Hide resolved
hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
Outdated
Show resolved
Hide resolved
...e-server/src/main/java/org/apache/hadoop/hbase/master/normalizer/SimpleRegionNormalizer.java
Show resolved
Hide resolved
...e-server/src/main/java/org/apache/hadoop/hbase/master/normalizer/SimpleRegionNormalizer.java
Show resolved
Hide resolved
...e-server/src/main/java/org/apache/hadoop/hbase/master/normalizer/SimpleRegionNormalizer.java
Outdated
Show resolved
Hide resolved
...server/src/main/java/org/apache/hadoop/hbase/master/normalizer/SplitPlanFirstComparator.java
Outdated
Show resolved
Hide resolved
...rver/src/test/java/org/apache/hadoop/hbase/master/normalizer/TestSimpleRegionNormalizer.java
Show resolved
Hide resolved
|
@saintstack Looks liked we originally ran the normalizer on a 30min period. This was reduced to 5 min for 2.0, according to https://hbase.apache.org/book.html#upgrade2.0.changed.defaults |
Interesting. Wonder why the change was made. |
Looks like it was slipped in with https://issues.apache.org/jira/browse/HBASE-15065. There's some interesting comments as well from the original author in regard to the plan order, the move from a single global, optimized plan to the current list of actions ("plans"). |
171d1fc
to
fde0a68
Compare
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
+1 for the new diff, looks great! |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
+1 on the new udpate. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whoops... old review I didn't push....
hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
Outdated
Show resolved
Hide resolved
...server/src/main/java/org/apache/hadoop/hbase/master/normalizer/SplitPlanFirstComparator.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there are follow ons here... but this is good to go I'd say.
* RegionMetrics)</li> | ||
* <li>For each region R0, if R0 is bigger than S * 2, it is kindly requested to split.</li> | ||
* <li>Otherwise, for the next region in the chain R1, if R0 + R1 is smaller then S, R0 and R1 | ||
* are kindly requested to merge.</li> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I said it already that this strikes me as too aggressive. In a follow-up we should make this sloppier? S * (2*) + (2*slop)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rather than slop, my recommendation is that we add the concept of a "minimum region size" for deciding when to split. I filed HBASE-24464 for that discussion.
public class SimpleRegionNormalizer extends AbstractRegionNormalizer { | ||
|
||
@InterfaceAudience.LimitedPrivate(HBaseInterfaceAudience.CONFIG) | ||
public class SimpleRegionNormalizer implements RegionNormalizer { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So we kept the Interface?
Simplify our Normalizer story to have just a single, configurable implementation. * fold the features of `MergeNormalizer` into `SimpleRegionNormalizer`, removing the intermediate abstract class. * configuration keys for merge-only features now share a common structure. * add configuration to selectively disable normalizer split/merge operations. * `RegionNormalizer` now extends `Configurable` instead of creating a new instance of `HBaseConfiguration` or snooping one off of other fields. * avoid the extra RPCs by using `MasterServices` instead of `MasterRpcServices`. * boost test coverage of all the various flags and feature combinations. Signed-off-by: Michael Stack <stack@apache.org> Signed-off-by: Viraj Jasani <vjasani@apache.org> Signed-off-by: huaxiangsun <huaxiangsun@apache.org>
c21c8a7
to
8e2558a
Compare
Thanks for review everyone. Good discussion! |
Simplify our Normalizer story to have just a single, configurable
implementation.
MergeNormalizer
intoSimpleRegionNormalizer
, removing the intermediate abstract class.structure.
operations.
RegionNormalizer
now extendsConfigurable
instead of creating anew instance of
HBaseConfiguration
or snooping one off of otherfields.
MasterServices
instead ofMasterRpcServices
.combinations.