-
Notifications
You must be signed in to change notification settings - Fork 3.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HBASE-27667 Normalizer can skip picking presplit regions while prepar… #5170
Conversation
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
@@ -246,23 +251,35 @@ public List<NormalizationPlan> computePlansForTable(final TableDescriptor tableD | |||
|
|||
/** Returns size of region in MB and if region is not found than -1 */ | |||
private long getRegionSizeMB(RegionInfo hri) { | |||
RegionMetrics regionLoad = getRegionMetrics(hri); | |||
if (regionLoad == null) return -1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will generate a checkstyle warning? Please use {} to wrap the block.
|
||
private long getStoreFileCount(RegionInfo hri) { | ||
RegionMetrics regionLoad = getRegionMetrics(hri); | ||
if (regionLoad == null) return 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ditto.
@@ -474,6 +494,11 @@ private boolean isLargeEnoughForMerge(final NormalizerConfiguration normalizerCo | |||
return getRegionSizeMB(regionInfo) >= normalizerConfiguration.getMergeMinRegionSizeMb(ctx); | |||
} | |||
|
|||
private boolean isPresplitEmptyRegion(final NormalizerConfiguration normalizerConfiguration, | |||
final NormalizeContext ctx, final RegionInfo regionInfo) { | |||
return getStoreFileCount(regionInfo) == 0 && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why getStoreFileCount == 0 means this is a pre split region?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why getStoreFileCount == 0 means this is a pre split region?
Thanks for review @Apache9. When the region empty means there are three scenarios 1) pre split region where there are no storefiles and region size is 0(in the normalization we are not considering memstore data or live updates so considering store files count 0 as one of the option to detect presplit region) 2) TTL expired for region then major compaction cleanup but create zero size store files where count is more than 0 but region size is 0 3) All the region data is deleted in this case also major compaction creates empty store files but region size is 0.
Will clear other checkstyle and spotless checks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that this single empty store file behaviour is a coincidence, not something that we want to design a functionality around.
Original implementation of normaliser had a long discussion about how to handle pre-split regions. As I recall, the conclusion was to use the minimum age setting as a means for an operator to preserve an empty, pre-split table. I don't particularly like this approach either, but at least it means that any forgotten empty regions will eventually converge on a more correct state.
If you want to pursue some other means, I prefer it involve explicit metadata instead of more implicit assumptions (especially about implementation details). Sorry @chrajeshbabu , but I strongly disapprove of this approach.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ndimiduk agree with you. Will close this PR as of now as there is no other means to detect the presplit regions from meta data and reopen in case of any best option.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Surely there's a better approach?
…ing merge plans