-
Notifications
You must be signed in to change notification settings - Fork 3.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HBASE-28564 Refactor direct interactions of Reference file creations to SFT interface #5939
Conversation
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
10cb001
to
2903369
Compare
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
5cb1523
to
f2be232
Compare
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
cdd72ad
to
83c29b0
Compare
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
83c29b0
to
504506d
Compare
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
Oh yeah, thanks for pointing to the original design doc. Let me give another round, I just want to see if we can minimise adding references to SFT in too many places. |
@@ -626,7 +626,7 @@ private List<Path> mergeStoreFiles(MasterProcedureEnv env, HRegionFileSystem reg | |||
// to read the hfiles. | |||
storeFileInfo.setConf(storeConfiguration); | |||
Path refFile = mergeRegionFs.mergeStoreFile(regionFs.getRegionInfo(), family, | |||
new HStoreFile(storeFileInfo, hcd.getBloomFilterType(), CacheConfig.DISABLED)); | |||
new HStoreFile(storeFileInfo, hcd.getBloomFilterType(), CacheConfig.DISABLED), tracker); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess in the future we would be moving mergeStoreFile to the SFT interface itself?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes in future if we have split/merge as well in SFT layer based on the impl (FILE, DEFAULT) we can commit all the ref/link files at once for FSFT based implementations
ColumnFamilyDescriptor[] families = tableDescriptor.getColumnFamilies(); | ||
boolean references = false; | ||
for (ColumnFamilyDescriptor cfd : families) { | ||
StoreFileTracker sft = StoreFileTrackerFactory.create(services.getConfiguration(), | ||
tableDescriptor, ColumnFamilyDescriptorBuilder.of(cfd.getNameAsString()), regionFs); | ||
references = references || sft.hasReferences(); | ||
if (references) { | ||
break; | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should also be moved to HRegionFileSystem.hasReferences(), like other methods did?
Additional question: Is it guaranteed that we don't have more than one SFT instance for the same store at any point in time? Because the file based SFT impl relies on the fact it's the single instance manipulating its meta files. If we have multiple instances of SFT for the same store, it could lead to inconsistencies, where one of the instances update the meta file, than the others would be looking at an outdated state of the meta files.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should also be moved to HRegionFileSystem.hasReferences(), like other methods did?
When we introduce virtual links to Link/Ref files using SFT, hasReferences() abstraction also should move to SFT layer and there is only one reference left of HRegionFileSystem.hasReferences() which will be eventually removed in HBASE-28861
Is it guaranteed that we don't have more than one SFT instance for the same store at any point in time? Because the file based SFT impl relies on the fact it's the single instance manipulating its meta files
This should only be true for WRITE mode of SFT instance right? there could be multiple instances of READ only mode SFT for the same store. I have created HBASE-28863 for introducing cache primarily for READ only mode
If we have multiple instances of SFT for the same store, it could lead to inconsistencies, where one of the instances update the meta file, than the others would be looking at an outdated state of the meta files.
Is this part not already handled based on timestamp? Or Flush and Compaction both are using SFT instance from StoreEngine due to that same reason?
Also looks like at couple of places I am using non READ only mode and I was only needing READ mode there, let me quickly reiterate on that, but I guess it should be good to have that check at StoreFileTrackerFactory layer as well. thanks for pointing that out @wchevreuil
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should only be true for WRITE mode of SFT instance right? there could be multiple instances of READ only mode SFT for the same store.
How would the "READ_ONLY" SFTs know that the "WRITER" had rotated the file? Specially if its on a different process, like here? Other master based processes such as split take it for granted since the region would be closed, but I don't think it's the case here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was looking at the FileBasedStoreFileTracker
code. Whenever compaction or flush happens, we rotate the meta file as we call FileBasedStoreFileTracker.update
, and then we also update the map of store files within this FileBasedStoreFileTracker
instance. At this point, any other existing FileBasedStoreFileTracker
instance for this same store should reload the store files, otherwise the store files map on those instances would be outdated. For SFT instances on read replicas, that's not a problem because we signal these events in the wal, so the read replicas would know they need to do a reload in their SFT instances.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At this point, any other existing FileBasedStoreFileTracker instance for this same store should reload the store files, otherwise the store files map on those instances would be outdated.
load of SFT every time does a backedfile.load() again right, at that time it should know the latest manifest and load that one again, it does not return the storefiles it cached know
@@ -227,8 +228,8 @@ public long getMaxMemStoreTS() { | |||
* @param primaryReplica true if this is a store file for primary replica, otherwise false. | |||
*/ | |||
public HStoreFile(FileSystem fs, Path p, Configuration conf, CacheConfig cacheConf, | |||
BloomType cfBloomType, boolean primaryReplica) throws IOException { | |||
this(new StoreFileInfo(conf, fs, p, primaryReplica), cfBloomType, cacheConf); | |||
BloomType cfBloomType, boolean primaryReplica, StoreFileTracker sft) throws IOException { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rather than adding this extra 'sft' parameter, and change every single caller in the upper chain to now create/forward an SFT instance down here, can we simply create the SFT instance in this HStoreFile constructor?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently HStoreFile is tied to Hadoop FileSystem obj, it will be more inline to send in the StoreFileTracker instance instead know?
Also then we need to send in the entire content of StoreContext to HStoreFile constructor if we want to create StoreFileTracker instance inside HStoreFile constructor.
Infact SFT instance currently doesnt have api for accessing its storeContext outside, otherwise we can replace FS object in constructor with SFT object directly
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about building a StoreContext and passing that to the HStoreFile constructor?
public HStoreFile(StoreFileContext context) {
...
}
This could be a follow up refactor that cleans up this and more.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As we -- necessarily! -- indirect all actions that touch the region filesystem through the store file tracker we will cause it to do more work. We should have a metric for the number of times the FileBasedStoreFileTracker reads the "backedfile" and another metric for the number of times the FileBasedStoreFileTracker writes the "backedfile" , so we can compare the impact of these changes and also optimize where possible those occurrences to a minimum.
Let's do this as a followup. @gvprathyusha6
In fact if we can make that change and commit it before this one, we can track the impact of these changes. Consider it a nice to have such information for evaluating next steps, not a prerequisite for commit. We should have these changes anyway.
Also it appears there are some minor conflicts now to be resolved.
Agreed, having these metrics would be great to understand the impact, created HBASE-28907 |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
b95494f
to
6e7ff98
Compare
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
Any concerns about merging this? I plan to do so this first thing tomorrow, so about 20 hours from now. |
…to SFT interface (#5939) Signed-off-by: Andrew Purtell <apurtell@apache.org>
…to SFT interface (#5939) Signed-off-by: Andrew Purtell <apurtell@apache.org> Conflicts: hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java hbase-server/src/main/java/org/apache/hadoop/hbase/util/ServerRegionReplicaUtil.java hbase-server/src/test/java/org/apache/hadoop/hbase/master/janitor/TestCatalogJanitor.java hbase-server/src/test/java/org/apache/hadoop/hbase/mob/TestMobFileCache.java hbase-server/src/test/java/org/apache/hadoop/hbase/mob/TestMobStoreCompaction.java hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/MockHStoreFile.java hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegion.java hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegionFileSystem.java hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestHStoreFile.java
HBASE-27826 introduces tracking of Link/Reference files with StoreFileTracker interface and with that, we can have the FileBasedStoreFileTracker implementation to track Link/Reference files only using .filelist file itself and not create actual link/ref files (virtual links).
To support the same, we need to refactor all the direct interactions of Ref/Link file creations to SFT layer and this changes starts off with moving the current Reference file operations/logic to StoreFileTrackerBase.
Post that we should be able to override/evolve the representation to handle Virtual Links.
Change detail
Modifies HStoreFile constructors to take SFT instance as a parameter.
Add apis in SFT to create StoreFileInfo objects and use that everywhere
Removes HRegionFileSystem.getStoreFiles(final String familyName, final boolean validate) and helper methods in Snapshot apis itself.
Adds hasReferences api in SFT, the idea is to use SFT.hasReferences() instead of HRegionFileSystem.hasReferences(), we haven't moved all file listing to SFT yet and will be handled in a different JIRA.
Adds a helper method to create StoreFileInfo for a HFile which is not Reference/HFileLink used for uts.
StoreFile.jsp fails to print data for Reference/Link files, this needs a minor refactor, which can be accommodated in the same PR or as part of another jira