Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HBASE-28564 Refactor direct interactions of Reference file creations to SFT interface #5939

Merged
merged 1 commit into from
Nov 1, 2024

Conversation

gvprathyusha6
Copy link
Contributor

@gvprathyusha6 gvprathyusha6 commented May 23, 2024

HBASE-27826 introduces tracking of Link/Reference files with StoreFileTracker interface and with that, we can have the FileBasedStoreFileTracker implementation to track Link/Reference files only using .filelist file itself and not create actual link/ref files (virtual links).

To support the same, we need to refactor all the direct interactions of Ref/Link file creations to SFT layer and this changes starts off with moving the current Reference file operations/logic to StoreFileTrackerBase.
Post that we should be able to override/evolve the representation to handle Virtual Links.

Change detail

  1. Modifies HStoreFile constructors to take SFT instance as a parameter.

  2. Add apis in SFT to create StoreFileInfo objects and use that everywhere

  3. Removes HRegionFileSystem.getStoreFiles(final String familyName, final boolean validate) and helper methods in Snapshot apis itself.

  4. Adds hasReferences api in SFT, the idea is to use SFT.hasReferences() instead of HRegionFileSystem.hasReferences(), we haven't moved all file listing to SFT yet and will be handled in a different JIRA.

  5. Adds a helper method to create StoreFileInfo for a HFile which is not Reference/HFileLink used for uts.

  6. StoreFile.jsp fails to print data for Reference/Link files, this needs a minor refactor, which can be accommodated in the same PR or as part of another jira

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@gvprathyusha6 gvprathyusha6 force-pushed the HBASE-28564 branch 4 times, most recently from 10cb001 to 2903369 Compare June 19, 2024 17:16
@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@gvprathyusha6 gvprathyusha6 marked this pull request as ready for review August 7, 2024 09:00
@wchevreuil
Copy link
Contributor

Virtual links for Reference/Link files

Oh yeah, thanks for pointing to the original design doc. Let me give another round, I just want to see if we can minimise adding references to SFT in too many places.

@@ -626,7 +626,7 @@ private List<Path> mergeStoreFiles(MasterProcedureEnv env, HRegionFileSystem reg
// to read the hfiles.
storeFileInfo.setConf(storeConfiguration);
Path refFile = mergeRegionFs.mergeStoreFile(regionFs.getRegionInfo(), family,
new HStoreFile(storeFileInfo, hcd.getBloomFilterType(), CacheConfig.DISABLED));
new HStoreFile(storeFileInfo, hcd.getBloomFilterType(), CacheConfig.DISABLED), tracker);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess in the future we would be moving mergeStoreFile to the SFT interface itself?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes in future if we have split/merge as well in SFT layer based on the impl (FILE, DEFAULT) we can commit all the ref/link files at once for FSFT based implementations

Comment on lines +429 to +438
ColumnFamilyDescriptor[] families = tableDescriptor.getColumnFamilies();
boolean references = false;
for (ColumnFamilyDescriptor cfd : families) {
StoreFileTracker sft = StoreFileTrackerFactory.create(services.getConfiguration(),
tableDescriptor, ColumnFamilyDescriptorBuilder.of(cfd.getNameAsString()), regionFs);
references = references || sft.hasReferences();
if (references) {
break;
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should also be moved to HRegionFileSystem.hasReferences(), like other methods did?

Additional question: Is it guaranteed that we don't have more than one SFT instance for the same store at any point in time? Because the file based SFT impl relies on the fact it's the single instance manipulating its meta files. If we have multiple instances of SFT for the same store, it could lead to inconsistencies, where one of the instances update the meta file, than the others would be looking at an outdated state of the meta files.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should also be moved to HRegionFileSystem.hasReferences(), like other methods did?

When we introduce virtual links to Link/Ref files using SFT, hasReferences() abstraction also should move to SFT layer and there is only one reference left of HRegionFileSystem.hasReferences() which will be eventually removed in HBASE-28861

Is it guaranteed that we don't have more than one SFT instance for the same store at any point in time? Because the file based SFT impl relies on the fact it's the single instance manipulating its meta files

This should only be true for WRITE mode of SFT instance right? there could be multiple instances of READ only mode SFT for the same store. I have created HBASE-28863 for introducing cache primarily for READ only mode

If we have multiple instances of SFT for the same store, it could lead to inconsistencies, where one of the instances update the meta file, than the others would be looking at an outdated state of the meta files.

Is this part not already handled based on timestamp? Or Flush and Compaction both are using SFT instance from StoreEngine due to that same reason?
Also looks like at couple of places I am using non READ only mode and I was only needing READ mode there, let me quickly reiterate on that, but I guess it should be good to have that check at StoreFileTrackerFactory layer as well. thanks for pointing that out @wchevreuil

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should only be true for WRITE mode of SFT instance right? there could be multiple instances of READ only mode SFT for the same store.

How would the "READ_ONLY" SFTs know that the "WRITER" had rotated the file? Specially if its on a different process, like here? Other master based processes such as split take it for granted since the region would be closed, but I don't think it's the case here.

Copy link
Contributor

@wchevreuil wchevreuil Sep 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was looking at the FileBasedStoreFileTracker code. Whenever compaction or flush happens, we rotate the meta file as we call FileBasedStoreFileTracker.update, and then we also update the map of store files within this FileBasedStoreFileTracker instance. At this point, any other existing FileBasedStoreFileTracker instance for this same store should reload the store files, otherwise the store files map on those instances would be outdated. For SFT instances on read replicas, that's not a problem because we signal these events in the wal, so the read replicas would know they need to do a reload in their SFT instances.

Copy link
Contributor Author

@gvprathyusha6 gvprathyusha6 Sep 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At this point, any other existing FileBasedStoreFileTracker instance for this same store should reload the store files, otherwise the store files map on those instances would be outdated.

load of SFT every time does a backedfile.load() again right, at that time it should know the latest manifest and load that one again, it does not return the storefiles it cached know

@@ -227,8 +228,8 @@ public long getMaxMemStoreTS() {
* @param primaryReplica true if this is a store file for primary replica, otherwise false.
*/
public HStoreFile(FileSystem fs, Path p, Configuration conf, CacheConfig cacheConf,
BloomType cfBloomType, boolean primaryReplica) throws IOException {
this(new StoreFileInfo(conf, fs, p, primaryReplica), cfBloomType, cacheConf);
BloomType cfBloomType, boolean primaryReplica, StoreFileTracker sft) throws IOException {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than adding this extra 'sft' parameter, and change every single caller in the upper chain to now create/forward an SFT instance down here, can we simply create the SFT instance in this HStoreFile constructor?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently HStoreFile is tied to Hadoop FileSystem obj, it will be more inline to send in the StoreFileTracker instance instead know?
Also then we need to send in the entire content of StoreContext to HStoreFile constructor if we want to create StoreFileTracker instance inside HStoreFile constructor.
Infact SFT instance currently doesnt have api for accessing its storeContext outside, otherwise we can replace FS object in constructor with SFT object directly

Copy link
Contributor

@apurtell apurtell Oct 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about building a StoreContext and passing that to the HStoreFile constructor?

 public HStoreFile(StoreFileContext context) {
     ...
 }

This could be a follow up refactor that cleans up this and more.

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

Copy link
Contributor

@apurtell apurtell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As we -- necessarily! -- indirect all actions that touch the region filesystem through the store file tracker we will cause it to do more work. We should have a metric for the number of times the FileBasedStoreFileTracker reads the "backedfile" and another metric for the number of times the FileBasedStoreFileTracker writes the "backedfile" , so we can compare the impact of these changes and also optimize where possible those occurrences to a minimum.

Let's do this as a followup. @gvprathyusha6
In fact if we can make that change and commit it before this one, we can track the impact of these changes. Consider it a nice to have such information for evaluating next steps, not a prerequisite for commit. We should have these changes anyway.

Also it appears there are some minor conflicts now to be resolved.

@gvprathyusha6
Copy link
Contributor Author

As we -- necessarily! -- indirect all actions that touch the region filesystem through the store file tracker we will cause it to do more work. We should have a metric for the number of times the FileBasedStoreFileTracker reads the "backedfile" and another metric for the number of times the FileBasedStoreFileTracker writes the "backedfile" , so we can compare the impact of these changes and also optimize where possible those occurrences to a minimum.

Let's do this as a followup. @gvprathyusha6 In fact if we can make that change and commit it before this one, we can track the impact of these changes. Consider it a nice to have such information for evaluating next steps, not a prerequisite for commit. We should have these changes anyway.

Also it appears there are some minor conflicts now to be resolved.

Agreed, having these metrics would be great to understand the impact, created HBASE-28907

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@Apache-HBase
Copy link

🎊 +1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 58s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 1s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 detsecrets 0m 0s detect-secrets was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 hbaseanti 0m 0s Patch does not have any anti-patterns.
_ master Compile Tests _
+1 💚 mvninstall 4m 45s master passed
+1 💚 compile 5m 3s master passed
+1 💚 checkstyle 0m 48s master passed
+1 💚 spotbugs 2m 44s master passed
+1 💚 spotless 1m 18s branch has no errors when running spotless:check.
_ Patch Compile Tests _
+1 💚 mvninstall 4m 45s the patch passed
+1 💚 compile 5m 9s the patch passed
+1 💚 javac 5m 8s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
-0 ⚠️ checkstyle 1m 3s /results-checkstyle-hbase-server.txt hbase-server: The patch generated 1 new + 130 unchanged - 1 fixed = 131 total (was 131)
+1 💚 spotbugs 2m 40s the patch passed
+1 💚 hadoopcheck 15m 38s Patch does not cause any errors with Hadoop 3.3.6 3.4.0.
+1 💚 spotless 1m 14s patch has no errors when running spotless:check.
_ Other Tests _
+1 💚 asflicense 0m 23s The patch does not generate ASF License warnings.
56m 34s
Subsystem Report/Notes
Docker ClientAPI=1.47 ServerAPI=1.47 base: https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-5939/29/artifact/yetus-general-check/output/Dockerfile
GITHUB PR #5939
Optional Tests dupname asflicense javac spotbugs checkstyle codespell detsecrets compile hadoopcheck hbaseanti spotless
uname Linux b747ad450d68 5.4.0-195-generic #215-Ubuntu SMP Fri Aug 2 18:28:05 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/hbase-personality.sh
git revision master / 6e7ff98
Default Java Eclipse Adoptium-17.0.11+9
Max. process+thread count 83 (vs. ulimit of 30000)
modules C: hbase-server U: hbase-server
Console output https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-5939/29/console
versions git=2.34.1 maven=3.9.8 spotbugs=4.7.3
Powered by Apache Yetus 0.15.0 https://yetus.apache.org

This message was automatically generated.

@Apache-HBase
Copy link

🎊 +1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 16s Docker mode activated.
-0 ⚠️ yetus 0m 3s Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --author-ignore-list --blanks-eol-ignore-file --blanks-tabs-ignore-file --quick-hadoopcheck
_ Prechecks _
_ master Compile Tests _
+1 💚 mvninstall 3m 0s master passed
+1 💚 compile 0m 57s master passed
+1 💚 javadoc 0m 30s master passed
+1 💚 shadedjars 5m 10s branch has no errors when building our shaded downstream artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 2m 53s the patch passed
+1 💚 compile 0m 56s the patch passed
+1 💚 javac 0m 56s the patch passed
-0 ⚠️ javadoc 0m 29s /results-javadoc-javadoc-hbase-server.txt hbase-server generated 6 new + 55 unchanged - 0 fixed = 61 total (was 55)
+1 💚 shadedjars 5m 9s patch has no errors when building our shaded downstream artifacts.
_ Other Tests _
+1 💚 unit 215m 50s hbase-server in the patch passed.
240m 11s
Subsystem Report/Notes
Docker ClientAPI=1.47 ServerAPI=1.47 base: https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-5939/29/artifact/yetus-jdk17-hadoop3-check/output/Dockerfile
GITHUB PR #5939
Optional Tests javac javadoc unit compile shadedjars
uname Linux cc991ae9f973 5.4.0-192-generic #212-Ubuntu SMP Fri Jul 5 09:47:39 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/hbase-personality.sh
git revision master / 6e7ff98
Default Java Eclipse Adoptium-17.0.11+9
Test Results https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-5939/29/testReport/
Max. process+thread count 5352 (vs. ulimit of 30000)
modules C: hbase-server U: hbase-server
Console output https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-5939/29/console
versions git=2.34.1 maven=3.9.8
Powered by Apache Yetus 0.15.0 https://yetus.apache.org

This message was automatically generated.

@apurtell
Copy link
Contributor

Any concerns about merging this? I plan to do so this first thing tomorrow, so about 20 hours from now.

@apurtell apurtell merged commit 70f2fef into apache:master Nov 1, 2024
1 check passed
asfgit pushed a commit that referenced this pull request Nov 2, 2024
…to SFT interface (#5939)

Signed-off-by: Andrew Purtell <apurtell@apache.org>
asfgit pushed a commit that referenced this pull request Nov 2, 2024
…to SFT interface (#5939)

Signed-off-by: Andrew Purtell <apurtell@apache.org>

Conflicts:
	hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
	hbase-server/src/main/java/org/apache/hadoop/hbase/util/ServerRegionReplicaUtil.java
	hbase-server/src/test/java/org/apache/hadoop/hbase/master/janitor/TestCatalogJanitor.java
	hbase-server/src/test/java/org/apache/hadoop/hbase/mob/TestMobFileCache.java
	hbase-server/src/test/java/org/apache/hadoop/hbase/mob/TestMobStoreCompaction.java
	hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/MockHStoreFile.java
	hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegion.java
	hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegionFileSystem.java
	hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestHStoreFile.java
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants