HBASE-25395 Introduce PersistedStoreEngine and PersistedStoreFileManager#2931
HBASE-25395 Introduce PersistedStoreEngine and PersistedStoreFileManager#2931taklwu merged 15 commits intoapache:HBASE-24749from
Conversation
- first commit and define interface StoreFilePathAccessor for reading and persisting path tracking data of StoreFileManager into hbase:storefile
- refactor and add loadInitialFiles to the base interface, such each store file manager implementation could have their own way to discovery a list of available files instead of just calling fs.getStoreFiles() - add store instance to each store file manager implementation
Fork the in-memory object in store file manager to a off-memory structure that is going to be used when region reassign, (re)open, cluster restart and etc. This improvement aims for better object store (e.g. S3) support without calling LIST to get storefiles from a given column family. * Details * Create PersistedStoreFileManager by extending DefaultStoreFileManager * Use PersistedStoreFileManager expect master region, hbase:meta, and hbase:storefile * if tracking information isn't available, always refresh the file storage for version upgrade and migration * add PersistedStoreEngine and integrate with HMaster when initializing
|
🎊 +1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
|
💔 -1 overall
This message was automatically generated. |
|
💔 -1 overall
This message was automatically generated. |
|
💔 -1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
|
@joshelser this patch is ready for review, and if you have any person in your mind could review it, please tag them as well, thanks a lot. |
joshelser
left a comment
There was a problem hiding this comment.
I need to go through the unit tests, but I wanted to get my initial review published.
Good stuff!
| long timeout = masterServices.getConfiguration() | ||
| .getLong(HConstants.STOREFILE_TRACKING_INIT_TIMEOUT, | ||
| HConstants.DEFAULT_STOREFILE_TRACKING_INIT_TIMEOUT); | ||
| while (!isStoreFileTableAssignedAndEnabled(masterServices)) { |
There was a problem hiding this comment.
I'm having nightmares about the hbase:namespace table not getting assigned and getting into cycles where the HBase master would just crashloop when the table wasn't getting assigned.
Not something that needs to be handled right now, but we should make sure we think about this later (rather than reintroduce old problems)
There was a problem hiding this comment.
agreed. this is an optional feature but if this is enabled, the order will be master-region > hbase:meta -> hbase:storefile > any other tables, so the best way is to merge into hbase:meta (will be done in HBASE-25397 ).
the only blocker/concern is if hbase:meta will be splittable by the time we hit that task.
| void initialize(final MasterServices masterServices) throws IOException; | ||
|
|
||
| /** | ||
| * GET storefile paths from the 'included' data set |
| final String storeName) throws IOException; | ||
|
|
||
| /** | ||
| * Writes the specified updates to the tracking |
There was a problem hiding this comment.
What is a "write" in this context? Defining what the total set of files for this Region+Store is supposed to be? Not clear from javadoc.
storeFilePathUpdateargument javadoc is also a little vague (i.e. what's an "update"?)
There was a problem hiding this comment.
In my understanding StoreFilePathUpdate represents a single entity which should be persisted by the Accessor. This could happen is some cases like results of a compaction, flush, recovered HFile, etc.
Essentially list of Paths decorated with hasStoreFilesUpdate() property which I don't fully understand what it's good for.
There was a problem hiding this comment.
thanks @anmolnar and you're right on the entity of StoreFilePathUpdate, I will adjust the comments.
also I will remove the hasStoreFilesUpdate (some old code left behind)
| * Returns the families being tracked in the storefile tracking data for the given | ||
| * table/region |
There was a problem hiding this comment.
For user-regions (not meta), is there a reason that we'd only want a subset of Stores to be using tracking data?
There was a problem hiding this comment.
mmm, we want all stores of a region server to be tracked e.g. when supporting snapshot and restoring region from the tracking data, we need to get it per family , maybe just this comment is confusing , I will update it and said this will be used by snapshot feature.
for (Path familyDir: FSUtils.getFamilyDirs(fs, regionDir)) {
There was a problem hiding this comment.
at the end, I removed it and will add it back when supporting snapshot (restore snapshot) feature
| * the row key. | ||
| * @throws IOException if a remote or network exception occurs during delete | ||
| */ | ||
| void deleteStoreFilePaths(final String tableName, final String regionName, final String storeName) |
There was a problem hiding this comment.
How does this method relate to deleteRegion? Would this be used for a truncate like operation?
There was a problem hiding this comment.
I will remove this function first in this PR, but this will be used by snapshot restoreSnapshot when we remove regions in regionsToRemove and tracking data needs to be updated
| } | ||
|
|
||
| @Override | ||
| public void addCompactionResults(Collection<HStoreFile> newCompactedfiles, |
There was a problem hiding this comment.
fixed and refactored to use a abstract base class
| return storeFileComparator; | ||
| } | ||
|
|
||
| void setStorefiles(ImmutableList<HStoreFile> storefiles) { |
There was a problem hiding this comment.
refactor to use a abstract base class
| this.storefiles = storefiles; | ||
| } | ||
|
|
||
| void setCompactedfiles(ImmutableList<HStoreFile> compactedfiles) { |
There was a problem hiding this comment.
I'd challenge you to see if there's a better API we could add to DefaultStoreFileManager than this. It's a little brittle in that PersistedStoreFileManager has to be knowing when to call these setters. Maybe there's API in which there can be one method in this class which computes and sets the compactedFiles and "current" files which both this implementation and the new implementation can leverage?
There was a problem hiding this comment.
Hm. My previous review has been lost in Github.
I would rather create an abstract base class for DefaultStoreFileManager and PersistedStoreFileManager and put everything which is common in there.
There was a problem hiding this comment.
abstract base class is good idea, I will give a try as a last thing (another commit at the end) as that will take longer.
| String message = "Failed to update Path list of " + tableName + "-" + regionName + | ||
| "-" + storeName + ", on " + TableName.STOREFILE_STR + ". The new files are not " | ||
| + "persistent and will be removed from " + regionName + "," + storeName + | ||
| ".\nFailed update: " + storeFilePathUpdate; | ||
| LOG.warn(message); |
There was a problem hiding this comment.
String.format or the parameterized log message, please.
There was a problem hiding this comment.
:( sure (FYI we don't allow to use String.format because of performance concern)
| protected static String REGION_NAME = UUID.randomUUID().toString().replaceAll("-", ""); | ||
| protected static String STORE_NAME = UUID.randomUUID().toString(); | ||
| protected static List<Path> EMPTY_PATH = Collections.emptyList(); | ||
| protected static List<Path> INCLUDE_EXAMPLE_PATH = | ||
| Lists.newArrayList(new Path("hdfs://foo/bar1"), new Path("hdfs://foo/bar2")); |
| // | ||
| // this should be rarely happened on a real cluster and RS with meta table should be the last | ||
| // to be shutdown normally. | ||
| TEST_UTIL.killMiniHBaseCluster(); |
There was a problem hiding this comment.
I think I understand the reasoning. Is that not the case in the other mini-cluster test TestPersistedStoreFileManager where you just shutdown the cluster without killing it?
There was a problem hiding this comment.
sorry, we can remove this now.
it was something old from my local branch that tried to write one last record for compactedFiles to hbase:storefile table, but we have simplified the logic in this commit that only storefiles are being tracked. The test here has already flushed and should not have any new update to the table hbase:storefile when closing.
if we merge into hbase:meta (will be done in HBASE-25397) , this will definitely not a issue anymore as well.
| * @param masterServices {@link MasterServices} | ||
| * @throws IOException if failures | ||
| */ | ||
| private static void cleanup(MasterServices masterServices) throws IOException { |
There was a problem hiding this comment.
According to IntelliJ this method is not used.
| * @param masterServices instance of HMaster | ||
| * @throws IOException if Master is not running or connection has been lost | ||
| */ | ||
| void initialize(final MasterServices masterServices) throws IOException; |
There was a problem hiding this comment.
Is this method only used in tests?
There was a problem hiding this comment.
yes, but after rethinking about along with StorefileTrackingUtils.init(masterService); we can remove it.
| return this; | ||
| } | ||
|
|
||
| Builder withStorePaths(List<Path> storeFiles) { |
There was a problem hiding this comment.
fixed with using @RestrictedApi
| final String storeName) throws IOException; | ||
|
|
||
| /** | ||
| * Writes the specified updates to the tracking |
There was a problem hiding this comment.
In my understanding StoreFilePathUpdate represents a single entity which should be persisted by the Accessor. This could happen is some cases like results of a compaction, flush, recovered HFile, etc.
Essentially list of Paths decorated with hasStoreFilesUpdate() property which I don't fully understand what it's good for.
…tedApi when for test only functions
…edStoreFileManager
|
💔 -1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
taklwu
left a comment
There was a problem hiding this comment.
should have addressed all the comments, thanks @joshelser and @anmolnar
| List<Path> paths = new ArrayList<>(); | ||
| if (data != null && data.length != 0) { | ||
| String pathString = Bytes.toString(data); | ||
| String[] pathStrings = pathString.split(LIST_SEPARATOR); |
There was a problem hiding this comment.
fixed and used StringUtils.split(pathString, LIST_SEPARATOR)
| new Get(Bytes.toBytes(getKey(tableName, regionName, storeName))); | ||
| get.addColumn(colFamilyBytes, STOREFILE_QUALIFIER); | ||
| Result result = doGet(get); | ||
| if (result == null || result.isEmpty()) { |
There was a problem hiding this comment.
from the API java doc it should be non-null and result.isEmpty() should be the indicator for no result for a given GET, I will remove the null check.
| if (connection == null) { | ||
| throw new IOException("Connection should be provided by region server " | ||
| + "and should not be null after initialized."); | ||
| } |
There was a problem hiding this comment.
fixed and removed this getConnection() function after making connection final
|
|
||
| import java.util.List; | ||
| import org.apache.commons.lang3.builder.EqualsBuilder; | ||
| import org.apache.commons.lang3.builder.HashCodeBuilder; |
There was a problem hiding this comment.
I don't think so, and I confirmed by searching the code base and the shading setting.
|
🎊 +1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
|
💔 -1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
|
💔 -1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
|
I checked locally with the jdk8 environment and it passed. I checked the code of the unit test
|
|
@taklwu I'm checking your commit which introduced the abstract class Revisiting my idea about the abstract class I think we don't need that. Let's revert to your original DefaultStoreFileManager -> PersistedStoreFileManager inheritence with the following modifications:
|
…oreFileManager and DefaultStoreFileManager
|
🎊 +1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
anmolnar
left a comment
There was a problem hiding this comment.
Thanks for the changes. Lgtm.
|
🎊 +1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
|
thanks @anmolnar , @joshelser do you have any other comments? |
z-york
left a comment
There was a problem hiding this comment.
Minor comments, approved.
| /** | ||
| * Configuration for storefile tracking feature | ||
| */ | ||
| public static final String STOREFILE_TRACKING_PERSIST_ENABLED = | ||
| "hbase.storefile.tracking.persist.enabled"; | ||
| public static final boolean DEFAULT_STOREFILE_TRACKING_PERSIST_ENABLED = false; | ||
|
|
||
| public static final String STOREFILE_TRACKING_INIT_TIMEOUT = | ||
| "hbase.storefile.tracking.init.timeout"; | ||
| public static final long DEFAULT_STOREFILE_TRACKING_INIT_TIMEOUT = | ||
| TimeUnit.MINUTES.toMillis(5); | ||
|
|
There was a problem hiding this comment.
I think this is a bit of an anti-pattern now to put constants in HConstants, but not a major issue IMO.
| * TODO we will need a followup change to introduce in-memory temporarily file, such that further | ||
| * we can introduce a non-tracking temporarily storefiles left from a flush or compaction when | ||
| * a regionserver crashes without closing the store properly | ||
| */ |
There was a problem hiding this comment.
The in-memory tmp tracking won't touch the Accessor as nothing will be persisted. I think you can remove this.
| private static final byte[] STOREFILE_QUALIFIER = Bytes.toBytes(STOREFILE_QUALIFIER_STR); | ||
| private static final int STOREFILE_TABLE_VERSIONS = 3; | ||
|
|
||
| // TODO find a way for system table to support region split at table creation or remove this |
| // 3. After table clone and create new HFiles directly into data directory | ||
| // | ||
| // Also we don't handle the inconsistency between storefile tracking and file system, which | ||
| // will be handled by a HBCK command |
| ArrayList<StoreFileInfo> storeFiles = new ArrayList<>(); | ||
| for (Path storeFilePath : pathList) { | ||
| if (!StoreFileInfo.isValid(getRegionFs().getFileSystem().getFileStatus(storeFilePath))) { | ||
| LOG.warn("Invalid StoreFile: {}, and archiving it", storeFilePath); |
Hey Zach @z-york since your comment are minor and I will implement them in the upcoming JIRA, I will push the commit AS IS to save another around of CI build In addition, @joshelser I didn't hear back from you so I think the reviewed by @anmolnar should be good enough, and if you want to come back and ask more change, we can have it in a addendum PR/JIRA. I'm pushing this to the feature branch HBASE-24749 and continues on other PRs this week. |
…ger (#2931) * HBASE-25395 Introduce PersistedStoreEngine and PersistedStoreFileManager - Implement HTableStoreFilePathAccessor - Add loadInitialFiles to StoreFileManager Interface - refactor and add loadInitialFiles to the base interface, such each store file manager implementation could have their own way to discovery a list of available files instead of just calling fs.getStoreFiles() - add store instance to each store file manager implementation - Add PersistedStoreEngine and PersistedStoreFileManager that fork the in-memory object in store file manager to a off-memory structure that is going to be used when region reassign, (re)open, cluster restart and etc. This improvement aims for better object store (e.g. S3) support without calling LIST to get storefiles from a given column family. Reviewed-by: Andor Molnár <andor@apache.org> Signed-off-by: Zach York <zyork@apache.org>
|
fixed the commit from fd649ce to cca4cbf with adding |
…ger (#2931) * HBASE-25395 Introduce PersistedStoreEngine and PersistedStoreFileManager - Implement HTableStoreFilePathAccessor - Add loadInitialFiles to StoreFileManager Interface - refactor and add loadInitialFiles to the base interface, such each store file manager implementation could have their own way to discovery a list of available files instead of just calling fs.getStoreFiles() - add store instance to each store file manager implementation - Add PersistedStoreEngine and PersistedStoreFileManager that fork the in-memory object in store file manager to a off-memory structure that is going to be used when region reassign, (re)open, cluster restart and etc. This improvement aims for better object store (e.g. S3) support without calling LIST to get storefiles from a given column family. Reviewed-by: Andor Molnár <andor@apache.org> Signed-off-by: Zach York <zyork@apache.org>
…ger (apache#2931) * HBASE-25395 Introduce PersistedStoreEngine and PersistedStoreFileManager - Implement HTableStoreFilePathAccessor - Add loadInitialFiles to StoreFileManager Interface - refactor and add loadInitialFiles to the base interface, such each store file manager implementation could have their own way to discovery a list of available files instead of just calling fs.getStoreFiles() - add store instance to each store file manager implementation - Add PersistedStoreEngine and PersistedStoreFileManager that fork the in-memory object in store file manager to a off-memory structure that is going to be used when region reassign, (re)open, cluster restart and etc. This improvement aims for better object store (e.g. S3) support without calling LIST to get storefiles from a given column family. Reviewed-by: Andor Molnár <andor@apache.org> Signed-off-by: Zach York <zyork@apache.org>
…ger (apache#2931) * HBASE-25395 Introduce PersistedStoreEngine and PersistedStoreFileManager - Implement HTableStoreFilePathAccessor - Add loadInitialFiles to StoreFileManager Interface - refactor and add loadInitialFiles to the base interface, such each store file manager implementation could have their own way to discovery a list of available files instead of just calling fs.getStoreFiles() - add store instance to each store file manager implementation - Add PersistedStoreEngine and PersistedStoreFileManager that fork the in-memory object in store file manager to a off-memory structure that is going to be used when region reassign, (re)open, cluster restart and etc. This improvement aims for better object store (e.g. S3) support without calling LIST to get storefiles from a given column family. Reviewed-by: Andor Molnár <andor@apache.org> Signed-off-by: Zach York <zyork@apache.org>
Create a new wrapper store engine on top of the default store engine, as well as, introducing a new store file manager that keep tracking path of HFiles of user tables into a hbase self-managed system table (further will have a different tasks to merge into hbase:meta table)