[HBASE-27245]Expose the archive API to the end user#4661
[HBASE-27245]Expose the archive API to the end user#4661alexdongli0829 wants to merge 1 commit intoapache:masterfrom
Conversation
|
🎊 +1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
| * @param tableName existing table | ||
| * @param archive if archive the data when delete the table | ||
| */ | ||
| public void deleteTable(TableName tableName, boolean archive) throws IOException { |
There was a problem hiding this comment.
Make the above method call this method so we can save some lines of code?
| * snapshot, the performance may be impacted, should evaluate the performance between directly | ||
| * archive and snapshot scan TODO: find some any way to get if the table snapshotted or not | ||
| */ | ||
| if (!archive) { |
There was a problem hiding this comment.
Why we need this logic here? Mind exlaining a bit?
There was a problem hiding this comment.
@Apache9 Thanks so much for your time. The thought is because there will be no archive for all the hfiles, so if there are any snapshots, the snapshots will be broken if any refer for the hfiles, so to avoid the broken, I made the check before conduct the actual delete without archive, make sense?
There was a problem hiding this comment.
Ah, this is a problem...
I think the correct way is to still archive the files which are referenced by snapshots, and delete unreferenced files directly? This wil be a bit hard as snapshot can happen at the same time?
There was a problem hiding this comment.
@Apache9 You are correct, there was some compromise for the performance and accuracy. If need to scan all the snapshot files and find out if the hfile referred, there will be another performance impact, meanwhile, for the snapshot which is happening, we may need a get the lock from the snapshot manager to make sure no in processing snapshot referring the file, its another concern.
So current I just simplify the logic and return the message to end user and let user decide if they want to remove the snapshot and then delete the table without archive. Do you think its acceptable? Or we had better check the details with some performance sacrifice?
https://issues.apache.org/jira/browse/HBASE-27245