-
Notifications
You must be signed in to change notification settings - Fork 4.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HIVE-24444: compactor.Cleaner should not set state 'mark cleaned' if there are obsolete files in the FS #1716
Conversation
…there are obsolete files in the FS
@pvargacl would you mind taking a look too? |
conf.set(ValidTxnList.VALID_TXNS_KEY, validTxnList.writeToString()); | ||
ValidReaderWriteIdList validWriteIdList = new ValidReaderWriteIdList(); | ||
Path locPath = new Path(location); | ||
AcidUtils.Directory dir = AcidUtils.getAcidState(locPath.getFileSystem(conf), locPath, conf, validWriteIdList, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could pass an empty dirSnapshot to the first getAcidState in removeFiles, that will fill up the snapshot, and you can you reuse it here, so it won't make a second listing on the FS
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea, thanks!
Path locPath = new Path(location); | ||
AcidUtils.Directory dir = AcidUtils.getAcidState(locPath.getFileSystem(conf), locPath, conf, validWriteIdList, | ||
Ref.from(false), false, dirSnapshots); | ||
return dir.getObsolete().size(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we consider aborts as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should not, there can be aborts with higher txnId than compaction, this will see those, but the cleaner will never see them, so it would never finish its job
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aborted directories are deleted by removeFiles() too, and there can be obsolete directories with a higher txnId than this compaction too. The point of this change is to call markCleaned() when the table is completely cleaned, not necessarily when this compaction's cleaning is done. I figured that would be better than calling markCleaned() when some obsolete or aborted file gets deleted. (see HIVE-24444 description)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think checking the aborted is a very problematic in upstream, if you have transaction coming in continuously and some of them aborting cleaner won't be able to mark any compaction cleaned. Example:
txn1-10 coming in some of them writing, txnid=8 is aborting
compaction is kicking in, compacts the deltas, will have the next_txn_id for example txnId=13
other transactions are coming in txnid=18 is aborting
cleaner will start when everything is done under 13, it can clean everything. but since txnId=14 is open it will not see txnId=18, so when it checks if can mark itself cleaned, it will not do it, because the check sees txnId=18 aborted.
Second compaction kicks in, will have next_txn_id for example txnId=23
other transactions are coming in txnid=28 is aborting.
Again cleaner can not mark compaction 1 or compaction 2 cleaned, as none of them see 28 and can not clean it. this will pile up and Cleaner only mark them cleaned, if finally there is an iteration, when there was no new aborted transaction.
If you check only the obsoleted files, you only miss marking the compaction cleaned, when you should have, if you have two overlapping compaction, but in that case the second compaction will get marked as cleaned and the next cleaner run, compaction 1 will be marked as cleaned too. This too can cause problems if you enabled delayed cleaner, but probably less
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, I see what you mean. I removed the aborted files from the total.
In general do you think checking for obsolete files is better than checking whether we removed any files?
Case 1: Assuming HIVE-23107 etc. are present in the version?
Case 2: Assuming HIVE-23107 etc. are not present in the version?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In upstream it would cause problems, for the delayed cleaner it is necessary that every compaction entry cleans only its own obsoletes. The default is every compaction is delayed by 15 minutes, so it is much more likely when a cleaning job finally runs, there are already some other jobs in the queue waiting. all of them contains highestwriteId saved, and they will clean only up until that. You should not merge those.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can merge stuff above retention period in this case, but that's another story not relevant to this pull request
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't checked, but if folder is considered as obsolete when there is a base / minor compacted delta shadowing it - that's ok.
But i don't see benefits of this patch for upstream (contains HIVE-23107), am I missing something? Also it's gonna violate the idea of delayed cleaner that is coupled with a specific compaction request.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, this isn't necessary upstream, this change is for versions without HIVE-23107 etc. But I don't want to hurt upstream functionality with it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if it's only for versions without HIVE-23107, should it be behind the feature flag/schema version check?
ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Cleaner.java
Outdated
Show resolved
Hide resolved
*/ | ||
private boolean removeFiles(String location, ValidWriteIdList writeIdList, CompactionInfo ci) | ||
throws IOException, NoSuchObjectException, MetaException { | ||
Path locPath = new Path(location); | ||
FileSystem fs = locPath.getFileSystem(conf); | ||
Map<Path, AcidUtils.HdfsDirSnapshot> dirSnapshots = AcidUtils.getHdfsDirSnapshots(fs, locPath); | ||
AcidUtils.Directory dir = AcidUtils.getAcidState(locPath.getFileSystem(conf), locPath, conf, writeIdList, Ref.from( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could be replaced with above fs variable
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what would happen if there is a long running read-only txn + multiple compaction attempts? are we going to have multiple records in a queue for the same table/partition pending cleanup?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, not with HIVE-24291.
Without HIVE-24291 (which might not be usable if for example if HMS schema changes are out the question) we still could have a pileup of the same table/partition in "ready for cleaning" in the queue.
Without this change (HIVE-24444) some of them might not be deleted.
The goal of this change is that, when the table does get cleaned, all of the records will be deleted.
*/ | ||
private int getNumEventuallyObsoleteDirs(String location, Map<Path, AcidUtils.HdfsDirSnapshot> dirSnapshots) | ||
throws IOException { | ||
ValidTxnList validTxnList = new ValidReadTxnList(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will we consider all writes as valid here? Shouldn't we limit at least by compactor txnId?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Compactor txnid is stored since HIVE-24291. This patch is for versions without that change, or any recent changes to the hms schema.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we limit by CQ_NEXT_TXN_ID ?
Do we need this change in upstream master if HIVE-24291 is already present ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No we don't need this change if HIVE-24291 is present. But if HIVE-24291 is present it shouldn't hurt.
CQ_NEXT_TXN_ID is stored since HIVE-23107, which also has hms schema changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But if HIVE-24291 is present it shouldn't hurt.
This isn't necessarily true, since as @pvargacl noted there could be aborts since the compaction happened.
The question here is whether this change or HIVE-24314 is worse for users if HIVE-23107 (remove MIN_HISTORY_LEVEL) etc. are not present on their version.
…etween the two getAcidState calls:(
I will close this because HIVE-24403 is making HIVE-23107 etc. backwards compatible so this change will not be needed. |
What changes were proposed in this pull request?
Why are the changes needed?
Does this PR introduce any user-facing change?
See HIVE-24444
How was this patch tested?
Unit test