Core: Optimize check for referenced data files in BaseRowDelta#3071
Core: Optimize check for referenced data files in BaseRowDelta#3071aokolnychyi merged 2 commits intoapache:masterfrom
Conversation
| if (!referencedDataFiles.isEmpty()) { | ||
| validateDataFilesExist(base, startingSnapshotId, referencedDataFiles, !validateDeletes); | ||
| validateDataFilesExist( | ||
| base, startingSnapshotId, referencedDataFiles, !validateDeletes, conflictDetectionFilter); |
There was a problem hiding this comment.
This does make the assumption that the conflict detection filter and referenced data files are related.
There was a problem hiding this comment.
I think that is a correct assumption as the conflict detection filter is our scan condition and referenced data files are data files were read.
There was a problem hiding this comment.
I think that's a valid assumption.
e02d97b to
8afb26d
Compare
| } | ||
|
|
||
| @Test | ||
| public void testValidateDataFilesExistWithConflictDetectionFilter() { |
There was a problem hiding this comment.
Should we also add a test where the validation fails? It looks like this one just checks that you can do isolated operations but I think we should do a conflicting test as well.
There was a problem hiding this comment.
Sure, I'll add one.
There was a problem hiding this comment.
Added a negative test too.
rdblue
left a comment
There was a problem hiding this comment.
I think this is ready to go once @RussellSpitzer's suggestion to add a test case is fixed.
966cede to
e3b7c68
Compare
|
I'll merge this one to unblock subsequent PRs. I added the missing test. |
|
Thanks for reviewing, @RussellSpitzer @rdblue! |
…e#3071) This change optimizes our check for referenced data files in BaseRowDelta by pushing down the conflict detection filter. Previously, we would open manifests even though they belonged to partitions out of our interest.
…e#3071) This change optimizes our check for referenced data files in BaseRowDelta by pushing down the conflict detection filter. Previously, we would open manifests even though they belonged to partitions out of our interest.
This change optimizes our check for referenced data files in BaseRowDelta by pushing down the conflict detection filter. Previously, we would open manifests even though they belonged to partitions out of our interest.
|
Added this as it was included in 0.12.1 (made cherry-picking easier). |
…e#3071) This change optimizes our check for referenced data files in BaseRowDelta by pushing down the conflict detection filter. Previously, we would open manifests even though they belonged to partitions out of our interest.
This PR optimizes our check for referenced data files in
BaseRowDeltaby pushing down the conflict detection filter. Previously, we would open manifests even though they belonged to partitions out of our interest.