[core] Introduce DeletionVectorIndexFileWriter#3402
[core] Introduce DeletionVectorIndexFileWriter#3402JingsongLi merged 9 commits intoapache:masterfrom
Conversation
| private final boolean isWrittenToMulitFiles; | ||
| private final long targetSizeInBytes; | ||
|
|
||
| private boolean written = false; |
There was a problem hiding this comment.
This class looks a bit strange, why does it need variables? It only provides one method, which appears to not require any state preservation.
| written = true; | ||
| writtenSizeInBytes += currentSize; | ||
| } | ||
| result.add(writer.closeWriter()); |
There was a problem hiding this comment.
It is better to use try (resource) to avoid resource leak.
| return new SingleIndexFileWriter(fileIO, indexPathFactory.newPath()); | ||
| } | ||
|
|
||
| static class SingleIndexFileWriter { |
There was a problem hiding this comment.
private, and maybe can be an inner class instead of static.
In this way, you can inline createWriter to the SingleIndexFileWriter constructor.
c682030 to
ca79eca
Compare
| long currentSize = writer.write(entry.getKey(), entry.getValue()); | ||
|
|
||
| if (isWrittenToMulitFiles | ||
| && !writer.hasWritten() |
There was a problem hiding this comment.
!writer.hasWritten() always return true?
Maybe you should remove it.
|
|
||
| private final PathFactory indexPathFactory; | ||
| private final FileIO fileIO; | ||
| private final boolean isWrittenToMulitFiles; |
There was a problem hiding this comment.
typo: isWrittenToMultiFiles
| } | ||
| } | ||
|
|
||
| class SingleIndexFileWriter implements Closeable { |
|
|
||
| private long writtenSizeInBytes = 0L; | ||
|
|
||
| public SingleIndexFileWriter(FileIO fileIO, Path path) throws IOException { |
There was a problem hiding this comment.
public SingleIndexFileWriter() throws IOException {
this.path = indexPathFactory.newPath();
.....
}
| public DeletionVectorIndexFileWriter( | ||
| FileIO fileIO, | ||
| PathFactory pathFactory, | ||
| BucketMode bucketMode, |
There was a problem hiding this comment.
Maybe we don't need bucketMode, here we can just pass a long targetSizePerIndexFile. If it is BUCKET_UNAWARE, targetSizePerIndexFile can be Long.MAX.
Same to DeletionVectorsIndexFile.
There was a problem hiding this comment.
Fine. That's what I thought at first. But I don't think it's a little serious.
|
|
||
| public List<IndexFileMeta> write(Map<String, DeletionVector> input) throws IOException { | ||
| if (input.isEmpty()) { | ||
| return emptyIndexFile(); |
There was a problem hiding this comment.
If this is possible? Why we need to return a empty Index file?
There was a problem hiding this comment.
This empty index file is needed to overwrite the previous one when compact.
I am going to think it again when to support compact operation on deletion vector.
There was a problem hiding this comment.
here just aligned to the original logic.
There was a problem hiding this comment.
This may only be to overwrite the old deletion file, as there is no proactive message to delete the deletion file for the primary key table.
But for append table, maybe here no need to generate empty deletion file?
Purpose
to support merge and write multi deletion vector within same partition and bucket to one index file.
Linked issue: close #xxx
Tests
API and Format
Documentation