[HUDI-8073] Add hosts to storage path info and use it if present#11761
Merged
yihua merged 6 commits intoapache:masterfrom Aug 16, 2024
Merged
[HUDI-8073] Add hosts to storage path info and use it if present#11761yihua merged 6 commits intoapache:masterfrom
yihua merged 6 commits intoapache:masterfrom
Conversation
CTTY
reviewed
Aug 14, 2024
hudi-hadoop-mr/src/main/java/org/apache/hudi/hadoop/HiveHoodieReaderContext.java
Outdated
Show resolved
Hide resolved
…ReaderContext.java Co-authored-by: Shawn Chang <42792772+CTTY@users.noreply.github.com>
yihua
requested changes
Aug 15, 2024
hudi-common/src/main/java/org/apache/hudi/common/engine/HoodieReaderContext.java
Outdated
Show resolved
Hide resolved
| */ | ||
| public abstract ClosableIterator<T> getFileRecordIterator( | ||
| protected abstract ClosableIterator<T> getFileRecordIterator( | ||
| StoragePath filePath, long start, long length, Schema dataSchema, Schema requiredSchema, |
Contributor
There was a problem hiding this comment.
Can we keep this only? The caller should know the path to use. Otherwise, we may hide issues where path info is null.
hudi-common/src/main/java/org/apache/hudi/common/engine/HoodieReaderContext.java
Show resolved
Hide resolved
|
|
||
| package org.apache.hudi.storage; | ||
|
|
||
| public interface StorageFile { |
Contributor
There was a problem hiding this comment.
Without this, by adding the locations to StoragePathInfo, the same goal can be achieved.
yihua
approved these changes
Aug 16, 2024
hudi-common/src/main/java/org/apache/hudi/common/engine/HoodieReaderContext.java
Show resolved
Hide resolved
hudi-common/src/main/java/org/apache/hudi/common/table/read/HoodieFileGroupReader.java
Outdated
Show resolved
Hide resolved
| fileStatus.getModificationTime()); | ||
| } | ||
|
|
||
| public static StoragePathInfo convertToStoragePathInfo(FileStatus fileStatus, String[] locations) { |
Contributor
There was a problem hiding this comment.
LocatedFileStatus (which extends FileStatus) stores BlockLocation[] locations. As a follow-up, see if we want to leverage that.
…odieFileGroupReader.java
Collaborator
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Change Logs
FileSplit has hosts information that should be used if present
Impact
possibly better perf for hive
Risk level (write none, low medium or high below)
low
Documentation Update
N/A
Contributor's checklist