New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pluggable file I/O submodule in TableOperations #14
Conversation
Allows custom implementations that share most of the behavior, but protected constructor prevents it from being instantiated improperly.
…perations-pluggable-io
core/src/main/java/com/netflix/iceberg/hadoop/HadoopFileIO.java
Outdated
Show resolved
Hide resolved
…perations-pluggable-io
@rdblue for review. |
@rdblue ping for review please! |
core/src/main/java/com/netflix/iceberg/hadoop/HadoopFileIO.java
Outdated
Show resolved
Hide resolved
core/src/main/java/com/netflix/iceberg/BaseMetastoreTableOperations.java
Show resolved
Hide resolved
core/src/main/java/com/netflix/iceberg/BaseMetastoreTableOperations.java
Outdated
Show resolved
Hide resolved
core/src/main/java/com/netflix/iceberg/BaseMetastoreTableOperations.java
Outdated
Show resolved
Hide resolved
core/src/main/java/com/netflix/iceberg/hadoop/HadoopTableOperations.java
Outdated
Show resolved
Hide resolved
Addressed comments and is ready for another pass of reviews. |
core/src/main/java/com/netflix/iceberg/hadoop/SerializableConfiguration.java
Outdated
Show resolved
Hide resolved
public String metadataFileLocation(String fileName) { | ||
return createdMetadataFilePaths.computeIfAbsent(fileName, name -> { | ||
try { | ||
return temp.newFile(name).getAbsolutePath(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this still run delete() and deleteOnExit()?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe TemporaryFolder
knows to handle cleanup of data when it's torn down.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right and that should take care of deleteOnExit. I thought that the delete was needed because it creates the file, but I guess not since tests are passing.
data/src/main/java/com/netflix/iceberg/data/TableScanIterable.java
Outdated
Show resolved
Hide resolved
Looks really close! Just minor issues right now. |
Addressed all comments so far. |
Merged. Thanks @mccheah! Nice work. |
This adds FileIO that is returned by TableOperations and used to delete paths and to create InputFile and OutputFile instances. FileIO is Serializable so that it can be sent to tasks running in different JVMs and used for all file-related tasks for a table.
Closes #12.
Separate patch to come for leveraging this submodule in the Iceberg clients, such as the Spark DataSource.