Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use the same shared FileSystem instance across calls in HadoopTableOperations #108

Closed
wants to merge 1 commit into from

Conversation

mccheah
Copy link
Contributor

@mccheah mccheah commented Nov 17, 2018

Closes #92.

@@ -76,6 +76,10 @@ public static HadoopInputFile fromStatus(FileStatus stat, Configuration conf) {
}
}

static HadoopInputFile fromFsPath(FileSystem fs, Path path, Configuration conf) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of adding this, I think HadoopInputFile should make its constructor protected to allow you to create a subclass.

The reason why we use Path and Configuration is to ensure that the file system always matches the path's scheme. Adding a new factory method that allows users to supply a file system creates a path around that guarantee. That's legitimate for your use case, but you'll need to implement HadoopTableOperations anyway, so I think a safer way to solve the problem for you is to subclass HadoopInputFile and expose this method in your private subclass.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I should note that for the limited use in this PR I think it is okay because this is package-private. It would just be better to solve your overall use case with a custom TableOperations implementation.

private TableMetadata currentMetadata = null;
private Integer version = null;
private boolean shouldRefresh = true;

HadoopTableOperations(Path location, Configuration conf) {
this.conf = conf;
this.location = location;
this.metadataFs = Util.getFS(location, conf);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These changes look fine to me, along with the package-private changes to HadoopInputFile and HadoopOutputFile.

@rdblue
Copy link
Contributor

rdblue commented Dec 7, 2018

This was migrated to apache/iceberg#14.

@rdblue rdblue closed this Dec 7, 2018
Parth-Brahmbhatt pushed a commit to Parth-Brahmbhatt/iceberg that referenced this pull request Apr 12, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants