New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
API for path generation in Iceberg frameworks #55
Comments
I think this API should pass in the partition tuple. Now that I think about it, the metadata location is determined by the |
Ok I'll start taking a look at this now! |
Remark: The paths module has to be a separate entity because once again, we will be instantiating it once on the driver and serializing the instance for executors. I think we can move the metadata file lookup to this module as well. |
@mccheah, in that case is this something that we should add to the |
Yup I don't have a strong opinion here, so let's bundle it with the FileIO interface. I'll propose a diff but I anticipate wanting to fine tune the exact method signatures as part of the PR review. |
Looks like this is done! I'll close it. |
(For completeness, linking this issue to PR that made it to master: #87) |
PLAT-47949 - fix the inconsistency in tombstone merge
(cherry picked from commit 54119bd) Co-authored-by: Russell Spitzer <russell.spitzer@GMAIL.COM>
In the integrations that Iceberg supports out of the box (Spark, Pig), the frameworks decide how to generate paths for written files. However, some sources would prefer to pick their own paths for new files. Some questions for designing such an API include:
FileIO
API?The text was updated successfully, but these errors were encountered: