Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Spark: support read of partition metadata column when table is over 1k #10547

Merged
merged 9 commits into from
Jul 5, 2024

Conversation

dramaticlly
Copy link
Contributor

@dramaticlly dramaticlly commented Jun 21, 2024

support of
SELECT *, _partition from iceberg.foo.bar when foo.bar table has over 1000 columns defined

@github-actions github-actions bot added the spark label Jun 21, 2024
@dramaticlly
Copy link
Contributor Author

Looks like PRB failed due to junit clean up of temp directory as mentioned in #10569

Copy link
Collaborator

@szehon-ho szehon-ho left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, thanks @dramaticlly

@szehon-ho szehon-ho merged commit 6223708 into apache:main Jul 5, 2024
35 checks passed
@szehon-ho
Copy link
Collaborator

Merged, thanks @dramaticlly

@pan3793
Copy link
Member

pan3793 commented Jul 10, 2024

... when foo.bar table has over 1000 columns defined

@dramaticlly can you clarify why it was not supported before? where does the restriction come from? I don't see the magic number 1000 in the codebase

@dramaticlly
Copy link
Contributor Author

dramaticlly commented Jul 10, 2024

... when foo.bar table has over 1000 columns defined

@dramaticlly can you clarify why it was not supported before? where does the restriction come from? I don't see the magic number 1000 in the codebase

@pan3793 so it will only fix the scenario when selecting all fields together with partition metadata column on iceberg table with more than 1000 columns. The unit test shall reproduce the problem if fix is missing. As for the reasoning, the 1000 is coming from the default field id to be assigned for inner partition struct of iceberg table, more detailed analysis can be found in #9923

jasonf20 pushed a commit to jasonf20/iceberg that referenced this pull request Aug 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants