Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Core: Check for all specs in partitionsTable #7551

Merged
merged 4 commits into from
May 12, 2023

Conversation

ajantha-bhat
Copy link
Member

@ajantha-bhat ajantha-bhat commented May 8, 2023

Partitions Table schema checks only current spec to decide if partition column is returned or not. We should actually check historical specs to see if it was ever partitioned, similar to Files/Entries and other metadata tables.

Fixes #7533

@ajantha-bhat
Copy link
Member Author

cc: @szehon-ho

@github-actions github-actions bot added the core label May 8, 2023
@ajantha-bhat
Copy link
Member Author

@nastra: I have addressed the comments. Thanks for the review. @szehon-ho: Would you like to take a look at this?

Copy link
Contributor

@nastra nastra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@@ -805,6 +807,47 @@ public void testPartitionSpecEvolutionRemoval() {
}
}

@Test
public void testPartitionSpecEvolutionToUnpartitioned() throws IOException {
Copy link
Collaborator

@szehon-ho szehon-ho May 10, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually we have a test specifically for this: TestMetadataTableScansWithPartitionEvolution, for organization purpose? Could we move it there?

Also another question, does this test the case? I thought it was only failing on V2, because I had seen on V1 the partition transform remains as void on the latest spec, when removed.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I moved it now.

Also another question, does this test the case? I thought it was only failing on V2, because I had seen on V1 the partition transform remains as void on the latest spec, when removed.

Yes, the testcase fails only for v2 without the PR changes as V1 will have void transform to satisfy the check. But I thought it can run for both.

BTW, now I removed the scan code in the testcase because my intention was to just scan spec_id column and see it is filled instead of null. But I don't know how to achieve it (In sql or spark code I know. But in this table scan I am not sure as instead of rows it returns content files)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks.

I think the original asserts you had before moving the test , worked pretty well? I would think it's nice to have those. But up to you.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok. Added it back.

Copy link
Collaborator

@szehon-ho szehon-ho left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great, just a nit and question on test

core/src/main/java/org/apache/iceberg/PartitionsTable.java Outdated Show resolved Hide resolved
Copy link
Collaborator

@szehon-ho szehon-ho left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. Maybe we can put your original test asserts back, as the test is not very interesting as it is? But up to you.

@@ -222,6 +222,14 @@ public void testPositionDeletesPartitionSpecRemoval() {
constantsMap(posDeleteTask, partitionType).get(MetadataColumns.FILE_PATH.fieldId()));
}

@Test
public void testPartitionSpecEvolutionToUnpartitioned() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

feels like test name is not super clear, maybe testPreviouslyPartitionedSpecStillShowPartitionColumnWhenEvolveToUnpartitioned ?

Copy link
Collaborator

@szehon-ho szehon-ho May 11, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm. it looks a bit long :) In that case, Id vote to maybe bring back the original asserts then the first commit so its a more end-to-end test and we dont have to change the name too much , 54b95ea as I dont think its covered elsewhere (though realize its not related to this pr )

Copy link
Collaborator

@szehon-ho szehon-ho left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lgtm, will merge tomorrow pending any other comment

@szehon-ho szehon-ho merged commit dbaeadf into apache:master May 12, 2023
41 checks passed
@szehon-ho
Copy link
Collaborator

Merged, thanks @ajantha-bhat and @nastra for additional review!

dramaticlly pushed a commit to dramaticlly/iceberg that referenced this pull request May 23, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Partitions table for currently unpartitioned V2 table should show partitions if partitioned at some point
4 participants