-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-34552][SQL] ExternalCatalog listPartitions and listPartitionsByFilter calls should also restore metadata #31661
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…er calls should also restore metadata
|
Kubernetes integration test starting |
|
Kubernetes integration test status success |
|
Test build #135501 has finished for PR 31661 at commit
|
dongjoon-hyun
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
MaxGekk
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would propose to inline restorePartitionSpec() into restorePartitionMetadata(), and use restorePartitionMetadata() everywhere instead of restorePartitionSpec().
Also I would rename restorePartitionMetadata to fromMetaStorePartitionSpec as the opposite function to toMetaStorePartitionSpec but this is optional.
| val metaStoreSpec = partialSpec.map(toMetaStorePartitionSpec) | ||
| val res = client.getPartitions(db, table, metaStoreSpec) | ||
| .map { part => part.copy(spec = restorePartitionSpec(part.spec, partColNameMap)) | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As you are here, could you move } up:
.map { part => part.copy(spec = restorePartitionSpec(part.spec, partColNameMap)) }
|
We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable. |
What changes were proposed in this pull request?
ExternalCatalog call getPartition restores partition-level stats from Hive table metadata. However, listPartitions and listPartitionsByFilter calls do not restore these partition stats, which leads to discrepancies between returned CatalogPartition between these API calls.
Why are the changes needed?
Fix discrepancies between similar APIs.
Does this PR introduce any user-facing change?
No
How was this patch tested?
Existing UTs. Ideally this should also be tested in
ExternalCatalogSuite, but there's no existing tests inExternalCatalogSuitefor metastore stats. I can add tests if reviewers raise concern.