Skip to content

Conversation

@yeshengm
Copy link
Contributor

@yeshengm yeshengm commented Feb 26, 2021

What changes were proposed in this pull request?

ExternalCatalog call getPartition restores partition-level stats from Hive table metadata. However, listPartitions and listPartitionsByFilter calls do not restore these partition stats, which leads to discrepancies between returned CatalogPartition between these API calls.

Why are the changes needed?

Fix discrepancies between similar APIs.

Does this PR introduce any user-facing change?

No

How was this patch tested?

Existing UTs. Ideally this should also be tested in ExternalCatalogSuite, but there's no existing tests in ExternalCatalogSuite for metastore stats. I can add tests if reviewers raise concern.

@github-actions github-actions bot added the SQL label Feb 26, 2021
@SparkQA
Copy link

SparkQA commented Feb 26, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/40082/

@SparkQA
Copy link

SparkQA commented Feb 26, 2021

Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/40082/

@SparkQA
Copy link

SparkQA commented Feb 26, 2021

Test build #135501 has finished for PR 31661 at commit d702295.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think we can have a unit test case, @yeshengm ?

cc @MaxGekk

Copy link
Member

@MaxGekk MaxGekk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would propose to inline restorePartitionSpec() into restorePartitionMetadata(), and use restorePartitionMetadata() everywhere instead of restorePartitionSpec().

Also I would rename restorePartitionMetadata to fromMetaStorePartitionSpec as the opposite function to toMetaStorePartitionSpec but this is optional.

val metaStoreSpec = partialSpec.map(toMetaStorePartitionSpec)
val res = client.getPartitions(db, table, metaStoreSpec)
.map { part => part.copy(spec = restorePartitionSpec(part.spec, partColNameMap))
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As you are here, could you move } up:

.map { part => part.copy(spec = restorePartitionSpec(part.spec, partColNameMap)) }

@github-actions
Copy link

We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable.
If you'd like to revive this PR, please reopen it and ask a committer to remove the Stale tag!

@github-actions github-actions bot added the Stale label Jun 14, 2021
@github-actions github-actions bot closed this Jun 15, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants