-
Notifications
You must be signed in to change notification settings - Fork 28.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-21213][SQL][FOLLOWUP] Improve partition statistics in AnalyzePartitionCommand #23584
Conversation
… of old table stats in AnalyzePartitionCommand
cc @mbasmanova @cloud-fan @gatorsmile Would you take a look if have time. :) |
@@ -110,7 +110,7 @@ case class AnalyzePartitionCommand( | |||
val newTotalSize = CommandUtils.calculateLocationSize( | |||
sessionState, tableMeta.identifier, p.storage.locationUri) | |||
val newRowCount = rowCounts.get(p.spec) | |||
val newStats = CommandUtils.compareAndGetNewStats(tableMeta.stats, newTotalSize, newRowCount) | |||
val newStats = CommandUtils.compareAndGetNewStats(p.stats, newTotalSize, newRowCount) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is there different?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Different. Stats in tableMeta
are table level stats, stats in Partition
are partition level stats.
Also cc @wzhfy |
Can one of the admins verify this patch? |
We're closing this PR because it hasn't been updated in a while. If you'd like to revive this PR, please reopen it! |
What changes were proposed in this pull request?
This pr proposes to improve partition statistics in
AnalyzePartitionCommand
:HiveExternalCatalog.listPartitions
andHiveExternalCatalog.listPartitionsByFilter
.AnalyzePartitionCommand
.Thus partitions listed in
AnalyzePartitionCommand
would contain Spark stats and would not update HiveMetaStore if stats not changed.spark/sql/core/src/main/scala/org/apache/spark/sql/execution/command/AnalyzePartitionCommand.scala
Line 87 in 6d9c54b
How was this patch tested?
Modified existing tests.