New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-17206][SQL] Support ANALYZE TABLE on analyzable temporary view #14780
Conversation
@hvanhovell Based on the prior discussion, I opened a JIRA and this PR. Can you review it if it is on the right direction? Thanks. |
Test build #64330 has finished for PR 14780 at commit
|
@hvanhovell @cloud-fan Can you help review this? |
Conflicts: sql/core/src/main/scala/org/apache/spark/sql/execution/command/AnalyzeTableCommand.scala sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/LogicalRelation.scala
cc @hvanhovell @cloud-fan Can you take a look? Thanks. |
what's the main benefit to analyze a temp view? I think table analyzing is an expensive operation, for temp views, we can't put the resulting statistics into metastore and it will go away if session terminates. |
@cloud-fan I have a discussion with @hvanhovell at #14729 (comment). I think the main benefit is the CBO can work on all tables. If we can't analyze on a temp view, the query plan involving temp views will not be applied on CBO. We might not be able to correctly guess the use case of users. I try to image an use case like: If the user creates a temp view on a data source relation as in @hvanhovell's comment, because the data is temporary and will be changed, so the user doesn't need it to be persisted in metastore. But the user needs the query involving the temp view to execute in cost-efficient way with CBO. |
Test build #64986 has finished for PR 14780 at commit
|
Test build #64991 has finished for PR 14780 at commit
|
ping @hvanhovell @cloud-fan any more thoughts on this? |
@hvanhovell Would you like to comment on this? Thanks. |
@viirya this seems like a good idea. However, I want to wait with adding this until we have finished merging all the CBO related statistics stuff. |
@hvanhovell ok. Thanks! |
Test build #67746 has finished for PR 14780 at commit
|
I close this for now and maybe reopen it when all the CBO related statistics stuff are merged. |
What changes were proposed in this pull request?
Currently
ANALYZE TABLE
DDL command can't work on temporary view. However, for the specified type of temporary view which is analyzable, we can support the DDL command for it. So the CBO can work with temporary view too.How was this patch tested?
Jenkins tests.