Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-17206][SQL] Support ANALYZE TABLE on analyzable temporary view #14780

Closed
wants to merge 4 commits into from

Conversation

viirya
Copy link
Member

@viirya viirya commented Aug 24, 2016

What changes were proposed in this pull request?

Currently ANALYZE TABLE DDL command can't work on temporary view. However, for the specified type of temporary view which is analyzable, we can support the DDL command for it. So the CBO can work with temporary view too.

How was this patch tested?

Jenkins tests.

@viirya
Copy link
Member Author

viirya commented Aug 24, 2016

@hvanhovell Based on the prior discussion, I opened a JIRA and this PR. Can you review it if it is on the right direction? Thanks.

@viirya viirya changed the title [SPARK-17206]SQL] Support ANALYZE TABLE on analyzable temporary view [SPARK-17206][SQL] Support ANALYZE TABLE on analyzable temporary view Aug 24, 2016
@SparkQA
Copy link

SparkQA commented Aug 24, 2016

Test build #64330 has finished for PR 14780 at commit cfbfefc.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@viirya
Copy link
Member Author

viirya commented Aug 29, 2016

@hvanhovell @cloud-fan Can you help review this?

Conflicts:
	sql/core/src/main/scala/org/apache/spark/sql/execution/command/AnalyzeTableCommand.scala
	sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/LogicalRelation.scala
@viirya
Copy link
Member Author

viirya commented Sep 6, 2016

cc @hvanhovell @cloud-fan Can you take a look? Thanks.

@cloud-fan
Copy link
Contributor

cloud-fan commented Sep 6, 2016

what's the main benefit to analyze a temp view? I think table analyzing is an expensive operation, for temp views, we can't put the resulting statistics into metastore and it will go away if session terminates.

@viirya
Copy link
Member Author

viirya commented Sep 6, 2016

@cloud-fan I have a discussion with @hvanhovell at #14729 (comment).

I think the main benefit is the CBO can work on all tables. If we can't analyze on a temp view, the query plan involving temp views will not be applied on CBO.

We might not be able to correctly guess the use case of users. I try to image an use case like: If the user creates a temp view on a data source relation as in @hvanhovell's comment, because the data is temporary and will be changed, so the user doesn't need it to be persisted in metastore. But the user needs the query involving the temp view to execute in cost-efficient way with CBO.

@SparkQA
Copy link

SparkQA commented Sep 6, 2016

Test build #64986 has finished for PR 14780 at commit 29e22fe.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Sep 6, 2016

Test build #64991 has finished for PR 14780 at commit 6a3aff2.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@viirya
Copy link
Member Author

viirya commented Sep 10, 2016

ping @hvanhovell @cloud-fan any more thoughts on this?

@viirya
Copy link
Member Author

viirya commented Sep 16, 2016

@hvanhovell Would you like to comment on this? Thanks.

@hvanhovell
Copy link
Contributor

@viirya this seems like a good idea. However, I want to wait with adding this until we have finished merging all the CBO related statistics stuff.

@viirya
Copy link
Member Author

viirya commented Sep 22, 2016

@hvanhovell ok. Thanks!

@SparkQA
Copy link

SparkQA commented Oct 29, 2016

Test build #67746 has finished for PR 14780 at commit 6a3aff2.

  • This patch fails Spark unit tests.
  • This patch does not merge cleanly.
  • This patch adds no public classes.

@viirya
Copy link
Member Author

viirya commented Jan 5, 2017

I close this for now and maybe reopen it when all the CBO related statistics stuff are merged.

@viirya viirya closed this Jan 5, 2017
@viirya viirya deleted the analyze-temp-table branch December 27, 2023 18:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants