Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-32219][SQL] Add SHOW CACHED TABLES Command #29034

Closed
wants to merge 13 commits into from

Conversation

ulysses-you
Copy link
Contributor

What changes were proposed in this pull request?

Modify SHOW TABLES to support SHOW CACHED TABLES.
After this pr, we can call

SHOW CACHED TABLES;

Why are the changes needed?

Once we cache table in sql mode, we have no way to get it back, so that we don't know which table has cached.
If we want to reduce some cache, we have to do clear cache that remove all cached tables.
SHOW CACHED TABLES can give a way to find which table is cached, then we can uncache the one we want.

Does this PR introduce any user-facing change?

Yes, a new command.

How was this patch tested?

New test.

@SparkQA
Copy link

SparkQA commented Jul 8, 2020

Test build #125308 has finished for PR 29034 at commit 5d362f6.

  • This patch fails due to an unknown error code, -9.
  • This patch merges cleanly.
  • This patch adds no public classes.

@maropu
Copy link
Member

maropu commented Jul 8, 2020

Looks useful cc: @HyukjinKwon @viirya

@maropu
Copy link
Member

maropu commented Jul 8, 2020

retest this please

@SparkQA
Copy link

SparkQA commented Jul 8, 2020

Test build #125335 has started for PR 29034 at commit 5d362f6.

@SparkQA
Copy link

SparkQA commented Jul 8, 2020

Test build #125345 has finished for PR 29034 at commit 564725a.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@maropu
Copy link
Member

maropu commented Jul 8, 2020

retest this please

@SparkQA
Copy link

SparkQA commented Jul 8, 2020

Test build #125372 has finished for PR 29034 at commit 564725a.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Member

@viirya viirya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Catalog provides isCached(tableName: String) API. Can't we use it to know which table is cached?

@dilipbiswal
Copy link
Contributor

dilipbiswal commented Jul 9, 2020

@maropu @viirya Also, i was thinking, if we want to make this information available from one of the SHOW commands, can we not extend show tables to show another attribute isCached ? Do we need a special syntax for it ? I guess, the one advantage of this command is that it acts as a filter and only shows the cached tables. But having this in one command gives a complete view of all the tables, regular, temporary and cached tables.

And we could apply filter to only see individual table types.

spark.sql("show tables").where("isCached= true").show
spark.sql("show tables").where("isTemporary= true").show

What do you guys think ?

@ulysses-you
Copy link
Contributor Author

@viirya we can, but in sql mode e.g. ThriftServer, we have no way to get it.

@maropu
Copy link
Member

maropu commented Jul 9, 2020

Yea, I think this fix is for SQL interfaces.

But having this in one command gives a complete view of all the tables, regular, temporary and cached tables.

hm, adding a new column isCache looks fine, too, for that purpose. WDYT? @ulysses-you
I feel that might be better than adding a new parser token CACHED.

@ulysses-you
Copy link
Contributor Author

It's ok to add an attribute isCached but it is not enough. 2 reasons:

  1. A database can contains hundreds tables, spark.sql("show tables").where("isCached= true").show can not used in sql mode. Then it's hard to find all cached tables in all tables.
  2. SHOW TABLES in SparkSQLDriver only show the table name without other attribute which is for compatible with Hive. Then the new attribute is useless in this scene.

@dilipbiswal
Copy link
Contributor

@ulysses-you

A database can contains hundreds tables, spark.sql("show tables").where("isCached= true").show can not used in sql mode. Then it's hard to find all cached tables in all tables.

Ok. Sounds good !!

SHOW TABLES in SparkSQLDriver only show the table name without other attribute which is for compatible with Hive. Then the new attribute is useless in this scene.

I was not aware of this !! Thanks !!

@maropu
Copy link
Member

maropu commented Jul 10, 2020

retest this please

@maropu
Copy link
Member

maropu commented Jul 10, 2020

Looks okay as it is.

@SparkQA
Copy link

SparkQA commented Jul 10, 2020

Test build #125534 has finished for PR 29034 at commit 564725a.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jul 10, 2020

Test build #125556 has finished for PR 29034 at commit 5138cdd.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@maropu
Copy link
Member

maropu commented Jul 10, 2020

retest this please

@SparkQA
Copy link

SparkQA commented Jul 10, 2020

Test build #125584 has finished for PR 29034 at commit 5138cdd.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@maropu
Copy link
Member

maropu commented Jul 10, 2020

retest this please

@maropu
Copy link
Member

maropu commented Jul 10, 2020

@ulysses-you Ur, could you resolve the conflict?

@SparkQA
Copy link

SparkQA commented Jul 11, 2020

Test build #125646 has finished for PR 29034 at commit 5138cdd.

  • This patch fails Spark unit tests.
  • This patch does not merge cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jul 11, 2020

Test build #125655 has finished for PR 29034 at commit 14ead4e.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
  • case class IsExecutorAlive(executorId: String) extends CoarseGrainedClusterMessage
  • class DisableHints(conf: SQLConf) extends RemoveAllHints(conf: SQLConf)

@SparkQA
Copy link

SparkQA commented Jul 11, 2020

Test build #125664 has finished for PR 29034 at commit 856026b.

  • This patch fails due to an unknown error code, -9.
  • This patch merges cleanly.
  • This patch adds no public classes.

@ulysses-you
Copy link
Contributor Author

cc @maropu @cloud-fan thanks for review.

@HyukjinKwon
Copy link
Member

retest this please

@SparkQA
Copy link

SparkQA commented Jul 20, 2020

Test build #126170 has finished for PR 29034 at commit 856026b.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@maropu
Copy link
Member

maropu commented Aug 3, 2020

retest this please

@SparkQA
Copy link

SparkQA commented Aug 3, 2020

Test build #126987 has finished for PR 29034 at commit 856026b.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@maropu
Copy link
Member

maropu commented Aug 6, 2020

I'm okay to support it and no more comment. Anyone could check this? also cc: @cloud-fan

@SparkQA
Copy link

SparkQA commented Aug 7, 2020

Test build #127196 has finished for PR 29034 at commit 13e1c35.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Aug 7, 2020

Test build #127195 has finished for PR 29034 at commit 3a0b049.

  • This patch fails Spark unit tests.
  • This patch does not merge cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Aug 7, 2020

Test build #127206 has finished for PR 29034 at commit 8262fe2.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Aug 9, 2020

Test build #127238 has finished for PR 29034 at commit 929c00d.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Nov 11, 2020

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35530/

@SparkQA
Copy link

SparkQA commented Nov 11, 2020

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35530/

@SparkQA
Copy link

SparkQA commented Nov 11, 2020

Test build #130925 has finished for PR 29034 at commit a9b88ca.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Dec 21, 2020

Test build #133162 has finished for PR 29034 at commit 63163ed.

  • This patch fails to build.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Dec 23, 2020

Test build #133312 has finished for PR 29034 at commit 635d3ab.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@github-actions
Copy link

github-actions bot commented Apr 3, 2021

We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable.
If you'd like to revive this PR, please reopen it and ask a committer to remove the Stale tag!

@github-actions github-actions bot added the Stale label Apr 3, 2021
@github-actions github-actions bot closed this Apr 4, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
7 participants