Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-32807][SQL] ThriftServer open session use direct API to init current DB #29656

Closed
wants to merge 2 commits into from

Conversation

AngersZhuuuu
Copy link
Contributor

What changes were proposed in this pull request?

When init current database, we can use direct API, don't need to call SQL

Why are the changes needed?

No

Does this PR introduce any user-facing change?

No

How was this patch tested?

Not need

@AngersZhuuuu
Copy link
Contributor Author

cc @wangyum @juliuszsompolski

@AngersZhuuuu AngersZhuuuu changed the title [SPARK-32807][SQL] ThriftServer open session slow when high concurrent when init current DB [SPARK-32807][SQL] ThriftServer open session use direct API to init current DB Sep 7, 2020
@SparkQA
Copy link

SparkQA commented Sep 7, 2020

Test build #128331 has finished for PR 29656 at commit a1cf29a.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Sep 7, 2020

Test build #128333 has finished for PR 29656 at commit 1f2fa6c.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@@ -69,7 +69,7 @@ private[hive] class SparkSQLSessionManager(hiveServer: HiveServer2, sqlContext:
setConfMap(ctx, hiveSessionState.getOverriddenConfigurations)
setConfMap(ctx, hiveSessionState.getHiveVariables)
if (sessionConf != null && sessionConf.containsKey("use:database")) {
ctx.sql(s"use ${sessionConf.get("use:database")}")
ctx.sessionState.catalog.setCurrentDatabase(sessionConf.get("use:database"))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just to be clear, is this just a small refactoring?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just to be clear, is this just a small refactoring?

In this way we can avoid a lot of process part, but seems I have miss a important problem that user may write a db name wrong such as aa.cc

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add extra tests to verify the behaviour in case of failures? Is it the same error thrown? Are we skipping some checks along the way by calling the catalog functions?

@juliuszsompolski
Copy link
Contributor

This may change how this skipped processing handles errors. I would add some tests that verify such behaviour.
cc @bogdanghit @CJStuart

@AngersZhuuuu
Copy link
Contributor Author

This may change how this skipped processing handles errors. I would add some tests that verify such behaviour.
cc @bogdanghit @CJStuart

What I most want to do is make HiveClientImpl's client to be thread local var and remove lock to speed up query.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
6 participants