Skip to content

Conversation

@AngersZhuuuu
Copy link
Contributor

@AngersZhuuuu AngersZhuuuu commented Sep 9, 2019

What changes were proposed in this pull request?

For issue mentioned in SPARK-29022
Spark SQL CLI can't use class as serde class in jars add by SQL ADD JAR.

When we create table with serde class contains by jar added by SQL 'ADD JAR'.
We can create table with serde class construct success since we call HiveClientImpl.createTable under withHiveState method, it will add clientLoader.classLoader to HiveClientImpl.state.getConf.classLoader.

Jars added by SQL ADD JAR will be add to

  1. sparkSession.sharedState.jarClassLoader.
  2. 'HiveClientLoader.clientLoader.classLoader'

In Current spark-sql MODE, HiveClientImpl.state will use CliSessionState created when initialize
SparkSQLCliDriver, When we select data from table, it will check serde class, when call method HiveTableScanExec#addColumnMetadataToConf() to check for table desc serde class.

val deserializer = tableDesc.getDeserializerClass.getConstructor().newInstance()
    deserializer.initialize(hiveConf, tableDesc.getProperties)

getDeserializer will use CliSessionState's hiveConf's classLoader in Spark SQL CLI mode.
But when we call ADD JAR in spark, the jar won't be added to Classloader of CliSessionState' conf , then ClassNotFound error happen.

So we reset CliSessionState conf's classLoader to sharedState.jarClassLoader when sharedState.jarClassLoader has added jar passed by HIVEAUXJARS
Then when we use ADD JAR to add jar, jar path will be added to CliSessionState's conf's ClassLoader

Why are the changes needed?

Fix bug

Does this PR introduce any user-facing change?

No

How was this patch tested?

ADD UT

@dongjoon-hyun
Copy link
Member

Hi, @AngersZhuuuu . Is the JIRA issue ID correct? SPARK-29051 ?

@AngersZhuuuu AngersZhuuuu changed the title [WIP][SPARK-29051][SQL] Fix spark 'add jar', CliSessionState's hiveConf 's classLoader ClassNotFound [WIP][SPARK-29022][SQL] Fix spark 'add jar', CliSessionState's hiveConf 's classLoader ClassNotFound Sep 9, 2019
@AngersZhuuuu
Copy link
Contributor Author

Hi, @AngersZhuuuu . Is the JIRA issue ID correct? SPARK-29051 ?

Sorry, I forgot to change it. This PR should wait for #25542

@wangyum
Copy link
Member

wangyum commented Sep 13, 2019

ok to test

@SparkQA
Copy link

SparkQA commented Sep 13, 2019

Test build #110556 has finished for PR 25729 at commit 9d347f6.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
  • test(\"SPARK-29022 Use add jar class as serde\")

@AngersZhuuuu AngersZhuuuu changed the title [WIP][SPARK-29022][SQL] Fix spark 'add jar', CliSessionState's hiveConf 's classLoader ClassNotFound [SPARK-29022][SQL] Fix spark 'add jar', CliSessionState's hiveConf 's classLoader ClassNotFound Sep 13, 2019
@AngersZhuuuu AngersZhuuuu changed the title [SPARK-29022][SQL] Fix spark 'add jar', CliSessionState's hiveConf 's classLoader ClassNotFound [SPARK-29022][SQL][test-hadoop3.2] Fix spark 'add jar', CliSessionState's hiveConf 's classLoader ClassNotFound Sep 13, 2019
@dongjoon-hyun
Copy link
Member

Retest this please.

@dongjoon-hyun
Copy link
Member

hadoop-2.7 is tested. I triggered hadoop-3.2. After a few minutes later, I will trigger [test-hadoop3.2][test-java11], too.

@AngersZhuuuu
Copy link
Contributor Author

hadoop-2.7 is tested. I triggered hadoop-3.2. After a few minutes later, I will trigger [test-hadoop3.2][test-java11], too.

Thanks.

@AngersZhuuuu AngersZhuuuu reopened this Sep 13, 2019
@dongjoon-hyun dongjoon-hyun changed the title [SPARK-29022][SQL][test-hadoop3.2] Fix spark 'add jar', CliSessionState's hiveConf 's classLoader ClassNotFound [SPARK-29022][SQL][test-hadoop3.2][test-java11] Fix spark 'add jar', CliSessionState's hiveConf 's classLoader ClassNotFound Sep 13, 2019
@dongjoon-hyun
Copy link
Member

@AngersZhuuuu . You need to change the title before triggering. 😄

@SparkQA
Copy link

SparkQA commented Sep 13, 2019

Test build #110559 has finished for PR 25729 at commit 9d347f6.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
  • test(\"SPARK-29022 Use add jar class as serde\")

@AngersZhuuuu
Copy link
Contributor Author

@AngersZhuuuu . You need to change the title before triggering. 😄

I intend to change title when build finish. Forgot that I can just change it when you have triggered a
build process.
You have change it. Maybe you should trigger build again for hadoop3.2 & java11.

@dongjoon-hyun
Copy link
Member

Retest this please.

@SparkQA
Copy link

SparkQA commented Sep 13, 2019

Test build #110562 has finished for PR 25729 at commit 9d347f6.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
  • test(\"SPARK-29022 Use add jar class as serde\")

@wangyum
Copy link
Member

wangyum commented Sep 16, 2019

@dongjoon-hyun @srowen Maybe we can fix this issue by another PR: #25775

@srowen
Copy link
Member

srowen commented Sep 16, 2019

OK so this should resolve the same issue? are there arguments for against this change vs the other?

@AngersZhuuuu
Copy link
Contributor Author

AngersZhuuuu commented Sep 16, 2019

@srowen @wangyum
I have add a new UT for this issue to cover case use hive.aux.jars jar's class as serde.
#25775 can't solve problem of this issue.

@dongjoon-hyun we may need to retest all case again

@SparkQA
Copy link

SparkQA commented Sep 16, 2019

Test build #110645 has finished for PR 25729 at commit ce45394.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Sep 17, 2019

Test build #110673 has finished for PR 25729 at commit 1d60b90.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Sep 17, 2019

Test build #110714 has finished for PR 25729 at commit 5486f75.

  • This patch fails PySpark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Sep 29, 2019

Test build #111553 has finished for PR 25729 at commit d070059.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Sep 29, 2019

Test build #111554 has finished for PR 25729 at commit fcfd7af.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Sep 29, 2019

Test build #111565 has finished for PR 25729 at commit db63cf2.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

// In HiveThriftServer2, it will call HiveUtils.newClientForExecution() to get a client
// for Execution, then that method will trigger here to execute, in that case we can't reset
// ret.getConf's ClassLoader.
if (HiveUtils.isCliSessionState) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@srowen @wangyum

Some special point for jdk11. Need to add more explanation?

@SparkQA
Copy link

SparkQA commented Sep 29, 2019

Test build #111568 has finished for PR 25729 at commit 918941e.

  • This patch fails PySpark pip packaging tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Sep 30, 2019

Test build #111583 has finished for PR 25729 at commit 72d4d84.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AngersZhuuuu AngersZhuuuu changed the title [SPARK-29022][SQL][test-hadoop3.2][test-java11] Fix spark 'add jar', CliSessionState's hiveConf 's classLoader ClassNotFound [SPARK-29022][SQL][test-hadoop3.2] Fix spark 'add jar', CliSessionState's hiveConf 's classLoader ClassNotFound Sep 30, 2019
@AngersZhuuuu
Copy link
Contributor Author

@dongjoon-hyun
Passed jdk11 test, I have changed the title, thanks for trigger build for jdk8 & hadoop3.2.

@wangyum
Copy link
Member

wangyum commented Sep 30, 2019

retest this please

@SparkQA
Copy link

SparkQA commented Sep 30, 2019

Test build #111598 has finished for PR 25729 at commit 72d4d84.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AngersZhuuuu AngersZhuuuu changed the title [SPARK-29022][SQL][test-hadoop3.2] Fix spark 'add jar', CliSessionState's hiveConf 's classLoader ClassNotFound [SPARK-29022][SQL]Fix spark 'add jar', CliSessionState's hiveConf 's classLoader ClassNotFound Sep 30, 2019
@wangyum
Copy link
Member

wangyum commented Sep 30, 2019

retest this please

@SparkQA
Copy link

SparkQA commented Sep 30, 2019

Test build #111599 has finished for PR 25729 at commit 72d4d84.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Member

@srowen srowen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks plausible. @wangyum ?

@AngersZhuuuu AngersZhuuuu changed the title [SPARK-29022][SQL]Fix spark 'add jar', CliSessionState's hiveConf 's classLoader ClassNotFound [SPARK-29022][SQL] Reset CliSessionState.conf's ClassLoader to SharedState.jarClassLoader after initialize SparkSQLCliDriver and handle HIVEAUXJARS Oct 1, 2019
@AngersZhuuuu AngersZhuuuu changed the title [SPARK-29022][SQL] Reset CliSessionState.conf's ClassLoader to SharedState.jarClassLoader after initialize SparkSQLCliDriver and handle HIVEAUXJARS [SPARK-29022][SQL]Fix SparkSQLCLI can not add jars by AddJarCommand Oct 1, 2019
Copy link
Member

@wangyum wangyum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for ping me @srowen. The change LGTM.

@srowen srowen closed this in 0cf2f48 Oct 1, 2019
@srowen
Copy link
Member

srowen commented Oct 1, 2019

Merged to master

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants