Skip to content

Conversation

@wangyum
Copy link
Member

@wangyum wangyum commented Sep 5, 2019

What changes were proposed in this pull request?

This PR moves Hive test jars(hive-contrib-*.jar and hive-hcatalog-core-*.jar) from maven dependency to local file.

Why are the changes needed?

--jars can't be tested since hive-contrib-*.jar and hive-hcatalog-core-*.jar are already in classpath.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

manual test

@wangyum
Copy link
Member Author

wangyum commented Sep 5, 2019

@SparkQA
Copy link

SparkQA commented Sep 5, 2019

Test build #110166 has finished for PR 25690 at commit b02cc4d.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
  • case class TestHiveVersion(hiveClient: HiveClient)
  • class TestHiveContext(
  • case class TestTable(name: String, commands: (() => Unit)*)
  • protected[hive] implicit class SqlCmd(sql: String)

@wangyum wangyum changed the title Revert "[SPARK-27831][SQL][TEST] Move Hive test jars to maven dependency" [test-hadoop3.2] Revert "[SPARK-27831][SQL][TEST] Move Hive test jars to maven dependency" Sep 5, 2019
@wangyum
Copy link
Member Author

wangyum commented Sep 5, 2019

retest this please

@SparkQA
Copy link

SparkQA commented Sep 5, 2019

Test build #110178 has finished for PR 25690 at commit b02cc4d.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
  • case class TestHiveVersion(hiveClient: HiveClient)
  • class TestHiveContext(
  • case class TestTable(name: String, commands: (() => Unit)*)
  • protected[hive] implicit class SqlCmd(sql: String)


test("Commands using SerDe provided in --jars") {
val jarFile = HiveTestUtils.getHiveHcatalogCoreJar.getCanonicalPath
val jarFile = "../hive/src/test/resources/" +
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding these JARs to the source tree has some LICENSE and NOTICE implications. The Hive NOTICE text from these JARs would have to go in NOTICE (and probably NOTICE-binary as we published test JARs). They'd have to be listed in LICENSE and LICENSE-binary too.

This is possible, but is it equally possible to just download these like we do with other Hive jars?

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically, this doesn't look like a complete revert of that commit. In this case, please be clear on that.
For example, this PR didn't revert the following from hive-thriftserver/pom.xml.

@dongjoon-hyun
Copy link
Member

IMO, if you still depends on a some code in the previous commit, this is just a clean-up follow-up.

If you use Revert, some downstreams tries to skip both commits (this and the original one). Then, it will fails.

@wangyum
Copy link
Member Author

wangyum commented Sep 6, 2019

thank you @srowen @dongjoon-hyun I see. Will update it later.

@wangyum wangyum changed the title [test-hadoop3.2] Revert "[SPARK-27831][SQL][TEST] Move Hive test jars to maven dependency" [SPARK-27831][FOLLOW-UP][SQL][TEST][test-maven] Move Hive test jars to maven dependency Sep 6, 2019
@HyukjinKwon
Copy link
Member

(Let's update and elaborate the PR description as well)

@SparkQA
Copy link

SparkQA commented Sep 6, 2019

Test build #110216 has finished for PR 25690 at commit ca57402.

  • This patch fails to build.
  • This patch merges cleanly.
  • This patch adds no public classes.

@wangyum wangyum changed the title [SPARK-27831][FOLLOW-UP][SQL][TEST][test-maven] Move Hive test jars to maven dependency [WIP][SPARK-27831][FOLLOW-UP][SQL][TEST][test-maven] Move Hive test jars to local file Sep 6, 2019
@SparkQA
Copy link

SparkQA commented Sep 6, 2019

Test build #110223 has finished for PR 25690 at commit 1a90f38.

  • This patch fails due to an unknown error code, -9.
  • This patch merges cleanly.
  • This patch adds no public classes.

@wangyum
Copy link
Member Author

wangyum commented Sep 6, 2019

retest this please

@SparkQA
Copy link

SparkQA commented Sep 6, 2019

Test build #110228 has finished for PR 25690 at commit 1a90f38.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@srowen
Copy link
Member

srowen commented Sep 6, 2019

I think the approach is valid, although I know there's a strong preference to avoid putting JAR files in the source tree. This is possibly a valid case for doing so, as it's test code and kind of requires the JAR file to exist locally. That said, I wonder if we can just reuse the code that is used elsewhere to download Hive JARs here? it might need a little refactoring, but it already exists, to use the ASF mirrors, etc.

@wangyum wangyum changed the title [WIP][SPARK-27831][FOLLOW-UP][SQL][TEST][test-maven] Move Hive test jars to local file [WIP][SPARK-27831][FOLLOW-UP][SQL][TEST] Move Hive test jars to local file Sep 7, 2019
@SparkQA
Copy link

SparkQA commented Sep 7, 2019

Test build #110276 has finished for PR 25690 at commit e18cf28.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
  • case class TestHiveVersion(hiveClient: HiveClient)
  • class TestHiveContext(
  • case class TestTable(name: String, commands: (() => Unit)*)
  • protected[hive] implicit class SqlCmd(sql: String)

@SparkQA
Copy link

SparkQA commented Sep 7, 2019

Test build #110277 has finished for PR 25690 at commit 64c614a.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Sep 7, 2019

Test build #110279 has finished for PR 25690 at commit d5d0b7f.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@wangyum wangyum changed the title [WIP][SPARK-27831][FOLLOW-UP][SQL][TEST] Move Hive test jars to local file [WIP][SPARK-27831][FOLLOW-UP][SQL][TEST][test-hadoop3.2] Move Hive test jars to local file Sep 7, 2019
@wangyum
Copy link
Member Author

wangyum commented Sep 24, 2019

retest this please

@SparkQA
Copy link

SparkQA commented Sep 24, 2019

Test build #111294 has started for PR 25690 at commit e4ed806.

@wangyum
Copy link
Member Author

wangyum commented Sep 24, 2019

retest this please

@SparkQA
Copy link

SparkQA commented Sep 24, 2019

Test build #111301 has finished for PR 25690 at commit e4ed806.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@wangyum
Copy link
Member Author

wangyum commented Sep 24, 2019

retest this please

@SparkQA
Copy link

SparkQA commented Sep 25, 2019

Test build #111314 has finished for PR 25690 at commit e4ed806.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@wangyum
Copy link
Member Author

wangyum commented Sep 25, 2019

@srowen @HyukjinKwon @dongjoon-hyun The test failure fixed by #25775. Which one should be merged first?

- SPARK-8368: includes jars passed in through --jars *** FAILED ***
...
  2019-09-24 20:55:40.8 - stderr> Caused by: java.lang.RuntimeException: java.lang.ClassNotFoundException: org.apache.hive.hcatalog.data.JsonSerDe
  2019-09-24 20:55:40.8 - stderr> 	at org.apache.hadoop.hive.ql.plan.TableDesc.getDeserializerClass(TableDesc.java:79)
  2019-09-24 20:55:40.8 - stderr> 	at org.apache.spark.sql.hive.execution.HiveTableScanExec.addColumnMetadataToConf(HiveTableScanExec.scala:123)
  2019-09-24 20:55:40.8 - stderr> 	at org.apache.spark.sql.hive.execution.HiveTableScanExec.hadoopConf$lzycompute(HiveTableScanExec.scala:101)
  2019-09-24 20:55:40.8 - stderr> 	at org.apache.spark.sql.hive.execution.HiveTableScanExec.hadoopConf(HiveTableScanExec.scala:98)
  2019-09-24 20:55:40.8 - stderr> 	at org.apache.spark.sql.hive.execution.HiveTableScanExec.hadoopReader$lzycompute(HiveTableScanExec.scala:110)
  2019-09-24 20:55:40.8 - stderr> 	at org.apache.spark.sql.hive.execution.HiveTableScanExec.hadoopReader(HiveTableScanExec.scala:105)
  2019-09-24 20:55:40.8 - stderr> 	at org.apache.spark.sql.hive.execution.HiveTableScanExec.$anonfun$doExecute$1(HiveTableScanExec.scala:188)
  2019-09-24 20:55:40.8 - stderr> 	at org.apache.spark.util.Utils$.withDummyCallSite(Utils.scala:2488)
  2019-09-24 20:55:40.8 - stderr> 	at org.apache.spark.sql.hive.execution.HiveTableScanExec.doExecute(HiveTableScanExec.scala:188)
  2019-09-24 20:55:40.8 - stderr> 	at org.apache.spark.sql.execution.SparkPlan.$anonfun$execute$1(SparkPlan.scala:189)
  2019-09-24 20:55:40.8 - stderr> 	at org.apache.spark.sql.execution.SparkPlan.$anonfun$executeQuery$1(SparkPlan.scala:227)
  2019-09-24 20:55:40.8 - stderr> 	at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
  2019-09-24 20:55:40.8 - stderr> 	at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:224)
  2019-09-24 20:55:40.8 - stderr> 	at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:185)
  2019-09-24 20:55:40.8 - stderr> 	at org.apache.spark.sql.execution.exchange.ShuffleExchangeExec.inputRDD$lzycompute(ShuffleExchangeExec.scala:64)
  2019-09-24 20:55:40.8 - stderr> 	at org.apache.spark.sql.execution.exchange.ShuffleExchangeExec.inputRDD(ShuffleExchangeExec.scala:64)
  2019-09-24 20:55:40.8 - stderr> 	at org.apache.spark.sql.execution.exchange.ShuffleExchangeExec.shuffleDependency$lzycompute(ShuffleExchangeExec.scala:74)
  2019-09-24 20:55:40.8 - stderr> 	at org.apache.spark.sql.execution.exchange.ShuffleExchangeExec.shuffleDependency(ShuffleExchangeExec.scala:72)
  2019-09-24 20:55:40.8 - stderr> 	at org.apache.spark.sql.execution.exchange.ShuffleExchangeExec.createShuffledRDD(ShuffleExchangeExec.scala:82)
  2019-09-24 20:55:40.8 - stderr> 	at org.apache.spark.sql.execution.exchange.ShuffleExchangeExec.$anonfun$doExecute$1(ShuffleExchangeExec.scala:93)
  2019-09-24 20:55:40.8 - stderr> 	at org.apache.spark.sql.catalyst.errors.package$.attachTree(package.scala:52)
  2019-09-24 20:55:40.8 - stderr> 	... 67 more
  2019-09-24 20:55:40.8 - stderr> Caused by: java.lang.ClassNotFoundException: org.apache.hive.hcatalog.data.JsonSerDe
  2019-09-24 20:55:40.8 - stderr> 	at java.base/java.net.URLClassLoader.findClass(URLClassLoader.java:471)
  2019-09-24 20:55:40.8 - stderr> 	at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:588)
  2019-09-24 20:55:40.8 - stderr> 	at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:521)
  2019-09-24 20:55:40.8 - stderr> 	at java.base/java.lang.Class.forName0(Native Method)
  2019-09-24 20:55:40.8 - stderr> 	at java.base/java.lang.Class.forName(Class.java:398)
  2019-09-24 20:55:40.8 - stderr> 	at org.apache.hadoop.hive.ql.plan.TableDesc.getDeserializerClass(TableDesc.java:76)

@srowen
Copy link
Member

srowen commented Sep 27, 2019

@wangyum #25775 is merged

@SparkQA
Copy link

SparkQA commented Sep 27, 2019

Test build #4886 has started for PR 25690 at commit e4ed806.

@wangyum
Copy link
Member Author

wangyum commented Sep 27, 2019

Thank you @srowen

}

private[hive] object HiveTestJars {
private val repository = SQLConf.ADDITIONAL_REMOTE_REPOSITORIES.defaultValueString
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can also verify that the default value is valid.

@SparkQA
Copy link

SparkQA commented Sep 27, 2019

Test build #111486 has finished for PR 25690 at commit 882ae45.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@wangyum
Copy link
Member Author

wangyum commented Sep 27, 2019

retest this please

@SparkQA
Copy link

SparkQA commented Sep 28, 2019

Test build #111502 has finished for PR 25690 at commit 882ae45.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@wangyum wangyum changed the title [SPARK-27831][FOLLOW-UP][SQL][TEST][test-maven][test-hadoop3.2][test-java11] Move Hive test jars to local file [SPARK-27831][FOLLOW-UP][SQL][TEST] Move Hive test jars to local file Sep 28, 2019
@wangyum
Copy link
Member Author

wangyum commented Sep 28, 2019

retest this please

@SparkQA
Copy link

SparkQA commented Sep 28, 2019

Test build #111527 has finished for PR 25690 at commit 882ae45.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@wangyum wangyum changed the title [SPARK-27831][FOLLOW-UP][SQL][TEST] Move Hive test jars to local file [SPARK-27831][FOLLOW-UP][SQL][TEST] Should not use maven to add Hive test jars Sep 28, 2019
@wangyum
Copy link
Member Author

wangyum commented Sep 28, 2019

@srowen @dongjoon-hyun @HyukjinKwon Do you have any comment?

@wangyum wangyum closed this in 8167714 Sep 28, 2019
@wangyum
Copy link
Member Author

wangyum commented Sep 28, 2019

Thank you all.

@wangyum
Copy link
Member Author

wangyum commented Sep 28, 2019

Merged to master.

@wangyum wangyum deleted the SPARK-27831-revert branch September 28, 2019 23:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants