Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-22756] [Build] [SparkR] Run SparkR tests if hive_thriftserver module has code changes #19944

Closed
wants to merge 2 commits into from

Conversation

gatorsmile
Copy link
Member

@gatorsmile gatorsmile commented Dec 11, 2017

What changes were proposed in this pull request?

The recent PR change in hive_thriftserver caused the test failure in CRAN requirements. To some extends, SparkR module also depends on hive_thriftserver module which could output some log files, so we should run SparkR tests if the hive_thriftserver module has code changes.

How was this patch tested?

N/A

@@ -481,7 +481,7 @@ def __hash__(self):

sparkr = Module(
name="sparkr",
dependencies=[hive, mllib],
dependencies=[hive, mllib, hive_thriftserver],
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since SparkR already depends on hive, this PR just adds the dependence on hive_thriftserver

@SparkQA
Copy link

SparkQA commented Dec 11, 2017

Test build #84735 has finished for PR 19944 at commit 1fd2d53.

  • This patch fails due to an unknown error code, 255.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Dec 12, 2017

Test build #84736 has finished for PR 19944 at commit 6439458.

  • This patch fails SparkR unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@gatorsmile
Copy link
Member Author

cc @felixcheung Very weird... I can reproduce it in my local environment. Could you take a look at why SparkR failed?

Caused by: org.apache.spark.SparkException: R computation failed with
 Error : requireNamespace("e1071", quietly = TRUE) is not TRUE
    at org.apache.spark.api.r.RRunner.compute(RRunner.scala:108)
    at org.apache.spark.api.r.BaseRRDD.compute(RRDD.scala:51)
    at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
    at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
    at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
    at org.apache.spark.scheduler.Task.run(Task.scala:108)
    at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:345)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    ... 1 more

@HyukjinKwon
Copy link
Member

Hm, @gatorsmile. BTW, do you maybe know how CRAN check fails by the changes in the thrift server? I was just double checking for sure but it sounds orthogonal to me now.

The test failure above seems due to missing package e1071 in your local.

@felixcheung
Copy link
Member

felixcheung commented Dec 12, 2017 via email

@felixcheung
Copy link
Member

felixcheung commented Dec 12, 2017 via email

@HyukjinKwon
Copy link
Member

HyukjinKwon commented Dec 12, 2017

Yup and it looks passing fine back now (roughly just a couple(?) of hours ago).

Seems the problem was this one(?)

* checking CRAN incoming feasibility ...Error in .check_package_CRAN_incoming(pkgdir) : 
  dims [product 39] do not match the length of object [0]
Execution halted
Loading required package: methods

@HyukjinKwon
Copy link
Member

retest this please

@SparkQA
Copy link

SparkQA commented Dec 12, 2017

Test build #84768 has finished for PR 19944 at commit 6439458.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@gatorsmile
Copy link
Member Author

<property>
    <name>hive.server2.logging.operation.enabled</name>
    <value>true</value>
</property>
<property>
  <name>hive.server2.logging.operation.level</name>
  <value>VERBOSE</value>
</property>
<property>
    <name>hive.querylog.location</name>
    <value>/data/logs/hive/${user.name}</value>
</property>

I am afraid that thriftserver PR could break it, because it writes to the log if hive.server2.logging.operation.enabled is set to true.

@gatorsmile
Copy link
Member Author

How about merging this PR first? and let @zouchenjun resubmit the PR?

@gatorsmile
Copy link
Member Author

cc @srowen @liancheng

@gatorsmile gatorsmile closed this Dec 12, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants