Skip to content

[SPARK-33191][YARN][TESTS] Fix PySpark test cases in YarnClusterSuite#30099

Closed
HyukjinKwon wants to merge 1 commit intoapache:masterfrom
HyukjinKwon:SPARK-33191
Closed

[SPARK-33191][YARN][TESTS] Fix PySpark test cases in YarnClusterSuite#30099
HyukjinKwon wants to merge 1 commit intoapache:masterfrom
HyukjinKwon:SPARK-33191

Conversation

@HyukjinKwon
Copy link
Member

@HyukjinKwon HyukjinKwon commented Oct 20, 2020

What changes were proposed in this pull request?

This PR proposes to fix:

org.apache.spark.deploy.yarn.YarnClusterSuite.run Python application in yarn-client mode
org.apache.spark.deploy.yarn.YarnClusterSuite.run Python application in yarn-cluster mode 
org.apache.spark.deploy.yarn.YarnClusterSuite.run Python application in yarn-cluster mode using spark.yarn.appMasterEnv to override local envvar

it currently fails as below:

20/10/16 19:20:36 WARN TaskSetManager: Lost task 0.0 in stage 0.0 (TID 0) (amp-jenkins-worker-03.amp executor 1): org.apache.spark.SparkException: 
Error from python worker:
  Traceback (most recent call last):
    File "/usr/lib64/python2.6/runpy.py", line 104, in _run_module_as_main
      loader, code, fname = _get_module_details(mod_name)
    File "/usr/lib64/python2.6/runpy.py", line 79, in _get_module_details
      loader = get_loader(mod_name)
    File "/usr/lib64/python2.6/pkgutil.py", line 456, in get_loader
      return find_loader(fullname)
    File "/usr/lib64/python2.6/pkgutil.py", line 466, in find_loader
      for importer in iter_importers(fullname):
    File "/usr/lib64/python2.6/pkgutil.py", line 422, in iter_importers
      __import__(pkg)
    File "/home/jenkins/workspace/SparkPullRequestBuilder@2/python/pyspark/__init__.py", line 53, in <module>
      from pyspark.rdd import RDD, RDDBarrier
    File "/home/jenkins/workspace/SparkPullRequestBuilder@2/python/pyspark/rdd.py", line 34, in <module>
      from pyspark.java_gateway import local_connect_and_auth
    File "/home/jenkins/workspace/SparkPullRequestBuilder@2/python/pyspark/java_gateway.py", line 29, in <module>
      from py4j.java_gateway import java_import, JavaGateway, JavaObject, GatewayParameters
    File "/home/jenkins/workspace/SparkPullRequestBuilder@2/python/lib/py4j-0.10.9-src.zip/py4j/java_gateway.py", line 60
      PY4J_TRUE = {"yes", "y", "t", "true"}
                        ^
  SyntaxError: invalid syntax

I think this was broken when Python 2 was dropped but was not caught because this specific test does not run when there's no change in YARN codes. See also #29843 (comment)

The root cause seems like the paths are different, see #29843 (review). I think Jenkins uses a different Python executable via Anaconda and the executor side does not know where it is for some reasons.

This PR proposes to fix it just by explicitly specifying the absolute path for Python executable so the tests should pass in any environment.

Why are the changes needed?

To make tests pass.

Does this PR introduce any user-facing change?

No, dev-only.

How was this patch tested?

This issue looks specific to Jenkins. It should run the tests on Jenkins.

@HyukjinKwon HyukjinKwon marked this pull request as draft October 20, 2020 07:53
@HyukjinKwon HyukjinKwon changed the title [SPARK-33191][PYTHON][YARN][TESTS]Fix PySpark test cases in YarnClusterSuite [SPARK-33191][YARN][TESTS] Fix PySpark test cases in YarnClusterSuite Oct 20, 2020
@SparkQA

This comment has been minimized.

@SparkQA

This comment has been minimized.

@SparkQA

This comment has been minimized.

@SparkQA

This comment has been minimized.

@SparkQA

This comment has been minimized.

@HyukjinKwon
Copy link
Member Author

Okay, it passes in Jenkins:

[info] - run Python application in yarn-client mode (18 seconds, 41 milliseconds)
[info] - run Python application in yarn-cluster mode (21 seconds, 30 milliseconds)
[info] - run Python application in yarn-cluster mode using spark.yarn.appMasterEnv to override local envvar (22 seconds, 31 milliseconds)

@HyukjinKwon HyukjinKwon marked this pull request as ready for review October 20, 2020 09:35
@HyukjinKwon
Copy link
Member Author

cc @sunchao @tgravescs @dongjoon-hyun.

@SparkQA

This comment has been minimized.

@SparkQA

This comment has been minimized.

@SparkQA

This comment has been minimized.

@SparkQA
Copy link

SparkQA commented Oct 20, 2020

Test build #130040 has finished for PR 30099 at commit ccabb04.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

private val pythonExecutablePath = {
// To make sure to use the same Python executable.
val maybePath = TestUtils.getAbsolutePathFromExecutable("python3")
assert(maybePath.isDefined)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe we should have a more descriptive error message if this fails so users can easily figure out what's wrong.
It also seems like we should have a better way to specify the python executable then hardcoding to python3 in this test, but that isn't something you changed.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we don't have python3, it will not even start to run the tests ... e.g.) see https://github.com/apache/spark/blob/master/dev/run-tests.py#L1 This is rather just a sanity check. But sure, I can add some more messages, though.

@tgravescs
Copy link
Contributor

this doesn't guarantee they use the same python version still - at least not minor version (3.5 vs 3.4) if executor node has different version then the driver node, but you think its a python2 vs python3, correct?
Sounds like you were going to investigate the build side as well, correct?

@HyukjinKwon
Copy link
Member Author

@tgravescs, but the tests run in the same node. If we specify the absolute path, it will point out the same Python version. This is a test only fix BTW.

@HyukjinKwon
Copy link
Member Author

Thanks, Tom. Let me merge this into master :-).

@HyukjinKwon
Copy link
Member Author

Merged to master.

@dongjoon-hyun
Copy link
Member

Thank you for the fix, @HyukjinKwon !
cc @sunchao

@sunchao
Copy link
Member

sunchao commented Oct 20, 2020

nice! thanks for the fix @HyukjinKwon !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants