[SPARK-32160][CORE][PYSPARK] Disallow to create SparkContext in executors. #28986

ueshin · 2020-07-02T23:32:06Z

What changes were proposed in this pull request?

This PR proposes to disallow to create SparkContext in executors, e.g., in UDFs.

Why are the changes needed?

Currently executors can create SparkContext, but shouldn't be able to create it.

sc.range(0, 1).foreach { _ =>
  new SparkContext(new SparkConf().setAppName("test").setMaster("local"))
}

Does this PR introduce any user-facing change?

Yes, users won't be able to create SparkContext in executors.

How was this patch tested?

Addes tests.

jiangxb1987 · 2020-07-02T23:57:43Z

core/src/main/scala/org/apache/spark/SparkContext.scala

+   * Throws an exception if a SparkContext is about to be created in executors.
+   */
+  private[spark] def assertOnDriver(): Unit = {
+    if (TaskContext.get != null) {


Have you tested this under local mode?

Under local mode:

scala> sc.range(0, 1).foreach { _ => new SparkContext(new SparkConf().setAppName("test").setMaster("local")) } java.lang.IllegalStateException: SparkContext should only be created and accessed on the driver. ...

before this patch:

scala> sc.range(0, 1).foreach { _ => new SparkContext(new SparkConf().setAppName("test").setMaster("local")) } org.apache.spark.SparkException: Only one SparkContext should be running in this JVM (see SPARK-2243).The currently running SparkContext was created at: org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:921) ...

Although the exception is different, it fails anyway.

I think the new error message is more reasonable.

jiangxb1987 · 2020-07-02T23:59:50Z

I agree we should not create a SparkContext on an executor. We should make sure the approach doesn't break the case when the executor runs in the same jvm with the driver.

core/src/test/scala/org/apache/spark/SparkContextSuite.scala

core/src/main/scala/org/apache/spark/SparkContext.scala

python/pyspark/tests/test_context.py

HyukjinKwon

LGTM otherwise

dongjoon-hyun

+1, LGTM. Thank you, @ueshin .

SparkQA · 2020-07-05T19:56:42Z

Test build #124930 has started for PR 28986 at commit aeed617.

dongjoon-hyun · 2020-07-05T23:03:11Z

The current run seems to hit a flaky test.

[info] - SPARK-30656: minPartitions *** FAILED *** (1 minute)
[info]   The code passed to eventually never returned normally. Attempted 3865 times over 1.0001984272333333 minutes. Last failure message: isPropagated was false Partition [topic-15, 0] metadata not propagated after timeout. (KafkaTestUtils.scala:643)

dongjoon-hyun · 2020-07-05T23:03:20Z

Retest this please.

dongjoon-hyun · 2020-07-06T03:16:03Z

Current run fails again at the different test case.

[info] - read Kafka transactional messages: read_committed *** FAILED *** (1 minute)
[info]   The code passed to eventually never returned normally. Attempted 3685 times over 1.0001156077 minutes. Last failure message: isPropagated was false Partition [topic-39, 0] metadata not propagated after timeout. (KafkaTestUtils.scala:643)
[info]   org.scalatest.exceptions.TestFailedDueToTimeoutException:

dongjoon-hyun · 2020-07-06T03:16:11Z

Retest this please.

SparkQA · 2020-07-06T07:05:03Z

Test build #124974 has finished for PR 28986 at commit aeed617.

This patch fails due to an unknown error code, -9.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2020-07-06T07:05:04Z

Test build #125015 has finished for PR 28986 at commit aeed617.

This patch fails due to an unknown error code, -9.
This patch merges cleanly.
This patch adds no public classes.

jiangxb1987

LGTM too

HyukjinKwon · 2020-07-06T08:17:14Z

retest this please

SparkQA · 2020-07-06T12:11:27Z

Test build #125049 has finished for PR 28986 at commit aeed617.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

HyukjinKwon · 2020-07-06T12:38:44Z

retest this please

SparkQA · 2020-07-06T13:09:52Z

Test build #125083 has finished for PR 28986 at commit aeed617.

This patch fails to generate documentation.
This patch merges cleanly.
This patch adds no public classes.

srowen

Looks reasonable

HyukjinKwon · 2020-07-06T13:34:01Z

retest this please

SparkQA · 2020-07-07T21:24:52Z

Test build #125240 has finished for PR 28986 at commit aeed617.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2020-07-07T22:09:48Z

Test build #125243 has finished for PR 28986 at commit 0faac2a.

This patch fails PySpark unit tests.
This patch merges cleanly.
This patch adds no public classes.

ueshin · 2020-07-07T22:22:41Z

Jenkins, retest this please.

SparkQA · 2020-07-07T22:43:48Z

Test build #125263 has finished for PR 28986 at commit 0faac2a.

This patch fails to generate documentation.
This patch merges cleanly.
This patch adds no public classes.

maropu · 2020-07-07T22:56:33Z

retest this please

SparkQA · 2020-07-08T02:08:33Z

Test build #125264 has finished for PR 28986 at commit 0faac2a.

This patch fails PySpark unit tests.
This patch merges cleanly.
This patch adds no public classes.

HyukjinKwon · 2020-07-08T02:15:48Z

retest this please

SparkQA · 2020-07-08T02:41:18Z

Test build #125265 has finished for PR 28986 at commit 0faac2a.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

HyukjinKwon · 2020-07-08T03:05:18Z

retest this please

SparkQA · 2020-07-08T07:05:02Z

Test build #125285 has finished for PR 28986 at commit 0faac2a.

This patch fails due to an unknown error code, -9.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2020-07-08T07:05:03Z

Test build #125282 has finished for PR 28986 at commit 0faac2a.

This patch fails due to an unknown error code, -9.
This patch merges cleanly.
This patch adds no public classes.

maropu · 2020-07-08T08:16:46Z

retest this please

SparkQA · 2020-07-08T08:37:25Z

Test build #125336 has finished for PR 28986 at commit 0faac2a.

This patch fails to generate documentation.
This patch merges cleanly.
This patch adds no public classes.

HyukjinKwon · 2020-07-08T08:57:30Z

retest this please

SparkQA · 2020-07-08T09:32:35Z

Test build #125342 has finished for PR 28986 at commit 0faac2a.

This patch fails to generate documentation.
This patch merges cleanly.
This patch adds no public classes.

HyukjinKwon · 2020-07-08T11:58:43Z

retest this please

SparkQA · 2020-07-08T18:44:21Z

Test build #125357 has finished for PR 28986 at commit 0faac2a.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

### What changes were proposed in this pull request? This PR aims to disable SBT `unidoc` generation testing in Jenkins environment because it's flaky in Jenkins environment and not used for the official documentation generation. Also, GitHub Action has the correct test coverage for the official documentation generation. - #28848 (comment) (amp-jenkins-worker-06) - #28926 (comment) (amp-jenkins-worker-06) - #28969 (comment) (amp-jenkins-worker-06) - #28975 (comment) (amp-jenkins-worker-05) - #28986 (comment) (amp-jenkins-worker-05) - #28992 (comment) (amp-jenkins-worker-06) - #28993 (comment) (amp-jenkins-worker-05) - #28999 (comment) (amp-jenkins-worker-04) - #29010 (comment) (amp-jenkins-worker-03) - #29013 (comment) (amp-jenkins-worker-04) - #29016 (comment) (amp-jenkins-worker-05) - #29025 (comment) (amp-jenkins-worker-04) - #29042 (comment) (amp-jenkins-worker-03) ### Why are the changes needed? Apache Spark `release-build.sh` generates the official document by using the following command. - https://github.com/apache/spark/blob/master/dev/create-release/release-build.sh#L341 ```bash PRODUCTION=1 RELEASE_VERSION="$SPARK_VERSION" jekyll build ``` And, this is executed by the following `unidoc` command for Scala/Java API doc. - https://github.com/apache/spark/blob/master/docs/_plugins/copy_api_dirs.rb#L30 ```ruby system("build/sbt -Pkinesis-asl clean compile unidoc") || raise("Unidoc generation failed") ``` However, the PR builder disabled `Jekyll build` and instead has a different test coverage. ```python # determine if docs were changed and if we're inside the amplab environment # note - the below commented out until *all* Jenkins workers can get `jekyll` installed # if "DOCS" in changed_modules and test_env == "amplab_jenkins": # build_spark_documentation() ``` ``` Building Unidoc API Documentation ======================================================================== [info] Building Spark unidoc using SBT with these arguments: -Phadoop-3.2 -Phive-2.3 -Pspark-ganglia-lgpl -Pkubernetes -Pmesos -Phadoop-cloud -Phive -Phive-thriftserver -Pkinesis-asl -Pyarn unidoc ``` ### Does this PR introduce _any_ user-facing change? No. (This is used only for testing and not used in the official doc generation.) ### How was this patch tested? Pass the Jenkins without doc generation invocation. Closes #29017 from dongjoon-hyun/SPARK-DOC-GEN. Authored-by: Dongjoon Hyun <dongjoon@apache.org> Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>

ueshin · 2020-07-08T23:34:16Z

Jenkins, retest this please.

SparkQA · 2020-07-09T04:19:22Z

Test build #125415 has finished for PR 28986 at commit 0faac2a.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

HyukjinKwon · 2020-07-09T06:51:35Z

Merged to master, and branch-3.0.

…tors ### What changes were proposed in this pull request? This PR proposes to disallow to create `SparkContext` in executors, e.g., in UDFs. ### Why are the changes needed? Currently executors can create SparkContext, but shouldn't be able to create it. ```scala sc.range(0, 1).foreach { _ => new SparkContext(new SparkConf().setAppName("test").setMaster("local")) } ``` ### Does this PR introduce _any_ user-facing change? Yes, users won't be able to create `SparkContext` in executors. ### How was this patch tested? Addes tests. Closes #28986 from ueshin/issues/SPARK-32160/disallow_spark_context_in_executors. Authored-by: Takuya UESHIN <ueshin@databricks.com> Signed-off-by: HyukjinKwon <gurwls223@apache.org> (cherry picked from commit cfecc20) Signed-off-by: HyukjinKwon <gurwls223@apache.org>

ueshin · 2020-07-09T15:08:41Z

Finally tests passed. Thanks all for triggering tests again and again!

gatorsmile · 2020-07-27T16:23:37Z

core/src/main/scala/org/apache/spark/SparkContext.scala

+   * Throws an exception if a SparkContext is about to be created in executors.
+   */
+  private def assertOnDriver(): Unit = {
+    if (TaskContext.get != null) {


@ueshin could you submit a follow-up PR to add a conf? In Spark 3.0, turn it off by default; in Spark 3.1 turn it on by default? Also add it to the migration guide?

Does this actually affect any legitimate use case that would otherwise work? this should be more of a fail-fast for things that will already fail

This is a user error. However, the users may use a Spark library in executors but the library calls SparkContext.getOrCreate. We should still let their workloads work if it worked in the previous releases.

I see, but would that ever succeed, using the SparkContext that was created? I'm trying to work out what library would do this and not fail. Is it, perhaps, some boilerplate initialization code that gets executed on driver and executor, and it makes a SparkContext that is never actually used on the executor, but now it fails fast?

I get it, if that's the use case. A release note and/or hidden config to disable it might be an OK workaround.

Alternatively if arguing that this is not-uncommon, maybe we just don't do this at all, and revert it

BTW, had been discussing this offline with @ueshin and @HyukjinKwon - one use case that previously (and admittedly unfortunately) relied on the ability to create SparkContexts in the executors is MLflow's mlflow.pyfunc.spark_udf API (see docs & usage example), which provides a unified interface for scoring models trained in arbitrary ML frameworks (tensorflow, scikit-learn, pyspark.ml) on Spark DataFrames as a pandas UDF. Most ML frameworks (e.g. sklearn) are single-node frameworks that can operate on pandas dataframes, so applying them via pandas UDF works well. For a pyspark.ml model to be applied via pandas UDF, we need to convert the input pandas series -> spark dataframe on the executors, which requires a SparkContext on the executors.

I'll dig to see if there's a way to keep the MLflow use case working with this change (e.g. pyspark.ml models may ultimately perform inference via spark UDF, so maybe we can somehow extract the underlying UDF from the model and return that from mlflow.pyfunc.spark_udf?), but otherwise agree that we may want to revert this change.

I submitted a PR to add configs #29278.
Could you take a look at it when you are available?

I think it's a bit odds to have two separate configuration for Scala and Python. I don't think we will happen to allow this in PySpark specifically in the future even if it worked. I think we will discourage this behaviour anyway whether it is in Scala or Python, and deprecate/remove both configurations together eventually. In which case would we disable it in Scala side but enable it in Python side?

In which case would we disable it in Scala side but enable it in Python side?

Like how MLflow uses it today. It's pretty hard to make it work correctly in Scala side. But it's working in Python today.

I see but I guess we should discourage/deprecate/remove the behaviour in the end - am I correct? It's a bit weird to have two configuration to control the same behaviour (to end users). And I think it's a bit unlikely when it should be disabled in Scala but enabled in Python specifically. We could just enable it in both sides.

I pushed a commit to use a single config.

… create SparkContext in executors ### What changes were proposed in this pull request? This is a follow-up of #28986. This PR adds a config to switch allow/disallow to create `SparkContext` in executors. - `spark.driver.allowSparkContextInExecutors` ### Why are the changes needed? Some users or libraries actually create `SparkContext` in executors. We shouldn't break their workloads. ### Does this PR introduce _any_ user-facing change? Yes, users will be able to create `SparkContext` in executors with the config enabled. ### How was this patch tested? More tests are added. Closes #29278 from ueshin/issues/SPARK-32160/add_configs. Authored-by: Takuya UESHIN <ueshin@databricks.com> Signed-off-by: HyukjinKwon <gurwls223@apache.org>

Disallow to create SparkContext in executors.

5a9befa

ueshin requested review from cloud-fan and jiangxb1987 July 2, 2020 23:32

probot-autolabeler bot added CORE PYTHON SQL labels Jul 2, 2020

ueshin changed the title ~~[SPARK-32160][CORE] Disallow to create SparkContext in executors.~~ [SPARK-32160][CORE][PYSPARK] Disallow to create SparkContext in executors. Jul 2, 2020

jiangxb1987 reviewed Jul 2, 2020

View reviewed changes

cloud-fan approved these changes Jul 3, 2020

View reviewed changes

HyukjinKwon reviewed Jul 3, 2020

View reviewed changes

core/src/test/scala/org/apache/spark/SparkContextSuite.scala Outdated Show resolved Hide resolved

HyukjinKwon reviewed Jul 3, 2020

View reviewed changes

core/src/main/scala/org/apache/spark/SparkContext.scala Outdated Show resolved Hide resolved

HyukjinKwon reviewed Jul 3, 2020

View reviewed changes

python/pyspark/tests/test_context.py Show resolved Hide resolved

HyukjinKwon approved these changes Jul 3, 2020

View reviewed changes

Address comments.

aeed617

dongjoon-hyun approved these changes Jul 4, 2020

View reviewed changes

jiangxb1987 approved these changes Jul 6, 2020

View reviewed changes

srowen reviewed Jul 6, 2020

View reviewed changes

Fix tests.

0faac2a

HyukjinKwon approved these changes Jul 9, 2020

View reviewed changes

HyukjinKwon closed this in cfecc20 Jul 9, 2020

gatorsmile reviewed Jul 27, 2020

View reviewed changes

ueshin mentioned this pull request Jul 29, 2020

[SPARK-32160][CORE][PYSPARK] Add a config to switch allow/disallow to create SparkContext in executors. #29278

Closed

smurching mentioned this pull request Aug 28, 2020

Allow creating SparkContext in executors when doing so in order to score Spark models via mlflow.pyfunc.spark_udf mlflow/mlflow#3355

Merged

27 tasks

[SPARK-32160][CORE][PYSPARK] Disallow to create SparkContext in executors. #28986

[SPARK-32160][CORE][PYSPARK] Disallow to create SparkContext in executors. #28986

Conversation

ueshin commented Jul 2, 2020 • edited Loading

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jiangxb1987 commented Jul 2, 2020

HyukjinKwon left a comment

Choose a reason for hiding this comment

dongjoon-hyun left a comment

Choose a reason for hiding this comment

SparkQA commented Jul 5, 2020

dongjoon-hyun commented Jul 5, 2020

dongjoon-hyun commented Jul 5, 2020

dongjoon-hyun commented Jul 6, 2020

dongjoon-hyun commented Jul 6, 2020

SparkQA commented Jul 6, 2020

SparkQA commented Jul 6, 2020

jiangxb1987 left a comment

Choose a reason for hiding this comment

HyukjinKwon commented Jul 6, 2020

SparkQA commented Jul 6, 2020

HyukjinKwon commented Jul 6, 2020

SparkQA commented Jul 6, 2020

srowen left a comment

Choose a reason for hiding this comment

HyukjinKwon commented Jul 6, 2020

SparkQA commented Jul 7, 2020

SparkQA commented Jul 7, 2020

ueshin commented Jul 7, 2020

SparkQA commented Jul 7, 2020

maropu commented Jul 7, 2020

SparkQA commented Jul 8, 2020

HyukjinKwon commented Jul 8, 2020

SparkQA commented Jul 8, 2020

HyukjinKwon commented Jul 8, 2020

SparkQA commented Jul 8, 2020

SparkQA commented Jul 8, 2020

maropu commented Jul 8, 2020

SparkQA commented Jul 8, 2020

HyukjinKwon commented Jul 8, 2020

SparkQA commented Jul 8, 2020

HyukjinKwon commented Jul 8, 2020

SparkQA commented Jul 8, 2020

ueshin commented Jul 8, 2020

SparkQA commented Jul 9, 2020

HyukjinKwon commented Jul 9, 2020

ueshin commented Jul 9, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

HyukjinKwon Jul 30, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ueshin commented Jul 2, 2020 •

edited

Loading

HyukjinKwon Jul 30, 2020 •

edited

Loading