Skip to content
This repository was archived by the owner on May 12, 2021. It is now read-only.

Conversation

@561152
Copy link

@561152 561152 commented Oct 10, 2017

pio batchpredict --input /tmp/pio/batchpredict-input.json --output /tmp/pio/batchpredict-output.json

[WARN] [ALSModel] Product factor is not cached. Prediction could be slow.
Exception in thread "main" org.apache.spark.SparkException: Only one SparkContext may be running in this JVM (see SPARK-2243). To ignore this error, set spark.driver.allowMultipleContexts = true. The currently running SparkContext was created at:
org.apache.spark.SparkContext.(SparkContext.scala:76)
org.apache.predictionio.workflow.WorkflowContext$.apply(WorkflowContext.scala:45)
org.apache.predictionio.workflow.BatchPredict$.run(BatchPredict.scala:160)
org.apache.predictionio.workflow.BatchPredict$$anonfun$main$1$$anonfun$apply$2.apply(BatchPredict.scala:121)
org.apache.predictionio.workflow.BatchPredict$$anonfun$main$1$$anonfun$apply$2.apply(BatchPredict.scala:117)
scala.Option.map(Option.scala:146)
org.apache.predictionio.workflow.BatchPredict$$anonfun$main$1.apply(BatchPredict.scala:117)
org.apache.predictionio.workflow.BatchPredict$$anonfun$main$1.apply(BatchPredict.scala:115)
scala.Option.map(Option.scala:146)
org.apache.predictionio.workflow.BatchPredict$.main(BatchPredict.scala:115)
org.apache.predictionio.workflow.BatchPredict.main(BatchPredict.scala)
sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
java.lang.reflect.Method.invoke(Method.java:498)
org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:738)
org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:187)
org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:212)
org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:126)
org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
at org.apache.spark.SparkContext$$anonfun$assertNoOtherContextIsRunning$2.apply(SparkContext.scala:2278)
at org.apache.spark.SparkContext$$anonfun$assertNoOtherContextIsRunning$2.apply(SparkContext.scala:2274)
at scala.Option.foreach(Option.scala:257)
at org.apache.spark.SparkContext$.assertNoOtherContextIsRunning(SparkContext.scala:2274)
at org.apache.spark.SparkContext$.markPartiallyConstructed(SparkContext.scala:2353)
at org.apache.spark.SparkContext.(SparkContext.scala:85)
at org.apache.predictionio.workflow.WorkflowContext$.apply(WorkflowContext.scala:45)
at org.apache.predictionio.workflow.BatchPredict$.run(BatchPredict.scala:183)
at org.apache.predictionio.workflow.BatchPredict$$anonfun$main$1$$anonfun$apply$2.apply(BatchPredict.scala:121)
at org.apache.predictionio.workflow.BatchPredict$$anonfun$main$1$$anonfun$apply$2.apply(BatchPredict.scala:117)
at scala.Option.map(Option.scala:146)
at org.apache.predictionio.workflow.BatchPredict$$anonfun$main$1.apply(BatchPredict.scala:117)
at org.apache.predictionio.workflow.BatchPredict$$anonfun$main$1.apply(BatchPredict.scala:115)
at scala.Option.map(Option.scala:146)
at org.apache.predictionio.workflow.BatchPredict$.main(BatchPredict.scala:115)
at org.apache.predictionio.workflow.BatchPredict.main(BatchPredict.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:738)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:187)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:212)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:126)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

@mars
Copy link
Member

mars commented Oct 11, 2017

I do not understand what this PR does.

Does this pull request fix that SparkException or cause it?

Is this a problem using Spark 2.2 and pio batchpredict?

What engine template does this occur for?

@mars
Copy link
Member

mars commented Oct 11, 2017

Also, this PR is from develop branch to master, but AFAIK this project does not use master. So, it seems to be committed directly to the mainline already.

@dszeto
Copy link
Contributor

dszeto commented Oct 11, 2017

This doesn’t look like a valid PR. @561152 are you looking to report an issue? Please do so by filing a JIRA following instructions at http://predictionio.incubator.apache.org/community/contribute-code/#how-to-report-an-issue

@561152
Copy link
Author

561152 commented Oct 12, 2017

@mars @dszeto thanks,I do not know much about GitHub, please understand
I think this is a BUG, and I do the following:
Test one:
I tested the version: 0.12 and master
Using templates: Recommendation
Local environment: HDP spark2.1-hadoop2.7.3
Pio batchpredict: Exception error in thread org.apache.spark.SparkException: Only main one SparkContext may be running in this JVM (see SPARK-2243). To ignore this error, set spark.driver.allowMultipleContexts true. The currently running SparkContext = was created at:
Test two:
Modify file:
Core/src/main/scala/org/apache/predictionio/workflow/WorkflowContext.scala:
core/src/main/scala/org/apache/predictionio/workflow/WorkflowContext.scala:
val conf = new SparkConf()
to:
val conf = new SparkConf().set("spark.driver.allowMultipleContexts", "true")

pio batchpredict :error reporting
[INFO] [ContextHandler] Started o.s.j.s.ServletContextHandler@2474df51{/metrics/json,null,AVAILABLE}
[WARN] [SparkContext] Multiple running SparkContexts detected in the same JVM!
[ERROR] [Utils] Aborting task
[ERROR] [Executor] Exception in task 0.0 in stage 0.0 (TID 0)
[WARN] [TaskSetManager] Lost task 0.0 in stage 0.0 (TID 0, localhost, executor driver): org.apache.spark.SparkException: This RDD lacks a SparkContext. It could happen in the following cases:
(1) RDD transformations and actions are NOT invoked by the driver, but inside of other transformations; for example, rdd1.map(x => rdd2.values.count() * x) is invalid because the values transformation and count action cannot be performed inside of the rdd1.map transformation. For more information, see SPARK-5063.
(2) When a Spark Streaming job recovers from checkpoint, this exception will be hit if a reference to an RDD not defined by the streaming job is used in DStream operations. For more information, See SPARK-13758.
at org.apache.spark.rdd.RDD.org$apache$spark$rdd$RDD$$sc(RDD.scala:89)
at org.apache.spark.rdd.RDD.withScope(RDD.scala:362)
at org.apache.spark.rdd.PairRDDFunctions.lookup(PairRDDFunctions.scala:939)
at org.apache.spark.mllib.recommendation.MatrixFactorizationModel.recommendProducts(MatrixFactorizationModel.scala:169)
at org.example.recommendation.ALSAlgorithm$$anonfun$predict$1.apply(ALSAlgorithm.scala:85)
at org.example.recommendation.ALSAlgorithm$$anonfun$predict$1.apply(ALSAlgorithm.scala:80)
at scala.Option.map(Option.scala:146)
at org.example.recommendation.ALSAlgorithm.predict(ALSAlgorithm.scala:80)
at org.example.recommendation.ALSAlgorithm.predict(ALSAlgorithm.scala:22)
at org.apache.predictionio.controller.PAlgorithm.predictBase(PAlgorithm.scala:76)
at org.apache.predictionio.workflow.BatchPredict$$anonfun$15$$anonfun$16.apply(BatchPredict.scala:212)
at org.apache.predictionio.workflow.BatchPredict$$anonfun$15$$anonfun$16.apply(BatchPredict.scala:211)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
at scala.collection.immutable.List.foreach(List.scala:381)
at scala.collection.TraversableLike$class.map(TraversableLike.scala:234)
at scala.collection.immutable.List.map(List.scala:285)
at org.apache.predictionio.workflow.BatchPredict$$anonfun$15.apply(BatchPredict.scala:211)
at org.apache.predictionio.workflow.BatchPredict$$anonfun$15.apply(BatchPredict.scala:197)
at scala.collection.Iterator$$anon$11.next(Iterator.scala:409)
at scala.collection.Iterator$$anon$11.next(Iterator.scala:409)
at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopDataset$1$$anonfun$13$$anonfun$apply$7.apply$mcV$sp(PairRDDFunctions.scala:1211)
at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopDataset$1$$anonfun$13$$anonfun$apply$7.apply(PairRDDFunctions.scala:1210)
at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopDataset$1$$anonfun$13$$anonfun$apply$7.apply(PairRDDFunctions.scala:1210)
at org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1341)
at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopDataset$1$$anonfun$13.apply(PairRDDFunctions.scala:1218)
at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopDataset$1$$anonfun$13.apply(PairRDDFunctions.scala:1197)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
at org.apache.spark.scheduler.Task.run(Task.scala:99)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:282)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)

[ERROR] [TaskSetManager] Task 0 in stage 0.0 failed 1 times; aborting job

test three:
pio deploy and use reset aip success。

@takezoe
Copy link
Member

takezoe commented Nov 2, 2017

Is it possible to move JIRA and close this pull request?

@mars
Copy link
Member

mars commented Nov 2, 2017

Sounds good @takezoe, though I am still unclear what this issue really is.

@shimamoto
Copy link
Member

I also agree with takezoe!

@mars BatchPredict seems to make multiple SparkContexts. The WorkflowContext.apply method creates a new instance of the SparkContext class.
https://github.com/apache/incubator-predictionio/blob/v0.12.0-incubating/core/src/main/scala/org/apache/predictionio/workflow/BatchPredict.scala#L160
https://github.com/apache/incubator-predictionio/blob/v0.12.0-incubating/core/src/main/scala/org/apache/predictionio/workflow/BatchPredict.scala#L183
Is this supposed to be this way?

@mars
Copy link
Member

mars commented Nov 8, 2017

@shimamoto yes, it requires two SparkContexts because PredictionIO implicitly stops the first one after preparing deploy, so it cannot be reused for the second instance to run queries.

I don’t understand what causes this error here, as the two SparkContexts normally work correctly.

@shimamoto
Copy link
Member

yes, it requires two SparkContexts because PredictionIO implicitly stops the first one after preparing deploy, so it cannot be reused for the second instance to run queries.

@mars Oh, I see. But when using a persisted model, PredictionIO don't stop the first SparkContext.
https://github.com/apache/incubator-predictionio/blob/v0.12.0-incubating/core/src/main/scala/org/apache/predictionio/controller/Engine.scala#L241-L250

I guess this causes, for example, if we use the Recommendation Engine Template, which supports loading a persisted model (see).

@mars
Copy link
Member

mars commented Nov 9, 2017

@shimamoto Ah, this makes so much sense now 😲 Not sure of a solution, yet. I created a JIRA for this issue: PIO-138 Batch predict fails when using a PersistentModel. Let's continue discussion for a fix there.

@takezoe
Copy link
Member

takezoe commented Nov 9, 2017

Is there a way to close this issue other than commit message?

@dszeto
Copy link
Contributor

dszeto commented Mar 1, 2018

@takezoe Unfortunately we can only close PRs with a commit message at this point. Gitbox may help but it has not been working for me.

@takezoe
Copy link
Member

takezoe commented Mar 3, 2018

@dszeto I see. I found an empty commit to close PRs: apache/spark@f217d7d
They are making such commits periodically. So I'll close this PR by a same way later.

@asfgit asfgit closed this in 5469ae4 Mar 3, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.