pio batchpredict error #441

561152 · 2017-10-10T02:53:30Z

pio batchpredict --input /tmp/pio/batchpredict-input.json --output /tmp/pio/batchpredict-output.json

[WARN] [ALSModel] Product factor is not cached. Prediction could be slow.
Exception in thread "main" org.apache.spark.SparkException: Only one SparkContext may be running in this JVM (see SPARK-2243). To ignore this error, set spark.driver.allowMultipleContexts = true. The currently running SparkContext was created at:
org.apache.spark.SparkContext.(SparkContext.scala:76)
org.apache.predictionio.workflow.WorkflowContext$.apply(WorkflowContext.scala:45)
org.apache.predictionio.workflow.BatchPredict$.run(BatchPredict.scala:160)
org.apache.predictionio.workflow.BatchPredict$$anonfun$main$1$$anonfun$apply$2.apply(BatchPredict.scala:121)
org.apache.predictionio.workflow.BatchPredict$$anonfun$main$1$$anonfun$apply$2.apply(BatchPredict.scala:117)
scala.Option.map(Option.scala:146)
org.apache.predictionio.workflow.BatchPredict$$anonfun$main$1.apply(BatchPredict.scala:117)
org.apache.predictionio.workflow.BatchPredict$$anonfun$main$1.apply(BatchPredict.scala:115)
scala.Option.map(Option.scala:146)
org.apache.predictionio.workflow.BatchPredict$.main(BatchPredict.scala:115)
org.apache.predictionio.workflow.BatchPredict.main(BatchPredict.scala)
sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
java.lang.reflect.Method.invoke(Method.java:498)
org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:738)
org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:187)
org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:212)
org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:126)
org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
at org.apache.spark.SparkContext$$anonfun$assertNoOtherContextIsRunning$2.apply(SparkContext.scala:2278)
at org.apache.spark.SparkContext$$anonfun$assertNoOtherContextIsRunning$2.apply(SparkContext.scala:2274)
at scala.Option.foreach(Option.scala:257)
at org.apache.spark.SparkContext$.assertNoOtherContextIsRunning(SparkContext.scala:2274)
at org.apache.spark.SparkContext$.markPartiallyConstructed(SparkContext.scala:2353)
at org.apache.spark.SparkContext.(SparkContext.scala:85)
at org.apache.predictionio.workflow.WorkflowContext$.apply(WorkflowContext.scala:45)
at org.apache.predictionio.workflow.BatchPredict$.run(BatchPredict.scala:183)
at org.apache.predictionio.workflow.BatchPredict$$anonfun$main$1$$anonfun$apply$2.apply(BatchPredict.scala:121)
at org.apache.predictionio.workflow.BatchPredict$$anonfun$main$1$$anonfun$apply$2.apply(BatchPredict.scala:117)
at scala.Option.map(Option.scala:146)
at org.apache.predictionio.workflow.BatchPredict$$anonfun$main$1.apply(BatchPredict.scala:117)
at org.apache.predictionio.workflow.BatchPredict$$anonfun$main$1.apply(BatchPredict.scala:115)
at scala.Option.map(Option.scala:146)
at org.apache.predictionio.workflow.BatchPredict$.main(BatchPredict.scala:115)
at org.apache.predictionio.workflow.BatchPredict.main(BatchPredict.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:738)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:187)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:212)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:126)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

Closes #439

Closes #436

Closes #438

Closes #440

mars · 2017-10-11T19:55:31Z

I do not understand what this PR does.

Does this pull request fix that SparkException or cause it?

Is this a problem using Spark 2.2 and pio batchpredict?

What engine template does this occur for?

mars · 2017-10-11T19:59:02Z

Also, this PR is from develop branch to master, but AFAIK this project does not use master. So, it seems to be committed directly to the mainline already.

dszeto · 2017-10-11T23:43:52Z

This doesn’t look like a valid PR. @561152 are you looking to report an issue? Please do so by filing a JIRA following instructions at http://predictionio.incubator.apache.org/community/contribute-code/#how-to-report-an-issue

561152 · 2017-10-12T02:44:40Z

@mars @dszeto thanks，I do not know much about GitHub, please understand
I think this is a BUG, and I do the following:
Test one:
I tested the version: 0.12 and master
Using templates: Recommendation
Local environment: HDP spark2.1-hadoop2.7.3
Pio batchpredict: Exception error in thread org.apache.spark.SparkException: Only main one SparkContext may be running in this JVM (see SPARK-2243). To ignore this error, set spark.driver.allowMultipleContexts true. The currently running SparkContext = was created at:
Test two:
Modify file:
Core/src/main/scala/org/apache/predictionio/workflow/WorkflowContext.scala:
core/src/main/scala/org/apache/predictionio/workflow/WorkflowContext.scala：
val conf = new SparkConf()
to：
val conf = new SparkConf().set("spark.driver.allowMultipleContexts", "true")

pio batchpredict ：error reporting
[INFO] [ContextHandler] Started o.s.j.s.ServletContextHandler@2474df51{/metrics/json,null,AVAILABLE}
[WARN] [SparkContext] Multiple running SparkContexts detected in the same JVM!
[ERROR] [Utils] Aborting task
[ERROR] [Executor] Exception in task 0.0 in stage 0.0 (TID 0)
[WARN] [TaskSetManager] Lost task 0.0 in stage 0.0 (TID 0, localhost, executor driver): org.apache.spark.SparkException: This RDD lacks a SparkContext. It could happen in the following cases:
(1) RDD transformations and actions are NOT invoked by the driver, but inside of other transformations; for example, rdd1.map(x => rdd2.values.count() * x) is invalid because the values transformation and count action cannot be performed inside of the rdd1.map transformation. For more information, see SPARK-5063.
(2) When a Spark Streaming job recovers from checkpoint, this exception will be hit if a reference to an RDD not defined by the streaming job is used in DStream operations. For more information, See SPARK-13758.
at org.apache.spark.rdd.RDD.org$apache$spark$rdd$RDD$$sc(RDD.scala:89)
at org.apache.spark.rdd.RDD.withScope(RDD.scala:362)
at org.apache.spark.rdd.PairRDDFunctions.lookup(PairRDDFunctions.scala:939)
at org.apache.spark.mllib.recommendation.MatrixFactorizationModel.recommendProducts(MatrixFactorizationModel.scala:169)
at org.example.recommendation.ALSAlgorithm$$anonfun$predict$1.apply(ALSAlgorithm.scala:85)
at org.example.recommendation.ALSAlgorithm$$anonfun$predict$1.apply(ALSAlgorithm.scala:80)
at scala.Option.map(Option.scala:146)
at org.example.recommendation.ALSAlgorithm.predict(ALSAlgorithm.scala:80)
at org.example.recommendation.ALSAlgorithm.predict(ALSAlgorithm.scala:22)
at org.apache.predictionio.controller.PAlgorithm.predictBase(PAlgorithm.scala:76)
at org.apache.predictionio.workflow.BatchPredict$$anonfun$15$$anonfun$16.apply(BatchPredict.scala:212)
at org.apache.predictionio.workflow.BatchPredict$$anonfun$15$$anonfun$16.apply(BatchPredict.scala:211)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
at scala.collection.immutable.List.foreach(List.scala:381)
at scala.collection.TraversableLike$class.map(TraversableLike.scala:234)
at scala.collection.immutable.List.map(List.scala:285)
at org.apache.predictionio.workflow.BatchPredict$$anonfun$15.apply(BatchPredict.scala:211)
at org.apache.predictionio.workflow.BatchPredict$$anonfun$15.apply(BatchPredict.scala:197)
at scala.collection.Iterator$$anon$11.next(Iterator.scala:409)
at scala.collection.Iterator$$anon$11.next(Iterator.scala:409)
at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopDataset$1$$anonfun$13$$anonfun$apply$7.apply$mcV$sp(PairRDDFunctions.scala:1211)
at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopDataset$1$$anonfun$13$$anonfun$apply$7.apply(PairRDDFunctions.scala:1210)
at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopDataset$1$$anonfun$13$$anonfun$apply$7.apply(PairRDDFunctions.scala:1210)
at org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1341)
at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopDataset$1$$anonfun$13.apply(PairRDDFunctions.scala:1218)
at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopDataset$1$$anonfun$13.apply(PairRDDFunctions.scala:1197)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
at org.apache.spark.scheduler.Task.run(Task.scala:99)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:282)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)

[ERROR] [TaskSetManager] Task 0 in stage 0.0 failed 1 times; aborting job

test three：
pio deploy and use reset aip success。

Git URLs still pending updates

Closes #443

Closes #442

takezoe · 2017-11-02T05:58:25Z

Is it possible to move JIRA and close this pull request?

mars · 2017-11-02T14:22:03Z

Sounds good @takezoe, though I am still unclear what this issue really is.

shimamoto · 2017-11-08T08:23:36Z

I also agree with takezoe!

@mars BatchPredict seems to make multiple SparkContexts. The WorkflowContext.apply method creates a new instance of the SparkContext class.
https://github.com/apache/incubator-predictionio/blob/v0.12.0-incubating/core/src/main/scala/org/apache/predictionio/workflow/BatchPredict.scala#L160
https://github.com/apache/incubator-predictionio/blob/v0.12.0-incubating/core/src/main/scala/org/apache/predictionio/workflow/BatchPredict.scala#L183
Is this supposed to be this way?

mars · 2017-11-08T18:28:19Z

@shimamoto yes, it requires two SparkContexts because PredictionIO implicitly stops the first one after preparing deploy, so it cannot be reused for the second instance to run queries.

I don’t understand what causes this error here, as the two SparkContexts normally work correctly.

shimamoto · 2017-11-09T05:58:43Z

yes, it requires two SparkContexts because PredictionIO implicitly stops the first one after preparing deploy, so it cannot be reused for the second instance to run queries.

@mars Oh, I see. But when using a persisted model, PredictionIO don't stop the first SparkContext.
https://github.com/apache/incubator-predictionio/blob/v0.12.0-incubating/core/src/main/scala/org/apache/predictionio/controller/Engine.scala#L241-L250

I guess this causes, for example, if we use the Recommendation Engine Template, which supports loading a persisted model (see).

mars · 2017-11-09T16:55:20Z

@shimamoto Ah, this makes so much sense now 😲 Not sure of a solution, yet. I created a JIRA for this issue: PIO-138 Batch predict fails when using a PersistentModel. Let's continue discussion for a fix there.

takezoe · 2017-11-09T18:56:59Z

Is there a way to close this issue other than commit message?

…ubator-predictionio into develop

closes #445

Closes #448

dszeto · 2018-03-01T18:11:30Z

@takezoe Unfortunately we can only close PRs with a commit message at this point. Gitbox may help but it has not been working for me.

This closes #449

This is a requirement of ASF

This closes #444

This closes #446

takezoe · 2018-03-03T14:45:16Z

@dszeto I see. I found an empty commit to close PRs: apache/spark@f217d7d
They are making such commits periodically. So I'll close this PR by a same way later.

Closes #441

chanlee514 and others added 7 commits September 27, 2017 20:16

Bump version to 0.13.0-SNAPSHOT

a87b7a6

[PIO-131] Fix Apache licensing issues for doc site

5f8a0c9

Closes #439

[PIO-125] Spark 2.2 support

7339a16

Closes #436

[PIO-129] Move CLI document

f92d2ac

Closes #438

[PIO-101] Document usage of Plug-in of event server and engine server

d8ee0c8

Closes #440

[PIO-133] Add DOAP

9017d7f

[PIO-133] Add all missing trademark attributions

b5c9655

mars and others added 9 commits October 16, 2017 13:54

Add Heroku Buildpack to community-contributed packages for installation

0acaa00

Change "Installation" header to avoid ambiguity with title

f9a1dc2

Refine go-live procedure in docs to "Contribute Documentation"

6be4ab5

[PIO-135] Remove incubating status

e6ea7dd

Git URLs still pending updates

Updated the Packagist.org URL.

80efe32

Closes #443

Updated the Elastic download URL.

15eb70a

Closes #442

Merge branch 'livedoc' into develop

1cfb77a

Add template info

27e4599

Merge branch 'livedoc' into develop

e0f8b27

[MINOR] Fix broken link in install page

e1b211e

haginot and others added 2 commits November 20, 2017 16:15

Merge branch 'livedoc' of https://git-wip-us.apache.org/repos/asf/inc…

fc30731

…ubator-predictionio into develop

JsonSyntaxException: Expected a string but was BEGIN_ARRAY

865d24c

closes #445

haginot and others added 6 commits November 20, 2017 17:00

Merge branch 'livedoc' into develop

a061529

Update projects.html.md

7169b69

Closes #448

Merge branch 'livedoc' into develop

2c283f5

[PIO-150] Update ruby gem versions

12b9b21

[PIO-217] Update release instructions for PMC

abb8a74

[PIO-135][PIO-148] Remove all incubating annotations

e156cb4

jamesward and others added 5 commits March 1, 2018 13:48

[PIO-151] Add S3 storage docs

c1270e2

This closes #449

[PIO-147] Fix broken Scala API documentation

f1f8723

[PIO-146] Change TM to (R) on text marks

c2528cb

This is a requirement of ASF

Disable gzip to avoid large commits on site repo

a59ef97

[PIO-136] Add CleanupFunctions for Python

161bc0e

This closes #444

asfgit force-pushed the develop branch from 8999cde to 161bc0e Compare March 1, 2018 21:50

[PIO-137] Create a connection object at a worker to delete events

ef6a490

This closes #446

[INFRA] Close invalid PR.

5469ae4

Closes #441

asfgit closed this in 5469ae4 Mar 3, 2018

pio batchpredict error #441

pio batchpredict error #441

Uh oh!

Conversation

561152 commented Oct 10, 2017

Uh oh!

mars commented Oct 11, 2017

Uh oh!

mars commented Oct 11, 2017

Uh oh!

dszeto commented Oct 11, 2017

Uh oh!

561152 commented Oct 12, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

takezoe commented Nov 2, 2017

Uh oh!

mars commented Nov 2, 2017

Uh oh!

shimamoto commented Nov 8, 2017

Uh oh!

mars commented Nov 8, 2017

Uh oh!

shimamoto commented Nov 9, 2017

Uh oh!

mars commented Nov 9, 2017

Uh oh!

takezoe commented Nov 9, 2017

Uh oh!

dszeto commented Mar 1, 2018

Uh oh!

takezoe commented Mar 3, 2018

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

11 participants

561152 commented Oct 12, 2017 •

edited

Loading