Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(LDA)Example/LDADriver/ Job aborted due to stage failure: java.lang.ArrayIndexOutOfBoundsException: -6 #50

Open
ylqfp opened this issue May 3, 2016 · 13 comments

Comments

@ylqfp
Copy link

ylqfp commented May 3, 2016

Example/LDADriver

Job aborted due to stage failure: Task 9 in stage 28.1 failed 4 times, most recent failure: Lost task 9.3 in stage 28.1 (TID 355, cloud1014121118.wd.nm.ss.nop.ted): java.lang.ArrayIndexOutOfBoundsException: -6
at org.apache.spark.graphx2.impl.EdgePartition.dstIds(EdgePartition.scala:114)
at org.apache.spark.graphx2.impl.EdgePartition$$anon$1.next(EdgePartition.scala:341)
at org.apache.spark.graphx2.impl.EdgePartition$$anon$1.next(EdgePartition.scala:333)
at scala.collection.Iterator$class.foreach(Iterator.scala:727)
at org.apache.spark.graphx2.impl.EdgePartition$$anon$1.foreach(EdgePartition.scala:333)
at org.apache.spark.graphx2.impl.RoutingTablePartition$.edgePartitionToMsgs(RoutingTablePartition.scala:58)
at org.apache.spark.graphx2.VertexRDD$$anonfun$4$$anonfun$apply$2.apply(VertexRDD.scala:359)
at org.apache.spark.graphx2.VertexRDD$$anonfun$4$$anonfun$apply$2.apply(VertexRDD.scala:359)
at scala.Function$$anonfun$tupled$1.apply(Function.scala:77)
at scala.Function$$anonfun$tupled$1.apply(Function.scala:76)
at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:371)
at org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter.write(BypassMergeSortShuffleWriter.java:126)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
at org.apache.spark.scheduler.Task.run(Task.scala:89)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)

@ylqfp
Copy link
Author

ylqfp commented May 3, 2016

My dataset is libsvm style:
0 0:3 1:2 2:10 ...
1 0:1 1:1 2:3 ...
.....

Is it a bug or something wrong with my dataset?

Thanks!

@ylqfp ylqfp changed the title Job aborted due to stage failure: java.lang.ArrayIndexOutOfBoundsException: -6 (LDA)Job aborted due to stage failure: java.lang.ArrayIndexOutOfBoundsException: -6 May 3, 2016
@ylqfp
Copy link
Author

ylqfp commented May 3, 2016

Following the instructions from google, I enlarge the memory, however still failed. @witgo

@ylqfp ylqfp changed the title (LDA)Job aborted due to stage failure: java.lang.ArrayIndexOutOfBoundsException: -6 (LDA)Example/LDADriver/ Job aborted due to stage failure: java.lang.ArrayIndexOutOfBoundsException: -6 May 3, 2016
@witgo
Copy link
Contributor

witgo commented May 3, 2016

ping @bhoppi

@ylqfp
Copy link
Author

ylqfp commented May 3, 2016

@bhoppi @hucheng Help!

@bhoppi
Copy link
Contributor

bhoppi commented May 3, 2016

I can't get useful info from your log. Can you dig your spark log for more detail? And please share your cmd parameters.

@ylqfp
Copy link
Author

ylqfp commented May 3, 2016

spark-submit --master yarn-client --class com.github.cloudml.zen.examples.ml.LDADriver
--jars ml/target/zen-ml_2.10-0.3-SNAPSHOT.jar
--executor-memory 6G --driver-memory 6G --num-executors 200 --executor-cores 1
--conf spark.driver.maxResultSize=6G
--conf spark.driver.extraJavaOptions="-XX:MaxPermSize=256m -XX:+CMSClassUnloadingEnabled -XX:+PrintGC -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintGCDateStamps -XX:+PrintHeapAtGC -XX:+PrintGCApplicationStoppedTime -XX:+PrintGCApplicationConcurrentTime -Xloggc:gc.log -XX:+HeapDumpOnOutOfMemoryError"
--conf spark.yarn.am.memory=2g
--conf spark.yarn.am.extraJavaOptions="-XX:MaxPermSize=256m -XX:+CMSClassUnloadingEnabled"
--conf spark.storage.memoryFraction=0.1
--conf spark.yarn.executor.memoryOverhead=6666
--conf spark.sql.shuffle.partitions=2000
--conf spark.executor.extraJavaOptions="-XX:MaxPermSize=256m -XX:+CMSClassUnloadingEnabled -XX:MaxDirectMemorySize=2048m -Xmn100m -XX:MaxTenuringThreshold=1 -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled -XX:+UseCMSCompactAtFullCollection -XX:+UseCMSInitiatingOccupancyOnly -XX:CMSInitiatingOccupancyFraction=10 -XX:+UseCompressedOops -XX:+PrintGC -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintGCDateStamps -XX:+PrintGCApplicationStoppedTime -XX:+PrintHeapAtGC -XX:+PrintGCApplicationConcurrentTime -Xloggc:gc.log"
examples/target/zen-examples_2.10-0.3-SNAPSHOT.jar
-numTopics=1000
-alpha=0.1
-beta=0.01
-alphaAS=0.01
-totalIter=50
-numPartitions=20
-useKryo=true
-ignoredocid=true
/user/distml/ldatest/input4
/user/distml/ldatest/output2

@ylqfp
Copy link
Author

ylqfp commented May 3, 2016

Uploading log.txt…

@ylqfp
Copy link
Author

ylqfp commented May 3, 2016

The log.txt is a little big, so a attched it in previous post. Tell me if you cannot see the file.

@bhoppi
Copy link
Contributor

bhoppi commented May 3, 2016

Sorry I can't read the log file.

@ylqfp
Copy link
Author

ylqfp commented May 3, 2016

gclog.txt
log.txt

@ylqfp
Copy link
Author

ylqfp commented May 3, 2016

Upload done... @bhoppi

@bhoppi
Copy link
Contributor

bhoppi commented May 4, 2016

@ylqfp Can you upload the container log? I can't still get the point from the master log.

@ylqfp
Copy link
Author

ylqfp commented May 7, 2016

Dear Bhoppi,
Sorry for the late response!
I use yarn logs -applicationID to get container log, however got nothing.
Could you please tell me where to find the container log?
Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants