New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

using NetcdfFile to create scisparkcontext encounters arrayoutofindexbounds. #93

Closed
chrimiway opened this Issue Aug 2, 2016 · 4 comments

Comments

Projects
None yet
2 participants
@chrimiway

chrimiway commented Aug 2, 2016

screen shot 2016-08-02 at 8 37 06 pm

i am pretty sure, the two variables are in that nc file.

bty, netcdfDFSfile works fine, except that they didn't split the input file based on minPartitions

@rahulpalamuttam

This comment has been minimized.

Show comment
Hide comment
@rahulpalamuttam

rahulpalamuttam Aug 2, 2016

Member

Hi @chrimiway,
Can you post a gist of your stack trace (your program output and where it errors) and your code as well? You may need to specify specifically which variables you want to extract (there may be an issue with the default behavior of loading everything when there are no variables).

netcdfDFSfile isn't intended to split a single file, but rather a collection of files under a directory.
What you pass to netcdfDFSfile is a directory path, rather than a file path so :
hdfs://.../data/directory
instead of
hdfs://.../data/directory/some.nc
It partitions the collection of files underneath that directory.

It would help greatly if you could provide the information in a gist.
You can create one at https://gist.github.com/, post it, and link it in your response.

Member

rahulpalamuttam commented Aug 2, 2016

Hi @chrimiway,
Can you post a gist of your stack trace (your program output and where it errors) and your code as well? You may need to specify specifically which variables you want to extract (there may be an issue with the default behavior of loading everything when there are no variables).

netcdfDFSfile isn't intended to split a single file, but rather a collection of files under a directory.
What you pass to netcdfDFSfile is a directory path, rather than a file path so :
hdfs://.../data/directory
instead of
hdfs://.../data/directory/some.nc
It partitions the collection of files underneath that directory.

It would help greatly if you could provide the information in a gist.
You can create one at https://gist.github.com/, post it, and link it in your response.

@chrimiway

This comment has been minimized.

Show comment
Hide comment
@chrimiway

chrimiway Aug 3, 2016

thanks for the reply,my code and stack trace are list below, "gist.github" is somehow not available for me.

my code is just simply the getting started code
`def main(args: Array[String]) {
val sc = new SciSparkContext("local", "SciSpark Program")

val scientificRDD = sc.NetcdfFile("file:///media/hadoop/data/nc/1971.nc", List("hmax"),5)
val filteredRDD = scientificRDD.map(p => p("hmax") <= 241.0)
val reshapedRDD = filteredRDD.map(p => p.reduceResolution(20))
val sumAllRDD = reshapedRDD.reduce(_ + _)

println(sumAllRDD)  }`

16/08/02 23:37:41 INFO HadoopRDD: Input split: file:/media/hadoop/data/nc/1971.nc:0+33554432
16/08/02 23:37:41 INFO deprecation: mapred.tip.id is deprecated. Instead, use mapreduce.task.id
16/08/02 23:37:41 INFO deprecation: mapred.task.id is deprecated. Instead, use mapreduce.task.attempt.id
16/08/02 23:37:41 INFO deprecation: mapred.task.is.map is deprecated. Instead, use mapreduce.task.ismap
16/08/02 23:37:41 INFO deprecation: mapred.task.partition is deprecated. Instead, use mapreduce.task.partition
16/08/02 23:37:41 INFO deprecation: mapred.job.id is deprecated. Instead, use mapreduce.job.id
hmax is not available from the URL
16/08/02 23:30:18 ERROR Executor: Exception in task 0.0 in stage 0.0 (TID 0)
java.lang.ArrayIndexOutOfBoundsException: 0
at org.dia.core.SciTensor.(SciTensor.scala:39)
at org.dia.core.SciSparkContext$$anonfun$1.apply(SciSparkContext.scala:123)
at org.dia.core.SciSparkContext$$anonfun$1.apply(SciSparkContext.scala:93)
at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
at scala.collection.Iterator$class.foreach(Iterator.scala:727)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
at scala.collection.TraversableOnce$class.reduceLeft(TraversableOnce.scala:172)
at scala.collection.AbstractIterator.reduceLeft(Iterator.scala:1157)
at org.apache.spark.rdd.RDD$$anonfun$reduce$1$$anonfun$14.apply(RDD.scala:1011)
at org.apache.spark.rdd.RDD$$anonfun$reduce$1$$anonfun$14.apply(RDD.scala:1009)
at org.apache.spark.SparkContext$$anonfun$36.apply(SparkContext.scala:1951)
at org.apache.spark.SparkContext$$anonfun$36.apply(SparkContext.scala:1951)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
at org.apache.spark.scheduler.Task.run(Task.scala:89)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
16/08/02 23:30:18 INFO TaskSetManager: Starting task 1.0 in stage 0.0 (TID 1, localhost, partition 1,PROCESS_LOCAL, 2136 bytes)
16/08/02 23:30:18 INFO Executor: Running task 1.0 in stage 0.0 (TID 1)
16/08/02 23:30:18 INFO HadoopRDD: Input split: file:/media/hadoop/data/nc/1971.nc:33554432+33554432
16/08/02 23:30:18 WARN TaskSetManager: Lost task 0.0 in stage 0.0 (TID 0, localhost): java.lang.ArrayIndexOutOfBoundsException: 0
at org.dia.core.SciTensor.(SciTensor.scala:39)
at org.dia.core.SciSparkContext$$anonfun$1.apply(SciSparkContext.scala:123)
at org.dia.core.SciSparkContext$$anonfun$1.apply(SciSparkContext.scala:93)
at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
at scala.collection.Iterator$class.foreach(Iterator.scala:727)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
at scala.collection.TraversableOnce$class.reduceLeft(TraversableOnce.scala:172)
at scala.collection.AbstractIterator.reduceLeft(Iterator.scala:1157)
at org.apache.spark.rdd.RDD$$anonfun$reduce$1$$anonfun$14.apply(RDD.scala:1011)
at org.apache.spark.rdd.RDD$$anonfun$reduce$1$$anonfun$14.apply(RDD.scala:1009)
at org.apache.spark.SparkContext$$anonfun$36.apply(SparkContext.scala:1951)
at org.apache.spark.SparkContext$$anonfun$36.apply(SparkContext.scala:1951)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
at org.apache.spark.scheduler.Task.run(Task.scala:89)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)

16/08/02 23:30:18 ERROR TaskSetManager: Task 0 in stage 0.0 failed 1 times; aborting job
16/08/02 23:30:18 INFO TaskSchedulerImpl: Cancelling stage 0
16/08/02 23:30:18 INFO Executor: Executor is trying to kill task 1.0 in stage 0.0 (TID 1)
16/08/02 23:30:18 INFO TaskSchedulerImpl: Stage 0 was cancelled
16/08/02 23:30:18 INFO DAGScheduler: ResultStage 0 (reduce at NcLocal.scala:14) failed in 0.619 s
16/08/02 23:30:18 INFO DAGScheduler: Job 0 failed: reduce at NcLocal.scala:14, took 1.060782 s
Exception in thread "main" org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 0.0 failed 1 times, most recent failure: Lost task 0.0 in stage 0.0 (TID 0, localhost): java.lang.ArrayIndexOutOfBoundsException: 0
at org.dia.core.SciTensor.(SciTensor.scala:39)
at org.dia.core.SciSparkContext$$anonfun$1.apply(SciSparkContext.scala:123)
at org.dia.core.SciSparkContext$$anonfun$1.apply(SciSparkContext.scala:93)
at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
at scala.collection.Iterator$class.foreach(Iterator.scala:727)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
at scala.collection.TraversableOnce$class.reduceLeft(TraversableOnce.scala:172)
at scala.collection.AbstractIterator.reduceLeft(Iterator.scala:1157)
at org.apache.spark.rdd.RDD$$anonfun$reduce$1$$anonfun$14.apply(RDD.scala:1011)
at org.apache.spark.rdd.RDD$$anonfun$reduce$1$$anonfun$14.apply(RDD.scala:1009)
at org.apache.spark.SparkContext$$anonfun$36.apply(SparkContext.scala:1951)
at org.apache.spark.SparkContext$$anonfun$36.apply(SparkContext.scala:1951)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
at org.apache.spark.scheduler.Task.run(Task.scala:89)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)

Driver stacktrace:
at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1431)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1419)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1418)
at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1418)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:799)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:799)
at scala.Option.foreach(Option.scala:236)
at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:799)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1640)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1599)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1588)
at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:620)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:1832)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:1952)
at org.apache.spark.rdd.RDD$$anonfun$reduce$1.apply(RDD.scala:1025)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:150)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:111)
at org.apache.spark.rdd.RDD.withScope(RDD.scala:316)
at org.apache.spark.rdd.RDD.reduce(RDD.scala:1007)
at NcLocal$.main(NcLocal.scala:14)
at NcLocal.main(NcLocal.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at com.intellij.rt.execution.application.AppMain.main(AppMain.java:144)
Caused by: java.lang.ArrayIndexOutOfBoundsException: 0
at org.dia.core.SciTensor.(SciTensor.scala:39)
at org.dia.core.SciSparkContext$$anonfun$1.apply(SciSparkContext.scala:123)
at org.dia.core.SciSparkContext$$anonfun$1.apply(SciSparkContext.scala:93)
at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
at scala.collection.Iterator$class.foreach(Iterator.scala:727)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
at scala.collection.TraversableOnce$class.reduceLeft(TraversableOnce.scala:172)
at scala.collection.AbstractIterator.reduceLeft(Iterator.scala:1157)
at org.apache.spark.rdd.RDD$$anonfun$reduce$1$$anonfun$14.apply(RDD.scala:1011)
at org.apache.spark.rdd.RDD$$anonfun$reduce$1$$anonfun$14.apply(RDD.scala:1009)
at org.apache.spark.SparkContext$$anonfun$36.apply(SparkContext.scala:1951)
at org.apache.spark.SparkContext$$anonfun$36.apply(SparkContext.scala:1951)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
at org.apache.spark.scheduler.Task.run(Task.scala:89)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)

chrimiway commented Aug 3, 2016

thanks for the reply,my code and stack trace are list below, "gist.github" is somehow not available for me.

my code is just simply the getting started code
`def main(args: Array[String]) {
val sc = new SciSparkContext("local", "SciSpark Program")

val scientificRDD = sc.NetcdfFile("file:///media/hadoop/data/nc/1971.nc", List("hmax"),5)
val filteredRDD = scientificRDD.map(p => p("hmax") <= 241.0)
val reshapedRDD = filteredRDD.map(p => p.reduceResolution(20))
val sumAllRDD = reshapedRDD.reduce(_ + _)

println(sumAllRDD)  }`

16/08/02 23:37:41 INFO HadoopRDD: Input split: file:/media/hadoop/data/nc/1971.nc:0+33554432
16/08/02 23:37:41 INFO deprecation: mapred.tip.id is deprecated. Instead, use mapreduce.task.id
16/08/02 23:37:41 INFO deprecation: mapred.task.id is deprecated. Instead, use mapreduce.task.attempt.id
16/08/02 23:37:41 INFO deprecation: mapred.task.is.map is deprecated. Instead, use mapreduce.task.ismap
16/08/02 23:37:41 INFO deprecation: mapred.task.partition is deprecated. Instead, use mapreduce.task.partition
16/08/02 23:37:41 INFO deprecation: mapred.job.id is deprecated. Instead, use mapreduce.job.id
hmax is not available from the URL
16/08/02 23:30:18 ERROR Executor: Exception in task 0.0 in stage 0.0 (TID 0)
java.lang.ArrayIndexOutOfBoundsException: 0
at org.dia.core.SciTensor.(SciTensor.scala:39)
at org.dia.core.SciSparkContext$$anonfun$1.apply(SciSparkContext.scala:123)
at org.dia.core.SciSparkContext$$anonfun$1.apply(SciSparkContext.scala:93)
at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
at scala.collection.Iterator$class.foreach(Iterator.scala:727)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
at scala.collection.TraversableOnce$class.reduceLeft(TraversableOnce.scala:172)
at scala.collection.AbstractIterator.reduceLeft(Iterator.scala:1157)
at org.apache.spark.rdd.RDD$$anonfun$reduce$1$$anonfun$14.apply(RDD.scala:1011)
at org.apache.spark.rdd.RDD$$anonfun$reduce$1$$anonfun$14.apply(RDD.scala:1009)
at org.apache.spark.SparkContext$$anonfun$36.apply(SparkContext.scala:1951)
at org.apache.spark.SparkContext$$anonfun$36.apply(SparkContext.scala:1951)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
at org.apache.spark.scheduler.Task.run(Task.scala:89)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
16/08/02 23:30:18 INFO TaskSetManager: Starting task 1.0 in stage 0.0 (TID 1, localhost, partition 1,PROCESS_LOCAL, 2136 bytes)
16/08/02 23:30:18 INFO Executor: Running task 1.0 in stage 0.0 (TID 1)
16/08/02 23:30:18 INFO HadoopRDD: Input split: file:/media/hadoop/data/nc/1971.nc:33554432+33554432
16/08/02 23:30:18 WARN TaskSetManager: Lost task 0.0 in stage 0.0 (TID 0, localhost): java.lang.ArrayIndexOutOfBoundsException: 0
at org.dia.core.SciTensor.(SciTensor.scala:39)
at org.dia.core.SciSparkContext$$anonfun$1.apply(SciSparkContext.scala:123)
at org.dia.core.SciSparkContext$$anonfun$1.apply(SciSparkContext.scala:93)
at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
at scala.collection.Iterator$class.foreach(Iterator.scala:727)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
at scala.collection.TraversableOnce$class.reduceLeft(TraversableOnce.scala:172)
at scala.collection.AbstractIterator.reduceLeft(Iterator.scala:1157)
at org.apache.spark.rdd.RDD$$anonfun$reduce$1$$anonfun$14.apply(RDD.scala:1011)
at org.apache.spark.rdd.RDD$$anonfun$reduce$1$$anonfun$14.apply(RDD.scala:1009)
at org.apache.spark.SparkContext$$anonfun$36.apply(SparkContext.scala:1951)
at org.apache.spark.SparkContext$$anonfun$36.apply(SparkContext.scala:1951)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
at org.apache.spark.scheduler.Task.run(Task.scala:89)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)

16/08/02 23:30:18 ERROR TaskSetManager: Task 0 in stage 0.0 failed 1 times; aborting job
16/08/02 23:30:18 INFO TaskSchedulerImpl: Cancelling stage 0
16/08/02 23:30:18 INFO Executor: Executor is trying to kill task 1.0 in stage 0.0 (TID 1)
16/08/02 23:30:18 INFO TaskSchedulerImpl: Stage 0 was cancelled
16/08/02 23:30:18 INFO DAGScheduler: ResultStage 0 (reduce at NcLocal.scala:14) failed in 0.619 s
16/08/02 23:30:18 INFO DAGScheduler: Job 0 failed: reduce at NcLocal.scala:14, took 1.060782 s
Exception in thread "main" org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 0.0 failed 1 times, most recent failure: Lost task 0.0 in stage 0.0 (TID 0, localhost): java.lang.ArrayIndexOutOfBoundsException: 0
at org.dia.core.SciTensor.(SciTensor.scala:39)
at org.dia.core.SciSparkContext$$anonfun$1.apply(SciSparkContext.scala:123)
at org.dia.core.SciSparkContext$$anonfun$1.apply(SciSparkContext.scala:93)
at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
at scala.collection.Iterator$class.foreach(Iterator.scala:727)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
at scala.collection.TraversableOnce$class.reduceLeft(TraversableOnce.scala:172)
at scala.collection.AbstractIterator.reduceLeft(Iterator.scala:1157)
at org.apache.spark.rdd.RDD$$anonfun$reduce$1$$anonfun$14.apply(RDD.scala:1011)
at org.apache.spark.rdd.RDD$$anonfun$reduce$1$$anonfun$14.apply(RDD.scala:1009)
at org.apache.spark.SparkContext$$anonfun$36.apply(SparkContext.scala:1951)
at org.apache.spark.SparkContext$$anonfun$36.apply(SparkContext.scala:1951)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
at org.apache.spark.scheduler.Task.run(Task.scala:89)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)

Driver stacktrace:
at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1431)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1419)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1418)
at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1418)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:799)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:799)
at scala.Option.foreach(Option.scala:236)
at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:799)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1640)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1599)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1588)
at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:620)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:1832)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:1952)
at org.apache.spark.rdd.RDD$$anonfun$reduce$1.apply(RDD.scala:1025)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:150)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:111)
at org.apache.spark.rdd.RDD.withScope(RDD.scala:316)
at org.apache.spark.rdd.RDD.reduce(RDD.scala:1007)
at NcLocal$.main(NcLocal.scala:14)
at NcLocal.main(NcLocal.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at com.intellij.rt.execution.application.AppMain.main(AppMain.java:144)
Caused by: java.lang.ArrayIndexOutOfBoundsException: 0
at org.dia.core.SciTensor.(SciTensor.scala:39)
at org.dia.core.SciSparkContext$$anonfun$1.apply(SciSparkContext.scala:123)
at org.dia.core.SciSparkContext$$anonfun$1.apply(SciSparkContext.scala:93)
at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
at scala.collection.Iterator$class.foreach(Iterator.scala:727)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
at scala.collection.TraversableOnce$class.reduceLeft(TraversableOnce.scala:172)
at scala.collection.AbstractIterator.reduceLeft(Iterator.scala:1157)
at org.apache.spark.rdd.RDD$$anonfun$reduce$1$$anonfun$14.apply(RDD.scala:1011)
at org.apache.spark.rdd.RDD$$anonfun$reduce$1$$anonfun$14.apply(RDD.scala:1009)
at org.apache.spark.SparkContext$$anonfun$36.apply(SparkContext.scala:1951)
at org.apache.spark.SparkContext$$anonfun$36.apply(SparkContext.scala:1951)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
at org.apache.spark.scheduler.Task.run(Task.scala:89)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)

@rahulpalamuttam

This comment has been minimized.

Show comment
Hide comment
@rahulpalamuttam

rahulpalamuttam Aug 3, 2016

Member

Thanks @chrimiway,

NetcdfFile is intended to read a file containing a list of paths pointing to netcdf files served by opendap servers. In your case you want to use NetcdfDFSFile.

If you have a collection of files on an opendap server, append their urls to a file and pass that file path to the NetcdfFile function.

Member

rahulpalamuttam commented Aug 3, 2016

Thanks @chrimiway,

NetcdfFile is intended to read a file containing a list of paths pointing to netcdf files served by opendap servers. In your case you want to use NetcdfDFSFile.

If you have a collection of files on an opendap server, append their urls to a file and pass that file path to the NetcdfFile function.

@chrimiway

This comment has been minimized.

Show comment
Hide comment
@chrimiway

chrimiway commented Aug 3, 2016

thanks a lot, @rahulpalamuttam

@chrimiway chrimiway closed this Aug 4, 2016

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment