Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Collective sam file not present in the output directory. #38

Closed
tushu1232 opened this issue Feb 21, 2017 · 9 comments
Closed

Collective sam file not present in the output directory. #38

tushu1232 opened this issue Feb 21, 2017 · 9 comments

Comments

@tushu1232
Copy link

The SparkBWA is running till the last step.But failing while running the –r option,meaning it is not collecting all the intermediate files to join into one sam file.
Here is the last run logs after which no sam file is created.Is there som change because of Merge pull request #37 from xubo26/master

17/02/21 16:14:48 INFO BwaInterpreter: JMAbuin:: SparkBWA :: Returned file ::/ testcases/outputsam//SparkBWA_LP2000265-DNA_A01_1.fastq-32-SortSpark-app-20170221155756-0000-29.sam
17/02/21 16:14:48 INFO BwaInterpreter: JMAbuin:: SparkBWA :: Returned file /testcases/outputsam//SparkBWA_LP2000265-DNA_A01_1.fastq-32-SortSpark-app-20170221155756-0000-30.sam
17/02/21 16:14:48 INFO BwaInterpreter: JMAbuin:: SparkBWA :: Returned file Sidra_lookup_wrappers/testcases/outputsam//SparkBWA_LP2000265-DNA_A01_1.fastq-32-SortSpark-app-20170221155756-0000-31.sam

@tushu1232
Copy link
Author

#22

@xubo245
Copy link
Contributor

xubo245 commented Feb 22, 2017

You can debug or add log info to trace, focus on the file, including temp file

@tushu1232
Copy link
Author

@xubo245 We are running the program on a non-hdfs environment using IBM LSF as scheduler and not YARN.I am attaching the entire run on this comment.
BWASPARKRUN.txt

@xubo245
Copy link
Contributor

xubo245 commented Feb 24, 2017

Can you see stderr log? in app-** of work dir ,

and list tmps dir file ?(maybe in workspace/tmps)

I can not see err in you BWASPARKRUN.txt

@salimbakker
Copy link

salimbakker commented Apr 21, 2017

SparkBWA generates empty sam files

/04/21 08:11:45 INFO ContainerManagementProtocolProxy: Opening proxy : slave2.hdp:45454
17/04/21 08:11:48 INFO YarnClusterSchedulerBackend: Registered executor NettyRpcEndpointRef(null) (slave1.hdp:41948) with ID 1
17/04/21 08:11:48 INFO BlockManagerMasterEndpoint: Registering block manager slave1.hdp:38864 with 7.0 GB RAM, BlockManagerId(1, slave1.hdp, 38864)
17/04/21 08:11:48 INFO YarnClusterSchedulerBackend: Registered executor NettyRpcEndpointRef(null) (slave.hdp:50548) with ID 2
17/04/21 08:11:48 INFO BlockManagerMasterEndpoint: Registering block manager slave.hdp:43602 with 7.0 GB RAM, BlockManagerId(2, slave.hdp, 43602)
17/04/21 08:11:48 INFO YarnClusterSchedulerBackend: Registered executor NettyRpcEndpointRef(null) (slave2.hdp:49614) with ID 3
17/04/21 08:11:48 INFO BlockManagerMasterEndpoint: Registering block manager slave2.hdp:46273 with 7.0 GB RAM, BlockManagerId(3, slave2.hdp, 46273)
17/04/21 08:12:14 INFO YarnClusterSchedulerBackend: SchedulerBackend is ready for scheduling beginning after waiting maxRegisteredResourcesWaitingTime: 30000(ms)
17/04/21 08:12:14 INFO YarnClusterScheduler: YarnClusterScheduler.postStartHook done
17/04/21 08:12:14 INFO BwaInterpreter: [com.github.sparkbwa.BwaInterpreter] :: Starting BWA
17/04/21 08:12:14 INFO BwaInterpreter: [com.github.sparkbwa.BwaInterpreter] ::Not sorting in HDFS. Timing: 47447818973648
17/04/21 08:12:14 INFO MemoryStore: Block broadcast_0 stored as values in memory (estimated size 341.8 KB, free 341.8 KB)
17/04/21 08:12:14 INFO MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 28.3 KB, free 370.2 KB)
17/04/21 08:12:14 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on 192.168.2.86:32844 (size: 28.3 KB, free: 1140.3 MB)
17/04/21 08:12:14 INFO SparkContext: Created broadcast 0 from textFile at BwaInterpreter.java:149
17/04/21 08:12:14 INFO FileInputFormat: Total input paths to process : 1
17/04/21 08:12:14 INFO SparkContext: Starting job: zipWithIndex at BwaInterpreter.java:152
17/04/21 08:12:14 INFO DAGScheduler: Got job 0 (zipWithIndex at BwaInterpreter.java:152) with 13 output partitions
17/04/21 08:12:14 INFO DAGScheduler: Final stage: ResultStage 0 (zipWithIndex at BwaInterpreter.java:152)
17/04/21 08:12:14 INFO DAGScheduler: Parents of final stage: List()
17/04/21 08:12:14 INFO DAGScheduler: Missing parents: List()
17/04/21 08:12:14 INFO DAGScheduler: Submitting ResultStage 0 (hdfs://master.hdp:8020/SparkBWA/ERR000589_1.filt.fastq MapPartitionsRDD[1] at textFile at BwaInterpreter.java:149), which has no missing parents
17/04/21 08:12:14 ERROR LiveListenerBus: Listener EventLoggingListener threw an exception
java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.scheduler.EventLoggingListener$$anonfun$logEvent$3.apply(EventLoggingListener.scala:150)
at org.apache.spark.scheduler.EventLoggingListener$$anonfun$logEvent$3.apply(EventLoggingListener.scala:150)
at scala.Option.foreach(Option.scala:236)
at org.apache.spark.scheduler.EventLoggingListener.logEvent(EventLoggingListener.scala:150)
at org.apache.spark.scheduler.EventLoggingListener.onJobStart(EventLoggingListener.scala:173)
at org.apache.spark.scheduler.SparkListenerBus$class.onPostEvent(SparkListenerBus.scala:34)
at org.apache.spark.scheduler.LiveListenerBus.onPostEvent(LiveListenerBus.scala:31)
at org.apache.spark.scheduler.LiveListenerBus.onPostEvent(LiveListenerBus.scala:31)
at org.apache.spark.util.ListenerBus$class.postToAll(ListenerBus.scala:55)
at org.apache.spark.util.AsynchronousListenerBus.postToAll(AsynchronousListenerBus.scala:37)
at org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(AsynchronousListenerBus.scala:80)
at org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply(AsynchronousListenerBus.scala:65)
at org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply(AsynchronousListenerBus.scala:65)
at scala.util.DynamicVariable.withValue(DynamicVariable.scala:57)
at org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1.apply$mcV$sp(AsynchronousListenerBus.scala:64)
at org.apache.spark.util.Utils$.tryOrStopSparkContext(Utils.scala:1181)
at org.apache.spark.util.AsynchronousListenerBus$$anon$1.run(AsynchronousListenerBus.scala:63)
Caused by: java.io.IOException: Filesystem closed
at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:818)
at org.apache.hadoop.hdfs.DFSOutputStream.flushOrSync(DFSOutputStream.java:2037)
at org.apache.hadoop.hdfs.DFSOutputStream.hflush(DFSOutputStream.java:1983)
at org.apache.hadoop.fs.FSDataOutputStream.hflush(FSDataOutputStream.java:130)
... 21 more
17/04/21 08:12:14 INFO MemoryStore: Block broadcast_1 stored as values in memory (estimated size 3.0 KB, free 373.2 KB)
17/04/21 08:12:14 INFO MemoryStore: Block broadcast_1_piece0 stored as bytes in memory (estimated size 1857.0 B, free 375.0 KB)
17/04/21 08:12:14 INFO BlockManagerInfo: Added broadcast_1_piece0 in memory on 192.168.2.86:32844 (size: 1857.0 B, free: 1140.3 MB)
17/04/21 08:12:14 INFO SparkContext: Created broadcast 1 from broadcast at DAGScheduler.scala:1008
17/04/21 08:12:14 INFO DAGScheduler: Submitting 13 missing tasks from ResultStage 0 (hdfs://master.hdp:8020/SparkBWA/ERR000589_1.filt.fastq MapPartitionsRDD[1] at textFile at BwaInterpreter.java:149)
17/04/21 08:12:14 INFO YarnClusterScheduler: Adding task set 0.0 with 13 tasks
17/04/21 08:12:14 INFO TaskSetManager: Starting task 0.0 in stage 0.0 (TID 0, slave.hdp, partition 0,NODE_LOCAL, 2156 bytes)
17/04/21 08:12:14 INFO TaskSetManager: Starting task 1.0 in stage 0.0 (TID 1, slave2.hdp, partition 1,NODE_LOCAL, 2156 bytes)
17/04/21 08:12:14 INFO TaskSetManager: Starting task 2.0 in stage 0.0 (TID 2, slave1.hdp, partition 2,NODE_LOCAL, 2156 bytes)
17/04/21 08:12:14 INFO BlockManagerInfo: Added broadcast_1_piece0 in memory on slave1.hdp:38864 (size: 1857.0 B, free: 7.0 GB)
17/04/21 08:12:14 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on slave1.hdp:38864 (size: 28.3 KB, free: 7.0 GB)
17/04/21 08:12:14 INFO BlockManagerInfo: Added broadcast_1_piece0 in memory on slave.hdp:43602 (size: 1857.0 B, free: 7.0 GB)
17/04/21 08:12:14 INFO BlockManagerInfo: Added broadcast_1_piece0 in memory on slave2.hdp:46273 (size: 1857.0 B, free: 7.0 GB)
17/04/21 08:12:15 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on slave.hdp:43602 (size: 28.3 KB, free: 7.0 GB)
17/04/21 08:12:15 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on slave2.hdp:46273 (size: 28.3 KB, free: 7.0 GB)
17/04/21 08:12:17 INFO TaskSetManager: Starting task 3.0 in stage 0.0 (TID 3, slave.hdp, partition 3,NODE_LOCAL, 2156 bytes)
17/04/21 08:12:17 INFO TaskSetManager: Finished task 0.0 in stage 0.0 (TID 0) in 2556 ms on slave.hdp (1/13)
17/04/21 08:12:17 INFO TaskSetManager: Starting task 4.0 in stage 0.0 (TID 4, slave1.hdp, partition 4,NODE_LOCAL, 2156 bytes)
17/04/21 08:12:17 INFO TaskSetManager: Finished task 2.0 in stage 0.0 (TID 2) in 2662 ms on slave1.hdp (2/13)
17/04/21 08:12:17 INFO TaskSetManager: Starting task 5.0 in stage 0.0 (TID 5, slave2.hdp, partition 5,NODE_LOCAL, 2156 bytes)
17/04/21 08:12:17 INFO TaskSetManager: Finished task 1.0 in stage 0.0 (TID 1) in 2904 ms on slave2.hdp (3/13)
17/04/21 08:12:18 INFO TaskSetManager: Starting task 6.0 in stage 0.0 (TID 6, slave.hdp, partition 6,NODE_LOCAL, 2156 bytes)
17/04/21 08:12:18 INFO TaskSetManager: Finished task 3.0 in stage 0.0 (TID 3) in 1708 ms on slave.hdp (4/13)
17/04/21 08:12:18 INFO TaskSetManager: Starting task 7.0 in stage 0.0 (TID 7, slave2.hdp, partition 7,NODE_LOCAL, 2156 bytes)
17/04/21 08:12:18 INFO TaskSetManager: Finished task 5.0 in stage 0.0 (TID 5) in 1406 ms on slave2.hdp (5/13)
17/04/21 08:12:19 INFO TaskSetManager: Finished task 4.0 in stage 0.0 (TID 4) in 2160 ms on slave1.hdp (6/13)
17/04/21 08:12:19 INFO TaskSetManager: Starting task 8.0 in stage 0.0 (TID 8, slave1.hdp, partition 8,NODE_LOCAL, 2156 bytes)
17/04/21 08:12:20 INFO TaskSetManager: Starting task 9.0 in stage 0.0 (TID 9, slave.hdp, partition 9,NODE_LOCAL, 2156 bytes)
17/04/21 08:12:20 INFO TaskSetManager: Finished task 6.0 in stage 0.0 (TID 6) in 1375 ms on slave.hdp (7/13)
17/04/21 08:12:20 INFO TaskSetManager: Starting task 10.0 in stage 0.0 (TID 10, slave2.hdp, partition 10,NODE_LOCAL, 2156 bytes)
17/04/21 08:12:20 INFO TaskSetManager: Finished task 7.0 in stage 0.0 (TID 7) in 1522 ms on slave2.hdp (8/13)
17/04/21 08:12:21 INFO TaskSetManager: Starting task 11.0 in stage 0.0 (TID 11, slave1.hdp, partition 11,NODE_LOCAL, 2156 bytes)
17/04/21 08:12:21 INFO TaskSetManager: Finished task 8.0 in stage 0.0 (TID 8) in 1652 ms on slave1.hdp (9/13)
17/04/21 08:12:21 INFO TaskSetManager: Starting task 12.0 in stage 0.0 (TID 12, slave.hdp, partition 12,NODE_LOCAL, 2156 bytes)
17/04/21 08:12:21 INFO TaskSetManager: Finished task 9.0 in stage 0.0 (TID 9) in 1603 ms on slave.hdp (10/13)
17/04/21 08:12:21 INFO TaskSetManager: Finished task 11.0 in stage 0.0 (TID 11) in 877 ms on slave1.hdp (11/13)
17/04/21 08:12:22 INFO TaskSetManager: Finished task 10.0 in stage 0.0 (TID 10) in 1659 ms on slave2.hdp (12/13)
17/04/21 08:12:23 INFO TaskSetManager: Finished task 12.0 in stage 0.0 (TID 12) in 1581 ms on slave.hdp (13/13)
17/04/21 08:12:23 INFO DAGScheduler: ResultStage 0 (zipWithIndex at BwaInterpreter.java:152) finished in 8.808 s
17/04/21 08:12:23 INFO YarnClusterScheduler: Removed TaskSet 0.0, whose tasks have all completed, from pool
17/04/21 08:12:23 INFO DAGScheduler: Job 0 finished: zipWithIndex at BwaInterpreter.java:152, took 8.884092 s
17/04/21 08:12:23 ERROR LiveListenerBus: Listener EventLoggingListener threw an exception
java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.scheduler.EventLoggingListener$$anonfun$logEvent$3.apply(EventLoggingListener.scala:150)
at org.apache.spark.scheduler.EventLoggingListener$$anonfun$logEvent$3.apply(EventLoggingListener.scala:150)
at scala.Option.foreach(Option.scala:236)
at org.apache.spark.scheduler.EventLoggingListener.logEvent(EventLoggingListener.scala:150)
at org.apache.spark.scheduler.EventLoggingListener.onStageCompleted(EventLoggingListener.scala:170)
at org.apache.spark.scheduler.SparkListenerBus$class.onPostEvent(SparkListenerBus.scala:32)
at org.apache.spark.scheduler.LiveListenerBus.onPostEvent(LiveListenerBus.scala:31)
at org.apache.spark.scheduler.LiveListenerBus.onPostEvent(LiveListenerBus.scala:31)
at org.apache.spark.util.ListenerBus$class.postToAll(ListenerBus.scala:55)
at org.apache.spark.util.AsynchronousListenerBus.postToAll(AsynchronousListenerBus.scala:37)
at org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(AsynchronousListenerBus.scala:80)
at org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply(AsynchronousListenerBus.scala:65)
at org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply(AsynchronousListenerBus.scala:65)
at scala.util.DynamicVariable.withValue(DynamicVariable.scala:57)
at org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1.apply$mcV$sp(AsynchronousListenerBus.scala:64)
at org.apache.spark.util.Utils$.tryOrStopSparkContext(Utils.scala:1181)
at org.apache.spark.util.AsynchronousListenerBus$$anon$1.run(AsynchronousListenerBus.scala:63)
Caused by: java.io.IOException: Filesystem closed
at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:818)
at org.apache.hadoop.hdfs.DFSOutputStream.flushOrSync(DFSOutputStream.java:2037)
at org.apache.hadoop.hdfs.DFSOutputStream.hflush(DFSOutputStream.java:1983)
at org.apache.hadoop.fs.FSDataOutputStream.hflush(FSDataOutputStream.java:130)
... 21 more
17/04/21 08:12:23 ERROR LiveListenerBus: Listener EventLoggingListener threw an exception
java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.scheduler.EventLoggingListener$$anonfun$logEvent$3.apply(EventLoggingListener.scala:150)
at org.apache.spark.scheduler.EventLoggingListener$$anonfun$logEvent$3.apply(EventLoggingListener.scala:150)
at scala.Option.foreach(Option.scala:236)
at org.apache.spark.scheduler.EventLoggingListener.logEvent(EventLoggingListener.scala:150)
at org.apache.spark.scheduler.EventLoggingListener.onJobEnd(EventLoggingListener.scala:175)
at org.apache.spark.scheduler.SparkListenerBus$class.onPostEvent(SparkListenerBus.scala:36)
at org.apache.spark.scheduler.LiveListenerBus.onPostEvent(LiveListenerBus.scala:31)
at org.apache.spark.scheduler.LiveListenerBus.onPostEvent(LiveListenerBus.scala:31)
at org.apache.spark.util.ListenerBus$class.postToAll(ListenerBus.scala:55)
at org.apache.spark.util.AsynchronousListenerBus.postToAll(AsynchronousListenerBus.scala:37)
at org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(AsynchronousListenerBus.scala:80)
at org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply(AsynchronousListenerBus.scala:65)
at org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply(AsynchronousListenerBus.scala:65)
at scala.util.DynamicVariable.withValue(DynamicVariable.scala:57)
at org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1.apply$mcV$sp(AsynchronousListenerBus.scala:64)
at org.apache.spark.util.Utils$.tryOrStopSparkContext(Utils.scala:1181)
at org.apache.spark.util.AsynchronousListenerBus$$anon$1.run(AsynchronousListenerBus.scala:63)
Caused by: java.io.IOException: Filesystem closed
at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:818)
at org.apache.hadoop.hdfs.DFSOutputStream.flushOrSync(DFSOutputStream.java:2037)
at org.apache.hadoop.hdfs.DFSOutputStream.hflush(DFSOutputStream.java:1983)
at org.apache.hadoop.fs.FSDataOutputStream.hflush(FSDataOutputStream.java:130)
... 21 more
17/04/21 08:12:23 INFO MemoryStore: Block broadcast_2 stored as values in memory (estimated size 341.9 KB, free 716.9 KB)
17/04/21 08:12:23 INFO MemoryStore: Block broadcast_2_piece0 stored as bytes in memory (estimated size 28.3 KB, free 745.2 KB)
17/04/21 08:12:23 INFO BlockManagerInfo: Added broadcast_2_piece0 in memory on 192.168.2.86:32844 (size: 28.3 KB, free: 1140.3 MB)
17/04/21 08:12:23 INFO SparkContext: Created broadcast 2 from textFile at BwaInterpreter.java:149
17/04/21 08:12:23 INFO FileInputFormat: Total input paths to process : 1
17/04/21 08:12:23 INFO SparkContext: Starting job: zipWithIndex at BwaInterpreter.java:152
17/04/21 08:12:23 INFO DAGScheduler: Got job 1 (zipWithIndex at BwaInterpreter.java:152) with 13 output partitions
17/04/21 08:12:23 INFO DAGScheduler: Final stage: ResultStage 1 (zipWithIndex at BwaInterpreter.java:152)
17/04/21 08:12:23 INFO DAGScheduler: Parents of final stage: List()
17/04/21 08:12:23 INFO DAGScheduler: Missing parents: List()
17/04/21 08:12:23 INFO DAGScheduler: Submitting ResultStage 1 (hdfs://master.hdp:8020/SparkBWA/ERR000589_2.filt.fastq MapPartitionsRDD[8] at textFile at BwaInterpreter.java:149), which has no missing parents
17/04/21 08:12:23 ERROR LiveListenerBus: Listener EventLoggingListener threw an exception
java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.scheduler.EventLoggingListener$$anonfun$logEvent$3.apply(EventLoggingListener.scala:150)
at org.apache.spark.scheduler.EventLoggingListener$$anonfun$logEvent$3.apply(EventLoggingListener.scala:150)
at scala.Option.foreach(Option.scala:236)
at org.apache.spark.scheduler.EventLoggingListener.logEvent(EventLoggingListener.scala:150)
at org.apache.spark.scheduler.EventLoggingListener.onJobStart(EventLoggingListener.scala:173)
at org.apache.spark.scheduler.SparkListenerBus$class.onPostEvent(SparkListenerBus.scala:34)
at org.apache.spark.scheduler.LiveListenerBus.onPostEvent(LiveListenerBus.scala:31)
at org.apache.spark.scheduler.LiveListenerBus.onPostEvent(LiveListenerBus.scala:31)
at org.apache.spark.util.ListenerBus$class.postToAll(ListenerBus.scala:55)
at org.apache.spark.util.AsynchronousListenerBus.postToAll(AsynchronousListenerBus.scala:37)
at org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(AsynchronousListenerBus.scala:80)
at org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply(AsynchronousListenerBus.scala:65)
at org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply(AsynchronousListenerBus.scala:65)
at scala.util.DynamicVariable.withValue(DynamicVariable.scala:57)
at org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1.apply$mcV$sp(AsynchronousListenerBus.scala:64)
at org.apache.spark.util.Utils$.tryOrStopSparkContext(Utils.scala:1181)
at org.apache.spark.util.AsynchronousListenerBus$$anon$1.run(AsynchronousListenerBus.scala:63)
Caused by: java.io.IOException: Filesystem closed
at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:818)
at org.apache.hadoop.hdfs.DFSOutputStream.flushOrSync(DFSOutputStream.java:2037)
at org.apache.hadoop.hdfs.DFSOutputStream.hflush(DFSOutputStream.java:1983)
at org.apache.hadoop.fs.FSDataOutputStream.hflush(FSDataOutputStream.java:130)
... 21 more
17/04/21 08:12:23 INFO MemoryStore: Block broadcast_3 stored as values in memory (estimated size 3.0 KB, free 748.2 KB)
17/04/21 08:12:23 INFO MemoryStore: Block broadcast_3_piece0 stored as bytes in memory (estimated size 1863.0 B, free 750.1 KB)
17/04/21 08:12:23 INFO BlockManagerInfo: Added broadcast_3_piece0 in memory on 192.168.2.86:32844 (size: 1863.0 B, free: 1140.3 MB)
17/04/21 08:12:23 INFO SparkContext: Created broadcast 3 from broadcast at DAGScheduler.scala:1008
17/04/21 08:12:23 INFO DAGScheduler: Submitting 13 missing tasks from ResultStage 1 (hdfs://master.hdp:8020/SparkBWA/ERR000589_2.filt.fastq MapPartitionsRDD[8] at textFile at BwaInterpreter.java:149)
17/04/21 08:12:23 INFO YarnClusterScheduler: Adding task set 1.0 with 13 tasks
17/04/21 08:12:23 INFO TaskSetManager: Starting task 0.0 in stage 1.0 (TID 13, slave1.hdp, partition 0,NODE_LOCAL, 2156 bytes)
17/04/21 08:12:23 INFO TaskSetManager: Starting task 1.0 in stage 1.0 (TID 14, slave2.hdp, partition 1,NODE_LOCAL, 2156 bytes)
17/04/21 08:12:23 INFO TaskSetManager: Starting task 2.0 in stage 1.0 (TID 15, slave.hdp, partition 2,NODE_LOCAL, 2156 bytes)
17/04/21 08:12:23 INFO BlockManagerInfo: Added broadcast_3_piece0 in memory on slave1.hdp:38864 (size: 1863.0 B, free: 7.0 GB)
17/04/21 08:12:23 INFO BlockManagerInfo: Added broadcast_3_piece0 in memory on slave.hdp:43602 (size: 1863.0 B, free: 7.0 GB)
17/04/21 08:12:23 INFO BlockManagerInfo: Added broadcast_3_piece0 in memory on slave2.hdp:46273 (size: 1863.0 B, free: 7.0 GB)
17/04/21 08:12:23 INFO BlockManagerInfo: Added broadcast_2_piece0 in memory on slave1.hdp:38864 (size: 28.3 KB, free: 7.0 GB)
17/04/21 08:12:23 INFO BlockManagerInfo: Added broadcast_2_piece0 in memory on slave2.hdp:46273 (size: 28.3 KB, free: 7.0 GB)
17/04/21 08:12:23 INFO BlockManagerInfo: Added broadcast_2_piece0 in memory on slave.hdp:43602 (size: 28.3 KB, free: 7.0 GB)
17/04/21 08:12:24 INFO TaskSetManager: Starting task 3.0 in stage 1.0 (TID 16, slave.hdp, partition 3,NODE_LOCAL, 2156 bytes)
17/04/21 08:12:24 INFO TaskSetManager: Finished task 2.0 in stage 1.0 (TID 15) in 1111 ms on slave.hdp (1/13)
17/04/21 08:12:24 INFO TaskSetManager: Starting task 4.0 in stage 1.0 (TID 17, slave1.hdp, partition 4,NODE_LOCAL, 2156 bytes)
17/04/21 08:12:24 INFO TaskSetManager: Finished task 0.0 in stage 1.0 (TID 13) in 1233 ms on slave1.hdp (2/13)
17/04/21 08:12:24 INFO TaskSetManager: Starting task 5.0 in stage 1.0 (TID 18, slave2.hdp, partition 5,NODE_LOCAL, 2156 bytes)
17/04/21 08:12:24 INFO TaskSetManager: Finished task 1.0 in stage 1.0 (TID 14) in 1415 ms on slave2.hdp (3/13)
17/04/21 08:12:25 INFO TaskSetManager: Starting task 6.0 in stage 1.0 (TID 19, slave1.hdp, partition 6,NODE_LOCAL, 2156 bytes)
17/04/21 08:12:25 INFO TaskSetManager: Finished task 4.0 in stage 1.0 (TID 17) in 861 ms on slave1.hdp (4/13)
17/04/21 08:12:25 INFO TaskSetManager: Starting task 7.0 in stage 1.0 (TID 20, slave.hdp, partition 7,NODE_LOCAL, 2156 bytes)
17/04/21 08:12:25 INFO TaskSetManager: Finished task 3.0 in stage 1.0 (TID 16) in 1128 ms on slave.hdp (5/13)
17/04/21 08:12:25 INFO TaskSetManager: Starting task 8.0 in stage 1.0 (TID 21, slave2.hdp, partition 8,NODE_LOCAL, 2156 bytes)
17/04/21 08:12:25 INFO TaskSetManager: Finished task 5.0 in stage 1.0 (TID 18) in 1075 ms on slave2.hdp (6/13)
17/04/21 08:12:26 INFO TaskSetManager: Starting task 9.0 in stage 1.0 (TID 22, slave1.hdp, partition 9,NODE_LOCAL, 2156 bytes)
17/04/21 08:12:26 INFO TaskSetManager: Finished task 6.0 in stage 1.0 (TID 19) in 852 ms on slave1.hdp (7/13)
17/04/21 08:12:26 INFO TaskSetManager: Starting task 10.0 in stage 1.0 (TID 23, slave.hdp, partition 10,NODE_LOCAL, 2156 bytes)
17/04/21 08:12:26 INFO TaskSetManager: Finished task 7.0 in stage 1.0 (TID 20) in 969 ms on slave.hdp (8/13)
17/04/21 08:12:27 INFO TaskSetManager: Starting task 11.0 in stage 1.0 (TID 24, slave2.hdp, partition 11,NODE_LOCAL, 2156 bytes)
17/04/21 08:12:27 INFO TaskSetManager: Finished task 8.0 in stage 1.0 (TID 21) in 1091 ms on slave2.hdp (9/13)
17/04/21 08:12:27 INFO TaskSetManager: Starting task 12.0 in stage 1.0 (TID 25, slave1.hdp, partition 12,NODE_LOCAL, 2156 bytes)
17/04/21 08:12:27 INFO TaskSetManager: Finished task 9.0 in stage 1.0 (TID 22) in 891 ms on slave1.hdp (10/13)
17/04/21 08:12:27 INFO TaskSetManager: Finished task 10.0 in stage 1.0 (TID 23) in 976 ms on slave.hdp (11/13)
17/04/21 08:12:28 INFO TaskSetManager: Finished task 12.0 in stage 1.0 (TID 25) in 861 ms on slave1.hdp (12/13)
17/04/21 08:12:28 INFO TaskSetManager: Finished task 11.0 in stage 1.0 (TID 24) in 1159 ms on slave2.hdp (13/13)
17/04/21 08:12:28 INFO DAGScheduler: ResultStage 1 (zipWithIndex at BwaInterpreter.java:152) finished in 4.737 s
17/04/21 08:12:28 INFO YarnClusterScheduler: Removed TaskSet 1.0, whose tasks have all completed, from pool
17/04/21 08:12:28 INFO DAGScheduler: Job 1 finished: zipWithIndex at BwaInterpreter.java:152, took 4.746554 s
17/04/21 08:12:28 ERROR LiveListenerBus: Listener EventLoggingListener threw an exception
java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.scheduler.EventLoggingListener$$anonfun$logEvent$3.apply(EventLoggingListener.scala:150)
at org.apache.spark.scheduler.EventLoggingListener$$anonfun$logEvent$3.apply(EventLoggingListener.scala:150)
at scala.Option.foreach(Option.scala:236)
at org.apache.spark.scheduler.EventLoggingListener.logEvent(EventLoggingListener.scala:150)
at org.apache.spark.scheduler.EventLoggingListener.onStageCompleted(EventLoggingListener.scala:170)
at org.apache.spark.scheduler.SparkListenerBus$class.onPostEvent(SparkListenerBus.scala:32)
at org.apache.spark.scheduler.LiveListenerBus.onPostEvent(LiveListenerBus.scala:31)
at org.apache.spark.scheduler.LiveListenerBus.onPostEvent(LiveListenerBus.scala:31)
at org.apache.spark.util.ListenerBus$class.postToAll(ListenerBus.scala:55)
at org.apache.spark.util.AsynchronousListenerBus.postToAll(AsynchronousListenerBus.scala:37)
at org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(AsynchronousListenerBus.scala:80)
at org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply(AsynchronousListenerBus.scala:65)
at org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply(AsynchronousListenerBus.scala:65)
at scala.util.DynamicVariable.withValue(DynamicVariable.scala:57)
at org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1.apply$mcV$sp(AsynchronousListenerBus.scala:64)
at org.apache.spark.util.Utils$.tryOrStopSparkContext(Utils.scala:1181)
at org.apache.spark.util.AsynchronousListenerBus$$anon$1.run(AsynchronousListenerBus.scala:63)
Caused by: java.io.IOException: Filesystem closed
at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:818)
at org.apache.hadoop.hdfs.DFSOutputStream.flushOrSync(DFSOutputStream.java:2037)
at org.apache.hadoop.hdfs.DFSOutputStream.hflush(DFSOutputStream.java:1983)
at org.apache.hadoop.fs.FSDataOutputStream.hflush(FSDataOutputStream.java:130)
... 21 more
17/04/21 08:12:28 ERROR LiveListenerBus: Listener EventLoggingListener threw an exception
java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.scheduler.EventLoggingListener$$anonfun$logEvent$3.apply(EventLoggingListener.scala:150)
at org.apache.spark.scheduler.EventLoggingListener$$anonfun$logEvent$3.apply(EventLoggingListener.scala:150)
at scala.Option.foreach(Option.scala:236)
at org.apache.spark.scheduler.EventLoggingListener.logEvent(EventLoggingListener.scala:150)
at org.apache.spark.scheduler.EventLoggingListener.onJobEnd(EventLoggingListener.scala:175)
at org.apache.spark.scheduler.SparkListenerBus$class.onPostEvent(SparkListenerBus.scala:36)
at org.apache.spark.scheduler.LiveListenerBus.onPostEvent(LiveListenerBus.scala:31)
at org.apache.spark.scheduler.LiveListenerBus.onPostEvent(LiveListenerBus.scala:31)
at org.apache.spark.util.ListenerBus$class.postToAll(ListenerBus.scala:55)
at org.apache.spark.util.AsynchronousListenerBus.postToAll(AsynchronousListenerBus.scala:37)
at org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(AsynchronousListenerBus.scala:80)
at org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply(AsynchronousListenerBus.scala:65)
at org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply(AsynchronousListenerBus.scala:65)
at scala.util.DynamicVariable.withValue(DynamicVariable.scala:57)
at org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1.apply$mcV$sp(AsynchronousListenerBus.scala:64)
at org.apache.spark.util.Utils$.tryOrStopSparkContext(Utils.scala:1181)
at org.apache.spark.util.AsynchronousListenerBus$$anon$1.run(AsynchronousListenerBus.scala:63)
Caused by: java.io.IOException: Filesystem closed
at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:818)
at org.apache.hadoop.hdfs.DFSOutputStream.flushOrSync(DFSOutputStream.java:2037)
at org.apache.hadoop.hdfs.DFSOutputStream.hflush(DFSOutputStream.java:1983)
at org.apache.hadoop.fs.FSDataOutputStream.hflush(FSDataOutputStream.java:130)
... 21 more
17/04/21 08:12:28 INFO MapPartitionsRDD: Removing RDD 6 from persistence list
17/04/21 08:12:28 INFO BlockManager: Removing RDD 6
17/04/21 08:12:28 INFO MapPartitionsRDD: Removing RDD 13 from persistence list
17/04/21 08:12:28 INFO BlockManager: Removing RDD 13
17/04/21 08:12:28 ERROR LiveListenerBus: Listener EventLoggingListener threw an exception
java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.scheduler.EventLoggingListener$$anonfun$logEvent$3.apply(EventLoggingListener.scala:150)
at org.apache.spark.scheduler.EventLoggingListener$$anonfun$logEvent$3.apply(EventLoggingListener.scala:150)
at scala.Option.foreach(Option.scala:236)
at org.apache.spark.scheduler.EventLoggingListener.logEvent(EventLoggingListener.scala:150)
at org.apache.spark.scheduler.EventLoggingListener.onUnpersistRDD(EventLoggingListener.scala:186)
at org.apache.spark.scheduler.SparkListenerBus$class.onPostEvent(SparkListenerBus.scala:50)
at org.apache.spark.scheduler.LiveListenerBus.onPostEvent(LiveListenerBus.scala:31)
at org.apache.spark.scheduler.LiveListenerBus.onPostEvent(LiveListenerBus.scala:31)
at org.apache.spark.util.ListenerBus$class.postToAll(ListenerBus.scala:55)
at org.apache.spark.util.AsynchronousListenerBus.postToAll(AsynchronousListenerBus.scala:37)
at org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(AsynchronousListenerBus.scala:80)
at org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply(AsynchronousListenerBus.scala:65)
at org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply(AsynchronousListenerBus.scala:65)
at scala.util.DynamicVariable.withValue(DynamicVariable.scala:57)
at org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1.apply$mcV$sp(AsynchronousListenerBus.scala:64)
at org.apache.spark.util.Utils$.tryOrStopSparkContext(Utils.scala:1181)
at org.apache.spark.util.AsynchronousListenerBus$$anon$1.run(AsynchronousListenerBus.scala:63)
Caused by: java.io.IOException: Filesystem closed
at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:818)
at org.apache.hadoop.hdfs.DFSOutputStream.flushOrSync(DFSOutputStream.java:2037)
at org.apache.hadoop.hdfs.DFSOutputStream.hflush(DFSOutputStream.java:1983)
at org.apache.hadoop.fs.FSDataOutputStream.hflush(FSDataOutputStream.java:130)
... 21 more
17/04/21 08:12:28 INFO BwaInterpreter: [com.github.sparkbwa.BwaInterpreter] :: No sort with partitioning
17/04/21 08:12:28 ERROR LiveListenerBus: Listener EventLoggingListener threw an exception
java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.scheduler.EventLoggingListener$$anonfun$logEvent$3.apply(EventLoggingListener.scala:150)
at org.apache.spark.scheduler.EventLoggingListener$$anonfun$logEvent$3.apply(EventLoggingListener.scala:150)
at scala.Option.foreach(Option.scala:236)
at org.apache.spark.scheduler.EventLoggingListener.logEvent(EventLoggingListener.scala:150)
at org.apache.spark.scheduler.EventLoggingListener.onUnpersistRDD(EventLoggingListener.scala:186)
at org.apache.spark.scheduler.SparkListenerBus$class.onPostEvent(SparkListenerBus.scala:50)
at org.apache.spark.scheduler.LiveListenerBus.onPostEvent(LiveListenerBus.scala:31)
at org.apache.spark.scheduler.LiveListenerBus.onPostEvent(LiveListenerBus.scala:31)
at org.apache.spark.util.ListenerBus$class.postToAll(ListenerBus.scala:55)
at org.apache.spark.util.AsynchronousListenerBus.postToAll(AsynchronousListenerBus.scala:37)
at org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(AsynchronousListenerBus.scala:80)
at org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply(AsynchronousListenerBus.scala:65)
at org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply(AsynchronousListenerBus.scala:65)
at scala.util.DynamicVariable.withValue(DynamicVariable.scala:57)
at org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1.apply$mcV$sp(AsynchronousListenerBus.scala:64)
at org.apache.spark.util.Utils$.tryOrStopSparkContext(Utils.scala:1181)
at org.apache.spark.util.AsynchronousListenerBus$$anon$1.run(AsynchronousListenerBus.scala:63)
Caused by: java.io.IOException: Filesystem closed
at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:818)
at org.apache.hadoop.hdfs.DFSOutputStream.flushOrSync(DFSOutputStream.java:2037)
at org.apache.hadoop.hdfs.DFSOutputStream.hflush(DFSOutputStream.java:1983)
at org.apache.hadoop.fs.FSDataOutputStream.hflush(FSDataOutputStream.java:130)
... 21 more
17/04/21 08:12:28 INFO BwaInterpreter: [com.github.sparkbwa.BwaInterpreter] :: Repartition with no sort
17/04/21 08:12:28 INFO BwaInterpreter: [com.github.sparkbwa.BwaInterpreter] :: End of sorting. Timing: 47461927545800
17/04/21 08:12:28 INFO BwaInterpreter: [com.github.sparkbwa.BwaInterpreter] :: Total time: 0.23514286920000002 minutes
17/04/21 08:12:28 INFO BwaAlignmentBase: [com.github.sparkbwa.BwaPairedAlignment] :: application_1492697141087_0027 - SparkBWA_ERR000589_1.filt.fastq-32-NoSort
17/04/21 08:12:28 INFO SparkContext: Starting job: collect at BwaInterpreter.java:305
17/04/21 08:12:28 INFO DAGScheduler: Registering RDD 3 (mapToPair at BwaInterpreter.java:152)
17/04/21 08:12:28 INFO DAGScheduler: Registering RDD 10 (mapToPair at BwaInterpreter.java:152)
17/04/21 08:12:28 INFO DAGScheduler: Registering RDD 17 (repartition at BwaInterpreter.java:281)
17/04/21 08:12:28 INFO DAGScheduler: Got job 2 (collect at BwaInterpreter.java:305) with 32 output partitions
17/04/21 08:12:28 INFO DAGScheduler: Final stage: ResultStage 5 (collect at BwaInterpreter.java:305)
17/04/21 08:12:28 INFO DAGScheduler: Parents of final stage: List(ShuffleMapStage 4)
17/04/21 08:12:28 INFO DAGScheduler: Missing parents: List(ShuffleMapStage 4)
17/04/21 08:12:28 INFO DAGScheduler: Submitting ShuffleMapStage 2 (MapPartitionsRDD[3] at mapToPair at BwaInterpreter.java:152), which has no missing parents
17/04/21 08:12:28 ERROR LiveListenerBus: Listener EventLoggingListener threw an exception
java.lang.reflect.InvocationTargetException
at sun.reflect.GeneratedMethodAccessor20.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.scheduler.EventLoggingListener$$anonfun$logEvent$3.apply(EventLoggingListener.scala:150)
at org.apache.spark.scheduler.EventLoggingListener$$anonfun$logEvent$3.apply(EventLoggingListener.scala:150)
at scala.Option.foreach(Option.scala:236)
at org.apache.spark.scheduler.EventLoggingListener.logEvent(EventLoggingListener.scala:150)
at org.apache.spark.scheduler.EventLoggingListener.onJobStart(EventLoggingListener.scala:173)
at org.apache.spark.scheduler.SparkListenerBus$class.onPostEvent(SparkListenerBus.scala:34)
at org.apache.spark.scheduler.LiveListenerBus.onPostEvent(LiveListenerBus.scala:31)
at org.apache.spark.scheduler.LiveListenerBus.onPostEvent(LiveListenerBus.scala:31)
at org.apache.spark.util.ListenerBus$class.postToAll(ListenerBus.scala:55)
at org.apache.spark.util.AsynchronousListenerBus.postToAll(AsynchronousListenerBus.scala:37)
at org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(AsynchronousListenerBus.scala:80)
at org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply(AsynchronousListenerBus.scala:65)
at org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply(AsynchronousListenerBus.scala:65)
at scala.util.DynamicVariable.withValue(DynamicVariable.scala:57)
at org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1.apply$mcV$sp(AsynchronousListenerBus.scala:64)
at org.apache.spark.util.Utils$.tryOrStopSparkContext(Utils.scala:1181)
at org.apache.spark.util.AsynchronousListenerBus$$anon$1.run(AsynchronousListenerBus.scala:63)
Caused by: java.io.IOException: Filesystem closed
at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:818)
at org.apache.hadoop.hdfs.DFSOutputStream.flushOrSync(DFSOutputStream.java:2037)
at org.apache.hadoop.hdfs.DFSOutputStream.hflush(DFSOutputStream.java:1983)
at org.apache.hadoop.fs.FSDataOutputStream.hflush(FSDataOutputStream.java:130)
... 20 more
17/04/21 08:12:28 INFO MemoryStore: Block broadcast_4 stored as values in memory (estimated size 5.3 KB, free 755.3 KB)
17/04/21 08:12:28 INFO MemoryStore: Block broadcast_4_piece0 stored as bytes in memory (estimated size 2.8 KB, free 758.2 KB)
17/04/21 08:12:28 INFO BlockManagerInfo: Added broadcast_4_piece0 in memory on 192.168.2.86:32844 (size: 2.8 KB, free: 1140.3 MB)
17/04/21 08:12:28 INFO SparkContext: Created broadcast 4 from broadcast at DAGScheduler.scala:1008
17/04/21 08:12:28 INFO DAGScheduler: Submitting 14 missing tasks from ShuffleMapStage 2 (MapPartitionsRDD[3] at mapToPair at BwaInterpreter.java:152)
17/04/21 08:12:28 INFO YarnClusterScheduler: Adding task set 2.0 with 14 tasks
17/04/21 08:12:28 INFO DAGScheduler: Submitting ShuffleMapStage 3 (MapPartitionsRDD[10] at mapToPair at BwaInterpreter.java:152), which has no missing parents
17/04/21 08:12:28 INFO TaskSetManager: Starting task 0.0 in stage 2.0 (TID 26, slave1.hdp, partition 0,NODE_LOCAL, 2255 bytes)
17/04/21 08:12:28 INFO TaskSetManager: Starting task 1.0 in stage 2.0 (TID 27, slave2.hdp, partition 1,NODE_LOCAL, 2255 bytes)
17/04/21 08:12:28 INFO TaskSetManager: Starting task 2.0 in stage 2.0 (TID 28, slave.hdp, partition 2,NODE_LOCAL, 2255 bytes)
17/04/21 08:12:28 INFO MemoryStore: Block broadcast_5 stored as values in memory (estimated size 5.3 KB, free 763.4 KB)
17/04/21 08:12:28 INFO BlockManagerInfo: Added broadcast_4_piece0 in memory on slave1.hdp:38864 (size: 2.8 KB, free: 7.0 GB)
17/04/21 08:12:28 INFO MemoryStore: Block broadcast_5_piece0 stored as bytes in memory (estimated size 2.8 KB, free 766.3 KB)
17/04/21 08:12:28 INFO BlockManagerInfo: Added broadcast_5_piece0 in memory on 192.168.2.86:32844 (size: 2.8 KB, free: 1140.3 MB)
17/04/21 08:12:28 INFO SparkContext: Created broadcast 5 from broadcast at DAGScheduler.scala:1008
17/04/21 08:12:28 INFO DAGScheduler: Submitting 14 missing tasks from ShuffleMapStage 3 (MapPartitionsRDD[10] at mapToPair at BwaInterpreter.java:152)
17/04/21 08:12:28 INFO BlockManagerInfo: Added broadcast_4_piece0 in memory on slave.hdp:43602 (size: 2.8 KB, free: 7.0 GB)
17/04/21 08:12:28 INFO YarnClusterScheduler: Adding task set 3.0 with 14 tasks
17/04/21 08:12:28 INFO BlockManagerInfo: Added broadcast_4_piece0 in memory on slave2.hdp:46273 (size: 2.8 KB, free: 7.0 GB)
17/04/21 08:12:34 INFO TaskSetManager: Starting task 3.0 in stage 2.0 (TID 29, slave2.hdp, partition 3,NODE_LOCAL, 2255 bytes)
17/04/21 08:12:34 INFO TaskSetManager: Finished task 1.0 in stage 2.0 (TID 27) in 6478 ms on slave2.hdp (1/14)
17/04/21 08:12:36 INFO TaskSetManager: Starting task 4.0 in stage 2.0 (TID 30, slave.hdp, partition 4,NODE_LOCAL, 2255 bytes)
17/04/21 08:12:36 INFO TaskSetManager: Finished task 2.0 in stage 2.0 (TID 28) in 8047 ms on slave.hdp (2/14)
17/04/21 08:12:36 INFO TaskSetManager: Starting task 5.0 in stage 2.0 (TID 31, slave1.hdp, partition 5,NODE_LOCAL, 2255 bytes)
17/04/21 08:12:36 INFO TaskSetManager: Finished task 0.0 in stage 2.0 (TID 26) in 8334 ms on slave1.hdp (3/14)
17/04/21 08:12:42 INFO TaskSetManager: Starting task 6.0 in stage 2.0 (TID 32, slave2.hdp, partition 6,NODE_LOCAL, 2255 bytes)
17/04/21 08:12:42 INFO TaskSetManager: Finished task 3.0 in stage 2.0 (TID 29) in 7902 ms on slave2.hdp (4/14)
17/04/21 08:12:44 INFO TaskSetManager: Starting task 7.0 in stage 2.0 (TID 33, slave.hdp, partition 7,NODE_LOCAL, 2255 bytes)
17/04/21 08:12:45 INFO TaskSetManager: Finished task 4.0 in stage 2.0 (TID 30) in 8723 ms on slave.hdp (5/14)
17/04/21 08:12:45 INFO TaskSetManager: Starting task 8.0 in stage 2.0 (TID 34, slave1.hdp, partition 8,NODE_LOCAL, 2255 bytes)
17/04/21 08:12:45 INFO TaskSetManager: Finished task 5.0 in stage 2.0 (TID 31) in 8952 ms on slave1.hdp (6/14)
17/04/21 08:12:50 INFO TaskSetManager: Starting task 9.0 in stage 2.0 (TID 35, slave2.hdp, partition 9,NODE_LOCAL, 2255 bytes)
17/04/21 08:12:50 INFO TaskSetManager: Finished task 6.0 in stage 2.0 (TID 32) in 8277 ms on slave2.hdp (7/14)
17/04/21 08:12:51 INFO TaskSetManager: Starting task 10.0 in stage 2.0 (TID 36, slave1.hdp, partition 10,NODE_LOCAL, 2255 bytes)
17/04/21 08:12:51 INFO TaskSetManager: Finished task 8.0 in stage 2.0 (TID 34) in 6232 ms on slave1.hdp (8/14)
17/04/21 08:12:53 INFO TaskSetManager: Starting task 11.0 in stage 2.0 (TID 37, slave.hdp, partition 11,NODE_LOCAL, 2255 bytes)
17/04/21 08:12:53 INFO TaskSetManager: Finished task 7.0 in stage 2.0 (TID 33) in 8497 ms on slave.hdp (9/14)
17/04/21 08:12:59 INFO TaskSetManager: Starting task 12.0 in stage 2.0 (TID 38, slave2.hdp, partition 12,NODE_LOCAL, 2255 bytes)
17/04/21 08:12:59 INFO TaskSetManager: Finished task 9.0 in stage 2.0 (TID 35) in 8242 ms on slave2.hdp (10/14)
17/04/21 08:12:59 INFO TaskSetManager: Starting task 13.0 in stage 2.0 (TID 39, slave1.hdp, partition 13,NODE_LOCAL, 2255 bytes)
17/04/21 08:12:59 INFO TaskSetManager: Finished task 10.0 in stage 2.0 (TID 36) in 7685 ms on slave1.hdp (11/14)
17/04/21 08:13:01 INFO TaskSetManager: Starting task 0.0 in stage 3.0 (TID 40, slave.hdp, partition 0,NODE_LOCAL, 2255 bytes)
17/04/21 08:13:01 INFO TaskSetManager: Finished task 11.0 in stage 2.0 (TID 37) in 8151 ms on slave.hdp (12/14)
17/04/21 08:13:01 INFO BlockManagerInfo: Added broadcast_5_piece0 in memory on slave.hdp:43602 (size: 2.8 KB, free: 7.0 GB)
17/04/21 08:13:03 INFO TaskSetManager: Starting task 1.0 in stage 3.0 (TID 41, slave1.hdp, partition 1,NODE_LOCAL, 2255 bytes)
17/04/21 08:13:03 INFO TaskSetManager: Finished task 13.0 in stage 2.0 (TID 39) in 3730 ms on slave1.hdp (13/14)
17/04/21 08:13:03 INFO BlockManagerInfo: Added broadcast_5_piece0 in memory on slave1.hdp:38864 (size: 2.8 KB, free: 7.0 GB)
17/04/21 08:13:08 INFO TaskSetManager: Starting task 2.0 in stage 3.0 (TID 42, slave2.hdp, partition 2,NODE_LOCAL, 2255 bytes)
17/04/21 08:13:08 INFO TaskSetManager: Finished task 12.0 in stage 2.0 (TID 38) in 9660 ms on slave2.hdp (14/14)
17/04/21 08:13:08 INFO YarnClusterScheduler: Removed TaskSet 2.0, whose tasks have all completed, from pool
17/04/21 08:13:08 INFO DAGScheduler: ShuffleMapStage 2 (mapToPair at BwaInterpreter.java:152) finished in 40.556 s
17/04/21 08:13:08 INFO DAGScheduler: looking for newly runnable stages
17/04/21 08:13:08 INFO DAGScheduler: running: Set(ShuffleMapStage 3)
17/04/21 08:13:08 INFO DAGScheduler: waiting: Set(ResultStage 5, ShuffleMapStage 4)
17/04/21 08:13:08 ERROR LiveListenerBus: Listener EventLoggingListener threw an exception
java.lang.reflect.InvocationTargetException
at sun.reflect.GeneratedMethodAccessor20.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.scheduler.EventLoggingListener$$anonfun$logEvent$3.apply(EventLoggingListener.scala:150)
at org.apache.spark.scheduler.EventLoggingListener$$anonfun$logEvent$3.apply(EventLoggingListener.scala:150)
at scala.Option.foreach(Option.scala:236)
at org.apache.spark.scheduler.EventLoggingListener.logEvent(EventLoggingListener.scala:150)
at org.apache.spark.scheduler.EventLoggingListener.onStageCompleted(EventLoggingListener.scala:170)
at org.apache.spark.scheduler.SparkListenerBus$class.onPostEvent(SparkListenerBus.scala:32)
at org.apache.spark.scheduler.LiveListenerBus.onPostEvent(LiveListenerBus.scala:31)
at org.apache.spark.scheduler.LiveListenerBus.onPostEvent(LiveListenerBus.scala:31)
at org.apache.spark.util.ListenerBus$class.postToAll(ListenerBus.scala:55)
at org.apache.spark.util.AsynchronousListenerBus.postToAll(AsynchronousListenerBus.scala:37)
at org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(AsynchronousListenerBus.scala:80)
at org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply(AsynchronousListenerBus.scala:65)
at org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply(AsynchronousListenerBus.scala:65)
at scala.util.DynamicVariable.withValue(DynamicVariable.scala:57)
at org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1.apply$mcV$sp(AsynchronousListenerBus.scala:64)
at org.apache.spark.util.Utils$.tryOrStopSparkContext(Utils.scala:1181)
at org.apache.spark.util.AsynchronousListenerBus$$anon$1.run(AsynchronousListenerBus.scala:63)
Caused by: java.io.IOException: Filesystem closed
at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:818)
at org.apache.hadoop.hdfs.DFSOutputStream.flushOrSync(DFSOutputStream.java:2037)
at org.apache.hadoop.hdfs.DFSOutputStream.hflush(DFSOutputStream.java:1983)
at org.apache.hadoop.fs.FSDataOutputStream.hflush(FSDataOutputStream.java:130)
... 20 more
17/04/21 08:13:08 INFO DAGScheduler: failed: Set()
17/04/21 08:13:08 INFO BlockManagerInfo: Added broadcast_5_piece0 in memory on slave2.hdp:46273 (size: 2.8 KB, free: 7.0 GB)
17/04/21 08:13:09 INFO TaskSetManager: Starting task 3.0 in stage 3.0 (TID 43, slave.hdp, partition 3,NODE_LOCAL, 2255 bytes)
17/04/21 08:13:09 INFO TaskSetManager: Finished task 0.0 in stage 3.0 (TID 40) in 8088 ms on slave.hdp (1/14)
17/04/21 08:13:10 INFO TaskSetManager: Starting task 4.0 in stage 3.0 (TID 44, slave1.hdp, partition 4,NODE_LOCAL, 2255 bytes)
17/04/21 08:13:10 INFO TaskSetManager: Finished task 1.0 in stage 3.0 (TID 41) in 7719 ms on slave1.hdp (2/14)
17/04/21 08:13:16 INFO TaskSetManager: Starting task 5.0 in stage 3.0 (TID 45, slave.hdp, partition 5,NODE_LOCAL, 2255 bytes)
17/04/21 08:13:16 INFO TaskSetManager: Finished task 3.0 in stage 3.0 (TID 43) in 6821 ms on slave.hdp (3/14)
17/04/21 08:13:16 INFO TaskSetManager: Starting task 6.0 in stage 3.0 (TID 46, slave2.hdp, partition 6,NODE_LOCAL, 2255 bytes)
17/04/21 08:13:16 INFO TaskSetManager: Finished task 2.0 in stage 3.0 (TID 42) in 7732 ms on slave2.hdp (4/14)
17/04/21 08:13:17 INFO TaskSetManager: Starting task 7.0 in stage 3.0 (TID 47, slave1.hdp, partition 7,NODE_LOCAL, 2255 bytes)
17/04/21 08:13:17 INFO TaskSetManager: Finished task 4.0 in stage 3.0 (TID 44) in 6100 ms on slave1.hdp (5/14)
17/04/21 08:13:24 INFO TaskSetManager: Starting task 8.0 in stage 3.0 (TID 48, slave.hdp, partition 8,NODE_LOCAL, 2255 bytes)
17/04/21 08:13:24 INFO TaskSetManager: Finished task 5.0 in stage 3.0 (TID 45) in 7780 ms on slave.hdp (6/14)
17/04/21 08:13:24 INFO TaskSetManager: Starting task 9.0 in stage 3.0 (TID 49, slave2.hdp, partition 9,NODE_LOCAL, 2255 bytes)
17/04/21 08:13:24 INFO TaskSetManager: Finished task 6.0 in stage 3.0 (TID 46) in 7916 ms on slave2.hdp (7/14)
17/04/21 08:13:25 INFO TaskSetManager: Starting task 10.0 in stage 3.0 (TID 50, slave1.hdp, partition 10,NODE_LOCAL, 2255 bytes)
17/04/21 08:13:25 INFO TaskSetManager: Finished task 7.0 in stage 3.0 (TID 47) in 8519 ms on slave1.hdp (8/14)
17/04/21 08:13:31 INFO TaskSetManager: Starting task 11.0 in stage 3.0 (TID 51, slave.hdp, partition 11,NODE_LOCAL, 2255 bytes)
17/04/21 08:13:31 INFO TaskSetManager: Finished task 8.0 in stage 3.0 (TID 48) in 7668 ms on slave.hdp (9/14)
17/04/21 08:13:32 INFO TaskSetManager: Starting task 12.0 in stage 3.0 (TID 52, slave2.hdp, partition 12,NODE_LOCAL, 2255 bytes)
17/04/21 08:13:32 INFO TaskSetManager: Finished task 9.0 in stage 3.0 (TID 49) in 7587 ms on slave2.hdp (10/14)
17/04/21 08:13:33 INFO TaskSetManager: Starting task 13.0 in stage 3.0 (TID 53, slave1.hdp, partition 13,NODE_LOCAL, 2255 bytes)
17/04/21 08:13:33 INFO TaskSetManager: Finished task 10.0 in stage 3.0 (TID 50) in 8168 ms on slave1.hdp (11/14)
17/04/21 08:13:37 INFO TaskSetManager: Finished task 13.0 in stage 3.0 (TID 53) in 4209 ms on slave1.hdp (12/14)
17/04/21 08:13:39 INFO TaskSetManager: Finished task 11.0 in stage 3.0 (TID 51) in 7536 ms on slave.hdp (13/14)
17/04/21 08:13:40 INFO TaskSetManager: Finished task 12.0 in stage 3.0 (TID 52) in 7997 ms on slave2.hdp (14/14)
17/04/21 08:13:40 INFO DAGScheduler: ShuffleMapStage 3 (mapToPair at BwaInterpreter.java:152) finished in 71.731 s
17/04/21 08:13:40 INFO DAGScheduler: looking for newly runnable stages
17/04/21 08:13:40 INFO YarnClusterScheduler: Removed TaskSet 3.0, whose tasks have all completed, from pool
17/04/21 08:13:40 ERROR LiveListenerBus: Listener EventLoggingListener threw an exception
java.lang.reflect.InvocationTargetException
at sun.reflect.GeneratedMethodAccessor20.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.scheduler.EventLoggingListener$$anonfun$logEvent$3.apply(EventLoggingListener.scala:150)
at org.apache.spark.scheduler.EventLoggingListener$$anonfun$logEvent$3.apply(EventLoggingListener.scala:150)
at scala.Option.foreach(Option.scala:236)
at org.apache.spark.scheduler.EventLoggingListener.logEvent(EventLoggingListener.scala:150)
at org.apache.spark.scheduler.EventLoggingListener.onStageCompleted(EventLoggingListener.scala:170)
at org.apache.spark.scheduler.SparkListenerBus$class.onPostEvent(SparkListenerBus.scala:32)
at org.apache.spark.scheduler.LiveListenerBus.onPostEvent(LiveListenerBus.scala:31)
at org.apache.spark.scheduler.LiveListenerBus.onPostEvent(LiveListenerBus.scala:31)
at org.apache.spark.util.ListenerBus$class.postToAll(ListenerBus.scala:55)
at org.apache.spark.util.AsynchronousListenerBus.postToAll(AsynchronousListenerBus.scala:37)
at org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(AsynchronousListenerBus.scala:80)
at org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply(AsynchronousListenerBus.scala:65)
at org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply(AsynchronousListenerBus.scala:65)
at scala.util.DynamicVariable.withValue(DynamicVariable.scala:57)
at org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1.apply$mcV$sp(AsynchronousListenerBus.scala:64)
at org.apache.spark.util.Utils$.tryOrStopSparkContext(Utils.scala:1181)
at org.apache.spark.util.AsynchronousListenerBus$$anon$1.run(AsynchronousListenerBus.scala:63)
Caused by: java.io.IOException: Filesystem closed
at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:818)
at org.apache.hadoop.hdfs.DFSOutputStream.flushOrSync(DFSOutputStream.java:2037)
at org.apache.hadoop.hdfs.DFSOutputStream.hflush(DFSOutputStream.java:1983)
at org.apache.hadoop.fs.FSDataOutputStream.hflush(FSDataOutputStream.java:130)
... 20 more
17/04/21 08:13:40 INFO DAGScheduler: running: Set()
17/04/21 08:13:40 INFO DAGScheduler: waiting: Set(ResultStage 5, ShuffleMapStage 4)
17/04/21 08:13:40 INFO DAGScheduler: failed: Set()
17/04/21 08:13:40 INFO DAGScheduler: Submitting ShuffleMapStage 4 (MapPartitionsRDD[17] at repartition at BwaInterpreter.java:281), which has no missing parents
17/04/21 08:13:40 INFO MemoryStore: Block broadcast_6 stored as values in memory (estimated size 8.3 KB, free 774.6 KB)
17/04/21 08:13:40 INFO MemoryStore: Block broadcast_6_piece0 stored as bytes in memory (estimated size 3.9 KB, free 778.5 KB)
17/04/21 08:13:40 INFO BlockManagerInfo: Added broadcast_6_piece0 in memory on 192.168.2.86:32844 (size: 3.9 KB, free: 1140.3 MB)
17/04/21 08:13:40 INFO SparkContext: Created broadcast 6 from broadcast at DAGScheduler.scala:1008
17/04/21 08:13:40 INFO DAGScheduler: Submitting 14 missing tasks from ShuffleMapStage 4 (MapPartitionsRDD[17] at repartition at BwaInterpreter.java:281)
17/04/21 08:13:40 INFO YarnClusterScheduler: Adding task set 4.0 with 14 tasks
17/04/21 08:13:40 INFO TaskSetManager: Starting task 0.0 in stage 4.0 (TID 54, slave1.hdp, partition 0,NODE_LOCAL, 2132 bytes)
17/04/21 08:13:40 INFO TaskSetManager: Starting task 1.0 in stage 4.0 (TID 55, slave2.hdp, partition 1,NODE_LOCAL, 2132 bytes)
17/04/21 08:13:40 INFO TaskSetManager: Starting task 2.0 in stage 4.0 (TID 56, slave.hdp, partition 2,NODE_LOCAL, 2132 bytes)
17/04/21 08:13:40 INFO BlockManagerInfo: Added broadcast_6_piece0 in memory on slave1.hdp:38864 (size: 3.9 KB, free: 7.0 GB)
17/04/21 08:13:40 INFO BlockManagerInfo: Added broadcast_6_piece0 in memory on slave2.hdp:46273 (size: 3.9 KB, free: 7.0 GB)
17/04/21 08:13:40 INFO BlockManagerInfo: Added broadcast_6_piece0 in memory on slave.hdp:43602 (size: 3.9 KB, free: 7.0 GB)
17/04/21 08:13:40 INFO MapOutputTrackerMasterEndpoint: Asked to send map output locations for shuffle 2 to slave2.hdp:49614
17/04/21 08:13:40 INFO MapOutputTrackerMaster: Size of output statuses for shuffle 2 is 207 bytes
17/04/21 08:13:40 INFO MapOutputTrackerMasterEndpoint: Asked to send map output locations for shuffle 2 to slave.hdp:50548
17/04/21 08:13:40 INFO MapOutputTrackerMasterEndpoint: Asked to send map output locations for shuffle 2 to slave1.hdp:41948
17/04/21 08:13:47 INFO MapOutputTrackerMasterEndpoint: Asked to send map output locations for shuffle 1 to slave.hdp:50548
17/04/21 08:13:47 INFO MapOutputTrackerMaster: Size of output statuses for shuffle 1 is 188 bytes
17/04/21 08:13:48 INFO MapOutputTrackerMasterEndpoint: Asked to send map output locations for shuffle 1 to slave1.hdp:41948
17/04/21 08:13:48 INFO MapOutputTrackerMasterEndpoint: Asked to send map output locations for shuffle 1 to slave2.hdp:49614
17/04/21 08:14:04 INFO TaskSetManager: Starting task 3.0 in stage 4.0 (TID 57, slave.hdp, partition 3,NODE_LOCAL, 2132 bytes)
17/04/21 08:14:04 INFO TaskSetManager: Finished task 2.0 in stage 4.0 (TID 56) in 24645 ms on slave.hdp (1/14)
17/04/21 08:14:06 INFO TaskSetManager: Starting task 4.0 in stage 4.0 (TID 58, slave2.hdp, partition 4,NODE_LOCAL, 2132 bytes)
17/04/21 08:14:06 INFO TaskSetManager: Finished task 1.0 in stage 4.0 (TID 55) in 26426 ms on slave2.hdp (2/14)
17/04/21 08:14:07 INFO TaskSetManager: Starting task 5.0 in stage 4.0 (TID 59, slave1.hdp, partition 5,NODE_LOCAL, 2132 bytes)
17/04/21 08:14:07 INFO TaskSetManager: Finished task 0.0 in stage 4.0 (TID 54) in 27388 ms on slave1.hdp (3/14)
17/04/21 08:14:32 INFO TaskSetManager: Starting task 6.0 in stage 4.0 (TID 60, slave.hdp, partition 6,NODE_LOCAL, 2132 bytes)
17/04/21 08:14:32 INFO TaskSetManager: Finished task 3.0 in stage 4.0 (TID 57) in 27494 ms on slave.hdp (4/14)
17/04/21 08:14:38 INFO TaskSetManager: Starting task 7.0 in stage 4.0 (TID 61, slave2.hdp, partition 7,NODE_LOCAL, 2132 bytes)
17/04/21 08:14:38 INFO TaskSetManager: Finished task 4.0 in stage 4.0 (TID 58) in 31786 ms on slave2.hdp (5/14)
17/04/21 08:14:39 INFO TaskSetManager: Starting task 8.0 in stage 4.0 (TID 62, slave1.hdp, partition 8,NODE_LOCAL, 2132 bytes)
17/04/21 08:14:39 INFO TaskSetManager: Finished task 5.0 in stage 4.0 (TID 59) in 31780 ms on slave1.hdp (6/14)
17/04/21 08:14:57 INFO TaskSetManager: Starting task 9.0 in stage 4.0 (TID 63, slave.hdp, partition 9,NODE_LOCAL, 2132 bytes)
17/04/21 08:14:57 INFO TaskSetManager: Finished task 6.0 in stage 4.0 (TID 60) in 25588 ms on slave.hdp (7/14)
17/04/21 08:15:07 INFO TaskSetManager: Starting task 10.0 in stage 4.0 (TID 64, slave2.hdp, partition 10,NODE_LOCAL, 2132 bytes)
17/04/21 08:15:08 INFO TaskSetManager: Finished task 7.0 in stage 4.0 (TID 61) in 29562 ms on slave2.hdp (8/14)
17/04/21 08:15:28 INFO TaskSetManager: Starting task 11.0 in stage 4.0 (TID 65, slave1.hdp, partition 11,NODE_LOCAL, 2132 bytes)
17/04/21 08:15:28 INFO TaskSetManager: Finished task 8.0 in stage 4.0 (TID 62) in 49647 ms on slave1.hdp (9/14)
17/04/21 08:15:35 INFO TaskSetManager: Starting task 12.0 in stage 4.0 (TID 66, slave.hdp, partition 12,NODE_LOCAL, 2132 bytes)
17/04/21 08:15:35 INFO TaskSetManager: Finished task 9.0 in stage 4.0 (TID 63) in 38007 ms on slave.hdp (10/14)
17/04/21 08:15:40 INFO TaskSetManager: Starting task 13.0 in stage 4.0 (TID 67, slave2.hdp, partition 13,NODE_LOCAL, 2132 bytes)
17/04/21 08:15:40 INFO TaskSetManager: Finished task 10.0 in stage 4.0 (TID 64) in 32524 ms on slave2.hdp (11/14)
17/04/21 08:17:17 INFO TaskSetManager: Finished task 13.0 in stage 4.0 (TID 67) in 97403 ms on slave2.hdp (12/14)
17/04/21 08:17:20 INFO TaskSetManager: Finished task 12.0 in stage 4.0 (TID 66) in 104233 ms on slave.hdp (13/14)
17/04/21 08:18:24 INFO TaskSetManager: Finished task 11.0 in stage 4.0 (TID 65) in 175387 ms on slave1.hdp (14/14)
17/04/21 08:18:24 INFO DAGScheduler: ShuffleMapStage 4 (repartition at BwaInterpreter.java:281) finished in 284.201 s
17/04/21 08:18:24 INFO YarnClusterScheduler: Removed TaskSet 4.0, whose tasks have all completed, from pool
17/04/21 08:18:24 INFO DAGScheduler: looking for newly runnable stages
17/04/21 08:18:24 INFO DAGScheduler: running: Set()
17/04/21 08:18:24 INFO DAGScheduler: waiting: Set(ResultStage 5)
17/04/21 08:18:24 INFO DAGScheduler: failed: Set()
17/04/21 08:18:24 INFO DAGScheduler: Submitting ResultStage 5 (MapPartitionsRDD[22] at mapPartitionsWithIndex at BwaInterpreter.java:304), which has no missing parents
17/04/21 08:18:24 ERROR LiveListenerBus: Listener EventLoggingListener threw an exception
java.lang.reflect.InvocationTargetException
at sun.reflect.GeneratedMethodAccessor20.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.scheduler.EventLoggingListener$$anonfun$logEvent$3.apply(EventLoggingListener.scala:150)
at org.apache.spark.scheduler.EventLoggingListener$$anonfun$logEvent$3.apply(EventLoggingListener.scala:150)
at scala.Option.foreach(Option.scala:236)
at org.apache.spark.scheduler.EventLoggingListener.logEvent(EventLoggingListener.scala:150)
at org.apache.spark.scheduler.EventLoggingListener.onStageCompleted(EventLoggingListener.scala:170)
at org.apache.spark.scheduler.SparkListenerBus$class.onPostEvent(SparkListenerBus.scala:32)
at org.apache.spark.scheduler.LiveListenerBus.onPostEvent(LiveListenerBus.scala:31)
at org.apache.spark.scheduler.LiveListenerBus.onPostEvent(LiveListenerBus.scala:31)
at org.apache.spark.util.ListenerBus$class.postToAll(ListenerBus.scala:55)
at org.apache.spark.util.AsynchronousListenerBus.postToAll(AsynchronousListenerBus.scala:37)
at org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(AsynchronousListenerBus.scala:80)
at org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply(AsynchronousListenerBus.scala:65)
at org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply(AsynchronousListenerBus.scala:65)
at scala.util.DynamicVariable.withValue(DynamicVariable.scala:57)
at org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1.apply$mcV$sp(AsynchronousListenerBus.scala:64)
at org.apache.spark.util.Utils$.tryOrStopSparkContext(Utils.scala:1181)
at org.apache.spark.util.AsynchronousListenerBus$$anon$1.run(AsynchronousListenerBus.scala:63)
Caused by: java.io.IOException: Filesystem closed
at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:818)
at org.apache.hadoop.hdfs.DFSOutputStream.flushOrSync(DFSOutputStream.java:2037)
at org.apache.hadoop.hdfs.DFSOutputStream.hflush(DFSOutputStream.java:1983)
at org.apache.hadoop.fs.FSDataOutputStream.hflush(FSDataOutputStream.java:130)
... 20 more

This is my running command
/usr/bin/spark-submit --class com.github.sparkbwa.SparkBWA --master yarn-cluster --driver-memory 2g --executor-memory 10g --executor-cores 1 --verbose --num-executors 32 /home/SparkBWA/target/SparkBWA-0.2.jar -m -r -p --index hdfs://master.hdp:8020/Data/HumanBase/hg19.fa -n 32 -w "-R @rg\tID:foo\tLB:bar\tPL:illumina\tPU:illumina\tSM:ERR000589" hdfs://master.hdp:8020/SparkBWA/ERR000589_1.filt.fastq hdfs://master.hdp:8020/SparkBWA/ERR000589_2.filt.fastq hdfs://master.hdp:8020/sample/output

@xubo245
Copy link
Contributor

xubo245 commented Apr 23, 2017

You can run in locla and executor number should be less than worker number.

@jmabuin
Copy link
Contributor

jmabuin commented Apr 24, 2017

Just to be clear, what do you mean when you say "non-hdfs environment"?

@jmabuin
Copy link
Contributor

jmabuin commented Jun 13, 2017

I was taking a look at your execution command. The index must be in local disk, not in HDFS, and it must be available in all computing nodes in your cluster.

@jmabuin
Copy link
Contributor

jmabuin commented Jun 13, 2017

We have been finally able to reproduce the empty sam file error. It is because bwa is not finding the index. The job in the YARN web interface appears to finish correctly, but internally, the spark executors are failing.

You can check if this is happening to you by ckecking the executors logs
yarn logs -applicationId your-app-id

You should find some kind of bwa error in the executors

@jmabuin jmabuin closed this as completed Aug 2, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants