New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Recalibrate_base_Qualities #1743

Closed
Rokshan2016 opened this Issue Sep 23, 2017 · 14 comments

Comments

Projects
3 participants
@Rokshan2016

Rokshan2016 commented Sep 23, 2017

Hi,

I am trying to use this command: recalibrate _base_Qualities. This is my command:

./adam-submit transformAlignments hdfs://ip-10-48-3-5.ips.local:8020/genomics/wes_data/wes_data_markduplicate_adam/aln_markduplicate.adam hdfs://ip-10-48-3-5.ips.local:8020/user/rokshan.jahan/aln2.adam -recalibrate_base_qualities

I am getting this error:

n.scala:323)
at org.bdgenomics.adam.rdd.read.recalibration.BaseQualityRecalibration$$anonfun$org$bdgenomics$adam$rdd$read$recalibration$BaseQualityRecalibration$$observe$1.apply(BaseQualityRecalibration.scala:320)
at scala.Option.fold(Option.scala:157)
at org.apache.spark.rdd.Timer.time(Timer.scala:48)
at org.bdgenomics.adam.rdd.read.recalibration.BaseQualityRecalibration$.org$bdgenomics$adam$rdd$read$recalibration$BaseQualityRecalibration$$observe(BaseQualityRecalibration.scala:320)
at org.bdgenomics.adam.rdd.read.recalibration.BaseQualityRecalibration$$anonfun$2.apply(BaseQualityRecalibration.scala:71)
at org.bdgenomics.adam.rdd.read.recalibration.BaseQualityRecalibration$$anonfun$2.apply(BaseQualityRecalibration.scala:69)
at scala.collection.

@Rokshan2016

This comment has been minimized.

Show comment
Hide comment
@Rokshan2016

Rokshan2016 Sep 24, 2017

It is only working with the .sam file. If I put .adam it is showing that error

Rokshan2016 commented Sep 24, 2017

It is only working with the .sam file. If I put .adam it is showing that error

@fnothaft

This comment has been minimized.

Show comment
Hide comment
@fnothaft

fnothaft Sep 24, 2017

Member

Hi @Rokshan2016! Your stack trace is missing the start of the first line. Can you re-paste that line?

Member

fnothaft commented Sep 24, 2017

Hi @Rokshan2016! Your stack trace is missing the start of the first line. Can you re-paste that line?

@Rokshan2016

This comment has been minimized.

Show comment
Hide comment
@Rokshan2016

Rokshan2016 Sep 24, 2017

Hi, is it good now? sorry, I am newbud for Genomics.

./adam-submit transformAlignments hdfs://ip-10-48-3-5.ips.local:8020/genomics/wes_data/wes_data_
markduplicate_adam/aln_markduplicate.adam hdfs://ip-10-48-3-5.ips.local:8020/ user/rokshan.jahan/ aln2.adam -recalibrate_base_qualities

Rokshan2016 commented Sep 24, 2017

Hi, is it good now? sorry, I am newbud for Genomics.

./adam-submit transformAlignments hdfs://ip-10-48-3-5.ips.local:8020/genomics/wes_data/wes_data_
markduplicate_adam/aln_markduplicate.adam hdfs://ip-10-48-3-5.ips.local:8020/ user/rokshan.jahan/ aln2.adam -recalibrate_base_qualities

@fnothaft

This comment has been minimized.

Show comment
Hide comment
@fnothaft

fnothaft Sep 24, 2017

Member

Hi @Rokshan2016! I was talking about the error message. Specifically, the first line of the error message is truncated to n.scala:323).

Member

fnothaft commented Sep 24, 2017

Hi @Rokshan2016! I was talking about the error message. Specifically, the first line of the error message is truncated to n.scala:323).

@Rokshan2016

This comment has been minimized.

Show comment
Hide comment
@Rokshan2016

Rokshan2016 Sep 24, 2017

Here it is :
at scala.Option.fold(Option.scala:157)
at org.apache.spark.rdd.Timer.time(Timer.scala:48)
at org.bdgenomics.adam.rdd.read.recalibration.BaseQualityRecalibration$.observe(BaseQualityRecalibration.scala:295)
at org.bdgenomics.adam.rdd.read.recalibration.BaseQualityRecalibration$$anonfun$org$bdgenomics$adam$rdd$read$recalibration$BaseQualityRecalibration$$observe$1.apply(BaseQualityRecalibration.scala:323)
at org.bdgenomics.adam.rdd.read.recalibration.BaseQualityRecalibration$$anonfun$org$bdgenomics$adam$rdd$read$recalibration$BaseQualityRecalibration$$observe$1.apply(BaseQualityRecalibration.scala:320)
at scala.Option.fold(Option.scala:157)
at org.apache.spark.rdd.Timer.time(Timer.scala:48)
at org.bdgenomics.adam.rdd.read.recalibration.BaseQualityRecalibration$.org$bdgenomics$adam$rdd$read$recalibration$BaseQualityRecalibration$$observe(BaseQualityRecalibration.scala:320)
at org.bdgenomics.adam.rdd.read.recalibration.BaseQualityRecalibration$$anonfun$2.apply(BaseQualityRecalibration.scala:71)
at org.bdgenomics.adam.rdd.read.recalibration.BaseQualityRecalibration$$anonfun$2.apply(BaseQualityRecalibration.scala:69)
at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:371)
at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:371)
at scala.collection.Iterator$class.foreach(Iterator.scala:727)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
at org.apache.spark.rdd.PairRDDFunctions$$anonfun$reduceByKeyLocally$1$$anonfun$3.apply(PairRDDFunctions.scala:348)
at org.apache.spark.rdd.PairRDDFunctions$$anonfun$reduceByKeyLocally$1$$anonfun$3.apply(PairRDDFunctions.scala:346)
at org.apache.spark.

Rokshan2016 commented Sep 24, 2017

Here it is :
at scala.Option.fold(Option.scala:157)
at org.apache.spark.rdd.Timer.time(Timer.scala:48)
at org.bdgenomics.adam.rdd.read.recalibration.BaseQualityRecalibration$.observe(BaseQualityRecalibration.scala:295)
at org.bdgenomics.adam.rdd.read.recalibration.BaseQualityRecalibration$$anonfun$org$bdgenomics$adam$rdd$read$recalibration$BaseQualityRecalibration$$observe$1.apply(BaseQualityRecalibration.scala:323)
at org.bdgenomics.adam.rdd.read.recalibration.BaseQualityRecalibration$$anonfun$org$bdgenomics$adam$rdd$read$recalibration$BaseQualityRecalibration$$observe$1.apply(BaseQualityRecalibration.scala:320)
at scala.Option.fold(Option.scala:157)
at org.apache.spark.rdd.Timer.time(Timer.scala:48)
at org.bdgenomics.adam.rdd.read.recalibration.BaseQualityRecalibration$.org$bdgenomics$adam$rdd$read$recalibration$BaseQualityRecalibration$$observe(BaseQualityRecalibration.scala:320)
at org.bdgenomics.adam.rdd.read.recalibration.BaseQualityRecalibration$$anonfun$2.apply(BaseQualityRecalibration.scala:71)
at org.bdgenomics.adam.rdd.read.recalibration.BaseQualityRecalibration$$anonfun$2.apply(BaseQualityRecalibration.scala:69)
at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:371)
at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:371)
at scala.collection.Iterator$class.foreach(Iterator.scala:727)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
at org.apache.spark.rdd.PairRDDFunctions$$anonfun$reduceByKeyLocally$1$$anonfun$3.apply(PairRDDFunctions.scala:348)
at org.apache.spark.rdd.PairRDDFunctions$$anonfun$reduceByKeyLocally$1$$anonfun$3.apply(PairRDDFunctions.scala:346)
at org.apache.spark.

@Rokshan2016

This comment has been minimized.

Show comment
Hide comment
@Rokshan2016

Rokshan2016 Sep 24, 2017

It is working it I give unaligned adam format file from the .fastq.
But it does not work if I give aligned adam format

Rokshan2016 commented Sep 24, 2017

It is working it I give unaligned adam format file from the .fastq.
But it does not work if I give aligned adam format

@fnothaft

This comment has been minimized.

Show comment
Hide comment
@fnothaft

fnothaft Sep 24, 2017

Member

Hi @Rokshan2016! Is there any more to the error message you've provided? I'd expect it to include the specific exception that is thrown. From what I can trace in the message, I don't have insight as to where the exception is being thrown.

This isn't related to the problem at hand, but BQSR doesn't do anything if you provide unaligned data; it will only recalibrate base qualities if your data has been aligned.

Member

fnothaft commented Sep 24, 2017

Hi @Rokshan2016! Is there any more to the error message you've provided? I'd expect it to include the specific exception that is thrown. From what I can trace in the message, I don't have insight as to where the exception is being thrown.

This isn't related to the problem at hand, but BQSR doesn't do anything if you provide unaligned data; it will only recalibrate base qualities if your data has been aligned.

@Rokshan2016

This comment has been minimized.

Show comment
Hide comment
@Rokshan2016

Rokshan2016 Sep 24, 2017

[rokshan.jahan@ip-10-48-3-64 bin]$ ./adam-submit transformAlignments hdfs://ip-10-48-3-5.ips.local:8020/genomics/wes_data/wes_data_aligned_adam/aln.adam hdfs://ip-10-48-3-5.ips.local:8020/user/rokshan.jahan/Recali2.adam -recalibrate_base_qualities
Using ADAM_MAIN=org.bdgenomics.adam.cli.ADAMMain
Using SPARK_SUBMIT=/opt/cloudera/parcels/CDH-5.10.1-1.cdh5.10.1.p0.10/bin/spark-submit
17/09/23 22:11:34 INFO cli.ADAMMain: ADAM invoked with args: "transformAlignments" "hdfs://ip-10-48-3-5.ips.local:8020/genomics/wes_data/wes_data_aligned_adam/aln.adam" "hdfs://ip-10-48-3-5.ips.local:8020/user/rokshan.jahan/Recali2.adam" "-recalibrate_base_qualities"
17/09/23 22:11:34 INFO spark.SparkContext: Running Spark version 1.6.0
17/09/23 22:11:35 INFO spark.SecurityManager: Changing view acls to: rokshan.jahan
17/09/23 22:11:35 INFO spark.SecurityManager: Changing modify acls to: rokshan.jahan
17/09/23 22:11:35 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(rokshan.jahan); users with modify permissions: Set(rokshan.jahan)
17/09/23 22:11:35 INFO util.Utils: Successfully started service 'sparkDriver' on port 43439.
17/09/23 22:11:35 INFO slf4j.Slf4jLogger: Slf4jLogger started
17/09/23 22:11:36 INFO Remoting: Starting remoting
17/09/23 22:11:36 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://sparkDriverActorSystem@10.48.3.64:39653]
17/09/23 22:11:36 INFO Remoting: Remoting now listens on addresses: [akka.tcp://sparkDriverActorSystem@10.48.3.64:39653]
17/09/23 22:11:36 INFO util.Utils: Successfully started service 'sparkDriverActorSystem' on port 39653.
17/09/23 22:11:36 INFO spark.SparkEnv: Registering MapOutputTracker
17/09/23 22:11:36 INFO spark.SparkEnv: Registering BlockManagerMaster
17/09/23 22:11:36 INFO storage.DiskBlockManager: Created local directory at /data1/tmp/blockmgr-1089712b-7f4f-451e-9a36-6722edb69a49
17/09/23 22:11:36 INFO storage.MemoryStore: MemoryStore started with capacity 530.0 MB
17/09/23 22:11:36 INFO spark.SparkEnv: Registering OutputCommitCoordinator
17/09/23 22:11:36 INFO util.Utils: Successfully started service 'SparkUI' on port 4040.
17/09/23 22:11:36 INFO ui.SparkUI: Started SparkUI at http://10.48.3.64:4040
17/09/23 22:11:36 INFO spark.SparkContext: Added JAR file:/home/rokshan.jahan/adamproject/adam/bin/../adam-assembly/target/adam-assembly_2.10-0.23.0-SNAPSHOT.jar at spark://10.48.3.64:43439/jars/adam-assembly_2.10-0.23.0-SNAPSHOT.jar with timestamp 1506219096638
17/09/23 22:11:36 INFO client.RMProxy: Connecting to ResourceManager at ip-10-48-3-5.ips.local/10.48.3.5:8032
17/09/23 22:11:37 INFO yarn.Client: Requesting a new application from cluster with 3 NodeManagers
17/09/23 22:11:37 INFO yarn.Client: Verifying our application has not requested more than the maximum memory capability of the cluster (28672 MB per container)
17/09/23 22:11:37 INFO yarn.Client: Will allocate AM container, with 896 MB memory including 384 MB overhead
17/09/23 22:11:37 INFO yarn.Client: Setting up container launch context for our AM
17/09/23 22:11:37 INFO yarn.Client: Setting up the launch environment for our AM container
17/09/23 22:11:37 INFO yarn.Client: Preparing resources for our AM container
17/09/23 22:11:38 INFO yarn.YarnSparkHadoopUtil: getting token for namenode: hdfs://ip-10-48-3-5.ips.local:8020/user/rokshan.jahan/.sparkStaging/application_1506021801259_1567
17/09/23 22:11:38 INFO hdfs.DFSClient: Created token for rokshan.jahan: HDFS_DELEGATION_TOKEN owner=rokshan.jahan@IPS.LOCAL, renewer=yarn, realUser=, issueDate=1506219098209, maxDate=1506823898209, sequenceNumber=92917, masterKeyId=759 on 10.48.3.5:8020
17/09/23 22:11:39 INFO hive.metastore: Trying to connect to metastore with URI thrift://ip-10-48-3-5.ips.local:9083
17/09/23 22:11:39 INFO hive.metastore: Opened a connection to metastore, current connections: 1
17/09/23 22:11:39 INFO hive.metastore: Connected to metastore.
17/09/23 22:11:39 INFO hive.metastore: Closed a connection to metastore, current connections: 0
17/09/23 22:11:39 INFO yarn.Client: Uploading resource file:/data1/tmp/spark-756bda7b-3039-4204-9c45-00f1ed423a35/__spark_conf__7564382871247080826.zip -> hdfs://ip-10-48-3-5.ips.local:8020/user/rokshan.jahan/.sparkStaging/application_1506021801259_1567/__spark_conf__7564382871247080826.zip
17/09/23 22:11:40 INFO spark.SecurityManager: Changing view acls to: rokshan.jahan
17/09/23 22:11:40 INFO spark.SecurityManager: Changing modify acls to: rokshan.jahan
17/09/23 22:11:40 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(rokshan.jahan); users with modify permissions: Set(rokshan.jahan)
17/09/23 22:11:40 INFO yarn.Client: Submitting application 1567 to ResourceManager
17/09/23 22:11:40 INFO impl.YarnClientImpl: Submitted application application_1506021801259_1567
17/09/23 22:11:41 INFO yarn.Client: Application report for application_1506021801259_1567 (state: ACCEPTED)
17/09/23 22:11:41 INFO yarn.Client:
client token: Token { kind: YARN_CLIENT_TOKEN, service: }
diagnostics: N/A
ApplicationMaster host: N/A
ApplicationMaster RPC port: -1
queue: root.users.rokshan_dot_jahan
start time: 1506219100129
final status: UNDEFINED
tracking URL: http://ip-10-48-3-5.ips.local:8088/proxy/application_1506021801259_1567/
user: rokshan.jahan
17/09/23 22:11:42 INFO yarn.Client: Application report for application_1506021801259_1567 (state: ACCEPTED)
17/09/23 22:11:43 INFO yarn.Client: Application report for application_1506021801259_1567 (state: ACCEPTED)
17/09/23 22:11:44 INFO cluster.YarnSchedulerBackend$YarnSchedulerEndpoint: ApplicationMaster registered as NettyRpcEndpointRef(null)
17/09/23 22:11:44 INFO cluster.YarnClientSchedulerBackend: Add WebUI Filter. org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter, Map(PROXY_HOSTS -> ip-10-48-3-5.ips.local, PROXY_URI_BASES -> http://ip-10-48-3-5.ips.local:8088/proxy/application_1506021801259_1567), /proxy/application_1506021801259_1567
17/09/23 22:11:44 INFO ui.JettyUtils: Adding filter: org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter
17/09/23 22:11:44 INFO yarn.Client: Application report for application_1506021801259_1567 (state: ACCEPTED)
17/09/23 22:11:45 INFO yarn.Client: Application report for application_1506021801259_1567 (state: RUNNING)
17/09/23 22:11:45 INFO yarn.Client:
client token: Token { kind: YARN_CLIENT_TOKEN, service: }
diagnostics: N/A
ApplicationMaster host: 10.48.3.65
ApplicationMaster RPC port: 0
queue: root.users.rokshan_dot_jahan
start time: 1506219100129
final status: UNDEFINED
tracking URL: http://ip-10-48-3-5.ips.local:8088/proxy/application_1506021801259_1567/
user: rokshan.jahan
17/09/23 22:11:45 INFO cluster.YarnClientSchedulerBackend: Application application_1506021801259_1567 has started running.
17/09/23 22:11:45 INFO util.Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 33559.
17/09/23 22:11:45 INFO netty.NettyBlockTransferService: Server created on 33559
17/09/23 22:11:45 INFO storage.BlockManager: external shuffle service port = 7337
17/09/23 22:11:45 INFO storage.BlockManagerMaster: Trying to register BlockManager
17/09/23 22:11:45 INFO storage.BlockManagerMasterEndpoint: Registering block manager 10.48.3.64:33559 with 530.0 MB RAM, BlockManagerId(driver, 10.48.3.64, 33559)
17/09/23 22:11:45 INFO storage.BlockManagerMaster: Registered BlockManager
17/09/23 22:11:45 INFO scheduler.EventLoggingListener: Logging events to hdfs://ip-10-48-3-5.ips.local:8020/user/spark/applicationHistory/application_1506021801259_1567
17/09/23 22:11:45 INFO cluster.YarnClientSchedulerBackend: SchedulerBackend is ready for scheduling beginning after reached minRegisteredResourcesRatio: 0.8
17/09/23 22:11:45 INFO rdd.ADAMContext: Loading hdfs://ip-10-48-3-5.ips.local:8020/genomics/wes_data/wes_data_aligned_adam/aln.adam as Parquet of AlignmentRecords.
17/09/23 22:11:45 INFO rdd.ADAMContext: Reading the ADAM file at hdfs://ip-10-48-3-5.ips.local:8020/genomics/wes_data/wes_data_aligned_adam/aln.adam to create RDD
17/09/23 22:11:46 INFO storage.MemoryStore: Block broadcast_0 stored as values in memory (estimated size 302.5 KB, free 529.7 MB)
17/09/23 22:11:46 INFO serialization.ADAMKryoRegistrator: Did not find Spark internal class. This is expected for Spark 1.
17/09/23 22:11:46 INFO storage.MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 28.0 KB, free 529.7 MB)
17/09/23 22:11:46 INFO storage.BlockManagerInfo: Added broadcast_0_piece0 in memory on 10.48.3.64:33559 (size: 28.0 KB, free: 530.0 MB)
17/09/23 22:11:46 INFO spark.SparkContext: Created broadcast 0 from newAPIHadoopFile at ADAMContext.scala:1257
17/09/23 22:11:46 INFO cli.TransformAlignments: Recalibrating base qualities
17/09/23 22:11:46 INFO storage.MemoryStore: Block broadcast_1 stored as values in memory (estimated size 128.0 B, free 529.7 MB)
17/09/23 22:11:46 INFO serialization.ADAMKryoRegistrator: Did not find Spark internal class. This is expected for Spark 1.
17/09/23 22:11:46 INFO storage.MemoryStore: Block broadcast_1_piece0 stored as bytes in memory (estimated size 32.0 B, free 529.7 MB)
17/09/23 22:11:46 INFO storage.BlockManagerInfo: Added broadcast_1_piece0 in memory on 10.48.3.64:33559 (size: 32.0 B, free: 530.0 MB)
17/09/23 22:11:46 INFO spark.SparkContext: Created broadcast 1 from broadcast at TransformAlignments.scala:272
17/09/23 22:11:46 INFO hdfs.DFSClient: Created token for rokshan.jahan: HDFS_DELEGATION_TOKEN owner=rokshan.jahan@IPS.LOCAL, renewer=yarn, realUser=, issueDate=1506219106860, maxDate=1506823906860, sequenceNumber=92918, masterKeyId=759 on 10.48.3.5:8020
17/09/23 22:11:46 INFO security.TokenCache: Got dt for hdfs://ip-10-48-3-5.ips.local:8020; Kind: HDFS_DELEGATION_TOKEN, Service: 10.48.3.5:8020, Ident: (token for rokshan.jahan: HDFS_DELEGATION_TOKEN owner=rokshan.jahan@IPS.LOCAL, renewer=yarn, realUser=, issueDate=1506219106860, maxDate=1506823906860, sequenceNumber=92918, masterKeyId=759)
17/09/23 22:11:46 INFO input.FileInputFormat: Total input paths to process : 5
17/09/23 22:11:46 INFO spark.SparkContext: Starting job: reduceByKeyLocally at BaseQualityRecalibration.scala:94
17/09/23 22:11:46 INFO scheduler.DAGScheduler: Got job 0 (reduceByKeyLocally at BaseQualityRecalibration.scala:94) with 5 output partitions
17/09/23 22:11:46 INFO scheduler.DAGScheduler: Final stage: ResultStage 0 (reduceByKeyLocally at BaseQualityRecalibration.scala:94)
17/09/23 22:11:46 INFO scheduler.DAGScheduler: Parents of final stage: List()
17/09/23 22:11:46 INFO scheduler.DAGScheduler: Missing parents: List()
17/09/23 22:11:46 INFO scheduler.DAGScheduler: Submitting ResultStage 0 (MapPartitionsRDD[5] at reduceByKeyLocally at BaseQualityRecalibration.scala:94), which has no missing parents
17/09/23 22:11:46 INFO storage.MemoryStore: Block broadcast_2 stored as values in memory (estimated size 4.7 KB, free 529.7 MB)
17/09/23 22:11:46 INFO serialization.ADAMKryoRegistrator: Did not find Spark internal class. This is expected for Spark 1.
17/09/23 22:11:46 INFO storage.MemoryStore: Block broadcast_2_piece0 stored as bytes in memory (estimated size 2.5 KB, free 529.7 MB)
17/09/23 22:11:46 INFO storage.BlockManagerInfo: Added broadcast_2_piece0 in memory on 10.48.3.64:33559 (size: 2.5 KB, free: 530.0 MB)
17/09/23 22:11:46 INFO spark.SparkContext: Created broadcast 2 from broadcast at DAGScheduler.scala:1006
17/09/23 22:11:47 INFO scheduler.DAGScheduler: Submitting 5 missing tasks from ResultStage 0 (MapPartitionsRDD[5] at reduceByKeyLocally at BaseQualityRecalibration.scala:94)
17/09/23 22:11:47 INFO cluster.YarnScheduler: Adding task set 0.0 with 5 tasks
17/09/23 22:11:48 INFO spark.ExecutorAllocationManager: Requesting 1 new executor because tasks are backlogged (new desired total will be 1)
17/09/23 22:11:49 INFO spark.ExecutorAllocationManager: Requesting 2 new executors because tasks are backlogged (new desired total will be 3)
17/09/23 22:11:50 INFO spark.ExecutorAllocationManager: Requesting 2 new executors because tasks are backlogged (new desired total will be 5)
17/09/23 22:11:51 INFO cluster.YarnClientSchedulerBackend: Registered executor NettyRpcEndpointRef(null) (ip-10-48-3-65.ips.local:43752) with ID 3
17/09/23 22:11:51 INFO spark.ExecutorAllocationManager: New executor 3 has registered (new total is 1)
17/09/23 22:11:51 INFO scheduler.TaskSetManager: Starting task 0.0 in stage 0.0 (TID 0, ip-10-48-3-65.ips.local, executor 3, partition 0, NODE_LOCAL, 2319 bytes)
17/09/23 22:11:51 INFO storage.BlockManagerMasterEndpoint: Registering block manager ip-10-48-3-65.ips.local:36529 with 530.0 MB RAM, BlockManagerId(3, ip-10-48-3-65.ips.local, 36529)
17/09/23 22:11:52 INFO storage.BlockManagerInfo: Added broadcast_2_piece0 in memory on ip-10-48-3-65.ips.local:36529 (size: 2.5 KB, free: 530.0 MB)
17/09/23 22:11:52 INFO cluster.YarnClientSchedulerBackend: Registered executor NettyRpcEndpointRef(null) (ip-10-48-3-12.ips.local:59610) with ID 1
17/09/23 22:11:52 INFO scheduler.TaskSetManager: Starting task 1.0 in stage 0.0 (TID 1, ip-10-48-3-12.ips.local, executor 1, partition 1, NODE_LOCAL, 2319 bytes)
17/09/23 22:11:52 INFO spark.ExecutorAllocationManager: New executor 1 has registered (new total is 2)
17/09/23 22:11:52 INFO storage.BlockManagerMasterEndpoint: Registering block manager ip-10-48-3-12.ips.local:33231 with 530.0 MB RAM, BlockManagerId(1, ip-10-48-3-12.ips.local, 33231)
17/09/23 22:11:52 INFO cluster.YarnClientSchedulerBackend: Registered executor NettyRpcEndpointRef(null) (ip-10-48-3-65.ips.local:43766) with ID 5
17/09/23 22:11:52 INFO scheduler.TaskSetManager: Starting task 2.0 in stage 0.0 (TID 2, ip-10-48-3-65.ips.local, executor 5, partition 2, NODE_LOCAL, 2319 bytes)
17/09/23 22:11:52 INFO spark.ExecutorAllocationManager: New executor 5 has registered (new total is 3)
17/09/23 22:11:52 INFO storage.BlockManagerMasterEndpoint: Registering block manager ip-10-48-3-65.ips.local:33969 with 530.0 MB RAM, BlockManagerId(5, ip-10-48-3-65.ips.local, 33969)
17/09/23 22:11:53 INFO storage.BlockManagerInfo: Added broadcast_2_piece0 in memory on ip-10-48-3-12.ips.local:33231 (size: 2.5 KB, free: 530.0 MB)
17/09/23 22:11:53 INFO storage.BlockManagerInfo: Added broadcast_2_piece0 in memory on ip-10-48-3-65.ips.local:33969 (size: 2.5 KB, free: 530.0 MB)
17/09/23 22:11:53 INFO storage.BlockManagerInfo: Added broadcast_0_piece0 in memory on ip-10-48-3-65.ips.local:36529 (size: 28.0 KB, free: 530.0 MB)
17/09/23 22:11:54 INFO storage.BlockManagerInfo: Added broadcast_0_piece0 in memory on ip-10-48-3-12.ips.local:33231 (size: 28.0 KB, free: 530.0 MB)
17/09/23 22:11:54 INFO storage.BlockManagerInfo: Added broadcast_0_piece0 in memory on ip-10-48-3-65.ips.local:33969 (size: 28.0 KB, free: 530.0 MB)
17/09/23 22:11:55 INFO cluster.YarnClientSchedulerBackend: Registered executor NettyRpcEndpointRef(null) (ip-10-48-3-64.ips.local:57132) with ID 4
17/09/23 22:11:55 INFO scheduler.TaskSetManager: Starting task 3.0 in stage 0.0 (TID 3, ip-10-48-3-64.ips.local, executor 4, partition 3, NODE_LOCAL, 2319 bytes)
17/09/23 22:11:55 INFO spark.ExecutorAllocationManager: New executor 4 has registered (new total is 4)
17/09/23 22:11:55 INFO storage.BlockManagerInfo: Added broadcast_1_piece0 in memory on ip-10-48-3-65.ips.local:36529 (size: 32.0 B, free: 530.0 MB)
17/09/23 22:11:55 INFO storage.BlockManagerMasterEndpoint: Registering block manager ip-10-48-3-64.ips.local:40355 with 530.0 MB RAM, BlockManagerId(4, ip-10-48-3-64.ips.local, 40355)
17/09/23 22:11:55 INFO scheduler.TaskSetManager: Starting task 4.0 in stage 0.0 (TID 4, ip-10-48-3-65.ips.local, executor 3, partition 4, NODE_LOCAL, 2319 bytes)
17/09/23 22:11:55 WARN scheduler.TaskSetManager: Lost task 0.0 in stage 0.0 (TID 0, ip-10-48-3-65.ips.local, executor 3): java.util.NoSuchElementException: key not found: null
at scala.collection.MapLike$class.default(MapLike.scala:228)
at scala.collection.AbstractMap.default(Map.scala:58)
at scala.collection.MapLike$class.apply(MapLike.scala:141)
at scala.collection.AbstractMap.apply(Map.scala:58)
at org.bdgenomics.adam.models.RecordGroupDictionary.getIndex(RecordGroupDictionary.scala:123)
at org.bdgenomics.adam.rdd.read.recalibration.CovariateSpace$.apply(CovariateSpace.scala:82)
at org.bdgenomics.adam.rdd.read.recalibration.BaseQualityRecalibration$$anonfun$observe$1$$anonfun$apply$1.apply(BaseQualityRecalibration.scala:300)
at org.bdgenomics.adam.rdd.read.recalibration.BaseQualityRecalibration$$anonfun$observe$1$$anonfun$apply$1.apply(BaseQualityRecalibration.scala:300)
at scala.Option.fold(Option.scala:157)
at org.apache.spark.rdd.Timer.time(Timer.scala:48)
at org.bdgenomics.adam.rdd.read.recalibration.BaseQualityRecalibration$$anonfun$observe$1.apply(BaseQualityRecalibration.scala:299)
at org.bdgenomics.adam.rdd.read.recalibration.BaseQualityRecalibration$$anonfun$observe$1.apply(BaseQualityRecalibration.scala:295)
at scala.Option.fold(Option.scala:157)
at org.apache.spark.rdd.Timer.time(Timer.scala:48)
at org.bdgenomics.adam.rdd.read.recalibration.BaseQualityRecalibration$.observe(BaseQualityRecalibration.scala:295)
at org.bdgenomics.adam.rdd.read.recalibration.BaseQualityRecalibration$$anonfun$org$bdgenomics$adam$rdd$read$recalibration$BaseQualityRecalibration$$observe$1.apply(BaseQualityRecalibration.scala:323)
at org.bdgenomics.adam.rdd.read.recalibration.BaseQualityRecalibration$$anonfun$org$bdgenomics$adam$rdd$read$recalibration$BaseQualityRecalibration$$observe$1.apply(BaseQualityRecalibration.scala:320)
at scala.Option.fold(Option.scala:157)
at org.apache.spark.rdd.Timer.time(Timer.scala:48)
at org.bdgenomics.adam.rdd.read.recalibration.BaseQualityRecalibration$.org$bdgenomics$adam$rdd$read$recalibration$BaseQualityRecalibration$$observe(BaseQualityRecalibration.scala:320)
at org.bdgenomics.adam.rdd.read.recalibration.BaseQualityRecalibration$$anonfun$2.apply(BaseQualityRecalibration.scala:71)
at org.bdgenomics.adam.rdd.read.recalibration.BaseQualityRecalibration$$anonfun$2.apply(BaseQualityRecalibration.scala:69)
at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:371)
at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:371)
at scala.collection.Iterator$class.foreach(Iterator.scala:727)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
at org.apache.spark.rdd.PairRDDFunctions$$anonfun$reduceByKeyLocally$1$$anonfun$3.apply(PairRDDFunctions.scala:348)
at org.apache.spark.rdd.PairRDDFunctions$$anonfun$reduceByKeyLocally$1$$anonfun$3.apply(PairRDDFunctions.scala:346)
at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$20.apply(RDD.scala:710)
at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$20.apply(RDD.scala:710)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
at org.apache.spark.scheduler.Task.run(Task.scala:89)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:242)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)

17/09/23 22:11:55 INFO storage.BlockManagerInfo: Added broadcast_1_piece0 in memory on ip-10-48-3-12.ips.local:33231 (size: 32.0 B, free: 530.0 MB)
17/09/23 22:11:55 INFO scheduler.TaskSetManager: Starting task 0.1 in stage 0.0 (TID 5, ip-10-48-3-65.ips.local, executor 3, partition 0, NODE_LOCAL, 2319 bytes)
17/09/23 22:11:55 INFO scheduler.TaskSetManager: Lost task 4.0 in stage 0.0 (TID 4) on ip-10-48-3-65.ips.local, executor 3: java.util.NoSuchElementException (key not found: null) [duplicate 1]
17/09/23 22:11:55 INFO cluster.YarnClientSchedulerBackend: Registered executor NettyRpcEndpointRef(null) (ip-10-48-3-64.ips.local:57134) with ID 2
17/09/23 22:11:55 INFO scheduler.TaskSetManager: Starting task 4.1 in stage 0.0 (TID 6, ip-10-48-3-64.ips.local, executor 2, partition 4, NODE_LOCAL, 2319 bytes)
17/09/23 22:11:55 INFO scheduler.TaskSetManager: Lost task 1.0 in stage 0.0 (TID 1) on ip-10-48-3-12.ips.local, executor 1: java.util.NoSuchElementException (key not found: null) [duplicate 2]
17/09/23 22:11:55 INFO scheduler.TaskSetManager: Starting task 1.1 in stage 0.0 (TID 7, ip-10-48-3-12.ips.local, executor 1, partition 1, NODE_LOCAL, 2319 bytes)
17/09/23 22:11:55 INFO spark.ExecutorAllocationManager: New executor 2 has registered (new total is 5)
17/09/23 22:11:55 INFO storage.BlockManagerMasterEndpoint: Registering block manager ip-10-48-3-64.ips.local:33124 with 530.0 MB RAM, BlockManagerId(2, ip-10-48-3-64.ips.local, 33124)
17/09/23 22:11:55 INFO scheduler.TaskSetManager: Lost task 0.1 in stage 0.0 (TID 5) on ip-10-48-3-65.ips.local, executor 3: java.util.NoSuchElementException (key not found: null) [duplicate 3]
17/09/23 22:11:55 INFO scheduler.TaskSetManager: Starting task 0.2 in stage 0.0 (TID 8, ip-10-48-3-65.ips.local, executor 3, partition 0, NODE_LOCAL, 2319 bytes)
17/09/23 22:11:55 INFO scheduler.TaskSetManager: Lost task 1.1 in stage 0.0 (TID 7) on ip-10-48-3-12.ips.local, executor 1: java.util.NoSuchElementException (key not found: null) [duplicate 4]
17/09/23 22:11:55 INFO scheduler.TaskSetManager: Starting task 1.2 in stage 0.0 (TID 9, ip-10-48-3-12.ips.local, executor 1, partition 1, NODE_LOCAL, 2319 bytes)
17/09/23 22:11:55 INFO scheduler.TaskSetManager: Lost task 0.2 in stage 0.0 (TID 8) on ip-10-48-3-65.ips.local, executor 3: java.util.NoSuchElementException (key not found: null) [duplicate 5]
17/09/23 22:11:55 INFO scheduler.TaskSetManager: Starting task 0.3 in stage 0.0 (TID 10, ip-10-48-3-65.ips.local, executor 3, partition 0, NODE_LOCAL, 2319 bytes)
17/09/23 22:11:55 INFO scheduler.TaskSetManager: Lost task 1.2 in stage 0.0 (TID 9) on ip-10-48-3-12.ips.local, executor 1: java.util.NoSuchElementException (key not found: null) [duplicate 6]
17/09/23 22:11:55 INFO scheduler.TaskSetManager: Starting task 1.3 in stage 0.0 (TID 11, ip-10-48-3-12.ips.local, executor 1, partition 1, NODE_LOCAL, 2319 bytes)
17/09/23 22:11:55 INFO scheduler.TaskSetManager: Lost task 0.3 in stage 0.0 (TID 10) on ip-10-48-3-65.ips.local, executor 3: java.util.NoSuchElementException (key not found: null) [duplicate 7]
17/09/23 22:11:55 ERROR scheduler.TaskSetManager: Task 0 in stage 0.0 failed 4 times; aborting job
17/09/23 22:11:55 INFO scheduler.TaskSetManager: Lost task 1.3 in stage 0.0 (TID 11) on ip-10-48-3-12.ips.local, executor 1: java.util.NoSuchElementException (key not found: null) [duplicate 8]
17/09/23 22:11:55 INFO cluster.YarnScheduler: Cancelling stage 0
17/09/23 22:11:55 INFO cluster.YarnScheduler: Stage 0 was cancelled
17/09/23 22:11:55 INFO scheduler.DAGScheduler: ResultStage 0 (reduceByKeyLocally at BaseQualityRecalibration.scala:94) failed in 8.749 s due to Job aborted due to stage failure: Task 0 in stage 0.0 failed 4 times, most recent failure: Lost task 0.3 in stage 0.0 (TID 10, ip-10-48-3-65.ips.local, executor 3): java.util.NoSuchElementException: key not found: null
at scala.collection.MapLike$class.default(MapLike.scala:228)
at scala.collection.AbstractMap.default(Map.scala:58)
at scala.collection.MapLike$class.apply(MapLike.scala:141)
at scala.collection.AbstractMap.apply(Map.scala:58)
at org.bdgenomics.adam.models.RecordGroupDictionary.getIndex(RecordGroupDictionary.scala:123)
at org.bdgenomics.adam.rdd.read.recalibration.CovariateSpace$.apply(CovariateSpace.scala:82)
at org.bdgenomics.adam.rdd.read.recalibration.BaseQualityRecalibration$$anonfun$observe$1$$anonfun$apply$1.apply(BaseQualityRecalibration.scala:300)
at org.bdgenomics.adam.rdd.read.recalibration.BaseQualityRecalibration$$anonfun$observe$1$$anonfun$apply$1.apply(BaseQualityRecalibration.scala:300)
at scala.Option.fold(Option.scala:157)
at org.apache.spark.rdd.Timer.time(Timer.scala:48)
at org.bdgenomics.adam.rdd.read.recalibration.BaseQualityRecalibration$$anonfun$observe$1.apply(BaseQualityRecalibration.scala:299)
at org.bdgenomics.adam.rdd.read.recalibration.BaseQualityRecalibration$$anonfun$observe$1.apply(BaseQualityRecalibration.scala:295)
at scala.Option.fold(Option.scala:157)
at org.apache.spark.rdd.Timer.time(Timer.scala:48)
at org.bdgenomics.adam.rdd.read.recalibration.BaseQualityRecalibration$.observe(BaseQualityRecalibration.scala:295)
at org.bdgenomics.adam.rdd.read.recalibration.BaseQualityRecalibration$$anonfun$org$bdgenomics$adam$rdd$read$recalibration$BaseQualityRecalibration$$observe$1.apply(BaseQualityRecalibration.scala:323)
at org.bdgenomics.adam.rdd.read.recalibration.BaseQualityRecalibration$$anonfun$org$bdgenomics$adam$rdd$read$recalibration$BaseQualityRecalibration$$observe$1.apply(BaseQualityRecalibration.scala:320)
at scala.Option.fold(Option.scala:157)
at org.apache.spark.rdd.Timer.time(Timer.scala:48)
at org.bdgenomics.adam.rdd.read.recalibration.BaseQualityRecalibration$.org$bdgenomics$adam$rdd$read$recalibration$BaseQualityRecalibration$$observe(BaseQualityRecalibration.scala:320)
at org.bdgenomics.adam.rdd.read.recalibration.BaseQualityRecalibration$$anonfun$2.apply(BaseQualityRecalibration.scala:71)
at org.bdgenomics.adam.rdd.read.recalibration.BaseQualityRecalibration$$anonfun$2.apply(BaseQualityRecalibration.scala:69)
at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:371)
at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:371)
at scala.collection.Iterator$class.foreach(Iterator.scala:727)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
at org.apache.spark.rdd.PairRDDFunctions$$anonfun$reduceByKeyLocally$1$$anonfun$3.apply(PairRDDFunctions.scala:348)
at org.apache.spark.rdd.PairRDDFunctions$$anonfun$reduceByKeyLocally$1$$anonfun$3.apply(PairRDDFunctions.scala:346)
at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$20.apply(RDD.scala:710)
at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$20.apply(RDD.scala:710)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
at org.apache.spark.scheduler.Task.run(Task.scala:89)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:242)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)

Driver stacktrace:
17/09/23 22:11:55 WARN spark.ExecutorAllocationManager: No stages are running, but numRunningTasks != 0
17/09/23 22:11:55 INFO scheduler.DAGScheduler: Job 0 failed: reduceByKeyLocally at BaseQualityRecalibration.scala:94, took 9.032743 s
Command body threw exception:
org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 0.0 failed 4 times, most recent failure: Lost task 0.3 in stage 0.0 (TID 10, ip-10-48-3-65.ips.local, executor 3): java.util.NoSuchElementException: key not found: null
at scala.collection.MapLike$class.default(MapLike.scala:228)
at scala.collection.AbstractMap.default(Map.scala:58)
at scala.collection.MapLike$class.apply(MapLike.scala:141)
at scala.collection.AbstractMap.apply(Map.scala:58)
at org.bdgenomics.adam.models.RecordGroupDictionary.getIndex(RecordGroupDictionary.scala:123)
at org.bdgenomics.adam.rdd.read.recalibration.CovariateSpace$.apply(CovariateSpace.scala:82)
at org.bdgenomics.adam.rdd.read.recalibration.BaseQualityRecalibration$$anonfun$observe$1$$anonfun$apply$1.apply(BaseQualityRecalibration.scala:300)
at org.bdgenomics.adam.rdd.read.recalibration.BaseQualityRecalibration$$anonfun$observe$1$$anonfun$apply$1.apply(BaseQualityRecalibration.scala:300)
at scala.Option.fold(Option.scala:157)
at org.apache.spark.rdd.Timer.time(Timer.scala:48)
at org.bdgenomics.adam.rdd.read.recalibration.BaseQualityRecalibration$$anonfun$observe$1.apply(BaseQualityRecalibration.scala:299)
at org.bdgenomics.adam.rdd.read.recalibration.BaseQualityRecalibration$$anonfun$observe$1.apply(BaseQualityRecalibration.scala:295)
at scala.Option.fold(Option.scala:157)
at org.apache.spark.rdd.Timer.time(Timer.scala:48)
at org.bdgenomics.adam.rdd.read.recalibration.BaseQualityRecalibration$.observe(BaseQualityRecalibration.scala:295)
at org.bdgenomics.adam.rdd.read.recalibration.BaseQualityRecalibration$$anonfun$org$bdgenomics$adam$rdd$read$recalibration$BaseQualityRecalibration$$observe$1.apply(BaseQualityRecalibration.scala:323)
at org.bdgenomics.adam.rdd.read.recalibration.BaseQualityRecalibration$$anonfun$org$bdgenomics$adam$rdd$read$recalibration$BaseQualityRecalibration$$observe$1.apply(BaseQualityRecalibration.scala:320)
at scala.Option.fold(Option.scala:157)
at org.apache.spark.rdd.Timer.time(Timer.scala:48)
at org.bdgenomics.adam.rdd.read.recalibration.BaseQualityRecalibration$.org$bdgenomics$adam$rdd$read$recalibration$BaseQualityRecalibration$$observe(BaseQualityRecalibration.scala:320)
at org.bdgenomics.adam.rdd.read.recalibration.BaseQualityRecalibration$$anonfun$2.apply(BaseQualityRecalibration.scala:71)
at org.bdgenomics.adam.rdd.read.recalibration.BaseQualityRecalibration$$anonfun$2.apply(BaseQualityRecalibration.scala:69)
at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:371)
at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:371)
at scala.collection.Iterator$class.foreach(Iterator.scala:727)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
at org.apache.spark.rdd.PairRDDFunctions$$anonfun$reduceByKeyLocally$1$$anonfun$3.apply(PairRDDFunctions.scala:348)
at org.apache.spark.rdd.PairRDDFunctions$$anonfun$reduceByKeyLocally$1$$anonfun$3.apply(PairRDDFunctions.scala:346)
at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$20.apply(RDD.scala:710)
at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$20.apply(RDD.scala:710)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
at org.apache.spark.scheduler.Task.run(Task.scala:89)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:242)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)

Driver stacktrace:
17/09/23 22:11:55 INFO cli.TransformAlignments: Overall Duration: 21.3 secs
Exception in thread "main" org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 0.0 failed 4 times, most recent failure: Lost task 0.3 in stage 0.0 (TID 10, ip-10-48-3-65.ips.local, executor 3): java.util.NoSuchElementException: key not found: null
at scala.collection.MapLike$class.default(MapLike.scala:228)
at scala.collection.AbstractMap.default(Map.scala:58)
at scala.collection.MapLike$class.apply(MapLike.scala:141)
at scala.collection.AbstractMap.apply(Map.scala:58)
at org.bdgenomics.adam.models.RecordGroupDictionary.getIndex(RecordGroupDictionary.scala:123)
at org.bdgenomics.adam.rdd.read.recalibration.CovariateSpace$.apply(CovariateSpace.scala:82)
at org.bdgenomics.adam.rdd.read.recalibration.BaseQualityRecalibration$$anonfun$observe$1$$anonfun$apply$1.apply(BaseQualityRecalibration.scala:300)
at org.bdgenomics.adam.rdd.read.recalibration.BaseQualityRecalibration$$anonfun$observe$1$$anonfun$apply$1.apply(BaseQualityRecalibration.scala:300)
at scala.Option.fold(Option.scala:157)
at org.apache.spark.rdd.Timer.time(Timer.scala:48)
at org.bdgenomics.adam.rdd.read.recalibration.BaseQualityRecalibration$$anonfun$observe$1.apply(BaseQualityRecalibration.scala:299)
at org.bdgenomics.adam.rdd.read.recalibration.BaseQualityRecalibration$$anonfun$observe$1.apply(BaseQualityRecalibration.scala:295)
at scala.Option.fold(Option.scala:157)
at org.apache.spark.rdd.Timer.time(Timer.scala:48)
at org.bdgenomics.adam.rdd.read.recalibration.BaseQualityRecalibration$.observe(BaseQualityRecalibration.scala:295)
at org.bdgenomics.adam.rdd.read.recalibration.BaseQualityRecalibration$$anonfun$org$bdgenomics$adam$rdd$read$recalibration$BaseQualityRecalibration$$observe$1.apply(BaseQualityRecalibration.scala:323)
at org.bdgenomics.adam.rdd.read.recalibration.BaseQualityRecalibration$$anonfun$org$bdgenomics$adam$rdd$read$recalibration$BaseQualityRecalibration$$observe$1.apply(BaseQualityRecalibration.scala:320)
at scala.Option.fold(Option.scala:157)
at org.apache.spark.rdd.Timer.time(Timer.scala:48)
at org.bdgenomics.adam.rdd.read.recalibration.BaseQualityRecalibration$.org$bdgenomics$adam$rdd$read$recalibration$BaseQualityRecalibration$$observe(BaseQualityRecalibration.scala:320)
at org.bdgenomics.adam.rdd.read.recalibration.BaseQualityRecalibration$$anonfun$2.apply(BaseQualityRecalibration.scala:71)
at org.bdgenomics.adam.rdd.read.recalibration.BaseQualityRecalibration$$anonfun$2.apply(BaseQualityRecalibration.scala:69)
at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:371)
at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:371)
at scala.collection.Iterator$class.foreach(Iterator.scala:727)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
at org.apache.spark.rdd.PairRDDFunctions$$anonfun$reduceByKeyLocally$1$$anonfun$3.apply(PairRDDFunctions.scala:348)
at org.apache.spark.rdd.PairRDDFunctions$$anonfun$reduceByKeyLocally$1$$anonfun$3.apply(PairRDDFunctions.scala:346)
at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$20.apply(RDD.scala:710)
at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$20.apply(RDD.scala:710)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
at org.apache.spark.scheduler.Task.run(Task.scala:89)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:242)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)

Driver stacktrace:
at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1433)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1421)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1420)
at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1420)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:799)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:799)
at scala.Option.foreach(Option.scala:236)
at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:799)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1644)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1603)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1592)
at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:620)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:1840)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:1960)
at org.apache.spark.rdd.RDD$$anonfun$reduce$1.apply(RDD.scala:1025)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:150)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:111)
at org.apache.spark.rdd.RDD.withScope(RDD.scala:316)
at org.apache.spark.rdd.RDD.reduce(RDD.scala:1007)
at org.apache.spark.rdd.PairRDDFunctions$$anonfun$reduceByKeyLocally$1.apply(PairRDDFunctions.scala:363)
at org.apache.spark.rdd.PairRDDFunctions$$anonfun$reduceByKeyLocally$1.apply(PairRDDFunctions.scala:339)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:150)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:111)
at org.apache.spark.rdd.RDD.withScope(RDD.scala:316)
at org.apache.spark.rdd.PairRDDFunctions.reduceByKeyLocally(PairRDDFunctions.scala:339)
at org.bdgenomics.adam.rdd.read.recalibration.BaseQualityRecalibration.(BaseQualityRecalibration.scala:94)
at org.bdgenomics.adam.rdd.read.recalibration.BaseQualityRecalibration$.apply(BaseQualityRecalibration.scala:346)
at org.bdgenomics.adam.rdd.read.AlignmentRecordRDD$$anonfun$recalibrateBaseQualities$1.apply(AlignmentRecordRDD.scala:980)
at org.bdgenomics.adam.rdd.read.AlignmentRecordRDD$$anonfun$recalibrateBaseQualities$1.apply(AlignmentRecordRDD.scala:980)
at scala.Option.fold(Option.scala:157)
at org.apache.spark.rdd.Timer.time(Timer.scala:48)
at org.bdgenomics.adam.rdd.read.AlignmentRecordRDD.recalibrateBaseQualities(AlignmentRecordRDD.scala:979)
at org.bdgenomics.adam.cli.TransformAlignments.maybeRecalibrate(TransformAlignments.scala:276)
at org.bdgenomics.adam.cli.TransformAlignments.apply(TransformAlignments.scala:391)
at org.bdgenomics.adam.cli.TransformAlignments.run(TransformAlignments.scala:548)
at org.bdgenomics.utils.cli.BDGSparkCommand$class.run(BDGCommand.scala:55)
at org.bdgenomics.adam.cli.TransformAlignments.run(TransformAlignments.scala:138)
at org.bdgenomics.adam.cli.ADAMMain.apply(ADAMMain.scala:126)
at org.bdgenomics.adam.cli.ADAMMain$.main(ADAMMain.scala:65)
at org.bdgenomics.adam.cli.ADAMMain.main(ADAMMain.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:729)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.util.NoSuchElementException: key not found: null
at scala.collection.MapLike$class.default(MapLike.scala:228)
at scala.collection.AbstractMap.default(Map.scala:58)
at scala.collection.MapLike$class.apply(MapLike.scala:141)
at scala.collection.AbstractMap.apply(Map.scala:58)
at org.bdgenomics.adam.models.RecordGroupDictionary.getIndex(RecordGroupDictionary.scala:123)
at org.bdgenomics.adam.rdd.read.recalibration.CovariateSpace$.apply(CovariateSpace.scala:82)
at org.bdgenomics.adam.rdd.read.recalibration.BaseQualityRecalibration$$anonfun$observe$1$$anonfun$apply$1.apply(BaseQualityRecalibration.scala:300)
at org.bdgenomics.adam.rdd.read.recalibration.BaseQualityRecalibration$$anonfun$observe$1$$anonfun$apply$1.apply(BaseQualityRecalibration.scala:300)
at scala.Option.fold(Option.scala:157)
at org.apache.spark.rdd.Timer.time(Timer.scala:48)
at org.bdgenomics.adam.rdd.read.recalibration.BaseQualityRecalibration$$anonfun$observe$1.apply(BaseQualityRecalibration.scala:299)
at org.bdgenomics.adam.rdd.read.recalibration.BaseQualityRecalibration$$anonfun$observe$1.apply(BaseQualityRecalibration.scala:295)
at scala.Option.fold(Option.scala:157)
at org.apache.spark.rdd.Timer.time(Timer.scala:48)
at org.bdgenomics.adam.rdd.read.recalibration.BaseQualityRecalibration$.observe(BaseQualityRecalibration.scala:295)
at org.bdgenomics.adam.rdd.read.recalibration.BaseQualityRecalibration$$anonfun$org$bdgenomics$adam$rdd$read$recalibration$BaseQualityRecalibration$$observe$1.apply(BaseQualityRecalibration.scala:323)
at org.bdgenomics.adam.rdd.read.recalibration.BaseQualityRecalibration$$anonfun$org$bdgenomics$adam$rdd$read$recalibration$BaseQualityRecalibration$$observe$1.apply(BaseQualityRecalibration.scala:320)
at scala.Option.fold(Option.scala:157)
at org.apache.spark.rdd.Timer.time(Timer.scala:48)
at org.bdgenomics.adam.rdd.read.recalibration.BaseQualityRecalibration$.org$bdgenomics$adam$rdd$read$recalibration$BaseQualityRecalibration$$observe(BaseQualityRecalibration.scala:320)
at org.bdgenomics.adam.rdd.read.recalibration.BaseQualityRecalibration$$anonfun$2.apply(BaseQualityRecalibration.scala:71)
at org.bdgenomics.adam.rdd.read.recalibration.BaseQualityRecalibration$$anonfun$2.apply(BaseQualityRecalibration.scala:69)
at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:371)
at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:371)
at scala.collection.Iterator$class.foreach(Iterator.scala:727)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
at org.apache.spark.rdd.PairRDDFunctions$$anonfun$reduceByKeyLocally$1$$anonfun$3.apply(PairRDDFunctions.scala:348)
at org.apache.spark.rdd.PairRDDFunctions$$anonfun$reduceByKeyLocally$1$$anonfun$3.apply(PairRDDFunctions.scala:346)
at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$20.apply(RDD.scala:710)
at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$20.apply(RDD.scala:710)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
at org.apache.spark.scheduler.Task.run(Task.scala:89)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:242)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Sep 23, 2017 10:11:46 PM INFO: org.apache.parquet.hadoop.ParquetInputFormat: Total input paths to process : 5
17/09/23 22:11:55 INFO spark.SparkContext: Invoking stop() from shutdown hook
17/09/23 22:11:56 WARN scheduler.TaskSetManager: Lost task 2.0 in stage 0.0 (TID 2, ip-10-48-3-65.ips.local, executor 5): TaskKilled (killed intentionally)
17/09/23 22:11:56 INFO ui.SparkUI: Stopped Spark web UI at http://10.48.3.64:4040
17/09/23 22:11:56 INFO cluster.YarnClientSchedulerBackend: Interrupting monitor thread
17/09/23 22:11:56 INFO cluster.YarnClientSchedulerBackend: Shutting down all executors
17/09/23 22:11:56 INFO cluster.YarnClientSchedulerBackend: Asking each executor to shut down
17/09/23 22:11:56 INFO cluster.YarnClientSchedulerBackend: Stopped
17/09/23 22:11:56 INFO spark.MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!
17/09/23 22:11:56 ERROR server.TransportRequestHandler: Error sending result StreamResponse{streamId=/jars/adam-assembly_2.10-0.23.0-SNAPSHOT.jar, byteCount=61114166, body=FileSegmentManagedBuffer{file=/home/rokshan.jahan/adamproject/adam/bin/../adam-assembly/target/adam-assembly_2.10-0.23.0-SNAPSHOT.jar, offset=0, length=61114166}} to /10.48.3.64:57146; closing connection
java.io.IOException: Broken pipe
at sun.nio.ch.FileChannelImpl.transferTo0(Native Method)
at sun.nio.ch.FileChannelImpl.transferToDirectlyInternal(FileChannelImpl.java:428)
at sun.nio.ch.FileChannelImpl.transferToDirectly(FileChannelImpl.java:493)
at sun.nio.ch.FileChannelImpl.transferTo(FileChannelImpl.java:608)
at org.apache.spark.network.buffer.LazyFileRegion.transferTo(LazyFileRegion.java:96)
at org.apache.spark.network.protocol.MessageWithHeader.transferTo(MessageWithHeader.java:98)
at io.netty.channel.socket.nio.NioSocketChannel.doWriteFileRegion(NioSocketChannel.java:254)
at io.netty.channel.nio.AbstractNioByteChannel.doWrite(AbstractNioByteChannel.java:237)
at io.netty.channel.socket.nio.NioSocketChannel.doWrite(NioSocketChannel.java:281)
at io.netty.channel.AbstractChannel$AbstractUnsafe.flush0(AbstractChannel.java:761)
at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.forceFlush(AbstractNioChannel.java:317)
at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:519)
at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
at java.lang.Thread.run(Thread.java:748)
17/09/23 22:11:56 INFO storage.MemoryStore: MemoryStore cleared
17/09/23 22:11:56 INFO storage.BlockManager: BlockManager stopped
17/09/23 22:11:56 INFO storage.BlockManagerMaster: BlockManagerMaster stopped
17/09/23 22:11:56 INFO scheduler.OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped!
17/09/23 22:11:56 INFO remote.RemoteActorRefProvider$RemotingTerminator: Shutting down remote daemon.
17/09/23 22:11:56 INFO spark.SparkContext: Successfully stopped SparkContext
17/09/23 22:11:56 INFO remote.RemoteActorRefProvider$RemotingTerminator: Remote daemon shut down; proceeding with flushing remote transports.
17/09/23 22:11:56 INFO util.ShutdownHookManager: Shutdown hook called
17/09/23 22:11:56 INFO util.ShutdownHookManager: Deleting directory /data1/tmp/spark-756bda7b-3039-4204-9c45-00f1ed423a35

Rokshan2016 commented Sep 24, 2017

[rokshan.jahan@ip-10-48-3-64 bin]$ ./adam-submit transformAlignments hdfs://ip-10-48-3-5.ips.local:8020/genomics/wes_data/wes_data_aligned_adam/aln.adam hdfs://ip-10-48-3-5.ips.local:8020/user/rokshan.jahan/Recali2.adam -recalibrate_base_qualities
Using ADAM_MAIN=org.bdgenomics.adam.cli.ADAMMain
Using SPARK_SUBMIT=/opt/cloudera/parcels/CDH-5.10.1-1.cdh5.10.1.p0.10/bin/spark-submit
17/09/23 22:11:34 INFO cli.ADAMMain: ADAM invoked with args: "transformAlignments" "hdfs://ip-10-48-3-5.ips.local:8020/genomics/wes_data/wes_data_aligned_adam/aln.adam" "hdfs://ip-10-48-3-5.ips.local:8020/user/rokshan.jahan/Recali2.adam" "-recalibrate_base_qualities"
17/09/23 22:11:34 INFO spark.SparkContext: Running Spark version 1.6.0
17/09/23 22:11:35 INFO spark.SecurityManager: Changing view acls to: rokshan.jahan
17/09/23 22:11:35 INFO spark.SecurityManager: Changing modify acls to: rokshan.jahan
17/09/23 22:11:35 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(rokshan.jahan); users with modify permissions: Set(rokshan.jahan)
17/09/23 22:11:35 INFO util.Utils: Successfully started service 'sparkDriver' on port 43439.
17/09/23 22:11:35 INFO slf4j.Slf4jLogger: Slf4jLogger started
17/09/23 22:11:36 INFO Remoting: Starting remoting
17/09/23 22:11:36 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://sparkDriverActorSystem@10.48.3.64:39653]
17/09/23 22:11:36 INFO Remoting: Remoting now listens on addresses: [akka.tcp://sparkDriverActorSystem@10.48.3.64:39653]
17/09/23 22:11:36 INFO util.Utils: Successfully started service 'sparkDriverActorSystem' on port 39653.
17/09/23 22:11:36 INFO spark.SparkEnv: Registering MapOutputTracker
17/09/23 22:11:36 INFO spark.SparkEnv: Registering BlockManagerMaster
17/09/23 22:11:36 INFO storage.DiskBlockManager: Created local directory at /data1/tmp/blockmgr-1089712b-7f4f-451e-9a36-6722edb69a49
17/09/23 22:11:36 INFO storage.MemoryStore: MemoryStore started with capacity 530.0 MB
17/09/23 22:11:36 INFO spark.SparkEnv: Registering OutputCommitCoordinator
17/09/23 22:11:36 INFO util.Utils: Successfully started service 'SparkUI' on port 4040.
17/09/23 22:11:36 INFO ui.SparkUI: Started SparkUI at http://10.48.3.64:4040
17/09/23 22:11:36 INFO spark.SparkContext: Added JAR file:/home/rokshan.jahan/adamproject/adam/bin/../adam-assembly/target/adam-assembly_2.10-0.23.0-SNAPSHOT.jar at spark://10.48.3.64:43439/jars/adam-assembly_2.10-0.23.0-SNAPSHOT.jar with timestamp 1506219096638
17/09/23 22:11:36 INFO client.RMProxy: Connecting to ResourceManager at ip-10-48-3-5.ips.local/10.48.3.5:8032
17/09/23 22:11:37 INFO yarn.Client: Requesting a new application from cluster with 3 NodeManagers
17/09/23 22:11:37 INFO yarn.Client: Verifying our application has not requested more than the maximum memory capability of the cluster (28672 MB per container)
17/09/23 22:11:37 INFO yarn.Client: Will allocate AM container, with 896 MB memory including 384 MB overhead
17/09/23 22:11:37 INFO yarn.Client: Setting up container launch context for our AM
17/09/23 22:11:37 INFO yarn.Client: Setting up the launch environment for our AM container
17/09/23 22:11:37 INFO yarn.Client: Preparing resources for our AM container
17/09/23 22:11:38 INFO yarn.YarnSparkHadoopUtil: getting token for namenode: hdfs://ip-10-48-3-5.ips.local:8020/user/rokshan.jahan/.sparkStaging/application_1506021801259_1567
17/09/23 22:11:38 INFO hdfs.DFSClient: Created token for rokshan.jahan: HDFS_DELEGATION_TOKEN owner=rokshan.jahan@IPS.LOCAL, renewer=yarn, realUser=, issueDate=1506219098209, maxDate=1506823898209, sequenceNumber=92917, masterKeyId=759 on 10.48.3.5:8020
17/09/23 22:11:39 INFO hive.metastore: Trying to connect to metastore with URI thrift://ip-10-48-3-5.ips.local:9083
17/09/23 22:11:39 INFO hive.metastore: Opened a connection to metastore, current connections: 1
17/09/23 22:11:39 INFO hive.metastore: Connected to metastore.
17/09/23 22:11:39 INFO hive.metastore: Closed a connection to metastore, current connections: 0
17/09/23 22:11:39 INFO yarn.Client: Uploading resource file:/data1/tmp/spark-756bda7b-3039-4204-9c45-00f1ed423a35/__spark_conf__7564382871247080826.zip -> hdfs://ip-10-48-3-5.ips.local:8020/user/rokshan.jahan/.sparkStaging/application_1506021801259_1567/__spark_conf__7564382871247080826.zip
17/09/23 22:11:40 INFO spark.SecurityManager: Changing view acls to: rokshan.jahan
17/09/23 22:11:40 INFO spark.SecurityManager: Changing modify acls to: rokshan.jahan
17/09/23 22:11:40 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(rokshan.jahan); users with modify permissions: Set(rokshan.jahan)
17/09/23 22:11:40 INFO yarn.Client: Submitting application 1567 to ResourceManager
17/09/23 22:11:40 INFO impl.YarnClientImpl: Submitted application application_1506021801259_1567
17/09/23 22:11:41 INFO yarn.Client: Application report for application_1506021801259_1567 (state: ACCEPTED)
17/09/23 22:11:41 INFO yarn.Client:
client token: Token { kind: YARN_CLIENT_TOKEN, service: }
diagnostics: N/A
ApplicationMaster host: N/A
ApplicationMaster RPC port: -1
queue: root.users.rokshan_dot_jahan
start time: 1506219100129
final status: UNDEFINED
tracking URL: http://ip-10-48-3-5.ips.local:8088/proxy/application_1506021801259_1567/
user: rokshan.jahan
17/09/23 22:11:42 INFO yarn.Client: Application report for application_1506021801259_1567 (state: ACCEPTED)
17/09/23 22:11:43 INFO yarn.Client: Application report for application_1506021801259_1567 (state: ACCEPTED)
17/09/23 22:11:44 INFO cluster.YarnSchedulerBackend$YarnSchedulerEndpoint: ApplicationMaster registered as NettyRpcEndpointRef(null)
17/09/23 22:11:44 INFO cluster.YarnClientSchedulerBackend: Add WebUI Filter. org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter, Map(PROXY_HOSTS -> ip-10-48-3-5.ips.local, PROXY_URI_BASES -> http://ip-10-48-3-5.ips.local:8088/proxy/application_1506021801259_1567), /proxy/application_1506021801259_1567
17/09/23 22:11:44 INFO ui.JettyUtils: Adding filter: org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter
17/09/23 22:11:44 INFO yarn.Client: Application report for application_1506021801259_1567 (state: ACCEPTED)
17/09/23 22:11:45 INFO yarn.Client: Application report for application_1506021801259_1567 (state: RUNNING)
17/09/23 22:11:45 INFO yarn.Client:
client token: Token { kind: YARN_CLIENT_TOKEN, service: }
diagnostics: N/A
ApplicationMaster host: 10.48.3.65
ApplicationMaster RPC port: 0
queue: root.users.rokshan_dot_jahan
start time: 1506219100129
final status: UNDEFINED
tracking URL: http://ip-10-48-3-5.ips.local:8088/proxy/application_1506021801259_1567/
user: rokshan.jahan
17/09/23 22:11:45 INFO cluster.YarnClientSchedulerBackend: Application application_1506021801259_1567 has started running.
17/09/23 22:11:45 INFO util.Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 33559.
17/09/23 22:11:45 INFO netty.NettyBlockTransferService: Server created on 33559
17/09/23 22:11:45 INFO storage.BlockManager: external shuffle service port = 7337
17/09/23 22:11:45 INFO storage.BlockManagerMaster: Trying to register BlockManager
17/09/23 22:11:45 INFO storage.BlockManagerMasterEndpoint: Registering block manager 10.48.3.64:33559 with 530.0 MB RAM, BlockManagerId(driver, 10.48.3.64, 33559)
17/09/23 22:11:45 INFO storage.BlockManagerMaster: Registered BlockManager
17/09/23 22:11:45 INFO scheduler.EventLoggingListener: Logging events to hdfs://ip-10-48-3-5.ips.local:8020/user/spark/applicationHistory/application_1506021801259_1567
17/09/23 22:11:45 INFO cluster.YarnClientSchedulerBackend: SchedulerBackend is ready for scheduling beginning after reached minRegisteredResourcesRatio: 0.8
17/09/23 22:11:45 INFO rdd.ADAMContext: Loading hdfs://ip-10-48-3-5.ips.local:8020/genomics/wes_data/wes_data_aligned_adam/aln.adam as Parquet of AlignmentRecords.
17/09/23 22:11:45 INFO rdd.ADAMContext: Reading the ADAM file at hdfs://ip-10-48-3-5.ips.local:8020/genomics/wes_data/wes_data_aligned_adam/aln.adam to create RDD
17/09/23 22:11:46 INFO storage.MemoryStore: Block broadcast_0 stored as values in memory (estimated size 302.5 KB, free 529.7 MB)
17/09/23 22:11:46 INFO serialization.ADAMKryoRegistrator: Did not find Spark internal class. This is expected for Spark 1.
17/09/23 22:11:46 INFO storage.MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 28.0 KB, free 529.7 MB)
17/09/23 22:11:46 INFO storage.BlockManagerInfo: Added broadcast_0_piece0 in memory on 10.48.3.64:33559 (size: 28.0 KB, free: 530.0 MB)
17/09/23 22:11:46 INFO spark.SparkContext: Created broadcast 0 from newAPIHadoopFile at ADAMContext.scala:1257
17/09/23 22:11:46 INFO cli.TransformAlignments: Recalibrating base qualities
17/09/23 22:11:46 INFO storage.MemoryStore: Block broadcast_1 stored as values in memory (estimated size 128.0 B, free 529.7 MB)
17/09/23 22:11:46 INFO serialization.ADAMKryoRegistrator: Did not find Spark internal class. This is expected for Spark 1.
17/09/23 22:11:46 INFO storage.MemoryStore: Block broadcast_1_piece0 stored as bytes in memory (estimated size 32.0 B, free 529.7 MB)
17/09/23 22:11:46 INFO storage.BlockManagerInfo: Added broadcast_1_piece0 in memory on 10.48.3.64:33559 (size: 32.0 B, free: 530.0 MB)
17/09/23 22:11:46 INFO spark.SparkContext: Created broadcast 1 from broadcast at TransformAlignments.scala:272
17/09/23 22:11:46 INFO hdfs.DFSClient: Created token for rokshan.jahan: HDFS_DELEGATION_TOKEN owner=rokshan.jahan@IPS.LOCAL, renewer=yarn, realUser=, issueDate=1506219106860, maxDate=1506823906860, sequenceNumber=92918, masterKeyId=759 on 10.48.3.5:8020
17/09/23 22:11:46 INFO security.TokenCache: Got dt for hdfs://ip-10-48-3-5.ips.local:8020; Kind: HDFS_DELEGATION_TOKEN, Service: 10.48.3.5:8020, Ident: (token for rokshan.jahan: HDFS_DELEGATION_TOKEN owner=rokshan.jahan@IPS.LOCAL, renewer=yarn, realUser=, issueDate=1506219106860, maxDate=1506823906860, sequenceNumber=92918, masterKeyId=759)
17/09/23 22:11:46 INFO input.FileInputFormat: Total input paths to process : 5
17/09/23 22:11:46 INFO spark.SparkContext: Starting job: reduceByKeyLocally at BaseQualityRecalibration.scala:94
17/09/23 22:11:46 INFO scheduler.DAGScheduler: Got job 0 (reduceByKeyLocally at BaseQualityRecalibration.scala:94) with 5 output partitions
17/09/23 22:11:46 INFO scheduler.DAGScheduler: Final stage: ResultStage 0 (reduceByKeyLocally at BaseQualityRecalibration.scala:94)
17/09/23 22:11:46 INFO scheduler.DAGScheduler: Parents of final stage: List()
17/09/23 22:11:46 INFO scheduler.DAGScheduler: Missing parents: List()
17/09/23 22:11:46 INFO scheduler.DAGScheduler: Submitting ResultStage 0 (MapPartitionsRDD[5] at reduceByKeyLocally at BaseQualityRecalibration.scala:94), which has no missing parents
17/09/23 22:11:46 INFO storage.MemoryStore: Block broadcast_2 stored as values in memory (estimated size 4.7 KB, free 529.7 MB)
17/09/23 22:11:46 INFO serialization.ADAMKryoRegistrator: Did not find Spark internal class. This is expected for Spark 1.
17/09/23 22:11:46 INFO storage.MemoryStore: Block broadcast_2_piece0 stored as bytes in memory (estimated size 2.5 KB, free 529.7 MB)
17/09/23 22:11:46 INFO storage.BlockManagerInfo: Added broadcast_2_piece0 in memory on 10.48.3.64:33559 (size: 2.5 KB, free: 530.0 MB)
17/09/23 22:11:46 INFO spark.SparkContext: Created broadcast 2 from broadcast at DAGScheduler.scala:1006
17/09/23 22:11:47 INFO scheduler.DAGScheduler: Submitting 5 missing tasks from ResultStage 0 (MapPartitionsRDD[5] at reduceByKeyLocally at BaseQualityRecalibration.scala:94)
17/09/23 22:11:47 INFO cluster.YarnScheduler: Adding task set 0.0 with 5 tasks
17/09/23 22:11:48 INFO spark.ExecutorAllocationManager: Requesting 1 new executor because tasks are backlogged (new desired total will be 1)
17/09/23 22:11:49 INFO spark.ExecutorAllocationManager: Requesting 2 new executors because tasks are backlogged (new desired total will be 3)
17/09/23 22:11:50 INFO spark.ExecutorAllocationManager: Requesting 2 new executors because tasks are backlogged (new desired total will be 5)
17/09/23 22:11:51 INFO cluster.YarnClientSchedulerBackend: Registered executor NettyRpcEndpointRef(null) (ip-10-48-3-65.ips.local:43752) with ID 3
17/09/23 22:11:51 INFO spark.ExecutorAllocationManager: New executor 3 has registered (new total is 1)
17/09/23 22:11:51 INFO scheduler.TaskSetManager: Starting task 0.0 in stage 0.0 (TID 0, ip-10-48-3-65.ips.local, executor 3, partition 0, NODE_LOCAL, 2319 bytes)
17/09/23 22:11:51 INFO storage.BlockManagerMasterEndpoint: Registering block manager ip-10-48-3-65.ips.local:36529 with 530.0 MB RAM, BlockManagerId(3, ip-10-48-3-65.ips.local, 36529)
17/09/23 22:11:52 INFO storage.BlockManagerInfo: Added broadcast_2_piece0 in memory on ip-10-48-3-65.ips.local:36529 (size: 2.5 KB, free: 530.0 MB)
17/09/23 22:11:52 INFO cluster.YarnClientSchedulerBackend: Registered executor NettyRpcEndpointRef(null) (ip-10-48-3-12.ips.local:59610) with ID 1
17/09/23 22:11:52 INFO scheduler.TaskSetManager: Starting task 1.0 in stage 0.0 (TID 1, ip-10-48-3-12.ips.local, executor 1, partition 1, NODE_LOCAL, 2319 bytes)
17/09/23 22:11:52 INFO spark.ExecutorAllocationManager: New executor 1 has registered (new total is 2)
17/09/23 22:11:52 INFO storage.BlockManagerMasterEndpoint: Registering block manager ip-10-48-3-12.ips.local:33231 with 530.0 MB RAM, BlockManagerId(1, ip-10-48-3-12.ips.local, 33231)
17/09/23 22:11:52 INFO cluster.YarnClientSchedulerBackend: Registered executor NettyRpcEndpointRef(null) (ip-10-48-3-65.ips.local:43766) with ID 5
17/09/23 22:11:52 INFO scheduler.TaskSetManager: Starting task 2.0 in stage 0.0 (TID 2, ip-10-48-3-65.ips.local, executor 5, partition 2, NODE_LOCAL, 2319 bytes)
17/09/23 22:11:52 INFO spark.ExecutorAllocationManager: New executor 5 has registered (new total is 3)
17/09/23 22:11:52 INFO storage.BlockManagerMasterEndpoint: Registering block manager ip-10-48-3-65.ips.local:33969 with 530.0 MB RAM, BlockManagerId(5, ip-10-48-3-65.ips.local, 33969)
17/09/23 22:11:53 INFO storage.BlockManagerInfo: Added broadcast_2_piece0 in memory on ip-10-48-3-12.ips.local:33231 (size: 2.5 KB, free: 530.0 MB)
17/09/23 22:11:53 INFO storage.BlockManagerInfo: Added broadcast_2_piece0 in memory on ip-10-48-3-65.ips.local:33969 (size: 2.5 KB, free: 530.0 MB)
17/09/23 22:11:53 INFO storage.BlockManagerInfo: Added broadcast_0_piece0 in memory on ip-10-48-3-65.ips.local:36529 (size: 28.0 KB, free: 530.0 MB)
17/09/23 22:11:54 INFO storage.BlockManagerInfo: Added broadcast_0_piece0 in memory on ip-10-48-3-12.ips.local:33231 (size: 28.0 KB, free: 530.0 MB)
17/09/23 22:11:54 INFO storage.BlockManagerInfo: Added broadcast_0_piece0 in memory on ip-10-48-3-65.ips.local:33969 (size: 28.0 KB, free: 530.0 MB)
17/09/23 22:11:55 INFO cluster.YarnClientSchedulerBackend: Registered executor NettyRpcEndpointRef(null) (ip-10-48-3-64.ips.local:57132) with ID 4
17/09/23 22:11:55 INFO scheduler.TaskSetManager: Starting task 3.0 in stage 0.0 (TID 3, ip-10-48-3-64.ips.local, executor 4, partition 3, NODE_LOCAL, 2319 bytes)
17/09/23 22:11:55 INFO spark.ExecutorAllocationManager: New executor 4 has registered (new total is 4)
17/09/23 22:11:55 INFO storage.BlockManagerInfo: Added broadcast_1_piece0 in memory on ip-10-48-3-65.ips.local:36529 (size: 32.0 B, free: 530.0 MB)
17/09/23 22:11:55 INFO storage.BlockManagerMasterEndpoint: Registering block manager ip-10-48-3-64.ips.local:40355 with 530.0 MB RAM, BlockManagerId(4, ip-10-48-3-64.ips.local, 40355)
17/09/23 22:11:55 INFO scheduler.TaskSetManager: Starting task 4.0 in stage 0.0 (TID 4, ip-10-48-3-65.ips.local, executor 3, partition 4, NODE_LOCAL, 2319 bytes)
17/09/23 22:11:55 WARN scheduler.TaskSetManager: Lost task 0.0 in stage 0.0 (TID 0, ip-10-48-3-65.ips.local, executor 3): java.util.NoSuchElementException: key not found: null
at scala.collection.MapLike$class.default(MapLike.scala:228)
at scala.collection.AbstractMap.default(Map.scala:58)
at scala.collection.MapLike$class.apply(MapLike.scala:141)
at scala.collection.AbstractMap.apply(Map.scala:58)
at org.bdgenomics.adam.models.RecordGroupDictionary.getIndex(RecordGroupDictionary.scala:123)
at org.bdgenomics.adam.rdd.read.recalibration.CovariateSpace$.apply(CovariateSpace.scala:82)
at org.bdgenomics.adam.rdd.read.recalibration.BaseQualityRecalibration$$anonfun$observe$1$$anonfun$apply$1.apply(BaseQualityRecalibration.scala:300)
at org.bdgenomics.adam.rdd.read.recalibration.BaseQualityRecalibration$$anonfun$observe$1$$anonfun$apply$1.apply(BaseQualityRecalibration.scala:300)
at scala.Option.fold(Option.scala:157)
at org.apache.spark.rdd.Timer.time(Timer.scala:48)
at org.bdgenomics.adam.rdd.read.recalibration.BaseQualityRecalibration$$anonfun$observe$1.apply(BaseQualityRecalibration.scala:299)
at org.bdgenomics.adam.rdd.read.recalibration.BaseQualityRecalibration$$anonfun$observe$1.apply(BaseQualityRecalibration.scala:295)
at scala.Option.fold(Option.scala:157)
at org.apache.spark.rdd.Timer.time(Timer.scala:48)
at org.bdgenomics.adam.rdd.read.recalibration.BaseQualityRecalibration$.observe(BaseQualityRecalibration.scala:295)
at org.bdgenomics.adam.rdd.read.recalibration.BaseQualityRecalibration$$anonfun$org$bdgenomics$adam$rdd$read$recalibration$BaseQualityRecalibration$$observe$1.apply(BaseQualityRecalibration.scala:323)
at org.bdgenomics.adam.rdd.read.recalibration.BaseQualityRecalibration$$anonfun$org$bdgenomics$adam$rdd$read$recalibration$BaseQualityRecalibration$$observe$1.apply(BaseQualityRecalibration.scala:320)
at scala.Option.fold(Option.scala:157)
at org.apache.spark.rdd.Timer.time(Timer.scala:48)
at org.bdgenomics.adam.rdd.read.recalibration.BaseQualityRecalibration$.org$bdgenomics$adam$rdd$read$recalibration$BaseQualityRecalibration$$observe(BaseQualityRecalibration.scala:320)
at org.bdgenomics.adam.rdd.read.recalibration.BaseQualityRecalibration$$anonfun$2.apply(BaseQualityRecalibration.scala:71)
at org.bdgenomics.adam.rdd.read.recalibration.BaseQualityRecalibration$$anonfun$2.apply(BaseQualityRecalibration.scala:69)
at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:371)
at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:371)
at scala.collection.Iterator$class.foreach(Iterator.scala:727)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
at org.apache.spark.rdd.PairRDDFunctions$$anonfun$reduceByKeyLocally$1$$anonfun$3.apply(PairRDDFunctions.scala:348)
at org.apache.spark.rdd.PairRDDFunctions$$anonfun$reduceByKeyLocally$1$$anonfun$3.apply(PairRDDFunctions.scala:346)
at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$20.apply(RDD.scala:710)
at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$20.apply(RDD.scala:710)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
at org.apache.spark.scheduler.Task.run(Task.scala:89)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:242)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)

17/09/23 22:11:55 INFO storage.BlockManagerInfo: Added broadcast_1_piece0 in memory on ip-10-48-3-12.ips.local:33231 (size: 32.0 B, free: 530.0 MB)
17/09/23 22:11:55 INFO scheduler.TaskSetManager: Starting task 0.1 in stage 0.0 (TID 5, ip-10-48-3-65.ips.local, executor 3, partition 0, NODE_LOCAL, 2319 bytes)
17/09/23 22:11:55 INFO scheduler.TaskSetManager: Lost task 4.0 in stage 0.0 (TID 4) on ip-10-48-3-65.ips.local, executor 3: java.util.NoSuchElementException (key not found: null) [duplicate 1]
17/09/23 22:11:55 INFO cluster.YarnClientSchedulerBackend: Registered executor NettyRpcEndpointRef(null) (ip-10-48-3-64.ips.local:57134) with ID 2
17/09/23 22:11:55 INFO scheduler.TaskSetManager: Starting task 4.1 in stage 0.0 (TID 6, ip-10-48-3-64.ips.local, executor 2, partition 4, NODE_LOCAL, 2319 bytes)
17/09/23 22:11:55 INFO scheduler.TaskSetManager: Lost task 1.0 in stage 0.0 (TID 1) on ip-10-48-3-12.ips.local, executor 1: java.util.NoSuchElementException (key not found: null) [duplicate 2]
17/09/23 22:11:55 INFO scheduler.TaskSetManager: Starting task 1.1 in stage 0.0 (TID 7, ip-10-48-3-12.ips.local, executor 1, partition 1, NODE_LOCAL, 2319 bytes)
17/09/23 22:11:55 INFO spark.ExecutorAllocationManager: New executor 2 has registered (new total is 5)
17/09/23 22:11:55 INFO storage.BlockManagerMasterEndpoint: Registering block manager ip-10-48-3-64.ips.local:33124 with 530.0 MB RAM, BlockManagerId(2, ip-10-48-3-64.ips.local, 33124)
17/09/23 22:11:55 INFO scheduler.TaskSetManager: Lost task 0.1 in stage 0.0 (TID 5) on ip-10-48-3-65.ips.local, executor 3: java.util.NoSuchElementException (key not found: null) [duplicate 3]
17/09/23 22:11:55 INFO scheduler.TaskSetManager: Starting task 0.2 in stage 0.0 (TID 8, ip-10-48-3-65.ips.local, executor 3, partition 0, NODE_LOCAL, 2319 bytes)
17/09/23 22:11:55 INFO scheduler.TaskSetManager: Lost task 1.1 in stage 0.0 (TID 7) on ip-10-48-3-12.ips.local, executor 1: java.util.NoSuchElementException (key not found: null) [duplicate 4]
17/09/23 22:11:55 INFO scheduler.TaskSetManager: Starting task 1.2 in stage 0.0 (TID 9, ip-10-48-3-12.ips.local, executor 1, partition 1, NODE_LOCAL, 2319 bytes)
17/09/23 22:11:55 INFO scheduler.TaskSetManager: Lost task 0.2 in stage 0.0 (TID 8) on ip-10-48-3-65.ips.local, executor 3: java.util.NoSuchElementException (key not found: null) [duplicate 5]
17/09/23 22:11:55 INFO scheduler.TaskSetManager: Starting task 0.3 in stage 0.0 (TID 10, ip-10-48-3-65.ips.local, executor 3, partition 0, NODE_LOCAL, 2319 bytes)
17/09/23 22:11:55 INFO scheduler.TaskSetManager: Lost task 1.2 in stage 0.0 (TID 9) on ip-10-48-3-12.ips.local, executor 1: java.util.NoSuchElementException (key not found: null) [duplicate 6]
17/09/23 22:11:55 INFO scheduler.TaskSetManager: Starting task 1.3 in stage 0.0 (TID 11, ip-10-48-3-12.ips.local, executor 1, partition 1, NODE_LOCAL, 2319 bytes)
17/09/23 22:11:55 INFO scheduler.TaskSetManager: Lost task 0.3 in stage 0.0 (TID 10) on ip-10-48-3-65.ips.local, executor 3: java.util.NoSuchElementException (key not found: null) [duplicate 7]
17/09/23 22:11:55 ERROR scheduler.TaskSetManager: Task 0 in stage 0.0 failed 4 times; aborting job
17/09/23 22:11:55 INFO scheduler.TaskSetManager: Lost task 1.3 in stage 0.0 (TID 11) on ip-10-48-3-12.ips.local, executor 1: java.util.NoSuchElementException (key not found: null) [duplicate 8]
17/09/23 22:11:55 INFO cluster.YarnScheduler: Cancelling stage 0
17/09/23 22:11:55 INFO cluster.YarnScheduler: Stage 0 was cancelled
17/09/23 22:11:55 INFO scheduler.DAGScheduler: ResultStage 0 (reduceByKeyLocally at BaseQualityRecalibration.scala:94) failed in 8.749 s due to Job aborted due to stage failure: Task 0 in stage 0.0 failed 4 times, most recent failure: Lost task 0.3 in stage 0.0 (TID 10, ip-10-48-3-65.ips.local, executor 3): java.util.NoSuchElementException: key not found: null
at scala.collection.MapLike$class.default(MapLike.scala:228)
at scala.collection.AbstractMap.default(Map.scala:58)
at scala.collection.MapLike$class.apply(MapLike.scala:141)
at scala.collection.AbstractMap.apply(Map.scala:58)
at org.bdgenomics.adam.models.RecordGroupDictionary.getIndex(RecordGroupDictionary.scala:123)
at org.bdgenomics.adam.rdd.read.recalibration.CovariateSpace$.apply(CovariateSpace.scala:82)
at org.bdgenomics.adam.rdd.read.recalibration.BaseQualityRecalibration$$anonfun$observe$1$$anonfun$apply$1.apply(BaseQualityRecalibration.scala:300)
at org.bdgenomics.adam.rdd.read.recalibration.BaseQualityRecalibration$$anonfun$observe$1$$anonfun$apply$1.apply(BaseQualityRecalibration.scala:300)
at scala.Option.fold(Option.scala:157)
at org.apache.spark.rdd.Timer.time(Timer.scala:48)
at org.bdgenomics.adam.rdd.read.recalibration.BaseQualityRecalibration$$anonfun$observe$1.apply(BaseQualityRecalibration.scala:299)
at org.bdgenomics.adam.rdd.read.recalibration.BaseQualityRecalibration$$anonfun$observe$1.apply(BaseQualityRecalibration.scala:295)
at scala.Option.fold(Option.scala:157)
at org.apache.spark.rdd.Timer.time(Timer.scala:48)
at org.bdgenomics.adam.rdd.read.recalibration.BaseQualityRecalibration$.observe(BaseQualityRecalibration.scala:295)
at org.bdgenomics.adam.rdd.read.recalibration.BaseQualityRecalibration$$anonfun$org$bdgenomics$adam$rdd$read$recalibration$BaseQualityRecalibration$$observe$1.apply(BaseQualityRecalibration.scala:323)
at org.bdgenomics.adam.rdd.read.recalibration.BaseQualityRecalibration$$anonfun$org$bdgenomics$adam$rdd$read$recalibration$BaseQualityRecalibration$$observe$1.apply(BaseQualityRecalibration.scala:320)
at scala.Option.fold(Option.scala:157)
at org.apache.spark.rdd.Timer.time(Timer.scala:48)
at org.bdgenomics.adam.rdd.read.recalibration.BaseQualityRecalibration$.org$bdgenomics$adam$rdd$read$recalibration$BaseQualityRecalibration$$observe(BaseQualityRecalibration.scala:320)
at org.bdgenomics.adam.rdd.read.recalibration.BaseQualityRecalibration$$anonfun$2.apply(BaseQualityRecalibration.scala:71)
at org.bdgenomics.adam.rdd.read.recalibration.BaseQualityRecalibration$$anonfun$2.apply(BaseQualityRecalibration.scala:69)
at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:371)
at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:371)
at scala.collection.Iterator$class.foreach(Iterator.scala:727)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
at org.apache.spark.rdd.PairRDDFunctions$$anonfun$reduceByKeyLocally$1$$anonfun$3.apply(PairRDDFunctions.scala:348)
at org.apache.spark.rdd.PairRDDFunctions$$anonfun$reduceByKeyLocally$1$$anonfun$3.apply(PairRDDFunctions.scala:346)
at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$20.apply(RDD.scala:710)
at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$20.apply(RDD.scala:710)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
at org.apache.spark.scheduler.Task.run(Task.scala:89)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:242)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)

Driver stacktrace:
17/09/23 22:11:55 WARN spark.ExecutorAllocationManager: No stages are running, but numRunningTasks != 0
17/09/23 22:11:55 INFO scheduler.DAGScheduler: Job 0 failed: reduceByKeyLocally at BaseQualityRecalibration.scala:94, took 9.032743 s
Command body threw exception:
org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 0.0 failed 4 times, most recent failure: Lost task 0.3 in stage 0.0 (TID 10, ip-10-48-3-65.ips.local, executor 3): java.util.NoSuchElementException: key not found: null
at scala.collection.MapLike$class.default(MapLike.scala:228)
at scala.collection.AbstractMap.default(Map.scala:58)
at scala.collection.MapLike$class.apply(MapLike.scala:141)
at scala.collection.AbstractMap.apply(Map.scala:58)
at org.bdgenomics.adam.models.RecordGroupDictionary.getIndex(RecordGroupDictionary.scala:123)
at org.bdgenomics.adam.rdd.read.recalibration.CovariateSpace$.apply(CovariateSpace.scala:82)
at org.bdgenomics.adam.rdd.read.recalibration.BaseQualityRecalibration$$anonfun$observe$1$$anonfun$apply$1.apply(BaseQualityRecalibration.scala:300)
at org.bdgenomics.adam.rdd.read.recalibration.BaseQualityRecalibration$$anonfun$observe$1$$anonfun$apply$1.apply(BaseQualityRecalibration.scala:300)
at scala.Option.fold(Option.scala:157)
at org.apache.spark.rdd.Timer.time(Timer.scala:48)
at org.bdgenomics.adam.rdd.read.recalibration.BaseQualityRecalibration$$anonfun$observe$1.apply(BaseQualityRecalibration.scala:299)
at org.bdgenomics.adam.rdd.read.recalibration.BaseQualityRecalibration$$anonfun$observe$1.apply(BaseQualityRecalibration.scala:295)
at scala.Option.fold(Option.scala:157)
at org.apache.spark.rdd.Timer.time(Timer.scala:48)
at org.bdgenomics.adam.rdd.read.recalibration.BaseQualityRecalibration$.observe(BaseQualityRecalibration.scala:295)
at org.bdgenomics.adam.rdd.read.recalibration.BaseQualityRecalibration$$anonfun$org$bdgenomics$adam$rdd$read$recalibration$BaseQualityRecalibration$$observe$1.apply(BaseQualityRecalibration.scala:323)
at org.bdgenomics.adam.rdd.read.recalibration.BaseQualityRecalibration$$anonfun$org$bdgenomics$adam$rdd$read$recalibration$BaseQualityRecalibration$$observe$1.apply(BaseQualityRecalibration.scala:320)
at scala.Option.fold(Option.scala:157)
at org.apache.spark.rdd.Timer.time(Timer.scala:48)
at org.bdgenomics.adam.rdd.read.recalibration.BaseQualityRecalibration$.org$bdgenomics$adam$rdd$read$recalibration$BaseQualityRecalibration$$observe(BaseQualityRecalibration.scala:320)
at org.bdgenomics.adam.rdd.read.recalibration.BaseQualityRecalibration$$anonfun$2.apply(BaseQualityRecalibration.scala:71)
at org.bdgenomics.adam.rdd.read.recalibration.BaseQualityRecalibration$$anonfun$2.apply(BaseQualityRecalibration.scala:69)
at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:371)
at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:371)
at scala.collection.Iterator$class.foreach(Iterator.scala:727)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
at org.apache.spark.rdd.PairRDDFunctions$$anonfun$reduceByKeyLocally$1$$anonfun$3.apply(PairRDDFunctions.scala:348)
at org.apache.spark.rdd.PairRDDFunctions$$anonfun$reduceByKeyLocally$1$$anonfun$3.apply(PairRDDFunctions.scala:346)
at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$20.apply(RDD.scala:710)
at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$20.apply(RDD.scala:710)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
at org.apache.spark.scheduler.Task.run(Task.scala:89)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:242)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)

Driver stacktrace:
17/09/23 22:11:55 INFO cli.TransformAlignments: Overall Duration: 21.3 secs
Exception in thread "main" org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 0.0 failed 4 times, most recent failure: Lost task 0.3 in stage 0.0 (TID 10, ip-10-48-3-65.ips.local, executor 3): java.util.NoSuchElementException: key not found: null
at scala.collection.MapLike$class.default(MapLike.scala:228)
at scala.collection.AbstractMap.default(Map.scala:58)
at scala.collection.MapLike$class.apply(MapLike.scala:141)
at scala.collection.AbstractMap.apply(Map.scala:58)
at org.bdgenomics.adam.models.RecordGroupDictionary.getIndex(RecordGroupDictionary.scala:123)
at org.bdgenomics.adam.rdd.read.recalibration.CovariateSpace$.apply(CovariateSpace.scala:82)
at org.bdgenomics.adam.rdd.read.recalibration.BaseQualityRecalibration$$anonfun$observe$1$$anonfun$apply$1.apply(BaseQualityRecalibration.scala:300)
at org.bdgenomics.adam.rdd.read.recalibration.BaseQualityRecalibration$$anonfun$observe$1$$anonfun$apply$1.apply(BaseQualityRecalibration.scala:300)
at scala.Option.fold(Option.scala:157)
at org.apache.spark.rdd.Timer.time(Timer.scala:48)
at org.bdgenomics.adam.rdd.read.recalibration.BaseQualityRecalibration$$anonfun$observe$1.apply(BaseQualityRecalibration.scala:299)
at org.bdgenomics.adam.rdd.read.recalibration.BaseQualityRecalibration$$anonfun$observe$1.apply(BaseQualityRecalibration.scala:295)
at scala.Option.fold(Option.scala:157)
at org.apache.spark.rdd.Timer.time(Timer.scala:48)
at org.bdgenomics.adam.rdd.read.recalibration.BaseQualityRecalibration$.observe(BaseQualityRecalibration.scala:295)
at org.bdgenomics.adam.rdd.read.recalibration.BaseQualityRecalibration$$anonfun$org$bdgenomics$adam$rdd$read$recalibration$BaseQualityRecalibration$$observe$1.apply(BaseQualityRecalibration.scala:323)
at org.bdgenomics.adam.rdd.read.recalibration.BaseQualityRecalibration$$anonfun$org$bdgenomics$adam$rdd$read$recalibration$BaseQualityRecalibration$$observe$1.apply(BaseQualityRecalibration.scala:320)
at scala.Option.fold(Option.scala:157)
at org.apache.spark.rdd.Timer.time(Timer.scala:48)
at org.bdgenomics.adam.rdd.read.recalibration.BaseQualityRecalibration$.org$bdgenomics$adam$rdd$read$recalibration$BaseQualityRecalibration$$observe(BaseQualityRecalibration.scala:320)
at org.bdgenomics.adam.rdd.read.recalibration.BaseQualityRecalibration$$anonfun$2.apply(BaseQualityRecalibration.scala:71)
at org.bdgenomics.adam.rdd.read.recalibration.BaseQualityRecalibration$$anonfun$2.apply(BaseQualityRecalibration.scala:69)
at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:371)
at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:371)
at scala.collection.Iterator$class.foreach(Iterator.scala:727)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
at org.apache.spark.rdd.PairRDDFunctions$$anonfun$reduceByKeyLocally$1$$anonfun$3.apply(PairRDDFunctions.scala:348)
at org.apache.spark.rdd.PairRDDFunctions$$anonfun$reduceByKeyLocally$1$$anonfun$3.apply(PairRDDFunctions.scala:346)
at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$20.apply(RDD.scala:710)
at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$20.apply(RDD.scala:710)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
at org.apache.spark.scheduler.Task.run(Task.scala:89)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:242)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)

Driver stacktrace:
at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1433)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1421)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1420)
at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1420)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:799)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:799)
at scala.Option.foreach(Option.scala:236)
at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:799)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1644)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1603)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1592)
at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:620)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:1840)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:1960)
at org.apache.spark.rdd.RDD$$anonfun$reduce$1.apply(RDD.scala:1025)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:150)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:111)
at org.apache.spark.rdd.RDD.withScope(RDD.scala:316)
at org.apache.spark.rdd.RDD.reduce(RDD.scala:1007)
at org.apache.spark.rdd.PairRDDFunctions$$anonfun$reduceByKeyLocally$1.apply(PairRDDFunctions.scala:363)
at org.apache.spark.rdd.PairRDDFunctions$$anonfun$reduceByKeyLocally$1.apply(PairRDDFunctions.scala:339)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:150)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:111)
at org.apache.spark.rdd.RDD.withScope(RDD.scala:316)
at org.apache.spark.rdd.PairRDDFunctions.reduceByKeyLocally(PairRDDFunctions.scala:339)
at org.bdgenomics.adam.rdd.read.recalibration.BaseQualityRecalibration.(BaseQualityRecalibration.scala:94)
at org.bdgenomics.adam.rdd.read.recalibration.BaseQualityRecalibration$.apply(BaseQualityRecalibration.scala:346)
at org.bdgenomics.adam.rdd.read.AlignmentRecordRDD$$anonfun$recalibrateBaseQualities$1.apply(AlignmentRecordRDD.scala:980)
at org.bdgenomics.adam.rdd.read.AlignmentRecordRDD$$anonfun$recalibrateBaseQualities$1.apply(AlignmentRecordRDD.scala:980)
at scala.Option.fold(Option.scala:157)
at org.apache.spark.rdd.Timer.time(Timer.scala:48)
at org.bdgenomics.adam.rdd.read.AlignmentRecordRDD.recalibrateBaseQualities(AlignmentRecordRDD.scala:979)
at org.bdgenomics.adam.cli.TransformAlignments.maybeRecalibrate(TransformAlignments.scala:276)
at org.bdgenomics.adam.cli.TransformAlignments.apply(TransformAlignments.scala:391)
at org.bdgenomics.adam.cli.TransformAlignments.run(TransformAlignments.scala:548)
at org.bdgenomics.utils.cli.BDGSparkCommand$class.run(BDGCommand.scala:55)
at org.bdgenomics.adam.cli.TransformAlignments.run(TransformAlignments.scala:138)
at org.bdgenomics.adam.cli.ADAMMain.apply(ADAMMain.scala:126)
at org.bdgenomics.adam.cli.ADAMMain$.main(ADAMMain.scala:65)
at org.bdgenomics.adam.cli.ADAMMain.main(ADAMMain.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:729)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.util.NoSuchElementException: key not found: null
at scala.collection.MapLike$class.default(MapLike.scala:228)
at scala.collection.AbstractMap.default(Map.scala:58)
at scala.collection.MapLike$class.apply(MapLike.scala:141)
at scala.collection.AbstractMap.apply(Map.scala:58)
at org.bdgenomics.adam.models.RecordGroupDictionary.getIndex(RecordGroupDictionary.scala:123)
at org.bdgenomics.adam.rdd.read.recalibration.CovariateSpace$.apply(CovariateSpace.scala:82)
at org.bdgenomics.adam.rdd.read.recalibration.BaseQualityRecalibration$$anonfun$observe$1$$anonfun$apply$1.apply(BaseQualityRecalibration.scala:300)
at org.bdgenomics.adam.rdd.read.recalibration.BaseQualityRecalibration$$anonfun$observe$1$$anonfun$apply$1.apply(BaseQualityRecalibration.scala:300)
at scala.Option.fold(Option.scala:157)
at org.apache.spark.rdd.Timer.time(Timer.scala:48)
at org.bdgenomics.adam.rdd.read.recalibration.BaseQualityRecalibration$$anonfun$observe$1.apply(BaseQualityRecalibration.scala:299)
at org.bdgenomics.adam.rdd.read.recalibration.BaseQualityRecalibration$$anonfun$observe$1.apply(BaseQualityRecalibration.scala:295)
at scala.Option.fold(Option.scala:157)
at org.apache.spark.rdd.Timer.time(Timer.scala:48)
at org.bdgenomics.adam.rdd.read.recalibration.BaseQualityRecalibration$.observe(BaseQualityRecalibration.scala:295)
at org.bdgenomics.adam.rdd.read.recalibration.BaseQualityRecalibration$$anonfun$org$bdgenomics$adam$rdd$read$recalibration$BaseQualityRecalibration$$observe$1.apply(BaseQualityRecalibration.scala:323)
at org.bdgenomics.adam.rdd.read.recalibration.BaseQualityRecalibration$$anonfun$org$bdgenomics$adam$rdd$read$recalibration$BaseQualityRecalibration$$observe$1.apply(BaseQualityRecalibration.scala:320)
at scala.Option.fold(Option.scala:157)
at org.apache.spark.rdd.Timer.time(Timer.scala:48)
at org.bdgenomics.adam.rdd.read.recalibration.BaseQualityRecalibration$.org$bdgenomics$adam$rdd$read$recalibration$BaseQualityRecalibration$$observe(BaseQualityRecalibration.scala:320)
at org.bdgenomics.adam.rdd.read.recalibration.BaseQualityRecalibration$$anonfun$2.apply(BaseQualityRecalibration.scala:71)
at org.bdgenomics.adam.rdd.read.recalibration.BaseQualityRecalibration$$anonfun$2.apply(BaseQualityRecalibration.scala:69)
at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:371)
at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:371)
at scala.collection.Iterator$class.foreach(Iterator.scala:727)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
at org.apache.spark.rdd.PairRDDFunctions$$anonfun$reduceByKeyLocally$1$$anonfun$3.apply(PairRDDFunctions.scala:348)
at org.apache.spark.rdd.PairRDDFunctions$$anonfun$reduceByKeyLocally$1$$anonfun$3.apply(PairRDDFunctions.scala:346)
at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$20.apply(RDD.scala:710)
at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$20.apply(RDD.scala:710)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
at org.apache.spark.scheduler.Task.run(Task.scala:89)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:242)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Sep 23, 2017 10:11:46 PM INFO: org.apache.parquet.hadoop.ParquetInputFormat: Total input paths to process : 5
17/09/23 22:11:55 INFO spark.SparkContext: Invoking stop() from shutdown hook
17/09/23 22:11:56 WARN scheduler.TaskSetManager: Lost task 2.0 in stage 0.0 (TID 2, ip-10-48-3-65.ips.local, executor 5): TaskKilled (killed intentionally)
17/09/23 22:11:56 INFO ui.SparkUI: Stopped Spark web UI at http://10.48.3.64:4040
17/09/23 22:11:56 INFO cluster.YarnClientSchedulerBackend: Interrupting monitor thread
17/09/23 22:11:56 INFO cluster.YarnClientSchedulerBackend: Shutting down all executors
17/09/23 22:11:56 INFO cluster.YarnClientSchedulerBackend: Asking each executor to shut down
17/09/23 22:11:56 INFO cluster.YarnClientSchedulerBackend: Stopped
17/09/23 22:11:56 INFO spark.MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!
17/09/23 22:11:56 ERROR server.TransportRequestHandler: Error sending result StreamResponse{streamId=/jars/adam-assembly_2.10-0.23.0-SNAPSHOT.jar, byteCount=61114166, body=FileSegmentManagedBuffer{file=/home/rokshan.jahan/adamproject/adam/bin/../adam-assembly/target/adam-assembly_2.10-0.23.0-SNAPSHOT.jar, offset=0, length=61114166}} to /10.48.3.64:57146; closing connection
java.io.IOException: Broken pipe
at sun.nio.ch.FileChannelImpl.transferTo0(Native Method)
at sun.nio.ch.FileChannelImpl.transferToDirectlyInternal(FileChannelImpl.java:428)
at sun.nio.ch.FileChannelImpl.transferToDirectly(FileChannelImpl.java:493)
at sun.nio.ch.FileChannelImpl.transferTo(FileChannelImpl.java:608)
at org.apache.spark.network.buffer.LazyFileRegion.transferTo(LazyFileRegion.java:96)
at org.apache.spark.network.protocol.MessageWithHeader.transferTo(MessageWithHeader.java:98)
at io.netty.channel.socket.nio.NioSocketChannel.doWriteFileRegion(NioSocketChannel.java:254)
at io.netty.channel.nio.AbstractNioByteChannel.doWrite(AbstractNioByteChannel.java:237)
at io.netty.channel.socket.nio.NioSocketChannel.doWrite(NioSocketChannel.java:281)
at io.netty.channel.AbstractChannel$AbstractUnsafe.flush0(AbstractChannel.java:761)
at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.forceFlush(AbstractNioChannel.java:317)
at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:519)
at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
at java.lang.Thread.run(Thread.java:748)
17/09/23 22:11:56 INFO storage.MemoryStore: MemoryStore cleared
17/09/23 22:11:56 INFO storage.BlockManager: BlockManager stopped
17/09/23 22:11:56 INFO storage.BlockManagerMaster: BlockManagerMaster stopped
17/09/23 22:11:56 INFO scheduler.OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped!
17/09/23 22:11:56 INFO remote.RemoteActorRefProvider$RemotingTerminator: Shutting down remote daemon.
17/09/23 22:11:56 INFO spark.SparkContext: Successfully stopped SparkContext
17/09/23 22:11:56 INFO remote.RemoteActorRefProvider$RemotingTerminator: Remote daemon shut down; proceeding with flushing remote transports.
17/09/23 22:11:56 INFO util.ShutdownHookManager: Shutdown hook called
17/09/23 22:11:56 INFO util.ShutdownHookManager: Deleting directory /data1/tmp/spark-756bda7b-3039-4204-9c45-00f1ed423a35

@Rokshan2016

This comment has been minimized.

Show comment
Hide comment
@Rokshan2016

Rokshan2016 Sep 24, 2017

This is the whole error message.

I am using BWA for alignment

Rokshan2016 commented Sep 24, 2017

This is the whole error message.

I am using BWA for alignment

@fnothaft

This comment has been minimized.

Show comment
Hide comment
@fnothaft

fnothaft Sep 24, 2017

Member

Hi @Rokshan2016! ADAM relies on read group information being attached; you need to specify the @RG line when you run BWA. This thread may have the instructions you need. You can also add this in ADAM; I am not at my desk right now, but can send instructions tomorrow.

Member

fnothaft commented Sep 24, 2017

Hi @Rokshan2016! ADAM relies on read group information being attached; you need to specify the @RG line when you run BWA. This thread may have the instructions you need. You can also add this in ADAM; I am not at my desk right now, but can send instructions tomorrow.

@Rokshan2016

This comment has been minimized.

Show comment
Hide comment
@Rokshan2016

Rokshan2016 Sep 24, 2017

Ok. I will try that. And also if you can please send me the instruction , that will be really helpful!

Thank You!

Rokshan2016 commented Sep 24, 2017

Ok. I will try that. And also if you can please send me the instruction , that will be really helpful!

Thank You!

@Rokshan2016

This comment has been minimized.

Show comment
Hide comment
@Rokshan2016

Rokshan2016 Sep 24, 2017

Hi,
Is there any alignment tool that does not required to specify @rg

Thanks

Rokshan2016 commented Sep 24, 2017

Hi,
Is there any alignment tool that does not required to specify @rg

Thanks

@fnothaft

This comment has been minimized.

Show comment
Hide comment
@fnothaft

fnothaft Sep 25, 2017

Member

Hi @Rokshan2016! Most alignment tools don't require the @RG line to be specified, but if you do not have read group descriptions attached to your data, this will cause some of the downstream analyses (like BQSR and duplicate marking) to fail.

And also if you can please send me the instruction , that will be really helpful!

What you should be able to do is:

import org.bdgenomics.adam.models.RecordGroup
import org.bdgenomics.formats.avro.AlignmentRecord

val sampleId = "your_sample_id_here"
val reads = sc.loadAlignments("path/to/my/reads.adam")

reads.addRecordGroup(RecordGroup(sampleId, sampleId))
  .transform(_.map(r => {
    AlignmentRecord.newBuilder(r)
      .setRecordGroupSample(sampleId)
      .setRecordGroupName(sampleId)
      .build
  }).saveAsParquet("path/to/my/updated_reads.adam")

You should be able to run this from the adam-shell.

Member

fnothaft commented Sep 25, 2017

Hi @Rokshan2016! Most alignment tools don't require the @RG line to be specified, but if you do not have read group descriptions attached to your data, this will cause some of the downstream analyses (like BQSR and duplicate marking) to fail.

And also if you can please send me the instruction , that will be really helpful!

What you should be able to do is:

import org.bdgenomics.adam.models.RecordGroup
import org.bdgenomics.formats.avro.AlignmentRecord

val sampleId = "your_sample_id_here"
val reads = sc.loadAlignments("path/to/my/reads.adam")

reads.addRecordGroup(RecordGroup(sampleId, sampleId))
  .transform(_.map(r => {
    AlignmentRecord.newBuilder(r)
      .setRecordGroupSample(sampleId)
      .setRecordGroupName(sampleId)
      .build
  }).saveAsParquet("path/to/my/updated_reads.adam")

You should be able to run this from the adam-shell.

@fnothaft fnothaft added wontfix and removed wontfix labels Oct 7, 2017

@fnothaft

This comment has been minimized.

Show comment
Hide comment
@fnothaft

fnothaft Oct 7, 2017

Member

I don't think there's any more action here, so I'm closing this ticket. Please reopen if needed.

Member

fnothaft commented Oct 7, 2017

I don't think there's any more action here, so I'm closing this ticket. Please reopen if needed.

@fnothaft fnothaft closed this Oct 7, 2017

@heuermh heuermh added this to the 0.23.0 milestone Dec 7, 2017

@heuermh heuermh added this to Completed in Release 0.23.0 Jan 4, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment