Writing to a BAM file with adamSAMSave consistently fails #721

Closed
danvk opened this Issue Jul 1, 2015 · 3 comments

Comments

Projects
None yet
3 participants
@danvk

danvk commented Jul 1, 2015

I'm running this code on a yarn cluster. It's trying to filter a BAM file to just those alignments which are either on chr22 or have a mate on chr22.

override def run(args: Arguments, sc: SparkContext): Unit = {
  val filterContig = args.filterContig
  val alignments = sc.loadAlignments(args.reads)
  val matchingAlignments = alignments.filter(matchesContig(_, filterContig))
  matchingAlignments.persist()
  println("Found " + matchingAlignments.count() + " alignments with   one pair in " + filterContig)
  matchingAlignments.coalesce(10).adamSAMSave(args.outputPath, asSam = false)
}

I'm consistently getting this error:

Exception in thread "main" org.apache.spark.SparkException: Job aborted due to stage failure: Task 5 in stage 5.0 failed 4 times, most recent failure: Lost task 5.3 in stage 5.0 (TID 706, demeter-csmaz08-10.demeter.hpc.mssm.edu): java.lang.AssertionError: assertion failed: Cannot return header if not attached.

My command line is this:

spark-submit --master yarn --deploy-mode client --executor-memory 16g --driver-memory 10g --num-executors 1000 --executor-cores 1 --driver-java-options "-Dyarn.resourcemanager.am.max-attempts=1 -Dlog4j.configuration=scripts/log4j.properties" --class org.hammerlab.guacamole.Guacamole --verbose target/guacamole-with-dependencies-0.0.1-SNAPSHOT.jar structural-variant --reads hdfs:///datasets/dream/data/synthetic-challenge-4/synthetic.challenge.set4.tumour.bam --filter-contig 22 --out hdfs:///user/vanded03/synth4.tumor.chr22+mate.bam

(the input is from the dream challenge)

Would this be expected to work? cc @ryan-williams

@arahuja

This comment has been minimized.

Show comment
Hide comment
@arahuja

arahuja Jul 6, 2015

Contributor

I was seeing the same issue in #676 - which was apparently fixed, but I haven't checked since.

Contributor

arahuja commented Jul 6, 2015

I was seeing the same issue in #676 - which was apparently fixed, but I haven't checked since.

@danvk

This comment has been minimized.

Show comment
Hide comment
@danvk

danvk Jul 6, 2015

@arahuja I believe that issue was specifically when you used .coalesce(1). I ran out of memory when I tried that, so I'm using .coalesce(10) and running into this issue.

danvk commented Jul 6, 2015

@arahuja I believe that issue was specifically when you used .coalesce(1). I ran out of memory when I tried that, so I'm using .coalesce(10) and running into this issue.

@ryan-williams

This comment has been minimized.

Show comment
Hide comment
@ryan-williams

ryan-williams May 18, 2016

Member

Closing as a ~dupe of #676

Member

ryan-williams commented May 18, 2016

Closing as a ~dupe of #676

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment