New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ADAM-646] Special case reads with '*' quality during BQSR. #647

Merged
merged 2 commits into from Apr 9, 2015

Conversation

Projects
None yet
4 participants
@fnothaft
Member

fnothaft commented Apr 9, 2015

Resolves #646. Allows the creation of DecadentReads with * quality scores. These reads are then not observed or corrected during BQSR.

@massie

This comment has been minimized.

Show comment
Hide comment
@massie

massie Apr 9, 2015

Member

This looks good, but should we just set qualityString to null if it's * and then just do a null check?

Member

massie commented Apr 9, 2015

This looks good, but should we just set qualityString to null if it's * and then just do a null check?

@fnothaft

This comment has been minimized.

Show comment
Hide comment
@fnothaft

fnothaft Apr 9, 2015

Member

Are you suggesting to do that check in the SAM/BAM<->ADAM converters?

Member

fnothaft commented Apr 9, 2015

Are you suggesting to do that check in the SAM/BAM<->ADAM converters?

@AmplabJenkins

This comment has been minimized.

Show comment
Hide comment
@AmplabJenkins

AmplabJenkins Apr 9, 2015

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/ADAM-prb/670/

Build result: FAILURE

GitHub pull request #647 of commit f6ce721 automatically merged.Notifying endpoint 'HTTP:https://webhooks.gitter.im/e/ac8bb6e9f53357bc8aa8'[EnvInject] - Loading node environment variables.Building remotely on amp-jenkins-worker-05 (centos) in workspace /home/jenkins/workspace/ADAM-prb > git rev-parse --is-inside-work-tree # timeout=10Fetching changes from the remote Git repository > git config remote.origin.url https://github.com/bigdatagenomics/adam.git # timeout=10Fetching upstream changes from https://github.com/bigdatagenomics/adam.git > git --version # timeout=10 > git fetch --tags --progress https://github.com/bigdatagenomics/adam.git +refs/pull/:refs/remotes/origin/pr/ > git rev-parse origin/pr/647/merge^{commit} # timeout=10 > git branch -a --contains d7e55c115cfc9f4de7289144d2506ea006bf3237 # timeout=10 > git rev-parse remotes/origin/pr/647/merge^{commit} # timeout=10Checking out Revision d7e55c115cfc9f4de7289144d2506ea006bf3237 (origin/pr/647/merge) > git config core.sparsecheckout # timeout=10 > git checkout -f d7e55c115cfc9f4de7289144d2506ea006bf3237First time build. Skipping changelog.Triggering ADAM-prb ? 2.2.0,centosTriggering ADAM-prb ? 2.3.0,centosTriggering ADAM-prb ? 1.0.4,centosADAM-prb ? 2.2.0,centos completed with result SUCCESSADAM-prb ? 2.3.0,centos completed with result FAILUREADAM-prb ? 1.0.4,centos completed with result SUCCESSNotifying endpoint 'HTTP:https://webhooks.gitter.im/e/ac8bb6e9f53357bc8aa8'
Test FAILed.

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/ADAM-prb/670/

Build result: FAILURE

GitHub pull request #647 of commit f6ce721 automatically merged.Notifying endpoint 'HTTP:https://webhooks.gitter.im/e/ac8bb6e9f53357bc8aa8'[EnvInject] - Loading node environment variables.Building remotely on amp-jenkins-worker-05 (centos) in workspace /home/jenkins/workspace/ADAM-prb > git rev-parse --is-inside-work-tree # timeout=10Fetching changes from the remote Git repository > git config remote.origin.url https://github.com/bigdatagenomics/adam.git # timeout=10Fetching upstream changes from https://github.com/bigdatagenomics/adam.git > git --version # timeout=10 > git fetch --tags --progress https://github.com/bigdatagenomics/adam.git +refs/pull/:refs/remotes/origin/pr/ > git rev-parse origin/pr/647/merge^{commit} # timeout=10 > git branch -a --contains d7e55c115cfc9f4de7289144d2506ea006bf3237 # timeout=10 > git rev-parse remotes/origin/pr/647/merge^{commit} # timeout=10Checking out Revision d7e55c115cfc9f4de7289144d2506ea006bf3237 (origin/pr/647/merge) > git config core.sparsecheckout # timeout=10 > git checkout -f d7e55c115cfc9f4de7289144d2506ea006bf3237First time build. Skipping changelog.Triggering ADAM-prb ? 2.2.0,centosTriggering ADAM-prb ? 2.3.0,centosTriggering ADAM-prb ? 1.0.4,centosADAM-prb ? 2.2.0,centos completed with result SUCCESSADAM-prb ? 2.3.0,centos completed with result FAILUREADAM-prb ? 1.0.4,centos completed with result SUCCESSNotifying endpoint 'HTTP:https://webhooks.gitter.im/e/ac8bb6e9f53357bc8aa8'
Test FAILed.

@massie

This comment has been minimized.

Show comment
Hide comment
@massie

massie Apr 9, 2015

Member

Yes, when we convert from BAM to ADAM, set simply set the qualityString to null, if it's *. It will be more compact (we save two bytes for each read) and doesn't require any string comparisons (albeit the string isn't very long :)).

Member

massie commented Apr 9, 2015

Yes, when we convert from BAM to ADAM, set simply set the qualityString to null, if it's *. It will be more compact (we save two bytes for each read) and doesn't require any string comparisons (albeit the string isn't very long :)).

@fnothaft

This comment has been minimized.

Show comment
Hide comment
@fnothaft

fnothaft Apr 9, 2015

Member

That's a good idea. Let me refactor that.

Member

fnothaft commented Apr 9, 2015

That's a good idea. Let me refactor that.

@fnothaft

This comment has been minimized.

Show comment
Hide comment
@fnothaft

fnothaft Apr 9, 2015

Member

Updated with the null on conversion change.

Member

fnothaft commented Apr 9, 2015

Updated with the null on conversion change.

@AmplabJenkins

This comment has been minimized.

Show comment
Hide comment
@AmplabJenkins

AmplabJenkins Apr 9, 2015

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/ADAM-prb/671/

Build result: FAILURE

GitHub pull request #647 of commit 608d5f7 automatically merged.Notifying endpoint 'HTTP:https://webhooks.gitter.im/e/ac8bb6e9f53357bc8aa8'[EnvInject] - Loading node environment variables.Building remotely on amp-jenkins-worker-05 (centos) in workspace /home/jenkins/workspace/ADAM-prb > git rev-parse --is-inside-work-tree # timeout=10Fetching changes from the remote Git repository > git config remote.origin.url https://github.com/bigdatagenomics/adam.git # timeout=10Fetching upstream changes from https://github.com/bigdatagenomics/adam.git > git --version # timeout=10 > git fetch --tags --progress https://github.com/bigdatagenomics/adam.git +refs/pull/:refs/remotes/origin/pr/ > git rev-parse origin/pr/647/merge^{commit} # timeout=10 > git branch -a --contains 356a6d6711a5a558e7df6c94cd0b427d82a69d58 # timeout=10 > git rev-parse remotes/origin/pr/647/merge^{commit} # timeout=10Checking out Revision 356a6d6711a5a558e7df6c94cd0b427d82a69d58 (origin/pr/647/merge) > git config core.sparsecheckout # timeout=10 > git checkout -f 356a6d6711a5a558e7df6c94cd0b427d82a69d58First time build. Skipping changelog.Triggering ADAM-prb ? 2.2.0,centosTriggering ADAM-prb ? 2.3.0,centosTriggering ADAM-prb ? 1.0.4,centosADAM-prb ? 2.2.0,centos completed with result SUCCESSADAM-prb ? 2.3.0,centos completed with result FAILUREADAM-prb ? 1.0.4,centos completed with result SUCCESSNotifying endpoint 'HTTP:https://webhooks.gitter.im/e/ac8bb6e9f53357bc8aa8'
Test FAILed.

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/ADAM-prb/671/

Build result: FAILURE

GitHub pull request #647 of commit 608d5f7 automatically merged.Notifying endpoint 'HTTP:https://webhooks.gitter.im/e/ac8bb6e9f53357bc8aa8'[EnvInject] - Loading node environment variables.Building remotely on amp-jenkins-worker-05 (centos) in workspace /home/jenkins/workspace/ADAM-prb > git rev-parse --is-inside-work-tree # timeout=10Fetching changes from the remote Git repository > git config remote.origin.url https://github.com/bigdatagenomics/adam.git # timeout=10Fetching upstream changes from https://github.com/bigdatagenomics/adam.git > git --version # timeout=10 > git fetch --tags --progress https://github.com/bigdatagenomics/adam.git +refs/pull/:refs/remotes/origin/pr/ > git rev-parse origin/pr/647/merge^{commit} # timeout=10 > git branch -a --contains 356a6d6711a5a558e7df6c94cd0b427d82a69d58 # timeout=10 > git rev-parse remotes/origin/pr/647/merge^{commit} # timeout=10Checking out Revision 356a6d6711a5a558e7df6c94cd0b427d82a69d58 (origin/pr/647/merge) > git config core.sparsecheckout # timeout=10 > git checkout -f 356a6d6711a5a558e7df6c94cd0b427d82a69d58First time build. Skipping changelog.Triggering ADAM-prb ? 2.2.0,centosTriggering ADAM-prb ? 2.3.0,centosTriggering ADAM-prb ? 1.0.4,centosADAM-prb ? 2.2.0,centos completed with result SUCCESSADAM-prb ? 2.3.0,centos completed with result FAILUREADAM-prb ? 1.0.4,centos completed with result SUCCESSNotifying endpoint 'HTTP:https://webhooks.gitter.im/e/ac8bb6e9f53357bc8aa8'
Test FAILed.

@fnothaft

This comment has been minimized.

Show comment
Hide comment
@fnothaft

fnothaft Apr 9, 2015

Member

Jenkins, retest this please.

Looks like some issue with a JAR not being pulled down in the Hadoop 2.2 build.

Member

fnothaft commented Apr 9, 2015

Jenkins, retest this please.

Looks like some issue with a JAR not being pulled down in the Hadoop 2.2 build.

Show outdated Hide outdated ...main/scala/org/bdgenomics/adam/converters/AlignmentRecordConverter.scala
@@ -79,7 +79,7 @@ class AlignmentRecordConverter extends Serializable {
// set canonically necessary fields
builder.setReadName(adamRecord.getReadName.toString)
builder.setReadString(adamRecord.getSequence)
builder.setBaseQualityString(adamRecord.getQual)
Option(adamRecord.getQual).fold(builder.setBaseQualityString("*"))(s => builder.setBaseQualityString(s))

This comment has been minimized.

@massie

massie Apr 9, 2015

Member

Simple pattern matching here would prevent double-setting the qualityString (unless I'm misreading this) and prevent allocating objects we won't use.

@massie

massie Apr 9, 2015

Member

Simple pattern matching here would prevent double-setting the qualityString (unless I'm misreading this) and prevent allocating objects we won't use.

This comment has been minimized.

@fnothaft

fnothaft Apr 9, 2015

Member

Ah, sure.

@fnothaft

fnothaft Apr 9, 2015

Member

Ah, sure.

This comment has been minimized.

@fnothaft

fnothaft Apr 9, 2015

Member

Fixed.

@fnothaft

fnothaft Apr 9, 2015

Member

Fixed.

@massie

This comment has been minimized.

Show comment
Hide comment
@massie

massie Apr 9, 2015

Member

Other than one nit, this looks good to me.

Member

massie commented Apr 9, 2015

Other than one nit, this looks good to me.

@AmplabJenkins

This comment has been minimized.

Show comment
Hide comment
@AmplabJenkins

AmplabJenkins Apr 9, 2015

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/ADAM-prb/672/
Test PASSed.

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/ADAM-prb/672/
Test PASSed.

massie added a commit that referenced this pull request Apr 9, 2015

Merge pull request #647 from fnothaft/allow-asterisk-bqsr
[ADAM-646] Special case reads with '*' quality during BQSR.

@massie massie merged commit 4c615da into bigdatagenomics:master Apr 9, 2015

1 check passed

default Merged build finished.
Details
@massie

This comment has been minimized.

Show comment
Hide comment
@massie

massie Apr 9, 2015

Member

Thanks, Frank!

Member

massie commented Apr 9, 2015

Thanks, Frank!

@Jaeki

This comment has been minimized.

Show comment
Hide comment
@Jaeki

Jaeki Apr 10, 2015

Thanks, Frank!
I have tested in my cluster with small data, it works well.

Jaeki commented Apr 10, 2015

Thanks, Frank!
I have tested in my cluster with small data, it works well.

@fnothaft

This comment has been minimized.

Show comment
Hide comment
@fnothaft

fnothaft Apr 10, 2015

Member

Great! Glad to hear it @Jaeki!

Member

fnothaft commented Apr 10, 2015

Great! Glad to hear it @Jaeki!

@fnothaft fnothaft deleted the fnothaft:allow-asterisk-bqsr branch Apr 10, 2015

@Jaeki

This comment has been minimized.

Show comment
Hide comment
@Jaeki

Jaeki Apr 11, 2015

@fnothaft Could you check the followin case?
I got the similar error when running BQSR, The input sam file is the aligned with SNAP (NA12878).

15/04/11 12:35:38 INFO DAGScheduler: Job 1 failed: aggregate at BaseQualityRecalibration.scala:84, took 3.648340 s
Exception in thread "main" org.apache.spark.SparkException: Job aborted due to stage failure: Task 45 in stage 1.0 failed 4 times, most recent failure: Lost task 45.3 in stage 1.0 (TID 111, node-120): java.lang.IllegalArgumentException: Error "requirement failed" while constructing DecadentRead from Read({"contig": {"contigName": "chrY", "contigLength": 59373566, "contigMD5": null, "referenceURL": null, "assembly": null, "species": null}, "start": 13833306, "oldPosition": null, "end": 13833411, "mapq": 23, "readName": "ERR032977_24245808", "sequence": "AAATGGAACGAAGTGGAATCGAGTGGAATGGAATCGAATGGAGTGAAATGGAATGGAATGGACGCGAAAGAATGGACTGGAACAAAATGAAATCGAACGGT", "qual": "CCCCCCCCCCCCCCCCCBCCCCCDCCCCCCCDCCCC@DCCCBCBCBBBCCCABCCCBDBCCDCCBCABBC?@@A@BABBBDBD@D<8;BB8?:@@d@B>>1", "cigar": "69M1D35=", "oldCigar": null, "basesTrimmedFromStart": 0, "basesTrimmedFromEnd": 0, "readPaired": true, "properPair": true, "readMapped": true, "mateMapped": true, "firstOfPair": false, "secondOfPair": true, "failedVendorQualityChecks": false, "duplicateRead": false, "readNegativeStrand": false, "mateNegativeStrand": true, "primaryAlignment": true, "secondaryAlignment": false, "supplementaryAlignment": false, "mismatchingPositions": null, "origQual": null, "attributes": "PU:Z:pu\tSM:Z:sm\tNM:i:14\tPL:Z:Illumina\tRG:Z:FASTQ\tPG:Z:SNAP\tLB:Z:lb", "recordGroupName": "FASTQ", "recordGroupSequencingCenter": null, "recordGroupDescription": null, "recordGroupRunDateEpoch": null, "recordGroupFlowOrder": null, "recordGroupKeySequence": null, "recordGroupLibrary": "lb", "recordGroupPredictedMedianInsertSize": null, "recordGroupPlatform": "Illumina", "recordGroupPlatformUnit": "pu", "recordGroupSample": "sm", "mateAlignmentStart": 13869382, "mateAlignmentEnd": null, "mateContig": {"contigName": "chrY", "contigLength": 59373566, "contigMD5": null, "referenceURL": null, "assembly": null, "species": null}})
at org.bdgenomics.adam.rich.DecadentRead$.apply(DecadentRead.scala:40)
at org.bdgenomics.adam.rich.DecadentRead$.apply(DecadentRead.scala:32)
at org.bdgenomics.adam.rich.DecadentRead$$anonfun$cloy$1.apply(DecadentRead.scala:50)
at org.bdgenomics.adam.rich.DecadentRead$$anonfun$cloy$1.apply(DecadentRead.scala:50)
at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
at scala.collection.Iterator$$anon$14.hasNext(Iterator.scala:389)
at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:371)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327)
at scala.collection.Iterator$class.foreach(Iterator.scala:727)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
at scala.collection.TraversableOnce$class.foldLeft(TraversableOnce.scala:144)
at scala.collection.AbstractIterator.foldLeft(Iterator.scala:1157)
at scala.collection.TraversableOnce$class.aggregate(TraversableOnce.scala:201)
at scala.collection.AbstractIterator.aggregate(Iterator.scala:1157)
at org.apache.spark.rdd.RDD$$anonfun$22.apply(RDD.scala:901)
at org.apache.spark.rdd.RDD$$anonfun$22.apply(RDD.scala:901)
at org.apache.spark.SparkContext$$anonfun$29.apply(SparkContext.scala:1355)
at org.apache.spark.SparkContext$$anonfun$29.apply(SparkContext.scala:1355)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:61)
at org.apache.spark.scheduler.Task.run(Task.scala:56)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:196)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.IllegalArgumentException: requirement failed
at scala.Predef$.require(Predef.scala:221)
at org.bdgenomics.adam.rich.DecadentRead.(DecadentRead.scala:71)
at org.bdgenomics.adam.rich.DecadentRead$.apply(DecadentRead.scala:36)
... 23 more

Jaeki commented Apr 11, 2015

@fnothaft Could you check the followin case?
I got the similar error when running BQSR, The input sam file is the aligned with SNAP (NA12878).

15/04/11 12:35:38 INFO DAGScheduler: Job 1 failed: aggregate at BaseQualityRecalibration.scala:84, took 3.648340 s
Exception in thread "main" org.apache.spark.SparkException: Job aborted due to stage failure: Task 45 in stage 1.0 failed 4 times, most recent failure: Lost task 45.3 in stage 1.0 (TID 111, node-120): java.lang.IllegalArgumentException: Error "requirement failed" while constructing DecadentRead from Read({"contig": {"contigName": "chrY", "contigLength": 59373566, "contigMD5": null, "referenceURL": null, "assembly": null, "species": null}, "start": 13833306, "oldPosition": null, "end": 13833411, "mapq": 23, "readName": "ERR032977_24245808", "sequence": "AAATGGAACGAAGTGGAATCGAGTGGAATGGAATCGAATGGAGTGAAATGGAATGGAATGGACGCGAAAGAATGGACTGGAACAAAATGAAATCGAACGGT", "qual": "CCCCCCCCCCCCCCCCCBCCCCCDCCCCCCCDCCCC@DCCCBCBCBBBCCCABCCCBDBCCDCCBCABBC?@@A@BABBBDBD@D<8;BB8?:@@d@B>>1", "cigar": "69M1D35=", "oldCigar": null, "basesTrimmedFromStart": 0, "basesTrimmedFromEnd": 0, "readPaired": true, "properPair": true, "readMapped": true, "mateMapped": true, "firstOfPair": false, "secondOfPair": true, "failedVendorQualityChecks": false, "duplicateRead": false, "readNegativeStrand": false, "mateNegativeStrand": true, "primaryAlignment": true, "secondaryAlignment": false, "supplementaryAlignment": false, "mismatchingPositions": null, "origQual": null, "attributes": "PU:Z:pu\tSM:Z:sm\tNM:i:14\tPL:Z:Illumina\tRG:Z:FASTQ\tPG:Z:SNAP\tLB:Z:lb", "recordGroupName": "FASTQ", "recordGroupSequencingCenter": null, "recordGroupDescription": null, "recordGroupRunDateEpoch": null, "recordGroupFlowOrder": null, "recordGroupKeySequence": null, "recordGroupLibrary": "lb", "recordGroupPredictedMedianInsertSize": null, "recordGroupPlatform": "Illumina", "recordGroupPlatformUnit": "pu", "recordGroupSample": "sm", "mateAlignmentStart": 13869382, "mateAlignmentEnd": null, "mateContig": {"contigName": "chrY", "contigLength": 59373566, "contigMD5": null, "referenceURL": null, "assembly": null, "species": null}})
at org.bdgenomics.adam.rich.DecadentRead$.apply(DecadentRead.scala:40)
at org.bdgenomics.adam.rich.DecadentRead$.apply(DecadentRead.scala:32)
at org.bdgenomics.adam.rich.DecadentRead$$anonfun$cloy$1.apply(DecadentRead.scala:50)
at org.bdgenomics.adam.rich.DecadentRead$$anonfun$cloy$1.apply(DecadentRead.scala:50)
at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
at scala.collection.Iterator$$anon$14.hasNext(Iterator.scala:389)
at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:371)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327)
at scala.collection.Iterator$class.foreach(Iterator.scala:727)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
at scala.collection.TraversableOnce$class.foldLeft(TraversableOnce.scala:144)
at scala.collection.AbstractIterator.foldLeft(Iterator.scala:1157)
at scala.collection.TraversableOnce$class.aggregate(TraversableOnce.scala:201)
at scala.collection.AbstractIterator.aggregate(Iterator.scala:1157)
at org.apache.spark.rdd.RDD$$anonfun$22.apply(RDD.scala:901)
at org.apache.spark.rdd.RDD$$anonfun$22.apply(RDD.scala:901)
at org.apache.spark.SparkContext$$anonfun$29.apply(SparkContext.scala:1355)
at org.apache.spark.SparkContext$$anonfun$29.apply(SparkContext.scala:1355)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:61)
at org.apache.spark.scheduler.Task.run(Task.scala:56)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:196)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.IllegalArgumentException: requirement failed
at scala.Predef$.require(Predef.scala:221)
at org.bdgenomics.adam.rich.DecadentRead.(DecadentRead.scala:71)
at org.bdgenomics.adam.rich.DecadentRead$.apply(DecadentRead.scala:36)
... 23 more

@massie

This comment has been minimized.

Show comment
Hide comment
@massie

massie Apr 11, 2015

Member

The exact line of the error is shown in this stack trace.

at org.bdgenomics.adam.rich.DecadentRead.(DecadentRead.scala:71)

Your sequence is 102 bases long...

$ echo "AAATGGAACGAAGTGGAATCGAGTGGAATGGAATCGAATGGAGTGAAATGGAATGGAATGGACGCGAAAGAATGGACTGGAACAAAATGAAATCGAACGGT" | wc -c
102

... but the difference between your reference start and end position is 13833411 - 13833306 = 105. The cigar string is 69M1D35= which agrees with the sequence length 69+1+35 = 105.

The sequence is missing 3 bases.

Member

massie commented Apr 11, 2015

The exact line of the error is shown in this stack trace.

at org.bdgenomics.adam.rich.DecadentRead.(DecadentRead.scala:71)

Your sequence is 102 bases long...

$ echo "AAATGGAACGAAGTGGAATCGAGTGGAATGGAATCGAATGGAGTGAAATGGAATGGAATGGACGCGAAAGAATGGACTGGAACAAAATGAAATCGAACGGT" | wc -c
102

... but the difference between your reference start and end position is 13833411 - 13833306 = 105. The cigar string is 69M1D35= which agrees with the sequence length 69+1+35 = 105.

The sequence is missing 3 bases.

@Jaeki

This comment has been minimized.

Show comment
Hide comment
@Jaeki

Jaeki Apr 13, 2015

@massie Thank you for your comment. Do you think the missing 3 bases come from the SNAP tool ? I downloaded the NA12878 reads from web and aligned with SNAP, transformed the sam file to adam and then BQSR with ADAM. That's what I did.

Jaeki commented Apr 13, 2015

@massie Thank you for your comment. Do you think the missing 3 bases come from the SNAP tool ? I downloaded the NA12878 reads from web and aligned with SNAP, transformed the sam file to adam and then BQSR with ADAM. That's what I did.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment