Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix to handle overlapped read clipping in MergeBamAlignment when clipping already present #709

Conversation

tfenne
Copy link
Collaborator

@tfenne tfenne commented Dec 12, 2016

Description

Fixed the clipping of overlapping reads which is currently broken in the case where the reads already contain soft-clipping. If the reads contain softclipping at their 3' ends the overlapping read clipping should add to that soft-clipping in order to stop the reads alignment's extending past each other's 5' ends. Prior to the change, the existing soft-clipping was ignored, and hence the result was incorrect.

Checklist (never delete this)

Never delete this, it is our record that procedure was followed. If you find that for whatever reason one of the checklist points doesn't apply to your PR, you can leave it unchecked but please add an explanation below.

Content

  • Added or modified tests to cover changes and any new functionality
  • Edited the README / documentation (if applicable)
  • All tests passing on Travis

Review

  • Final thumbs-up from reviewer
  • Rebase, squash and reword as applicable

@tfenne tfenne requested a review from nh13 December 12, 2016 23:16
Copy link
Collaborator

@nh13 nh13 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with a few minor comments. I think other folks should be aware of this bug and it's fix.

@@ -590,11 +592,10 @@ private void transferAlignmentInfoToPairedRead(final SAMRecord firstUnaligned, f
* Checks to see whether the ends of the reads overlap and soft clips reads
* them if necessary.
*/
protected void clipForOverlappingReads(final SAMRecord read1, final SAMRecord read2) {
protected static void clipForOverlappingReads(final SAMRecord read1, final SAMRecord read2) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

while we're here, can this be private?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it needs to be protected or package level for the tests to be able to access it.

int clipped = 0;
while (iterator.hasNext()) {
final CigarElement elem = iterator.next();
if (elem.getOperator() != CigarOperator.SOFT_CLIP) break;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that this will do the wrong thing if we have hard clips (ex. 120M12S8H), namely break too early. I think that's the only one though.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, good catch. I've fixed that. The only remaining issue is that it will lose the hard-clipping due to the issue here:
https://github.com/samtools/htsjdk/blob/master/src/main/java/htsjdk/samtools/util/CigarUtil.java#L46

The cigar will be correctly soft-clipped, but the hard-clip will get lost. I'm not inclined to try and fix that at the moment unless you think it's critical.

@nh13
Copy link
Collaborator

nh13 commented Dec 12, 2016

Original bug report: #708

@coveralls
Copy link

Coverage Status

Coverage increased (+0.02%) to 71.096% when pulling 37db695 on tfenne:tf_fix_clip_overlapping_reads_for_existing_clipping into 3dcadca on broadinstitute:master.

@nh13
Copy link
Collaborator

nh13 commented Dec 12, 2016

I can confirm this fixes the test case on #708

@tfenne
Copy link
Collaborator Author

tfenne commented Dec 13, 2016

@nh13 Second commit pushed to handle the hard-clipping. It's not ideal though, so do you want to take a second look?

@coveralls
Copy link

Coverage Status

Coverage increased (+0.02%) to 71.096% when pulling 535937c on tfenne:tf_fix_clip_overlapping_reads_for_existing_clipping into 3dcadca on broadinstitute:master.

if (elem.getOperator() != CigarOperator.SOFT_CLIP) break;
clipped = elem.getLength();
if (elem.getOperator() != CigarOperator.SOFT_CLIP && elem.getOperator() != CigarOperator.HARD_CLIP) break;
if (elem.getOperator() == CigarOperator.SOFT_CLIP) clipped = elem.getLength();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is fine that we will lose the hard clip, since methods in CigarUtil also note this. Perhaps add a TODO here and where this function is used that it doesn't handle hard clips?

@nh13
Copy link
Collaborator

nh13 commented Dec 15, 2016

@tfenne one minor comment about adding a note about losing the hard clips and then merge at will.

@nh13 nh13 assigned tfenne and unassigned nh13 Dec 15, 2016
@tfenne
Copy link
Collaborator Author

tfenne commented Dec 15, 2016

@ktibbett Do you want to take a peek at this at all since it affects MergeBamAlignment? It is a critical fix for anyone who is using UMIs inline in the template reads. Without this, if you have reads that are already soft clipped (because of adapters being identified), then the clipping back of reads that run off the insert doesn't work correctly. @nh13 and I would really like to get this merged ASAP but figure you should at least be aware of it.

@tfenne tfenne merged commit 79cb6d9 into broadinstitute:master Dec 19, 2016
@tfenne tfenne deleted the tf_fix_clip_overlapping_reads_for_existing_clipping branch December 19, 2016 16:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants