AlignmentRecord.mateAlignmentEnd never set #1290

Closed
ryan-williams opened this Issue Nov 24, 2016 · 3 comments

Comments

Projects
None yet
3 participants
@ryan-williams
Member

ryan-williams commented Nov 24, 2016

Gotcha I just ran into: AlignmentRecord.mateAlignmentEnd is never set in ADAM.

That field doesn't exist on SAMRecords, and afaict mate's end is not inferable in e.g. SAMRecordConverter where the mate-start is set.

My use-case was mimicking samtools view's region-filtering behavior where unmapped reads will be included if their mate is mapped to an included region.

Having dug into it further, samtools view only counts unmapped reads as existing at the one-locus position indicated by mateAlignmentStart.

I have some code I will maybe make into a PR that constructs a ReferenceRegion representing an unmapped AlignmentRecord's mate, setting the end to one more than the start position, to be consistent with samtools view's behavior described above.

I am considering whether .setMateAlignmentEnd(mateStart + 1) (on unmapped reads only?) in SAMRecordConverter makes sense; it's technically not a correct value to set the mate-end to, so I think leaving that as a decision higher in the stack (e.g. when explicitly making a ReferenceRegion representing an unmapped read's mate like I described above) makes more sense.

In that case, there may be nothing actionable here, though maybe some documentation somewhere about mateAlignmentEnd never being set would make sense… maybe that field should be removed, or at least documented as having different set/null contracts than the parallel-named mateAlignmentStart and mateContigName.

@fnothaft

This comment has been minimized.

Show comment
Hide comment
@fnothaft

fnothaft Nov 24, 2016

Member

Yeah, I ran into this several months back in bigdatagenomics/quinine#38 as well. I think your ReferenceRegion addition makes sense. WRT SAMRecordConverter, my preference would be to nix the mateAlignmentEnd field upstream in bdg-formats.

Member

fnothaft commented Nov 24, 2016

Yeah, I ran into this several months back in bigdatagenomics/quinine#38 as well. I think your ReferenceRegion addition makes sense. WRT SAMRecordConverter, my preference would be to nix the mateAlignmentEnd field upstream in bdg-formats.

@heuermh

This comment has been minimized.

Show comment
Hide comment
@heuermh

heuermh Nov 24, 2016

Member

nice catch! +1 to removing mateAlignmentEnd field

Member

heuermh commented Nov 24, 2016

nice catch! +1 to removing mateAlignmentEnd field

@ryan-williams ryan-williams referenced this issue in bigdatagenomics/bdg-formats Nov 24, 2016

Closed

Remove AlignmentRecord.mateAlignmentEnd #115

@ryan-williams

This comment has been minimized.

Show comment
Hide comment
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment