New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix ReferenceRegion from ADAMRecord #226

Merged
merged 1 commit into from Apr 28, 2014

Conversation

Projects
None yet
3 participants
@arahuja
Contributor

arahuja commented Apr 23, 2014

ReferenceRegion from ADAMRecord currently treats the end as inclusive and therefore creates a region that is one base too long.

@AmplabJenkins

This comment has been minimized.

Show comment
Hide comment
@AmplabJenkins

AmplabJenkins Apr 23, 2014

All automated tests passed.
Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/ADAM-prb/300/

AmplabJenkins commented Apr 23, 2014

All automated tests passed.
Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/ADAM-prb/300/

@fnothaft

This comment has been minimized.

Show comment
Hide comment
@fnothaft

fnothaft Apr 24, 2014

Member

@arahuja Unless I misunderstand, the end of a reference region is supposed to be inclusive; this is correct behavior.

Member

fnothaft commented Apr 24, 2014

@arahuja Unless I misunderstand, the end of a reference region is supposed to be inclusive; this is correct behavior.

@arahuja

This comment has been minimized.

Show comment
Hide comment
@arahuja

arahuja Apr 24, 2014

Contributor

To clarify there are two "end" fields, one in RichADAMRecord and one in ReferenceRegion.

For ReferenceRegion - the docs state:

  • @param start The 0-based residue-coordinate for the start of the region
  • @param end The 0-based residue-coordinate for the first residue after the start
  •        which is <i>not</i> in the region -- i.e. [start, end) define a 0-based
    
  •        half-open interval.
    

RichADAMRecord end is also exclusive and (start, end) also define a 0-based half-open interval. Currently when you create a ReferenceRegion from a RichADAMRecord, it sets region.start = read.start and region.end = (read.end + 1). Since both are already exclusive, half-open intervals the start and end should be equal.

On other hand, the documentation on the apply(RichADAMRecord) method is somewhat conflicting:

  • @return Region corresponding to inclusive region of read alignment, if read is mapped.
Contributor

arahuja commented Apr 24, 2014

To clarify there are two "end" fields, one in RichADAMRecord and one in ReferenceRegion.

For ReferenceRegion - the docs state:

  • @param start The 0-based residue-coordinate for the start of the region
  • @param end The 0-based residue-coordinate for the first residue after the start
  •        which is <i>not</i> in the region -- i.e. [start, end) define a 0-based
    
  •        half-open interval.
    

RichADAMRecord end is also exclusive and (start, end) also define a 0-based half-open interval. Currently when you create a ReferenceRegion from a RichADAMRecord, it sets region.start = read.start and region.end = (read.end + 1). Since both are already exclusive, half-open intervals the start and end should be equal.

On other hand, the documentation on the apply(RichADAMRecord) method is somewhat conflicting:

  • @return Region corresponding to inclusive region of read alignment, if read is mapped.
@fnothaft

This comment has been minimized.

Show comment
Hide comment
@fnothaft

fnothaft Apr 24, 2014

Member

Ah! I see. Just to confirm my understanding then, you're saying that for the read:

Start: 0
Cigar: 6M
Read: ACATAG
Pos:  012345

Wrapping this as a RichADAMRecord, and calling end, would return 5 + 1 = 6, and that the current code to go RichADAMRecord --> ReferenceRegion then in turn (incorrectly) bumps 6 + 1 = 7.

Is my understanding of this correct?

Member

fnothaft commented Apr 24, 2014

Ah! I see. Just to confirm my understanding then, you're saying that for the read:

Start: 0
Cigar: 6M
Read: ACATAG
Pos:  012345

Wrapping this as a RichADAMRecord, and calling end, would return 5 + 1 = 6, and that the current code to go RichADAMRecord --> ReferenceRegion then in turn (incorrectly) bumps 6 + 1 = 7.

Is my understanding of this correct?

@arahuja

This comment has been minimized.

Show comment
Hide comment
@arahuja

arahuja Apr 28, 2014

Contributor

Yes @fnothaft, that seems to be the bug (or alternatively read.end is incorrectly exclusive)

Contributor

arahuja commented Apr 28, 2014

Yes @fnothaft, that seems to be the bug (or alternatively read.end is incorrectly exclusive)

fnothaft added a commit that referenced this pull request Apr 28, 2014

Merge pull request #226 from hammerlab/inclusive-end
Fix ReferenceRegion from ADAMRecord

@fnothaft fnothaft merged commit bc7a51e into bigdatagenomics:master Apr 28, 2014

1 check passed

default Merged build finished.
Details
@fnothaft

This comment has been minimized.

Show comment
Hide comment
@fnothaft

fnothaft Apr 28, 2014

Member

Great; just wanted to confirm that I was reading the code correctly! Thanks @arahuja! I've merged this.

Member

fnothaft commented Apr 28, 2014

Great; just wanted to confirm that I was reading the code correctly! Thanks @arahuja! I've merged this.

@arahuja arahuja deleted the hammerlab:inclusive-end branch Apr 28, 2014

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment