Skip to content

Hotfix for various TagPileup bugs#77

Merged
WilliamKMLai merged 4 commits intomasterfrom
hotfix
Jan 3, 2022
Merged

Hotfix for various TagPileup bugs#77
WilliamKMLai merged 4 commits intomasterfrom
hotfix

Conversation

@owlang
Copy link
Copy Markdown
Collaborator

@owlang owlang commented Dec 24, 2021

This pull request includes fixes for minor shifts in tag pileup under specific conditions. More specifically, fixes for antisense 1bp shift, strand-specific shift correction, and adjustment to insert size determination (#75 & #76).

Bug discovered with test data showing 1bp shift downstream in ticket #75

The getUnclippedEnd() function returns a 1-indexed inclusive coordinate which needs to be decremented to be 0-indexed when assigning the FivePrime mark variable's value. This is assigned in two places (paired end and not-paired end data code blocks).
By removing the code block shifting the BED coordinates that appears to be some sort of strand-specific correction, we restore the data to the correct composite (checking for BEDcoord with different strands and checking both sense and antisense composites). Suspect the correction code block that is removed in this commit was for correction of code that is no longer in place.

Relates to issue #75
The rationale is outlined in issue #76.

PileupExtract is updated for pileups that require proper pairs.

The midpoint is changed to get the leftmost coordinate of the insert (getAlignmentStart or getMateAlignmentStart depending if R1 or R2 is the leftmost read) and add on half of the insert size.

Note the correction for when the BED interval is even and on the negative strand. This ensures that even though we perform a floor integer division calculation, the distance from the 5' end of the BED interval is consistent between BED intervals. Odd intervals have consistent 5' distances regardless of direction.

The filter for insert size is changed to use the built-in SamRecord function getInferredInsertSize() by checking if the absolute value is more or less than the limits specified by the PileupParameters object.

I also switched the filter to use a continue statement instead of setting FivePrime to an invalid position in order to save a little on downstream computation.
Invert if statement int TagPileup so that it parses BED coordinates such that unexpected strand characters default to the positive strand ("+").
@owlang owlang requested a review from WilliamKMLai December 24, 2021 13:10
@WilliamKMLai WilliamKMLai merged commit c65f746 into master Jan 3, 2022
@WilliamKMLai WilliamKMLai deleted the hotfix branch January 3, 2022 15:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants