-
Notifications
You must be signed in to change notification settings - Fork 173
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TLEN clarification to SAM spec #23
Conversation
pulling from samtools/hts-specs
States clearly that TLEN should be length (i.e. pos.right-pos.left+1) rather than subtraction (i.e. pos.right-pos.left) -- this was somewhat ambiguous before. Makes it clear that the sign character is prepended to the States that TLEN may be set to 0 whenever the information is unavailable, and gives examples as to when this would be the case.
Adds explicit statement that soft-clipped bases don't count as mapped bases.
I think this explanation is clearer. I was thinking that TLEN could be described as an "offset" rather than a length (because lengths aren't negative), but it isn't really an offset because of the +1. Rather than saying "The rightmost segment should be negated" this should be something like "For the rightmost segment, this length is negated". Actually I would suggest to explain it like this: "In any case where multiple segments from the same template are mapped, the template may be said to have an implied length from the leftmost mapped position to the rightmost mapped position (i.e., not including soft-clipped bases), calculated as pos.right - pos.left + 1. For the leftmost segment, {\sf TLEN} is this implied length, and for the rightmost, it is the length prepended by a negative ("-") sign. " The parts of the revision that refer to middle segments sound like they might have substantive implications for the standard (rather than being merely explanatory). |
Changes the way the process to calculate TLEN for the rightmost position is described to clarify that it is multiplication by -1 (rather than "negation", which could potentially be confused with strand flipping). Changes the requirement for treating reads as leftmost or rightmost such that all reads that share the leftmost position be treated as leftmost and all reads that share the rightmost position be treated as rightmost
New version of pull request has been updated based on comments from Arlin and Bob. I agree with Arlin that the portions dealing with what to do with multiple leftmost and rightmost segments could be considered substantive changes as the current spec says "the sign of segments in the middle is undefined" whereas this version now explicitly says that all segments sharing the leftmost position should be treated as leftmost (therefore having a positive TLEN) while all segments sharing the rightmost position should be treated as rightmost (therefore having a negative TLEN). |
superceded by #366 |
States clearly that TLEN should be length (i.e. pos.right-pos.left+1) rather than subtraction
(i.e. pos.right-pos.left) -- this was somewhat ambiguous before.
States that TLEN may be set to 0 whenever the information is unavailable,
and gives examples as to when this would be the case.
Adds explicit statement that soft-clipped bases don't count as mapped bases.