strand: XS tag is an aligner specified field and intention can clash (patch included). #473

Closed
wants to merge 1 commit into
from

Conversation

Projects
None yet
3 participants
@keiranmraine
Contributor

keiranmraine commented May 1, 2014

It appears that the strand function was originally coded against XS
as 'sequence from strand', probably for RNA or some such.

BWA-mem defines XS as 'suboptimal alignment score' which is numeric
resulting in incorrect read colouring.

I've attempted to preserve original functionallity, however this may
give unexpected results should XS be used in a different textual way
by another tool.

Technically X?, Y?, Z? tags should not be baked into any BAM rendering components, see BAM specification.

Note that tags starting with X',Y' and `Z' or tags containing lowercase letters
in either position are reserved for local use and will not be formally de fined in any future version of
this speci cation.

XS tag is an aligner specified field and intention can clash.
It appears that the strand function was originally coded against XS
as 'sequence from strand', probably for RNA or some such.

BWA-mem defines XS as 'suboptimal alignment score' which is numeric
resulting in incorrect read colouring.

I've attempted to preserve original functionallity, however this may
give unexpected results should XS be used in a different textual way
by another tool.

@keiranmraine keiranmraine changed the title from strand: XS tag is an aligner specified field and intention can clash. to strand: XS tag is an aligner specified field and intention can clash (patch included). May 8, 2014

@keiranmraine

This comment has been minimized.

Show comment
Hide comment
@keiranmraine

keiranmraine May 15, 2014

Contributor

@cmdcolin, is there any chance you could review this please so that it makes into a release. I thought it would have made it into 1.11.4 (or a comment raised against it as to why not).

Contributor

keiranmraine commented May 15, 2014

@cmdcolin, is there any chance you could review this please so that it makes into a release. I thought it would have made it into 1.11.4 (or a comment raised against it as to why not).

@cmdcolin

This comment has been minimized.

Show comment
Hide comment
@cmdcolin

cmdcolin May 15, 2014

Contributor

Hey I'm sorry I didn't get to review this before the release. it seems like a perfectly fine fix but i just wasn't able to test it (i know i know...should have done that). we're a little short staffed but i really appreciate your input. I would like to continue speeding along with jbrowse development and i'll try and get this included.

Contributor

cmdcolin commented May 15, 2014

Hey I'm sorry I didn't get to review this before the release. it seems like a perfectly fine fix but i just wasn't able to test it (i know i know...should have done that). we're a little short staffed but i really appreciate your input. I would like to continue speeding along with jbrowse development and i'll try and get this included.

@@ -131,7 +131,7 @@ var Feature = Util.fastDeclare(
},
strand: function() {
var xs = this._get('xs');
- return xs ? ( xs == '-' ? -1 : 1 ) :
+ return xs && isNaN(xs) ? ( xs == '-' ? -1 : 1 ) :

This comment has been minimized.

@keiranmraine

keiranmraine May 15, 2014

Contributor

I'd prefer to completely delete the reference to the XS tag to improve performance but as it could be critical for someone else...

@keiranmraine

keiranmraine May 15, 2014

Contributor

I'd prefer to completely delete the reference to the XS tag to improve performance but as it could be critical for someone else...

@keiranmraine

This comment has been minimized.

Show comment
Hide comment
@keiranmraine

keiranmraine May 15, 2014

Contributor

No problem, FYI I'm happy to test things I have experience with. Robert always said I had lots of funky edge-cases.

Contributor

keiranmraine commented May 15, 2014

No problem, FYI I'm happy to test things I have experience with. Robert always said I had lots of funky edge-cases.

@cmdcolin cmdcolin added this to the 1.11.5 milestone May 22, 2014

@tomoakin

This comment has been minimized.

Show comment
Hide comment
@tomoakin

tomoakin Jun 2, 2014

I have made an essentially duplicate pull request #485 (not realizing this request). In that request I deleted reference to XS in the strand function. Looking at the XS tag is not good for stranded RNA-seq data.

tomoakin commented Jun 2, 2014

I have made an essentially duplicate pull request #485 (not realizing this request). In that request I deleted reference to XS in the strand function. Looking at the XS tag is not good for stranded RNA-seq data.

@keiranmraine

This comment has been minimized.

Show comment
Hide comment
@keiranmraine

keiranmraine Jun 2, 2014

Contributor

@tomoakin, I'd take your version over mine for performance reasons, but I thought I should attempted to preserve the current 'functionality'.

Contributor

keiranmraine commented Jun 2, 2014

@tomoakin, I'd take your version over mine for performance reasons, but I thought I should attempted to preserve the current 'functionality'.

@cmdcolin

This comment has been minimized.

Show comment
Hide comment
@cmdcolin

cmdcolin Jun 12, 2014

Contributor

I think we do need to keep parsing XS tags as it appears to be used by some aligners so this approach seems to be the most ideal

Contributor

cmdcolin commented Jun 12, 2014

I think we do need to keep parsing XS tags as it appears to be used by some aligners so this approach seems to be the most ideal

@tomoakin

This comment has been minimized.

Show comment
Hide comment
@tomoakin

tomoakin Jun 13, 2014

Color according to XS is useful if the RNA-seq was performed in a non-stranded manner.
The XS field is set based on the splice site information by Tophat and when the read does not
contain the intron XS is not set. When the original RNA-seq library is constructed from double stranded cDNA, that is the only information that suggest the strandedness of the transcript. In such case, however, it is only confusing to color the other reads that do not have XS tag according to the FLAG, which is relative to genome and just random when it is constructed in non-stranded fashion.

So, I would suggest not to mix strand from XS and from FLAG in a single track with the same color.

The aligner could do other nice stuff based on the construction method and read number information.

The current code

       var xs = this._get('xs');
       return xs ? ( xs == '-' ? -1 : 1 ) :
              this._get('seq_reverse_complemented') ? -1 :  1;

mixes the value of XS and the FLAG.
If you maintain the function to look the XS flag that should test only the XS flag and return 0 when the value is unknown.
The 0 could be rendered in a neutral color say gray.
The code here might look like:

       var xs = this._get('xs');
       return xs && isNaN(xs) ? ( xs == '-' ? -1 : 1 ) : 0;

But surely, you need to check the caller to handle the new return value.
I also would be happy if you make the color-scheme selectable per track, whether the track should be colored according to XS or to FLAG.

Color according to XS is useful if the RNA-seq was performed in a non-stranded manner.
The XS field is set based on the splice site information by Tophat and when the read does not
contain the intron XS is not set. When the original RNA-seq library is constructed from double stranded cDNA, that is the only information that suggest the strandedness of the transcript. In such case, however, it is only confusing to color the other reads that do not have XS tag according to the FLAG, which is relative to genome and just random when it is constructed in non-stranded fashion.

So, I would suggest not to mix strand from XS and from FLAG in a single track with the same color.

The aligner could do other nice stuff based on the construction method and read number information.

The current code

       var xs = this._get('xs');
       return xs ? ( xs == '-' ? -1 : 1 ) :
              this._get('seq_reverse_complemented') ? -1 :  1;

mixes the value of XS and the FLAG.
If you maintain the function to look the XS flag that should test only the XS flag and return 0 when the value is unknown.
The 0 could be rendered in a neutral color say gray.
The code here might look like:

       var xs = this._get('xs');
       return xs && isNaN(xs) ? ( xs == '-' ? -1 : 1 ) : 0;

But surely, you need to check the caller to handle the new return value.
I also would be happy if you make the color-scheme selectable per track, whether the track should be colored according to XS or to FLAG.

@cmdcolin

This comment has been minimized.

Show comment
Hide comment
@cmdcolin

cmdcolin Jul 21, 2014

Contributor

@tomoakin I see your suggestion for having the XS flag also return 0 when XS does not exist instead of defaulting to seq_reverse_complemented. Is this needed?

Contributor

cmdcolin commented Jul 21, 2014

@tomoakin I see your suggestion for having the XS flag also return 0 when XS does not exist instead of defaulting to seq_reverse_complemented. Is this needed?

@tomoakin

This comment has been minimized.

Show comment
Hide comment
@tomoakin

tomoakin Jul 21, 2014

When some specific reads have strand information via XS tag and others don't, I think it is better to color only those specific reads according to the strand and keep the rest as neutral (gray) if possible.

When some specific reads have strand information via XS tag and others don't, I think it is better to color only those specific reads according to the strand and keep the rest as neutral (gray) if possible.

@cmdcolin

This comment has been minimized.

Show comment
Hide comment
@cmdcolin

cmdcolin Jul 21, 2014

Contributor

I guess you are right. There are many reads that, having no splice information, are given no XS tag. Therefore, we may want to just "not color" them instead of defaulting to the regular strand.

Contributor

cmdcolin commented Jul 21, 2014

I guess you are right. There are many reads that, having no splice information, are given no XS tag. Therefore, we may want to just "not color" them instead of defaulting to the regular strand.

@keiranmraine

This comment has been minimized.

Show comment
Hide comment
@keiranmraine

keiranmraine Jul 21, 2014

Contributor

It really would be safer for the use of the XS field for read colouring to be handled at the track configuration level, is this complicated?

Contributor

keiranmraine commented Jul 21, 2014

It really would be safer for the use of the XS field for read colouring to be handled at the track configuration level, is this complicated?

@cmdcolin

This comment has been minimized.

Show comment
Hide comment
@cmdcolin

cmdcolin Jul 29, 2014

Contributor

I have been thinking that a configuration variable would be good. I made a demo implementation of how this might look using a track configuration variable called "useXS" that can be applied to a BAM track. Code here: https://github.com/GMOD/jbrowse/tree/xs_config

Screenshot with XS flags for both directions

screenshot-genomes missouri edu 8080 2014-07-29 17-25-10

One concern is that my demo no longer is going to mix the XS and seq_reverse_complemented to compute strand. Another concern is what we should set useXS as by default? I had it set to true on my branch, but I think it should probably be false in practice.

-Colin

Contributor

cmdcolin commented Jul 29, 2014

I have been thinking that a configuration variable would be good. I made a demo implementation of how this might look using a track configuration variable called "useXS" that can be applied to a BAM track. Code here: https://github.com/GMOD/jbrowse/tree/xs_config

Screenshot with XS flags for both directions

screenshot-genomes missouri edu 8080 2014-07-29 17-25-10

One concern is that my demo no longer is going to mix the XS and seq_reverse_complemented to compute strand. Another concern is what we should set useXS as by default? I had it set to true on my branch, but I think it should probably be false in practice.

-Colin

@cmdcolin

This comment has been minimized.

Show comment
Hide comment
@cmdcolin

cmdcolin Jul 30, 2014

Contributor

Actually, after reading the discussion at #433, we may want to change this code to reflect the color choice at the View level instead of the Store level

Contributor

cmdcolin commented Jul 30, 2014

Actually, after reading the discussion at #433, we may want to change this code to reflect the color choice at the View level instead of the Store level

@keiranmraine

This comment has been minimized.

Show comment
Hide comment
@keiranmraine

keiranmraine Jul 31, 2014

Contributor

Hi Colin, that looks great. From my perspective I don't think the mixing of XS and flag-strand should be allowed, I only applied it in my original patch in an attempt to preserve existing behaviour. For this reason the default for useXS may want to be true (although as this is actually a non standard field for strand is should technically be false).

Contributor

keiranmraine commented Jul 31, 2014

Hi Colin, that looks great. From my perspective I don't think the mixing of XS and flag-strand should be allowed, I only applied it in my original patch in an attempt to preserve existing behaviour. For this reason the default for useXS may want to be true (although as this is actually a non standard field for strand is should technically be false).

@cmdcolin

This comment has been minimized.

Show comment
Hide comment
@cmdcolin

cmdcolin Aug 1, 2014

Contributor

Hi Kieran,
I actually am not sure if we should preserve the original behavior, and that's why I am thinking that useXS: false would be a better default behavior. Having it as true does not look like it did before (it has gray alignments indicating no strand according to XS).

Anyways, I have added a new branch with proposals for having both the useXS and useReversedTemplate flags for the color function.

Comparison
https://github.com/GMOD/jbrowse/compare/xs_config

Contributor

cmdcolin commented Aug 1, 2014

Hi Kieran,
I actually am not sure if we should preserve the original behavior, and that's why I am thinking that useXS: false would be a better default behavior. Having it as true does not look like it did before (it has gray alignments indicating no strand according to XS).

Anyways, I have added a new branch with proposals for having both the useXS and useReversedTemplate flags for the color function.

Comparison
https://github.com/GMOD/jbrowse/compare/xs_config

@keiranmraine

This comment has been minimized.

Show comment
Hide comment
@keiranmraine

keiranmraine Aug 29, 2014

Contributor

Hi Colin,

I've just had a chance to look at the xs_config branch and it's working as expected. I'm going to push this version out to a select few of our users.

Contributor

keiranmraine commented Aug 29, 2014

Hi Colin,

I've just had a chance to look at the xs_config branch and it's working as expected. I'm going to push this version out to a select few of our users.

@cmdcolin cmdcolin modified the milestones: 1.11.5, 1.11.6 Sep 4, 2014

@keiranmraine

This comment has been minimized.

Show comment
Hide comment
@keiranmraine

keiranmraine Sep 5, 2014

Contributor

FYI, our users are reporting good things with the read colouring working as expected for standard paired end sequencing (not using XS).

Contributor

keiranmraine commented Sep 5, 2014

FYI, our users are reporting good things with the read colouring working as expected for standard paired end sequencing (not using XS).

@cmdcolin

This comment has been minimized.

Show comment
Hide comment
@cmdcolin

cmdcolin Feb 7, 2015

Contributor

Fixed by merging xs_config

Contributor

cmdcolin commented Feb 7, 2015

Fixed by merging xs_config

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment