Navigation Menu

Skip to content
This repository has been archived by the owner on Oct 2, 2018. It is now read-only.

RG lines are created where they should not exist #36

Open
jkbonfield opened this issue Apr 17, 2015 · 1 comment
Open

RG lines are created where they should not exist #36

jkbonfield opened this issue Apr 17, 2015 · 1 comment

Comments

@jkbonfield
Copy link

htslib/test/xx#rg.sam contains two different RG tags and also some reads with no RG at all. These latter reads get an RG tag assigned to them regardless, thus changing the contents of the SAM file on a SAM->CRAM->SAM round trip.

This happens with both 2.1 and 3.0 versions.

Edit: this is an encoder issue. If I decode the file created by Cramtools with scramble then it has the extra RG tags too. If I decode the file created by scramble with cramtools then it does not (working).

Before:

@HD     VN:1.4  SO:coordinate
@SQ     SN:xx   LN:20   AS:?    SP:?    UR:?    M5:bbf4de6d8497a119dda6e074521643dc
@RG     ID:x1   SM:x1
@RG     ID:x2   SM:x2   LB:x    PG:foo:bar      PI:1111
@PG     ID:emacs        PN:emacs        VN:23.1.1
@CO     also test
@CO     other   headers
a1      16      xx      1       1       10M     *       0       0       AAAAAAAAAA      **********      RG:Z:x1
b1      16      xx      1       1       10M     *       0       0       AAAAAAAAAA      **********      RG:Z:x2
c1      16      xx      1       1       10M     *       0       0       AAAAAAAAAA      **********
a2      16      xx      11      1       10M     *       0       0       TTTTTTTTTT      **********      RG:Z:x1
b2      16      xx      11      1       10M     *       0       0       TTTTTTTTTT      **********      RG:Z:x2
c2      16      xx      11      1       10M     *       0       0       TTTTTTTTTT      **********

After:

@HD     VN:1.4  SO:coordinate
@SQ     SN:xx   LN:20   AS:?    UR:?    M5:bbf4de6d8497a119dda6e074521643dc     SP:?
@RG     ID:x1   SM:x1
@RG     ID:x2   LB:x    PI:1111 SM:x2   PG:foo:bar
@PG     ID:emacs        PN:emacs        VN:23.1.1
@PG     ID:0    PN:cramtools    VN:2.1-b268     CL:java /nfs/users/nfs_j/jkb/work/cram/cramtools/cramtools-2.1.jar cram -R xx.fa -I xx#rg.sam -O _tmp.cram -n -Q --capture-all-tags
@PG     ID:1    PN:cramtools    VN:2.1-b268     CL:java /nfs/users/nfs_j/jkb/work/cram/cramtools/cramtools-2.1.jar bam -R xx.fa -I _tmp.cram -O _tmp.sam
@CO     also test
@CO     other   headers
a1      16      xx      1       1       10M     *       0       0       AAAAAAAAAA      **********      RG:Z:x1
b1      16      xx      1       1       10M     *       0       0       AAAAAAAAAA      **********      RG:Z:x2
c1      16      xx      1       1       10M     *       0       0       AAAAAAAAAA      **********      RG:Z:x2
a2      16      xx      11      1       10M     *       0       0       TTTTTTTTTT      **********      RG:Z:x1
b2      16      xx      11      1       10M     *       0       0       TTTTTTTTTT      **********      RG:Z:x2
c2      16      xx      11      1       10M     *       0       0       TTTTTTTTTT      **********      RG:Z:x2
@vadimzalunin
Copy link
Contributor

fix eb7e09e

beta codec used as a dull default option for int data series, but it didn't know about offset, cutting important bits off if given negative values. This is a bad one because it could silently mess values up 😱

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants