We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
samtools rmdup removes single read from a pair when 2 pairs with same position for all 4 reads are present
i.e. input file
WPHISEQ06:99:C6L3PANXX:5:1101:1124:56798_1:N:0: 163 chr2 175312986 42 34M = 175312986 -34 WPHISEQ06:99:C6L3PANXX:5:1101:13415:91660_1:N:0: 163 chr2 175312986 23 34M = 175312986 -34 WPHISEQ06:99:C6L3PANXX:5:1101:1124:56798_1:N:0: 83 chr2 175312986 42 34M = 175312986 34 WPHISEQ06:99:C6L3PANXX:5:1101:13415:91660_1:N:0: 83 chr2 175312986 23 34M = 175312986 34
output after samtools rmdup
WPHISEQ06:99:C6L3PANXX:5:1101:1124:56798_1:N:0: 163 chr2 175312986 42 34M = 175312986 -34 WPHISEQ06:99:C6L3PANXX:5:1101:13415:91660_1:N:0: 163 chr2 175312986 23 34M = 175312986 -34 WPHISEQ06:99:C6L3PANXX:5:1101:13415:91660_1:N:0: 83 chr2 175312986 23 34M = 175312986 34
samtools git version 1.2-242-g4d56437 htslib git version 1.2.1-256-ga356746
Example file to reproduce error test.sam.txt
The text was updated successfully, but these errors were encountered:
The rmdup logic assumes that the QNAME fields are identical for duplicates. After manually changing QNAME for the second pair in the example -
./samtools view -b /tmp/Issue497.sam |./samtools sort -n -o - - |./samtools rmdup - - |./samtools view -h - |tail [bam_rmdup_core] processing reference chr2... [bam_rmdup_core] 1 / 2 = 0.5000 in library ' ' @SQ SN:chrUn_KI270757v1 LN:71251 @SQ SN:chrUn_GL000214v1 LN:137718 @SQ SN:chrUn_KI270742v1 LN:186739 @SQ SN:chrUn_GL000216v2 LN:176608 @SQ SN:chrUn_GL000218v1 LN:161147 @SQ SN:chrX LN:156040895 @SQ SN:chrY LN:57227415 @SQ SN:chrY_KI270740v1_random LN:37240 WPHISEQ06:99:C6L3PANXX:5:1101:1124:56798_1:N:0: 163 chr2 1753129 86 23 34M = 175312986 -34 ATACAAAAATTTACCGCTTTACTAATAATCCACT ;?B1DFEGGFGGG11?/<=<FGGGGGGGGGGGGG WPHISEQ06:99:C6L3PANXX:5:1101:1124:56798_1:N:0: 83 chr2 1753129 86 23 34M = 175312986 34 ATACAAAAATTTACCGCTTGACTAATAATCCACT GGGGB:1GGF=E=;/1GF=1BEF;1>11GB@A3B
Sorry, something went wrong.
Confirmed that the provided test file still exhibits this issue with current samtools/htslib ("1.3-5-g664cc5f (using htslib 1.3-5-gdf4a80e)").
As rmdup is deprecated and nobody has touched this for two years I am going to close this issue.
mp15
No branches or pull requests
samtools rmdup removes single read from a pair when 2 pairs with same position for all 4 reads are present
i.e. input file
WPHISEQ06:99:C6L3PANXX:5:1101:1124:56798_1:N:0: 163 chr2 175312986 42 34M = 175312986 -34
WPHISEQ06:99:C6L3PANXX:5:1101:13415:91660_1:N:0: 163 chr2 175312986 23 34M = 175312986 -34
WPHISEQ06:99:C6L3PANXX:5:1101:1124:56798_1:N:0: 83 chr2 175312986 42 34M = 175312986 34
WPHISEQ06:99:C6L3PANXX:5:1101:13415:91660_1:N:0: 83 chr2 175312986 23 34M = 175312986 34
output after samtools rmdup
WPHISEQ06:99:C6L3PANXX:5:1101:1124:56798_1:N:0: 163 chr2 175312986 42 34M = 175312986 -34
WPHISEQ06:99:C6L3PANXX:5:1101:13415:91660_1:N:0: 163 chr2 175312986 23 34M = 175312986 -34
WPHISEQ06:99:C6L3PANXX:5:1101:13415:91660_1:N:0: 83 chr2 175312986 23 34M = 175312986 34
samtools git version 1.2-242-g4d56437
htslib git version 1.2.1-256-ga356746
Example file to reproduce error
test.sam.txt
The text was updated successfully, but these errors were encountered: