Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A bug: pysam.pileup will not work if all the reads per base qualities are 10 (+) #1002

Closed
dolittle007 opened this issue Apr 1, 2021 · 3 comments

Comments

@dolittle007
Copy link

An interesting bug,
If reads per base qualities are all 10, (+++++++++), the pysam.pileup will not work properly.

@jmarshall
Copy link
Member

In what way did it did not work properly? Please add an example program that demonstrates the problem.

@dolittle007
Copy link
Author

dolittle007 commented Apr 1, 2021

I attached a small SAM file and codes to demonstrate this weird bug.
I've tested it using pysam 0.16.0.1 and 0.15.3, they both have this bug.
I also tried using different steppers ('all' and 'nofilter') in pysam.pileup. The bug is still there.

@HD	VN:1.6	SO:coordinate
@SQ	SN:chr1	LN:248956422
@RG	ID:XX	SM:hs	LB:ga	PL:PacBio
@PG	ID:minimap2	PN:minimap2	VN:2.17-r941	CL:minimap2 -t 8 -R @RG\tID:XX\tSM:hs\tLB:ga\tPL:PacBio --MD -ax splice:hq -uf --secondary=no hg38.fa XX.fastq
ccs	16	chr1	169691400	60	10M	*	0	0	ATCAAATTTG	++++++++++	RG:Z:XX	NM:i:0	ms:i:1705	AS:i:1581	nn:i:0	ts:A:+	tp:A:P	cm:i:544	s1:i:1604	s2:i:0	de:f:0	MD:Z:1705	rl:i:0

############################

sam =  pysam.AlignmentFile('example.sam','r')
for i in sam.pileup():
    for pileupread in i.pileups:
        print(pileupread)
    

The output will be empty.

But after changing the reads quality to "+++++~++++" (keep the rest part unchanged).
The output will be as expected.

ccs	16	0	169691399	60	10M	-1	-1	10	ATCAAATTTG	array('B', [10, 10, 93, 10, 10, 10, 10, 10, 10, 10])	[('RG', 'XX'), ('NM', 0), ('ms', 1705), ('AS', 1581), ('nn', 0), ('ts', '+'), ('tp', 'P'), ('cm', 544), ('s1', 1604), ('s2', 0), ('de', 0.0), ('MD', '1705'), ('rl', 0)]	2	0	1313935984	0	0	0	0

example.sam

Please fix it.
Thank you.

jmarshall added a commit that referenced this issue Apr 2, 2021
@jmarshall
Copy link
Member

pileup() can omit bases with low quality scores. The default is to omit < 13, which means all your base quality 10s (+) are dropped. If you lower the quality threshold via e.g. sam.pileup(min_base_quality = 5) you will see all these bases in the output.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants