Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bamCoverage --extendReads, defaultFragmentLength can be 0 causing program to fail #1249

Open
Alexwestbrook opened this issue Aug 30, 2023 · 0 comments

Comments

@Alexwestbrook
Copy link

Alexwestbrook commented Aug 30, 2023

python version 3.8
deepTools version 3.5.1

I wanted to compute the coverage of a paired bam file, and got an error about fragment start being greater than fragment end. After some investigation, I realized the problem came from the defaultFragmentLength being 0, causing unproperly aligned reads to have identical fragment start and end. The problem can be easily solved by supplying a value to extendReads.
However I believe it would be better if the program throws an error as soon as this problem is encountered, it would be easier to spot and it would save time, because in my case bamCoverage kept running for awhile despite showing the error and eventually crashed.
I would also add that the reason the defaultFragmentLength was 0, is because the estimated median was 0. I scanned all the pairs in my file and had only 7% of templates lengths of 0. However the get_read_and_fragment_length function from deeptools showed up to the qtile60 as 0. This could be due to the reference genome I used (YPH499) which is not assembled to the chromosome level and has many very small scaffolds

Command that produces the issue:
bamCoverage -p 16 -b NG-34039_167_4_2_lib713577_10294_1_101bp_max250.sorted.bam -o NG-34039_167_4_2_lib713577_10294_1_101bp_max250.bw --binSize 1 --extendReads

Output:
bamFilesList: ['NG-34039_167_4_2_lib713577_10294_1_101bp_max250.sorted.bam']
binLength: 1
numberOfSamples: None
blackListFileName: None
skipZeroOverZero: False
bed_and_bin: False
genomeChunkSize: None
defaultFragmentLength: 0
numberOfProcessors: 16
verbose: False
region: 167_7_4kbrf:1:9169:1
bedFile: None
minMappingQuality: None
ignoreDuplicates: False
chrsToSkip: []
stepSize: 1
center_read: False
samFlag_include: None
samFlag_exclude: None
minFragmentLength: 0
maxFragmentLength: 0
zerosToNans: False
smoothLength: None
save_data: False
out_file_for_raw_data: None
maxPairedFragmentLength: 0
Traceback (most recent call last):
File "/home/alex/anaconda3/envs/tf2.5/bin/bamCoverage", line 12, in
main(args)
File "/home/alex/anaconda3/envs/tf2.5/lib/python3.8/site-packages/deeptools/bamCoverage.py", line 256, in main
wr.run(writeBedGraph.scaleCoverage, func_args, args.outFileName,
File "/home/alex/anaconda3/envs/tf2.5/lib/python3.8/site-packages/deeptools/writeBedGraph.py", line 145, in run
res = mapReduce.mapReduce([func_to_call, func_args],
File "/home/alex/anaconda3/envs/tf2.5/lib/python3.8/site-packages/deeptools/mapReduce.py", line 146, in mapReduce
res = list(map(func, TASKS))
File "/home/alex/anaconda3/envs/tf2.5/lib/python3.8/site-packages/deeptools/writeBedGraph.py", line 27, in writeBedGraph_wrapper
return WriteBedGraph.writeBedGraph_worker(*args)
File "/home/alex/anaconda3/envs/tf2.5/lib/python3.8/site-packages/deeptools/writeBedGraph.py", line 231, in writeBedGraph_worker
coverage, _ = self.count_reads_in_region(chrom, start, end)
File "/home/alex/anaconda3/envs/tf2.5/lib/python3.8/site-packages/deeptools/countReadsPerBin.py", line 501, in count_reads_in_region
tcov = self.get_coverage_of_region(bam, chrom, trans)
File "/home/alex/anaconda3/envs/tf2.5/lib/python3.8/site-packages/deeptools/countReadsPerBin.py", line 674, in get_coverage_of_region
position_blocks = fragmentFromRead_func(read)
File "/home/alex/anaconda3/envs/tf2.5/lib/python3.8/site-packages/deeptools/countReadsPerBin.py", line 902, in get_fragment_from_read
assert fragmentStart < fragmentEnd, "fragment start greater than fragment"
AssertionError: fragment start greater than fragmentend for read A01619:208:H7K7CDSX5:1:2148:11731:2988

And also python code to compute proportion of template lengths of 0:
bam_file = 'NG-34039_167_4_2_lib713577_10294_1_101bp_max250.sorted.bam'
with pysam.AlignmentFile(bam_file, 'rb') as f:
\t tlens = []
\t for read in f.fetch():
\t\t if read.is_read1:
\t\t\t tlens.append(abs(read.template_length))
print(np.sum(np.array(tlens) <= 0) / len(tlens))

And python code to get fragment length from the deeptools function:
from deeptools.getFragmentAndReadSize import get_read_and_fragment_length
print(get_read_and_fragment_length(bam_file, numberOfProcessors=1))

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant