Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

samtools coverage for WES using bed file #1662

Open
amyhouseman opened this issue May 30, 2022 · 11 comments
Open

samtools coverage for WES using bed file #1662

amyhouseman opened this issue May 30, 2022 · 11 comments
Assignees

Comments

@amyhouseman
Copy link

Hello,

I was wondering if it's possible to add both bam and bed files as input for samtools coverage?

Thanks,
Amy

@whitwham
Copy link
Contributor

I am not sure I understand your request. Could you expand on what you want and, more importantly, why you want it?

Understanding your problem would help us know how to fix the issue.

@amyhouseman
Copy link
Author

Hello,

I have whole exome sequences, I have aligned them to hg38. I want to work out the % coverage across the reference genome at the points that the bed file states were sequenced. I may be wrong but samtools coverage seems to provide these results in your example samtools coverage -r chr1:1M-12M input.bam

However, rather than just a part of the chromosome I would like to do this using the regions of the exome capture stated in the bed file.

I may be misunderstanding the use of samtools coverage, so apologies if that is the case. I have used samtools depth to work out the average depth but would like some sort of coverage across the parts of the whole exome regions to go alongside.

Sorry if I am wrong, and if this doesn't make sense. I am still learning. Thank you for your reply! Amy

@whitwham
Copy link
Contributor

Now I understand. You are right, samtools coverage only accepts one region where samtools depth can also accept bed files.

It should be possible to add a bed file option to samtools coverage.

@jkbonfield
Copy link
Contributor

There's also samtools bedcov which is explicitly designed for that scenario, but the output format is a little different. Does that suffice?

@amyhouseman
Copy link
Author

That's great, I'll have a go at bedcov, thank you!

@amyhouseman
Copy link
Author

amyhouseman commented Jun 6, 2022

Hello, bedcov worked well - is there a way to get some similar stats that samtools coverage outputs like:

Number of reads:
Covered bases:
Percent covered:
Mean coverage:
Mean baseQ:
Mean mapQ:

from the bedcov output?
and also, what the columns of the bedcov file mean.

Thanks,
Amy

@amyhouseman amyhouseman reopened this Jun 7, 2022
@nicolo-tellini
Copy link

Hello,

I think that what @amyhouseman proposed could be a really great improvement over running the command window by window.

Thanks for your availability
Nicolò

@amyhouseman
Copy link
Author

I actually ended up using qualimap for this rather than bedcov, it gave me more stats and output results that were useful

@codebuzer
Copy link

Hello Everyone,
I am trying get the gene wise coverage for that I use both samtool(command line) and pysam (python wrapper), everything is good like when I align the results of both the approach covered bases and covered percent are same but a problem occur when covered bases are more than this (end - start + 1 ), and because of that out covered percentage increased even more than 100% (while using pysam) but for the same region samtool using command line gives different like less than 100% or 100% but not more than 100%. I wanted to know how samtool handle these type of situation when at the time of alignment number of covered bases are more than the number of region itself. I am new to this so it might be possible that my question could be wrong.
I will be thankful if anyone here will explain the algorithmic approach of samtool to calculate the coverage especially when covered bases are more than point of the interested region.

@nicolo-tellini
Copy link

Hi,

is there any advance on that?

thanks

@whitwham
Copy link
Contributor

It is on my to-do but I have not got around to doing anything about it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants