Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Coverage error #593

Closed
joys8998 opened this issue Apr 14, 2021 · 5 comments · Fixed by #598
Closed

Coverage error #593

joys8998 opened this issue Apr 14, 2021 · 5 comments · Fixed by #598
Labels

Comments

@joys8998
Copy link

Hi there, I'm executing the cnvkit pipeline by running each command (not batch mode) on this dataset (in particular samples HG002, HG003, HG004) from this paper with this target file Agilent v7 rgion file.

ref= hg38
blacklist=hg38-blacklist.v2.bed.gz

Specifically I run
cnvkit.py target {input.target} --annotate {input.assembly-gtf} -a {params.avg_size} --split -o {output.target} --short-names

cnvkit.py access {input.ref_fa} -o {output.accessibility} -x {input.blacklist}

cnvkit.py antitarget {input.target} -g {input.accessibility} -o {output.antitarget} -a {params.avg_size}

cnvkit.py autobin {input.bam_samples} -t {input.target} -g {input.accessibility} -m {params.method}

cnvkit.py coverage {input.bam_sample} {input.target} -p {threads} -o {output.sample_target} -c -q 10 && \ cnvkit.py coverage {input.bam_sample} {input.antitarget} -p {threads} -o {output.sample_antitarget} -c -q 10

And Here I get the error in Coverage step. According to the log in coverage.py file when it tryes to calculate log2 depth it crashes. In the file it's written that log2 depth is math.log(depth,2) if depth else NULL_LOG2_COVERAGE so
when depth is 0 NULL_LOG2_COVERAGE is taken
when depth is > 0 depth is taken
but when depth is < 0 an error is thrown math domain error

I have thus 2 questions:

  1. how can it be possible that a sample bam file has depth < 0 on target file created with also this bam file (aka autobin step)?
  2. is it possible to include an exception to handle negative depth values?

Thank you very much

@tetedange13
Copy link
Contributor

tetedange13 commented Apr 16, 2021

Hi @joys8998 !

Not an author of CNVkit, but could you precise what is exactly your Python error?
=> Is it a 'math domain error' ?
(maybe you could post the complete error raised by Python ?)

Have you checked that your really have a negative depth value?
=> For example adding a print where the error occurs
=> Or better, something like that:

try:
    math.log(depth,2) if depth else NULL_LOG2_COVERAGE
except math domain error:
    print(depth) ; sys.exit()

And another question: I see you are using coverage -c parameter that causes CNVkit to calculate coverage using a different algorithm (see the 1st paragraph of coverage subcmd in the doc)
=> Do you have a particular reason?
=> Not sure if it will change something, but maybe try without '-c' ?

Another hint: not sure if this is a good idea on WES data, but maybe try to add the --drop-low-coverage flag ?

Hope this helps.
Have a nice day.
Felix.

@joys8998
Copy link
Author

Hi @tetedange13

Thank you for your reply! Yes that's what I did and I noticed that 1) I get negative depth values and I don't know why since the target is created on top of bam coverage and 2) It's not easy to reproduce, it happens only with WES on germline mode with those samples in particular (HG002, HG003 from GIAB google).

@tskir
Copy link
Collaborator

tskir commented Apr 24, 2021

Hi @joys8998! Thank you for a very detailed bug report. I've reproduced the issue and submitted the fix here: #598.

Also, as @tetedange13 mentioned, running cnvkit.py coverage with the -c flag invokes a non-standard coverage calculation algorithm. Unless you have a specific reason to use it, I would probably advise to stick to the standard algorithm in most situations (that is, without the -c).

@etal
Copy link
Owner

etal commented Apr 24, 2021

Thanks for this report, and I agree with @tskir on both points. The -c option is there to support internal algorithm development, and the default coverage command's behavior without -c is best for normal usage.

The negative depth result would be a quirk in the original arithmetic -- maybe a gapped alignment where the gap spans the target bin boundaries. The fix in #598 should avoid the problem.

@joys8998
Copy link
Author

Dear @etal and @tskir ,

Thanks for your reply, I'll remove -c option!
:)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants