Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

snpEff Error for Manta: end before start. #1205

Closed
bwubb opened this issue Jan 29, 2016 · 14 comments
Closed

snpEff Error for Manta: end before start. #1205

bwubb opened this issue Jan 29, 2016 · 14 comments

Comments

@bwubb
Copy link

bwubb commented Jan 29, 2016

Something in a Manta.vcf.gz broke snpEff.

[2016-01-27T11:30Z] ChrPosStats.sample(78): Error counting samples on chromosome 'GL000210.1'. Position '27712' => count[277]  (count.length: 277, factor: 100, chrLength: 27682).
[2016-01-27T11:30Z] Error: Error while processing VCF entry (line 10156) :
[2016-01-27T11:30Z]     GL000220.1  1   MantaBND:231425:0:1:2:0:0:0 G   ]GL000205.1:122316]G    .   PASS    SVTYPE=BND;MATEID=MantaBND:231425:0:1:2:0:0:1;IMPRECISE;CIPOS=-114,115;BND_DEPTH=100;MATE_BND_DEPTH=222;AC=0;AN=0   PR  68,33
[2016-01-27T11:30Z] java.lang.RuntimeException: Interval error: end before start.
[2016-01-27T11:30Z]     Class        : Marker
[2016-01-27T11:30Z]     Start        : 0
[2016-01-27T11:30Z]     End          : -1
[2016-01-27T11:30Z]     ID           :
[2016-01-27T11:30Z]     Parent class : Chromosome
[2016-01-27T11:30Z]     Parent       : GL000220.1   0-161801 CHROMOSOME 'GL000220.1'
[2016-01-27T11:30Z] java.lang.RuntimeException: Interval error: end before start.
[2016-01-27T11:30Z]     Class        : Marker
[2016-01-27T11:30Z]     Start        : 0
[2016-01-27T11:30Z]     End          : -1
[2016-01-27T11:30Z]     ID           :
[2016-01-27T11:30Z]     Parent class : Chromosome
[2016-01-27T11:30Z]     Parent       : GL000220.1   0-161801 CHROMOSOME 'GL000220.1'
[2016-01-27T11:30Z]     at ca.mcgill.mcb.pcingola.interval.Interval.<init>(Interval.java:38)
[2016-01-27T11:30Z]     at ca.mcgill.mcb.pcingola.interval.Marker.<init>(Marker.java:37)
[2016-01-27T11:30Z]     at ca.mcgill.mcb.pcingola.snpEffect.HgvsDna.isDuplication(HgvsDna.java:124)
[2016-01-27T11:30Z]     at ca.mcgill.mcb.pcingola.snpEffect.HgvsDna.toString(HgvsDna.java:377)
[2016-01-27T11:30Z]     at ca.mcgill.mcb.pcingola.snpEffect.VariantEffect.getHgvsDna(VariantEffect.java:531)
[2016-01-27T11:30Z]     at ca.mcgill.mcb.pcingola.vcf.VcfEffect.set(VcfEffect.java:1027)
[2016-01-27T11:30Z]     at ca.mcgill.mcb.pcingola.vcf.VcfEffect.<init>(VcfEffect.java:145)
[2016-01-27T11:30Z]     at ca.mcgill.mcb.pcingola.outputFormatter.VcfOutputFormatter.addInfo(VcfOutputFormatter.java:97)
[2016-01-27T11:30Z]     at ca.mcgill.mcb.pcingola.outputFormatter.VcfOutputFormatter.toString(VcfOutputFormatter.java:285)
[2016-01-27T11:30Z]     at ca.mcgill.mcb.pcingola.outputFormatter.OutputFormatter.endSection(OutputFormatter.java:112)
[2016-01-27T11:30Z]     at ca.mcgill.mcb.pcingola.outputFormatter.VcfOutputFormatter.endSection(VcfOutputFormatter.java:229)
[2016-01-27T11:30Z]     at ca.mcgill.mcb.pcingola.outputFormatter.OutputFormatter.printSection(OutputFormatter.java:145)
[2016-01-27T11:30Z]     at ca.mcgill.mcb.pcingola.snpEffect.commandLine.SnpEffCmdEff.annotate(SnpEffCmdEff.java:276)
[2016-01-27T11:30Z]     at ca.mcgill.mcb.pcingola.snpEffect.commandLine.SnpEffCmdEff.annotateVcf(SnpEffCmdEff.java:443)
[2016-01-27T11:30Z]     at ca.mcgill.mcb.pcingola.snpEffect.commandLine.SnpEffCmdEff.annotate(SnpEffCmdEff.java:139)
[2016-01-27T11:30Z]     at ca.mcgill.mcb.pcingola.snpEffect.commandLine.SnpEffCmdEff.run(SnpEffCmdEff.java:969)
[2016-01-27T11:30Z]     at ca.mcgill.mcb.pcingola.snpEffect.commandLine.SnpEffCmdEff.run(SnpEffCmdEff.java:921)
[2016-01-27T11:30Z]     at ca.mcgill.mcb.pcingola.snpEffect.commandLine.SnpEff.run(SnpEff.java:883)
[2016-01-27T11:30Z]     at ca.mcgill.mcb.pcingola.snpEffect.commandLine.SnpEff.main(SnpEff.java:128)
[2016-01-27T11:30Z] Uncaught exception occurred
Traceback (most recent call last):
  File "/home/bwubb/bcbio-nextgen/anaconda/lib/python2.7/site-packages/bcbio/provenance/do.py", line 21, in run
    _do_run(cmd, checks, log_stdout)
  File "/home/bwubb/bcbio-nextgen/anaconda/lib/python2.7/site-packages/bcbio/provenance/do.py", line 95, in _do_run
    raise subprocess.CalledProcessError(exitcode, error_msg)
CalledProcessError: Command 'set -o pipefail; /home/bwubb/bcbio-tools/bin/snpEff -Xms750m -Xmx30720m -Djava.io.tmpdir=/home/bwubb/NGS/WGS_PCC-PGL/data/PP011-F1/work/structural/PP011-F1/manta/tx/tmp0Nsptl/tmp eff -dataDir /home/bwubb/bcbio-nextgen/genomes/Hsapiens/GRCh37/snpeff -cancer -noLog -i vcf -o vcf -s /home/bwubb/NGS/WGS_PCC-PGL/data/PP011-F1/work/structural/PP011-F1/manta/tumorSV-PP011-F1-effects-stats.html GRCh37.75 /home/bwubb/NGS/WGS_PCC-PGL/data/PP011-F1/work/structural/PP011-F1/manta/tumorSV-PP011-F1.vcf.gz | /home/bwubb/tabix-0.2.6/bgzip -c > /home/bwubb/NGS/WGS_PCC-PGL/data/PP011-F1/work/structural/PP011-F1/manta/tx/tmp0Nsptl/tumorSV-PP011-F1-effects.vcf.gz
ChrPosStats.sample(78): Error counting samples on chromosome 'GL000210.1'. Position '27712' => count[277]  (count.length: 277, factor: 100, chrLength: 27682).
Error: Error while processing VCF entry (line 10156) :
...

I am not quite sure why/how this variant has a Start=0 and and End=-1, but there are a few of this same error before it is killed. I believe this only occurred in 1 of 17 samples. Thanks.

-bwubb

@chapmanb
Copy link
Member

Brad;
Sorry about the problem. I'm not sure what is going on here -- it seems like something in the Manta output triggers a boundary condition in snpEff. I thought from the output it was that the variant starts at the first position (1) but I couldn't reproduce just trying that. Would you be able to send me the offending variant to see if we can reproduce?

zgrep ^# /home/bwubb/NGS/WGS_PCC-PGL/data/PP011-F1/work/structural/PP011-F1/manta/tumorSV-PP011-F1.vcf.gz > snpeff-problem.vcf
zgrep 'MantaBND:231425:0:1:2:0:0:' /home/bwubb/NGS/WGS_PCC-PGL/data/PP011-F1/work/structural/PP011-F1/manta/tumorSV-PP011-F1.vcf.gz >> snpeff-problem.vcf

Thanks much for reporting the issue and hopefully we can get it squared away to keep your analysis going.

@bwubb
Copy link
Author

bwubb commented Feb 2, 2016

gist ok?
https://gist.github.com/bwubb/9c48d81b0d02bc96b48b

or I can attach here or email directly.

chapmanb added a commit to bioconda/bioconda-recipes that referenced this issue Feb 4, 2016
@chapmanb
Copy link
Member

chapmanb commented Feb 4, 2016

Brad;
Thanks for posting that gist, that helped to reproduce and isolate the problem. It looks like this was a bug in snpEff 4.1l, which has been fixed in the latest release 4.2. I updated the snpEff dependency, which will also require re-preparing your local snpEff databases to match the version change. If you update with:

bcbio_nextgen.py upgrade -u development --tools --data

it should annotate these calls cleanly. Sorry about all the issues and hope this gets your run finished cleanly.

@chapmanb chapmanb closed this as completed Feb 4, 2016
@bwubb
Copy link
Author

bwubb commented Feb 8, 2016

Perhaps this is an unrelated issue, but I can not upgrade successfully. First was an exit code 4 from a bash script but I was able to run that manually. Now there is a 'perl: symbol lookup error'. I am contacting my HPC admin as well to check if this is their fault some how.

2016-02-08 12:04:40 (0.00 B/s) - “homo_sapiens_vep_83_GRCh37.tar.gz” saved [5047961760]

Traceback (most recent call last):
  File "/home/bwubb/bcbio-tools/bin/bcbio_nextgen.py", line 207, in <module>
    install.upgrade_bcbio(kwargs["args"])
  File "/home/bwubb/bcbio-nextgen/anaconda/lib/python2.7/site-packages/bcbio/install.py", line 89, in upgrade_bcbio
    upgrade_bcbio_data(args, REMOTES)
  File "/home/bwubb/bcbio-nextgen/anaconda/lib/python2.7/site-packages/bcbio/install.py", line 261, in upgrade_bcbio_data
    _upgrade_vep_data(s["fabricrc_overrides"]["galaxy_home"], tooldir)
  File "/home/bwubb/bcbio-nextgen/anaconda/lib/python2.7/site-packages/bcbio/install.py", line 297, in _upgrade_vep_data
    effects.prep_vep_cache(dbkey, ref_file, tooldir)
  File "/home/bwubb/bcbio-nextgen/anaconda/lib/python2.7/site-packages/bcbio/variation/effects.py", line 102, in prep_vep_cache
    do.run("%s && %s" % (perl_exports, " ".join(cmd)), "Prepare VEP directory for %s" % ensembl_name)
  File "/home/bwubb/bcbio-nextgen/anaconda/lib/python2.7/site-packages/bcbio/provenance/do.py", line 21, in run
    _do_run(cmd, checks, log_stdout)
  File "/home/bwubb/bcbio-nextgen/anaconda/lib/python2.7/site-packages/bcbio/provenance/do.py", line 95, in _do_run
    raise subprocess.CalledProcessError(exitcode, error_msg)
subprocess.CalledProcessError: Command 'set -o pipefail; export PATH=/home/bwubb/bcbio-nextgen/anaconda/bin:$PATH && export PERL5LIB=/home/bwubb/bcbio-tools/lib/perl5:$PERL5LIB && /home/bwubb/bcbio-tools/bin/vep_install.pl -a c -s homo_sapiens_vep_83_GRCh37 -c /home/bwubb/bcbio-nextgen/genomes/Hsapiens/GRCh37/vep -u /home/bwubb/bcbio-nextgen/genomes/Hsapiens/GRCh37/vep/homo_sapiens/txtmp
perl: symbol lookup error: /home/bwubb/bcbio-tools/lib/perl5/x86_64-linux-thread-multi/auto/Time/HiRes/HiRes.so: undefined symbol: Perl_Istack_sp_ptr
' returned non-zero exit status 127

@chapmanb
Copy link
Member

chapmanb commented Feb 9, 2016

Brad;
Sorry about the problems. We're migrated to use a threaded version of Perl in bioconda/bcbio and it looks like you might have gotten caught with a non-threaded runtime. Could you try this workaround to see if it fixes it?

/home/bwubb/bcbio-nextgen/anaconda/bin/conda uninstall perl perl-threaded
bcbio_nextgen.py upgrade --tools

If so, I'll push this fix to the install to prevent the problem in the future. Thanks again for all the patience getting this going.

chapmanb added a commit to chapmanb/cloudbiolinux that referenced this issue Feb 10, 2016
Avoids potential conflicts between perl and perl-threaded
(bcbio/bcbio-nextgen#1205).

Add in impute2 to standard bcbio install.
@bwubb
Copy link
Author

bwubb commented Feb 26, 2016

So it turns out I am still having issues with the OP, or something very similar.

[2016-02-25T18:58Z] snpEff effects : PP236-F1
[2016-02-25T19:02Z] Error: Error while processing VCF entry (line 5166) :
[2016-02-25T19:02Z]     GL000241.1  1   MantaBND:194083:0:1:0:5:0:0 G   ]GL000219.1:59505]G .   MaxDepth    SVTYPE=BND;MATEID=MantaBND:194083:0:1:0:5:0:1;CIPOS=0,10;HOMLEN=10;HOMSEQ=ATCTGAAGAC;BND_DEPTH=101;MATE_BND_DEPTH=247;AC=0;AN=0 PR:SR   0,0:0,0
[2016-02-25T19:02Z] java.lang.RuntimeException: Interval error: end before start.
[2016-02-25T19:02Z]     Class        : Marker
[2016-02-25T19:02Z]     Start        : 0
[2016-02-25T19:02Z]     End          : -1
[2016-02-25T19:02Z]     ID           :
[2016-02-25T19:02Z]     Parent class : Chromosome
[2016-02-25T19:02Z]     Parent       : GL000241.1   0-42151 CHROMOSOME 'GL000241.1'
[2016-02-25T19:02Z] java.lang.RuntimeException: Interval error: end before start.
[2016-02-25T19:02Z]     Class        : Marker
[2016-02-25T19:02Z]     Start        : 0
[2016-02-25T19:02Z]     End          : -1
[2016-02-25T19:02Z]     ID           :
[2016-02-25T19:02Z]     Parent class : Chromosome
[2016-02-25T19:02Z]     Parent       : GL000241.1   0-42151 CHROMOSOME 'GL000241.1'
[2016-02-25T19:02Z]     at ca.mcgill.mcb.pcingola.interval.Interval.<init>(Interval.java:38)
[2016-02-25T19:02Z]     at ca.mcgill.mcb.pcingola.interval.Marker.<init>(Marker.java:37)
[2016-02-25T19:02Z]     at ca.mcgill.mcb.pcingola.snpEffect.HgvsDna.isDuplication(HgvsDna.java:117)
[2016-02-25T19:02Z]     at ca.mcgill.mcb.pcingola.snpEffect.HgvsDna.toString(HgvsDna.java:394)
[2016-02-25T19:02Z]     at ca.mcgill.mcb.pcingola.snpEffect.VariantEffect.getHgvsDna(VariantEffect.java:544)
[2016-02-25T19:02Z]     at ca.mcgill.mcb.pcingola.vcf.VcfEffect.set(VcfEffect.java:1043)
[2016-02-25T19:02Z]     at ca.mcgill.mcb.pcingola.vcf.VcfEffect.<init>(VcfEffect.java:147)
[2016-02-25T19:02Z]     at ca.mcgill.mcb.pcingola.outputFormatter.VcfOutputFormatter.addInfo(VcfOutputFormatter.java:97)
[2016-02-25T19:02Z]     at ca.mcgill.mcb.pcingola.outputFormatter.VcfOutputFormatter.toString(VcfOutputFormatter.java:285)
[2016-02-25T19:02Z]     at ca.mcgill.mcb.pcingola.outputFormatter.OutputFormatter.endSection(OutputFormatter.java:112)
[2016-02-25T19:02Z]     at ca.mcgill.mcb.pcingola.outputFormatter.VcfOutputFormatter.endSection(VcfOutputFormatter.java:229)
[2016-02-25T19:02Z]     at ca.mcgill.mcb.pcingola.outputFormatter.OutputFormatter.printSection(OutputFormatter.java:145)
[2016-02-25T19:02Z]     at ca.mcgill.mcb.pcingola.snpEffect.commandLine.SnpEffCmdEff.annotate(SnpEffCmdEff.java:278)
[2016-02-25T19:02Z]     at ca.mcgill.mcb.pcingola.snpEffect.commandLine.SnpEffCmdEff.annotateVcf(SnpEffCmdEff.java:446)
[2016-02-25T19:02Z]     at ca.mcgill.mcb.pcingola.snpEffect.commandLine.SnpEffCmdEff.annotate(SnpEffCmdEff.java:138)
[2016-02-25T19:02Z]     at ca.mcgill.mcb.pcingola.snpEffect.commandLine.SnpEffCmdEff.run(SnpEffCmdEff.java:984)
[2016-02-25T19:02Z]     at ca.mcgill.mcb.pcingola.snpEffect.commandLine.SnpEffCmdEff.run(SnpEffCmdEff.java:939)
[2016-02-25T19:02Z]     at ca.mcgill.mcb.pcingola.snpEffect.commandLine.SnpEff.run(SnpEff.java:978)
[2016-02-25T19:02Z]     at ca.mcgill.mcb.pcingola.snpEffect.commandLine.SnpEff.main(SnpEff.java:136)
[2016-02-25T19:02Z] Uncaught exception occurred
Traceback (most recent call last):
  File "/home/bwubb/bcbio-nextgen/anaconda/lib/python2.7/site-packages/bcbio/provenance/do.py", line 21, in run
    _do_run(cmd, checks, log_stdout)
  File "/home/bwubb/bcbio-nextgen/anaconda/lib/python2.7/site-packages/bcbio/provenance/do.py", line 95, in _do_run
    raise subprocess.CalledProcessError(exitcode, error_msg)
CalledProcessError: Command 'set -o pipefail; /home/bwubb/bcbio-tools/bin/snpEff -Xms750m -Xmx30720m -Djava.io.tmpdir=/home/bwubb/NGS/WGS_PCC-PGL/data/PP236-F1/work/structural/PP236-F1/manta/tx/tmp7FpKxi/tmp eff -dataDir /home/bwubb/bcbio-nextgen/genomes/Hsapiens/GRCh37/snpeff -cancer -noLog -i vcf -o vcf -s /home/bwubb/NGS/WGS_PCC-PGL/data/PP236-F1/work/structural/PP236-F1/manta/tumorSV-PP236-F1-effects-stats.html GRCh37.75 /home/bwubb/NGS/WGS_PCC-PGL/data/PP236-F1/work/structural/PP236-F1/manta/tumorSV-PP236-F1.vcf.gz | /home/bwubb/tabix-0.2.6/bgzip -c > /home/bwubb/NGS/WGS_PCC-PGL/data/PP236-F1/work/structural/PP236-F1/manta/tx/tmp7FpKxi/tumorSV-PP236-F1-effects.vcf.gz

That is how it starts. There are a few of those and then finally an exit code of 255. I have done all of the upgrades and unsets in this thread, and the other issues seem to have been resolved, but this one still remains. Thanks.

@chapmanb
Copy link
Member

Brad;
Sorry about the continued problems. I'd hoped the upgrade to 4.2 would resolve the issues but it looks like there are still other problems. If we can generate a reproducible case again we can report upstream to the snpEff team to resolve. Could you do:

zgrep ^# /home/bwubb/NGS/WGS_PCC-PGL/data/PP236-F1/work/structural/PP236-F1/manta/tumorSV-PP236-F1-effects-stats.html GRCh37.75 /home/bwubb/NGS/WGS_PCC-PGL/data/PP236-F1/work/structural/PP236-F1/manta/tumorSV-PP236-F1.vcf.gz > snpeff-problem.vcf
zgrep 'MantaBND:194083:0:1:0:5:0' /home/bwubb/NGS/WGS_PCC-PGL/data/PP236-F1/work/structural/PP236-F1/manta/tumorSV-PP236-F1-effects-stats.html GRCh37.75 /home/bwubb/NGS/WGS_PCC-PGL/data/PP236-F1/work/structural/PP236-F1/manta/tumorSV-PP236-F1.vcf.gz >> snpeff-problem.vcf

and post the output as a gist again? We can try to put together a minimal example case from that for reporting. Thanks for the help debugging.

@bwubb
Copy link
Author

bwubb commented Mar 2, 2016

https://gist.github.com/bwubb/8b425c7a4bba244ea2d7

Much appreciated. Let me know what more I can do, I have a few samples from several projects that have a similar issue so I can always produce more debug data.

@chapmanb
Copy link
Member

chapmanb commented Mar 7, 2016

Brad;
Thanks for the reproducible case. I was able to isolate the problem to HGVS parsing and temporary disabled this for structural variant calls in bcbio to avoid the problem. I also reported the problem to Pablo (pcingola/SnpEff#128). If you upgrade to the development version it should work cleanly for you now. Thanks again for all the help and patience.

joemphilips pushed a commit to joemphilips/cloudbiolinux that referenced this issue May 17, 2016
Avoids potential conflicts between perl and perl-threaded
(bcbio/bcbio-nextgen#1205).

Add in impute2 to standard bcbio install.
@rmande
Copy link

rmande commented Jul 8, 2016

I'm running into the same issue even though I'm running snpEff 4.2

@chapmanb
Copy link
Member

chapmanb commented Jul 9, 2016

Sorry about the issue. We're planning to move to snpEff 4.3 with a new release next week so hopefully that will resolve the underlying issue. If you want to fix it sooner you could do:

bcbio_conda install -c bioconda snpeff=4.3
bcbio_nextgen.py upgrade --data

which will move to snpEff 4.3 and then update the snpEff database files to match. Hope that fixes the problem for you.

@rmande
Copy link

rmande commented Aug 19, 2016

So is snpEff 4.3 available? I downloaded the latest SnpEff and tried annotating again, but I got the same error message. I could email my vcf to you and you could take a look at it; maybe that would help solve the problem. Please let me know.

@rmande
Copy link

rmande commented Aug 19, 2016

Brad- I will email the vcf to the email listed on your GitHub profile. Thanks again for your help.

@chapmanb
Copy link
Member

Rohit -- we do install snpEff 4.3 with bcbio now which has resolved the majority of these errors in all of our testing. If you do:

snpEff -version

does it report 4.3? If so, we can dig more into the cause of your issue with your VCF and try to come up with a reproducible error case. My suggestion would be to minimize your VCF to the lines that are breaking snpEff based on the error message and then post as a gist (http://gist.github.com/). Hope this helps.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants