Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scalpel InDel calling support #428

Closed
mjafin opened this issue May 22, 2014 · 65 comments
Closed

Scalpel InDel calling support #428

mjafin opened this issue May 22, 2014 · 65 comments

Comments

@mjafin
Copy link
Contributor

mjafin commented May 22, 2014

Looks like vcf support has been added to Scalpel recently: http://sourceforge.net/p/scalpel/code/ci/master/tree/

Opening this ticket while I'm looking into testing Scalpel and integrating it within bcbio, bear with me

@lbeltrame
Copy link
Contributor

Quite interesting. It looks like it might be another good candidate for somatic variant calling.

@mjafin
Copy link
Contributor Author

mjafin commented May 22, 2014

My thoughts exactly. Could implement it as a separate tool or bunch it with MuTect.

@lbeltrame
Copy link
Contributor

Given that MuTect doesn't handle indels and Scalpel does exactly fit the niche, IMO it looks sensible to bundle both together.

Brad, what do you think about it?

@chapmanb
Copy link
Member

Miika;
Thanks for heading this up, Scalpel support would be very welcome. Please let me know how I can help as you get into it.

A focus on somatic calling to start with makes a lot of the sense but it would be great if you could make this general so we could inject it into different tools as an indel supplement. A lot of somatic callers focus on SNPs so I'd imagine it would be generally useful (as long as it benchmarks well). Thanks again.

@mjafin
Copy link
Contributor Author

mjafin commented May 22, 2014

cool, I'll look into this in the next few weeks

@mjafin
Copy link
Contributor Author

mjafin commented May 25, 2014

OK so I've started work on Scalpel integration here:
https://github.com/mjafin/bcbio-nextgen/commit/3d6ff8172c6ae4779e70725d4ad3331738a272f9

I reckon we can make calls to scalpel.py from other callers like mutect.py fairly easily and combine the variants as I'm currently doing for mutect+SID.

I've logged a few tickets about Scalpel here https://sourceforge.net/p/scalpel/bugs/3/, https://sourceforge.net/p/scalpel/tickets/2/ and https://sourceforge.net/p/scalpel/tickets/1/ but in the meantime I fixed some of the show stoppers myself in https://github.com/mjafin/scalpel

I've been also testing Scalpel on real data and in single sample mode it catches indels well. However, in somatic mode it somehow produces empty vcf output always. My test cases are really very obvious indels that are clearly there in the somatic sample and not in the normal. I've checked coverage is good in both and in single sample mode Scalpel produces the indel for the somatic sample OK. I'll see if I can see something obvious in the code.

@chapmanb
Copy link
Member

Miika -- this is great, thank you for doing this work. Definitely keep us up to date on the bugs and tumor/normal results. I'll be happy to integrate this and scalpel installation whenever you think things are in a useful state. I'm excited to test this out.

@mjafin
Copy link
Contributor Author

mjafin commented May 26, 2014

@chapmanb looking to get this working in the next 24h or so. Is there any chance you could introduce an insertion and deletion in the cancer test data? I think it only currently has SNPs (maybe I'm wrong).

@chapmanb
Copy link
Member

Miika;
Good idea. I've been wanting to update this with data from the ICGC-TCGA Dream challenge. Synthetic dataset 3 looks like it would be good for our purposes:

https://www.synapse.org/#!Synapse:syn312572/wiki/62018

I'll download and give this a go on subsetting for a more useful test case. Unfortunately I don't believe they've released a truth VCF but we could bug them for this now that the challenge is complete.

@lbeltrame
Copy link
Contributor

In data lunedì 26 maggio 2014 11:32:11, Brad Chapman ha scritto:

I'll download and give this a go on subsetting for a more useful test case.
Unfortunately I don't believe they've released a truth VCF but we could bug
them for this now that the challenge is complete.

Agreed! The more test datasets we have for this, the better.
Miika: do you think it would be OK for me to pull from your github clone, or
should I wait? I have a few datasets I could use to test it with.

@mjafin
Copy link
Contributor Author

mjafin commented May 26, 2014

@chapmanb @lbeltrame Great ideas both - I know one of my colleagues is partaking in the Dream challenge and has some data but I don't know if I'm allowed to pass it on. Let me know what you think.

@lbeltrame would be awesome if you could test it out! You need to clone https://github.com/mjafin/scalpel and run make in the root dir, then set the root dir into your PATH. In terms of bcbio you'd need to pull this https://github.com/mjafin/bcbio-nextgen/tree/scalpel. I'll do a pull request with bcbio master soon, but there are some rough corners still:

  • In mutect, it always uses scalpel and ignores SomaticIndelDetector. I want to make this slightly better somehow
  • There's no way of calling scalpel independently as of yet; scalpel only works as part of mutect (either in single or paired mode though)

If you see any bugs lemme know!

Edit:

  • For scalpel to work it needs to be able to see samtools in path. Even though this hack speeds scalpel up quite a bit (avoids loading full genome into memory for every call), it's still somewhat slow compared to the other tools

Edit2:

  • In paired mode scalpel has this validation step, which expands the search radius in terms of bp around the putative InDels. I noticed this effectively leads to missing InDels in short exons, such as the EGFR exon 19 deletion (that AZ has a vested interest in). I therefore pick the InDels from the pre-validation step.

@mjafin
Copy link
Contributor Author

mjafin commented May 26, 2014

Oh, and scalpel supports min_allele_fraction.

@mjafin
Copy link
Contributor Author

mjafin commented May 26, 2014

So the way I'm pulling the scalpel calls together is by combining inferred somatic and common InDels, and setting the FILTER field for the common InDels to REJECT (just made a commit here https://github.com/mjafin/bcbio-nextgen/commit/3b8aea914e0700bad7c3b67e4ea9199fa7ba5d32). Another way would be to have an INFO field value for somatic or common I suppose. In tumor-only mode I'm just reporting all discovered indels (obviously not optimal).

Anyways, @chapmanb would you like me to do a pull request of my scalpel branch or would you be able run the basic tests by cloning my scalpel branch?

chapmanb added a commit that referenced this issue May 27, 2014
@chapmanb
Copy link
Member

Miika;
Thanks for all this. Understood about the rough edges, but I'm happy to have the basic support merged in so long as it doesn't break existing pipelines. This should make it easier for others to test and get us to a useful set of calls.

I added better tumor/normal test cases using chr22 data from the ICGC-TCGA dream challenge BAMs that includes indels. I'm not sure if it has a somatic specific indel (VarScan and FreeBayes disagree on one potential) but it definitely has more complex regions for assessment so can hopefully test Scalpel better.

Thanks again for all the work.

@lbeltrame
Copy link
Contributor

somatic and common InDels, and setting the FILTER field for the common

How do you define "common"?

199fa7ba5d32). Another way would be to have an INFO field value for somatic

I think you should add SOMATIC to putative positive somatic indels. See my
code in freebayes.py to get an idea on how to do so. The reasoning is that by
doing so, we'll get also these calls flagged correctly in GEMINI if we use it,
and that makes selecting somatic variants trivial.

(VarScan 2 needs more adjustments as I'd love to flag calls that are germline,
because IIRC it adds SOMATIC also to those - but I'll get to look at it
eventualy).

@mjafin
Copy link
Contributor Author

mjafin commented May 27, 2014

@lbeltrame By common I mean InDels that are both in the tumour and normal, and have the same inferred zygosity.

@chapmanb Big thanks - how do I trigger bcbio to download/run the chr22 test data?

@lbeltrame
Copy link
Contributor

@lbeltrame By common I mean InDels that are both in the tumour and normal,
and have the same inferred zygosity.

Ok, thanks for the explanation: indeed marking REJECT would be probably the
best choice at the moment.

I'm hoping to get scalpel to run on my data later this week, no promises though (have to attend a conference the next weekend).

@mjafin
Copy link
Contributor Author

mjafin commented May 27, 2014

@lbeltrame I'll look into the SOMATIC info field when I get the chance, thanks for the pointer.

I ran FreeBayes previously for the ERP002442 study sample 10-497-T. However, for whatever reason, the VT (somatic, LOH, germline) tag is only present in 2 out of the 600 indels. Scalpel is identifying 300 indels it thinks are somatic.

@mjafin mjafin mentioned this issue May 27, 2014
@lbeltrame
Copy link
Contributor

I ran FreeBayes previously for the ERP002442 study sample 10-497-T. However,
for whatever reason, the VT (somatic, LOH, germline) tag is only present in
2 out of the 600 indels. Scalpel is identifying 300 indels it thinks are

Are you running FreeBayes using bcbio-nextgen? Look for SOMATIC tags if so, I
made the code strip the original VT tags.

@mjafin
Copy link
Contributor Author

mjafin commented May 27, 2014

Are you running FreeBayes using bcbio-nextgen? Look for SOMATIC tags if so, I
made the code strip the original VT tags.

Ah, I must be looking at a run finished quite some time ago then. Will need to rerun FreeBayes. Or better yet, identify a proper dataset like the DREAM one.

@chapmanb
Copy link
Member

Miika;
The chr22 subset is now part of the test data so will download automatically if you do:

cd bcbio-nextgen/tests
run_tests.sh cancer

It'll be in tests/data/tcga_benchmark.

@mjafin
Copy link
Contributor Author

mjafin commented May 27, 2014

Thanks @chapmanb, the output definitely contains (scalpel) indels now!

@mjafin
Copy link
Contributor Author

mjafin commented Jun 4, 2014

Closing, as per #432, thanks!

@mjafin mjafin closed this as completed Jun 4, 2014
@mjafin
Copy link
Contributor Author

mjafin commented Jun 4, 2014

Actually, I don't think scalpel is producing any InDels at all. Looking at the logs:

sh: /apps/bcbio-nextgen/latest-devel/rhel6-x64/Cellar/scalpel/0.1.1-beta-2014-05-27/Microassembler/Microassembler: No such file or directory

Looks like the Microassembler part hasn't compiled.

@mjafin mjafin reopened this Jun 4, 2014
@mjafin
Copy link
Contributor Author

mjafin commented Jun 5, 2014

As an aside, I ran the NA12878 exome benchmark and Scalpel isn't doing too well (even when it's working! Used my own compiled version). HaplotypeCaller is catching some 6400 indels, FreeBayes around 6k and scalpel 3600 indels using default settings. Disappointing, to be honest.

@lbeltrame
Copy link
Contributor

HaplotypeCaller is catching some 6400 indels, FreeBayes around 6k and
scalpel 3600 indels using default settings. Disappointing, to be honest.

Did you have any go at somatic indels, e.g. Scalpel compared to the Somatic
Indel Detector?

@lbeltrame
Copy link
Contributor

This is slightly off-topic, but I'm getting these errors with VarScan:
..but only when running in the queue. Seems to be running fine in local
single core mode. Odd.

Hm, fishy. It looks however like some other code path. VarScan 2 ran fine last
time I did it, but I admit I didn't have time to keep up with bcbio-nextgen's
development (what you get for being almost the only bioinfo person in the
whole institution...).

@mjafin
Copy link
Contributor Author

mjafin commented Jun 5, 2014

Well, varscan is officially also pants :)
(Worse than SID in InDels)

caller  SNP/TP  InD/TP  SNP/FN InD/FN  SNP/FP   InD/FP
varscan 837     407     21     461     4256     930

@lbeltrame
Copy link
Contributor

I assume this does include the strand bias filter? (if you did through bcbio-nextgen, yes, it was included).
Grim results indeed...

@chapmanb
Copy link
Member

chapmanb commented Jun 5, 2014

Miika;
Thanks for this. That matches what I found for VarScan in germline calling, although I was holding out hope that paired calling might be better:

http://bcbio.wordpress.com/2013/02/06/an-automated-ensemble-method-for-combining-and-evaluating-genomic-variants-from-multiple-callers/

From these results it seems like the best calling approach right out of the box is MuTecT plus an indel caller. If we can tweak and improve scalpel that would be ideal. Pindel is also an option. I also have faith we can tweak FreeBayes calls to improve them, based on success with germline calling. It would be ideal to have a couple of choices, and I'd also like to have a non-licensed option.

Thanks again, that's a huge help to focus future effort. Much appreciated.

@mjafin
Copy link
Contributor Author

mjafin commented Jun 5, 2014

@chapmanb Which LCR bed file would you recommend pulling? (I'm not using bcbio to manage my genomes)

@chapmanb
Copy link
Member

chapmanb commented Jun 5, 2014

Miika;
Here is the download URL and post-processing steps to get a bgzipped indexed BED file. This also included grep magic to convert to hg19 if you prefer that over GRCh37:

https://github.com/chapmanb/cloudbiolinux/blob/master/cloudbio/biodata/dbsnp.py#L170

@lbeltrame
Copy link
Contributor

I wonder if, given these results, we can split out some bullet points of things that can be done from here? So we can, potentially, split the work ahead.

@mjafin
Copy link
Contributor Author

mjafin commented Jun 5, 2014

@chapmanb Thanks I'll give that a try!

@lbeltrame

  • REJECT non-somatic variants in varscan/FreeBayes(/SID)
  • Try and improve FreeBayes false positives
  • Ensemble calling
  • ...

The chr19 Dream data is synthetic so we could possibly use that as a basic benchmark data set. I took the BAM files, used samtools to extract the reads into fastq (discarding singletons) and started from there.

@mjafin
Copy link
Contributor Author

mjafin commented Jun 5, 2014

Just a quick update, excluding the low complexity regions reduced the false positive InDel calls for Scalpel (1058) and VarDict (717) to around 350. Although I think Scalpel was advertised in the paper to be immune to repetitive regions..?

@mjafin
Copy link
Contributor Author

mjafin commented Jun 6, 2014

OK, another quick update, with the latest fixes to FreeBayes and excluding the low complexity regions, its indel false positive rate has dropped to 185! The SNP FP rate is still relatively high but its something to work from.

I'll close the ticket for now.

@mjafin mjafin closed this as completed Jun 6, 2014
chapmanb added a commit that referenced this issue Jun 9, 2014
…428). This check already handled as part of VarScan calling. Move test to samtools, where it is needed. This will ened reworking/testing with samtools 0.2.0 calling
@chapmanb
Copy link
Member

chapmanb commented Jun 9, 2014

Miika;
Thanks for the LCR update. It's great that removing them helped avoid issues. My take away from looking at them is that they should either be removed or annotated since they're going to be a large cause of false positives, both from calling and comparison. Longer term incorporating lobSTR (http://lobstr.teamerlich.org/), or further evaluating scalpel or other realigners in these regions might let us be better but this is a bigger research project.

I also fixed the VarScan scaling issue you reported to hopefully help with future evaluations at scale. Thanks again.

kern3020 pushed a commit to kern3020/bcbio-nextgen that referenced this issue Jun 10, 2014
…cbio#428). This check already handled as part of VarScan calling. Move test to samtools, where it is needed. This will ened reworking/testing with samtools 0.2.0 calling
@chapmanb
Copy link
Member

Miika;
There is a new pre-print from the Cold Spring Harbor folks about reducing false positive indel calls, primarily using Scalpel. Beyond avoiding LCRs, especially A/T repeats, and protocol suggestions they have some filters for Scalpel high quality variants based on CHI2 and ALTCOV (Materials and Methods in 'Classifications of INDEL with calling quality based on the validation data of sample K8101' ):

http://biorxiv.org/content/biorxiv/early/2014/06/10/006148.full.pdf

The caller might already apply some of these as part of the standard process, but it's an interesting read in addition to the practical stuff. The 60x coverage for WGS indel detection is a good reference to have.

@mjafin
Copy link
Contributor Author

mjafin commented Jun 11, 2014

Thanks Brad, will definitely give that a thorough read! I think their focus is on lower coverage data and I suspect (based on personal communication with the authors) that high(er) coverage regions can make the microassembler not converge. One of the authors' suggestion was to downsample the data, but I don't know how practical that would be.

In any case, I'm worried that with such a low sensitivity for the NA12878 sample (only 3k+) it really doesn't matter what the specificity is - the sensitivity is unacceptable. If you have the chance to run scalpel for any NA12878 it would be great, just to be sure I haven't botched something..

@mjafin
Copy link
Contributor Author

mjafin commented Jun 12, 2014

Just another quick update, I managed to run ensemble calling from bcbio.variation on the combo of MuTect (SNPs only), FreeBayes and VarDict. I had to manually remove any variants that were REJECTed and also the 'normal' sample column.

The SNP TP rate was up there with FreeBayes (857 vs. 856) and the FP rate was marginally higher than that of MuTect (102->106). For InDels, there were only 707 TP but only 14 FP!

Edit. I'll need to rerun all of this, with all the latest updates to all the tools, at some point and update the tables.

@lbeltrame
Copy link
Contributor

The SNP TP rate was up there with FreeBayes (857 vs. 856) and the FP rate
was marginally higher than that of MuTect (102->106). For InDels, there
were only 707 TP but only 14 FP!

Impressive results! Can you share the parameters you used for bcbio.variation?
My biggest issue with it was that I was not sure what to use.

(And Brad, I wonder if all the findings / results from these investigation
could land in a blog post / guide for somatic calls)

@mjafin
Copy link
Contributor Author

mjafin commented Jun 12, 2014

@lbeltrame I used pretty standard settings:

ensemble:
  classifier-params:
    type: svm
  classifiers:
    balance:
    - AD
    - FS
    - Entropy
    calling:
    - ReadPosEndDist
    - PL
    - PLratio
    - Entropy
    - NBQ
  format-filters:
  - DP < 4
  trusted-pct: 0.5
intervals: /ngs/oncology/analysis/external/icgc/dream/chr19_test/tumor-paired/work/align/tumor/2_2014-06-04_tumor-paired-sort-callable.bed
names:
- freebayes
- mutect
- vardict
prep-inputs: false

However I suspect the output indels might be the intersect set of freebayes and vardict as I don't think any of the above annotation will be in the calls.. but then again I'm still trying to understand how the ensemble calling actually works.

This is still very much 'live' work and requires quite a bit of manual intervention (plus I don't know where the chr19 DREAM data could be pulled from without username/password). Our guys also haven't made the vardict code publicly available yet. I'm happy to write something up though once things settle down a bit.

@lbeltrame
Copy link
Contributor

In data giovedì 12 giugno 2014 07:14:39, Miika Ahdesmaki ha scritto:

code publicly available yet. I'm happy to write something up though once
things settle down a bit.

Thanks. It goes without saying, but of course I'm going to help on that too.

While I don't have many data sets available, we're starting to do some
validations on varying allelic fractions data sets and we plan on validating
the low fraction variants with droplet digital PCR, so we could
(technically) be able to detect mutations in the 1% range (of course, with
impure samples like tumors it's not that easy...).

To add to the discussion and provide some data from my experience on targeted
resequencing:

  • So far MuTect is the best on the SNP calling front even for targeted
    resequencing: our successful validations come from it by largest part ;
  • VarScan suffers from insane strand bias as I've mentioned, and I've seen it
    call SNPs as somatic while they were in fact germline (pyrograms don't lie ;)
    and only then I realized that the same locus had been REJECTed by MuTect
  • For indels, we're doing some rounds of validation soon at varying fractions,
    so I'll make sure to let you know how it goes
  • You can actually validate mutations called with fractions lower than 5%,
    however you need highly sensitive methods and be prepared to deal with a lot
    of noise
  • Predicted fractions less than 10% are hard to validate even with
    pyrosequencing

We've validated a MuTect called SNP with digital PCR with a predicted fraction
of 3% (observed ~1%) and approximately 1000X in coverage (although with a lot
of effort). VarScan has by default a lower limit of 10% IIRC.

@chapmanb
Copy link
Member

Luca and Miika;
Thanks for all this. This is great progress. I hope after we finish this structural variant calling work we'll have more time to help push this forward as well.

It's great to hear about the Ensemble results, and I suspect it's probably what Miika suggested: the variants called by two reliable callers. My longer term thinking for Ensemble calling is to reduce it to something simpler and rely on these type of heuristics. The SVM can recover a small fraction of additional variants but I'm not sure all the time spent and tuning is worth the gain. A simpler approach should be able to speed things up and make the overall process much cleaner. Thanks again.

@mjafin
Copy link
Contributor Author

mjafin commented Jun 17, 2014

There is a new pre-print from the Cold Spring Harbor folks about reducing false positive indel calls, primarily using Scalpel. Beyond avoiding LCRs, especially A/T repeats, and protocol suggestions they have some filters for Scalpel high quality variants based on CHI2 and ALTCOV (Materials and Methods in 'Classifications of INDEL with calling quality based on the validation data of sample K8101' )

We could possibly filter variants if CHI2 is higher than 10.8 as per the manuscript - I saw some performance improvement from this in the Dream data (but could be just overfitting). ALTCOV within the Scalpel vcf files is a bit of a misnomer, as it actually refers to the coverage of non-REF, non-ALT alternative alleles (third parties if you will).

@chapmanb
Copy link
Member

chapmanb commented Sep 1, 2014

To follow up on Miika's initial evaluation work we now have an automated whole genome evaluation for cancer data using DREAM challenge synthetic dataset 3 (https://www.synapse.org/#!Synapse:syn312572/wiki/62018):

https://github.com/chapmanb/bcbio-nextgen/blob/master/config/examples/cancer-dream-syn3-getdata.sh
https://github.com/chapmanb/bcbio-nextgen/blob/master/config/examples/cancer-dream-syn3.yaml

Here are results for MuTect, VarScan, VarDict and FreeBayes with the current development version:

comparison

This confirms all the observations above and gives us a practical dataset we can iterate and improve on. The plans are to try and improve filtering to help with false positives then begin testing Ensemble calling and other approaches to get a high quality final callset. It's exciting to see cancer calling continue to improve -- thanks to everyone for their help so far with this.

@lbeltrame
Copy link
Contributor

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512

Absolutely terrific! Thanks to everyone involved!


Luca Beltrame, Ph.D.
Translational Genomics Unit
Department of Oncology
Istituto di Ricerche Farmacologiche "Mario Negri" IRCCS
-----BEGIN PGP SIGNATURE-----
Version: APG v1.1.1

iQI/BAEBCgApBQJUBKMnIhxMdWNhIEJlbHRyYW1lIDxsYmVsdHJhbWVAa2RlLm9y
Zz4ACgkQAT+lC24aTnkxsBAAhxr/ZxEJqfFffwH+R0DRpPSgNlh0wEWrmfFaBANt
FMo7vkjxBf8OYjtP2FeNV8rqpA7MFu+8adkNeMDFkcC0ov/44wskMJZWo3vZWf9N
nL6H6BhpKIGbiitH2dOiGqJM5ozoCsGerbv9BwM/ACcGNYrg/Kyh8q3W72WRI2MH
5edisQorYxnoeW+rUo4eEnfD7YD+5VmKiOHKbsOP6WByNoTPOVze1RebLx+Jo53y
7h35y7rhuYuXezzqISxdb1GmZNCT99bQQbw9+ZxVp42E47WrfukDMxLqlXTdO9No
PKdMlRHIlPdx+7y5yvahp+OL9bOD2JO3gzdBPfq56eGe/vcOJpChx9XrwP1cdoEn
kipfw8M1NQyTEJQ6vJ6+Ezrmj5d6ffhtZ5o0eGJ9Hdk6ynzp2vaU636CrkZIQffz
4R3g/NEaya3OY5M0E0Qa3TnglfgkJ1PJkECBIXgiBcoPAVUPgI9SkexWXkkc7qqj
7wtDKf1AeDlsEGDzCAZO206XvphqtNh5p6zay2/XRNEP2UxxXx3A0jgZK2FJwyYK
46pTAAOt0YKgWOeXsJy1+buhFNkzO1In2KamGG1nWMnoVLKTpXsqNpAEfWPx/Aun
1Js6xLyQAi8iM5ey/u4xejxTzyt7gS1rA1lLf9tw20IT+cBqAqeTsTJiNu351/sm
lW0=
=W89B
-----END PGP SIGNATURE-----

@zhaoming159753
Copy link

Hi mjafin ,
I have a question? could scalpel use for WGS(not WES) of Tumor/Normal sample detecting somatic indels?

@chapmanb
Copy link
Member

chapmanb commented Oct 9, 2015

It does work on WGS samples as well, but is quite slow since Scalpel is primarily developed on exomes. We use it in cases where we need indels, but also recommend the VarDict variant caller included in bcbio which calls indels as well as Scalpel and runs much quicker. Hope this helps.

@lbeltrame
Copy link
Contributor

lbeltrame commented Oct 9, 2015 via email

@chapmanb
Copy link
Member

chapmanb commented Oct 9, 2015

Luca -- VarDict calls indels as part of the standard calling. You don't use it as indelcaller but as its own separate caller under variantcaller. We're still working on a post with the latest filtering results, but it does as well at MuTect on SNPs and equivalent or better than Scalpel on Indels so is a good all around caller:

https://github.com/bcbio/bcbio.github.io/blob/master/_posts/2015-10-05-vardict-filtering.md

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants