Flexible Genotyping of Polyploids using Next Generation Sequencing Data
Switch branches/tags
Nothing to show
Clone or download

README.md

updog

Build Status Appveyor Build status codecov License: GPL v3 CRAN_Status_Badge

Updog provides a suite of methods for genotyping polyploids from next-generation sequencing (NGS) data. It does this while accounting for many common features of NGS data: allelic bias, overdispersion, sequencing error, and (possibly) outlying observations. It is named updog for “Using Parental Data for Offspring Genotyping” because we originally developed the method for full-sib populations, but it works now for more general populations. The method is described in detail here and here.

The main function is flexdog, which provides many options for the distribution of the genotypes in your sample.

Also provided are:

  • An experimental function mupdog, which allows for correlation between the individuals’ genotypes while jointly estimating the genotypes of the individuals at all provided SNPs. The implementation uses a variational approximation. This is designed for samples where the individuals share a complex relatedness structure (e.g. siblings, cousins, uncles, half-siblings, etc). Right now there are no guarantees about this function’s performance.
  • Functions to simulate genotypes (rgeno) and read-counts (rflexdog). These support all of the models available in flexdog.
  • Functions to evaluate oracle genotyping performance: oracle_joint, oracle_mis, oracle_mis_vec, and oracle_cor. We mean “oracle” in the sense that we assume that the entire data generation process is known (i.e. the genotype distribution, sequencing error rate, allelic bias, and overdispersion are all known). These are good approximations when there are a lot of individuals (but not necessarily large read-depth).

The original updog package is now named updogAlpha and may be found here.

See also ebg, fitPoly, and TET, and polyRAD. Our best “competitor” is probably fitPoly.

See NEWS for the latest updates on the package.

Vignettes

I’ve included many vignettes in updog, which you can access online here.

Bug Reports

If you find a bug or want an enhancement, please submit an issue here.

Installation

You can install updog from CRAN in the usual way:

install.packages("updog")

You can install the current (unstable) version of updog from Github with:

# install.packages("devtools")
devtools::install_github("dcgerard/updog")

CVXR

If you want to use the use_cvxr = TRUE option in flexdog (not generally recommended), you will need to install the CVXR package. Before I could install CVXR in Ubuntu, I had to run in the terminal

sudo apt-get install libmpfr-dev

and then run in R

install.packages("Rmpfr")

How to Cite

Please cite

Gerard, D., Ferrão, L. F. V., Garcia, A. A. F., & Stephens, M. (2018). Genotyping Polyploids from Messy Sequencing Data. Genetics, 210(3), 789-807. doi: 10.1534/genetics.118.301468.

Or, using BibTex:

@article {gerard2018genotyping,
    author = {Gerard, David and Ferr{\~a}o, Luis Felipe Ventorim and Garcia, Antonio Augusto Franco and Stephens, Matthew},
    title = {Genotyping Polyploids from Messy Sequencing Data},
    volume = {210},
    number = {3},
    pages = {789--807},
    year = {2018},
    doi = {10.1534/genetics.118.301468},
    publisher = {Genetics},
    issn = {0016-6731},
    URL = {https://doi.org/10.1534/genetics.118.301468},
    journal = {Genetics}
}

Code of Conduct

Please note that this project is released with a Contributor Code of Conduct. By participating in this project you agree to abide by its terms.