P2C2M.SNAPP uses posterior predictive checks to identify violations to the multispecies coalescent model as implemented in the 'SNAPP' phylogeny estimation program. It was designed to be as accurate and user-friendly as possible.
To install P2C2M.SNAPP, first download the gunzipped tarball and then install from source in R with:
install.packages("path/to/P2C2M.SNAPP_1.0.0.tar.gz", repos = NULL, type = "source")
R (>= 3.5.0), ape (>= 5.3), ggplot2 (>= 3.2.0), graphics (<= 3.5.0), grDevices (>= 3.5.0), gsubfn (>= 0.7), ggtree (>= 1.14.6) - not on CRAN (see below), KRIS (>= 1.1.1), stats (>= 3.5.0), utils (>= 3.5.0)
The easiest way to install ggtree is with code from the Bioconductor website:
if (!requireNamespace("BiocManager", quietly = TRUE))
install.packages("BiocManager")
BiocManager::install("ggtree")
fastsimcoal2 (Excoffier et al., 2013: available at [http://cmpg.unibe.ch/software/fastsimcoal2/]) See fastsimcoal2 website if you encounter errors - they may be due to the version of gcc on your computer. If you see a dyld error, it can be fixed by using the code at https://gist.github.com/jonchang/46f24dd460bab840ba69a24190fe11f8. Thanks to Jonathan Chang and Tara Pelletier for figuring out how to fix this issue.
Mac users need to have XQuartz installed for fastsimcoal2 to run: available at [https://www.xquartz.org]
More installation information can be found in the tutorial or vignette.
The P2C2M.SNAPP package contains a single function, run.p2c2m.snapp. It requires as input the tree, log, and .xml files from a 'SNAPP' analysis as well as a simple metadata text file. With run_mode = 1, P2C2M.SNAPP will extract parameter values from the input files and simulate posterior predictive datasets. Users must then analyze these posterior predictive datasets with 'SNAPP'. We recommend performing these analyses with a computer cluster, as 'SNAPP' is quite computationally demanding. After analyzing the posterior predictive datasets with 'SNAPP', using run_mode = 2 will calculate and compare summary statistics between the empirical and the posterior predictive datasets to identify possible model violations in the empirical dataset. A full tutorial is available from the P2C2M GitHub page [https://github.com/P2C2M].
Please cite P2C2M.SNAPP as: Duckett, D.J., Pelletier, T.A., and Carstens, B.C. 2020. Identifying model violations under the Multispecies Coalescent model with P2C2M.SNAPP. PeerJ 8:e8271 https://doi.org/10.7717/peerj.8271.