Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dev #28

Merged
merged 11 commits into from
Jul 14, 2021
Merged

Dev #28

Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion .Rbuildignore
Original file line number Diff line number Diff line change
@@ -1,5 +1,4 @@
^docs$
^_pkgdown\.yml$
^.*\.Rproj$
^\.Rproj\.user$
^cran-comments\.md$
Expand All @@ -12,3 +11,5 @@
^doc$
^Meta$
^CRAN-RELEASE$
pkgdown
.travis.yml
3 changes: 2 additions & 1 deletion DESCRIPTION
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
Package: PhenotypeSimulator
Title: Flexible Phenotype Simulation from Different Genetic and Noise Models
Version: 0.3.3
Version: 0.3.4
Authors@R: c(
person("Hannah", "Meyer", email = "hannah.v.meyer@gmail.com", role = c("aut", "cre"),
comment = c(ORCID = "0000-0003-4564-0899")),
Expand Down Expand Up @@ -33,6 +33,7 @@ LinkingTo:
Imports:
methods,
optparse,
Hmisc,
R.utils,
mvtnorm,
snpStats,
Expand Down
2 changes: 1 addition & 1 deletion INDEX.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -53,7 +53,7 @@ knitr::include_graphics("docs/simulatedPhenotypes.png")

## <i class="fa fa-rocket" aria-hidden="true"></i> Installation

The current github version of *PhenotypeSimulator* is 0.3.3 and can be
The current github version of *PhenotypeSimulator* is 0.3.4 and can be
installed via:
```{r, eval=FALSE}
library(devtools)
Expand Down
2 changes: 1 addition & 1 deletion INDEX.md
Original file line number Diff line number Diff line change
Expand Up @@ -43,7 +43,7 @@ customised.

## <i class="fa fa-rocket" aria-hidden="true"></i> Installation

The current github version of *PhenotypeSimulator* is 0.3.3 and can be
The current github version of *PhenotypeSimulator* is 0.3.4 and can be
installed via:

``` r
Expand Down
15 changes: 15 additions & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,18 @@
# PhenotypeSimulator 0.3.4
## Minor changes
1. Fixed missing --genotypefile flag [issue 27](https://github.com/HannahVMeyer/PhenotypeSimulator/issues/27)
2. Update vignettes with new location for impute files and commands to get the
CEU samples [issue 24](https://github.com/HannahVMeyer/PhenotypeSimulator/issues/24),
thanks to @zfuller5280 for the suggestion!)
3. Standardise genotypes on row with major alleles [issue 21](https://github.com/HannahVMeyer/PhenotypeSimulator/issues/21).
Thank you for the detailed bug report by @alanw1!
4. Add option to imput missing genotypes to standardise genotype function;
otherwise, if genotypes are missing, function will fail
[issue 17](https://github.com/HannahVMeyer/PhenotypeSimulator/issues/17)
5. Fix function description and passing of SNP IDs in readStandardGenotypes with
delim option [issue 25](https://github.com/HannahVMeyer/PhenotypeSimulator/issues/25),
thanks @BSchmidt1.

# PhenotypeSimulator 0.3.3
## Minor changes
1. Fixed bug that failed to return causal SNP name when only one SNP was chosen
Expand Down
1 change: 1 addition & 0 deletions R/commandlineFunctions.R
Original file line number Diff line number Diff line change
Expand Up @@ -524,6 +524,7 @@ simulatePhenotypes <- function() {
header=args$header,
sampleID=args$sampleID,
phenoID=args$phenoID,
genotypefile = args$genotypefile,
genoFilePrefix=args$genoFilePrefix,
genoFileSuffix=args$genoFileSuffix,
SNPfrequencies=SNPfrequencies,
Expand Down
43 changes: 32 additions & 11 deletions R/genotypeFunctions.R
Original file line number Diff line number Diff line change
Expand Up @@ -30,8 +30,15 @@ getAlleleFrequencies <- function(snp) {
#' Genotypes are standardised as described in Yang et al:
#' snp_standardised = (snp - 2 * ref_allele_freq)/
#' sqrt(2 * ref_allele_freq * alt_allele_freq).
#'
#' Missing genotypes can be mean-imputed and rounded to nearest integer
#' before standardisation. If genotypes contain missing values and impute is set
#' to FALSE, \code{standardiseGenotypes} will return an error.
#'
#' @param geno [N x NrSNP] Matrix/dataframe of genotypes [integer]/[double].
#' @param impute [logical] Indicating if missing genotypes should be imputed; if
#' set FALSE and data contains missing values, \code{standardiseGenotypes} will
#' return an error.
#' @return [N x NrSNP] Matrix of standardised genotypes [double].
#' @seealso \code{\link{getAlleleFrequencies}}
#' @export
Expand All @@ -40,11 +47,19 @@ getAlleleFrequencies <- function(snp) {
#' @examples
#' geno <- cbind(rbinom(2000, 2, 0.3), rbinom(2000, 2, 0.4),rbinom(2000, 2, 0.5))
#' geno_sd <- standardiseGenotypes(geno)
standardiseGenotypes <- function(geno) {
standardiseGenotypes <- function(geno, impute=FALSE) {
if (any(is.na(geno)) & !impute) {
stop("Missing genotypes found and impute=FALSE, cannot standardise",
"genotypes; remove missing genotypes or set impute=TRUE for mean",
"imputation of genotypes")
}
if (any(is.na(geno)) & impute) {
geno <- round(apply(as.matrix(geno), 2, Hmisc::impute, fun=mean))
}
allele_freq <- sapply(data.frame(geno), getAlleleFrequencies)
var_geno <- sqrt(2*allele_freq[1,]*allele_freq[2,])
var_geno[var_geno == 0] <- 1
geno_mean <- sweep(geno, 2, 2*allele_freq[1,], "-")
geno_mean <- sweep(geno, 2, 2*allele_freq[2,], "-")
geno_sd <- sweep(geno_mean, 2, var_geno, "/")
return (geno_sd)
}
Expand Down Expand Up @@ -177,12 +192,12 @@ simulateGenotypes <- function(N, NrSNP=5000, frequencies=c(0.1, 0.2, 0.4),
#' with minor allele first. The remaining columns are the mean genotypes of
#' different individuals – numbers between 0 and 2 that represents the
#' (posterior) mean genotype, or dosage of the minor allele.
#' \item delim: a [delimter]-delimited file of [NrSNPs x NrSamples] genotypes
#' with the snpIDs in the first column and the sampleIDs in the first row and
#' genotypes encoded as numbers between 0 and 2 representing the (posterior)
#' mean genotype, or dosage of the minor allele. Can be user-genotypes or
#' genotypes simulated with foward-time algorithms such as simupop
#' (\url{http://simupop.sourceforge.net/Main/HomePage}) or MetaSim
#' \item delim: a [delimter]-delimited file of [(NrSNPs+1) x (NrSamples+1)]
#' genotypes with the snpIDs in the first column and the sampleIDs in the first
#' row and genotypes encoded as numbers between 0 and 2 representing the
#' (posterior) mean genotype, or dosage of the minor allele. Can be
#' user-genotypes or genotypes simulated with foward-time algorithms such as
#' simupop (\url{http://simupop.sourceforge.net/Main/HomePage}) or MetaSim
#' (\url{project.org/web/packages/rmetasim/vignettes/CreatingLandscapes.html}),
#' that allow for user-specified output formats.
#' }}
Expand Down Expand Up @@ -218,6 +233,12 @@ simulateGenotypes <- function(N, NrSNP=5000, frequencies=c(0.1, 0.2, 0.4),
#' filename_plink <- gsub("\\.bed", "", filename_plink)
#' data_plink <- readStandardGenotypes(N=100, filename=filename_plink,
#' format="plink")
#'
#' filename_delim <- system.file("extdata/genotypes/",
#' "genotypes_chr22.csv",
#' package = "PhenotypeSimulator")
#' data_delim <- readStandardGenotypes(N=50, filename=filename_delim,
#' format="delim")
readStandardGenotypes <- function(N, filename, format = NULL,
verbose=TRUE, sampleID = "ID_",
snpID = "SNP_", delimiter = ",") {
Expand Down Expand Up @@ -300,11 +321,11 @@ readStandardGenotypes <- function(N, filename, format = NULL,
}
id_samples <- paste(sampleID, 1:N, "_", gsub(":", "", data$V1), sep="")
id_snps <- paste(snpID, 0:(ncol(genotypes) -1), sep="")
format_files = NULL
format_files <- NULL
} else if (format == "delim") {
data <- data.table::fread(filename, data.table=FALSE,
data <- data.table::fread(filename, data.table=FALSE, header=TRUE,
sep=delimiter)
id_snps <- data$V1
id_snps <- data[,1]
genotypes <- t(data[,-1])
colnames(genotypes) <- id_snps
if (N > nrow(genotypes)) {
Expand Down
2 changes: 1 addition & 1 deletion README.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -51,7 +51,7 @@ be customised.
Full documentation of **PhenotypeSimulator** is available at
http://HannahVMeyer.github.io/PhenotypeSimulator/.

The current github version of *PhenotypeSimulator* is 0.3.3 and can be
The current github version of *PhenotypeSimulator* is 0.3.4 and can be
installed via

```{r gh-installation, eval = FALSE}
Expand Down
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,7 @@ customised.
Full documentation of **PhenotypeSimulator** is available at
<http://HannahVMeyer.github.io/PhenotypeSimulator/>.

The current github version of *PhenotypeSimulator* is 0.3.3 and can be
The current github version of *PhenotypeSimulator* is 0.3.4 and can be
installed via

``` r
Expand Down
182 changes: 182 additions & 0 deletions docs/404.html

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading