/
scan_multi_oneqtl.Rd
89 lines (81 loc) · 3.81 KB
/
scan_multi_oneqtl.Rd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/scan_multi_onechr.R
\name{scan_multi_oneqtl}
\alias{scan_multi_oneqtl}
\title{Perform multivariate, one-QTL model fitting for markers on all chromosomes}
\usage{
scan_multi_oneqtl(
probs_list,
pheno,
kinship_list = NULL,
addcovar = NULL,
cores = 1
)
}
\arguments{
\item{probs_list}{an list of arrays of founder allele probabilities}
\item{pheno}{a matrix of phenotypes}
\item{kinship_list}{a list of kinship matrices, one for each chromosome}
\item{addcovar}{a matrix, n subjects by c additive covariates}
\item{cores}{number of cores for parallelization via parallel::mclapply()}
}
\value{
a tibble with d + 1 columns. First d columns indicate the genetic data (by listing the marker ids) used in the design matrix; last is log10 likelihood
}
\description{
The function first discards individuals with one or more missing phenotypes or missing covariates.
It then infers variance components, Vg and Ve. Both Vg and Ve
are d by d covariance matrices. It uses an expectation maximization algorithm, as
implemented in the `gemma2` R package. `gemma2` R package is an R implementation of the
GEMMA algorithm for multivariate variance component estimation (Zhou & Stephens 2014 Nature methods).
Note that variance components are fitted on a model that uses the d-variate phenotype
but contains no genetic information. This model does, however,
use the specified covariates (after dropping dependent columns
in the covariates matrix).
These inferred covariance matrices, \eqn{\hat{Vg}} and \eqn{\hat{Ve}},
are then used in subsequent model fitting via
generalized least squares.
Generalized least squares model fitting is applied to every marker on
every chromosome.
For a single marker, we fit the model:
\deqn{vec(Y) = Xvec(B) + vec(G) + vec(E)} where
\deqn{G \sim MN(0, K, \hat{Vg})} and \deqn{E \sim MN(0, I, \hat{Ve})} where \eqn{MN} denotes the matrix-variate
normal distribution with three parameters: mean matrix, covariance among rows, and
covariance among columns. \eqn{vec} denotes the vectorization operation, ie, stacking by columns.
\eqn{K} is a kinship matrix, typically calculated by leave-one-chromosome-out methods.
\eqn{Y} is the n by d phenotypes matrix. \eqn{X} is a block-diagonal nd by fd matrix consisting of
d blocks each of dimension n by f. Each n by f block (on the diagonal) contains a matrix of
founder allele probabilities for the n subjects at a single marker. The off-diagonal blocks
have only zero entries.
The log-likelihood is returned for each model. The outputted object is a tibble with
d + 1 columns. The first d columns specify the markers used in the corresponding model fit, while
the last column specifies the log-likelihood value at that d-tuple of markers.
}
\examples{
# read data
n <- 50
pheno <- matrix(rnorm(2 * n), ncol = 2)
rownames(pheno) <- paste0("s", 1:n)
colnames(pheno) <- paste0("tr", 1:2)
probs <- array(dim = c(n, 2, 5))
probs[ , 1, ] <- rbinom(n * 5, size = 1, prob = 0.2)
probs[ , 2, ] <- 1 - probs[ , 1, ]
rownames(probs) <- paste0("s", 1:n)
colnames(probs) <- LETTERS[1:2]
dimnames(probs)[[3]] <- paste0("m", 1:5)
scan_multi_oneqtl(probs_list = list(probs, probs), pheno = pheno, cores = 1)
}
\references{
Knott SA, Haley CS (2000) Multitrait
least squares for quantitative trait loci detection.
Genetics 156: 899–911.
Jiang C, Zeng ZB (1995) Multiple trait analysis
of genetic mapping for quantitative trait loci.
Genetics 140: 1111-1127.
Zhou X, Stephens M (2014) Efficient multivariate linear
mixed model algorithms for genome-wide association studies.
Nature methods 11:407-409.
Broman KW, Gatti DM, Simecek P, Furlotte NA, Prins P, Sen S, Yandell BS, Churchill GA (2019)
R/qtl2: software for mapping quantitative trait loci with high-dimensional data and
multi-parent populations. GENETICS https://www.genetics.org/content/211/2/495.
}