Skip to content

Commit

Permalink
add document using documenter.jl
Browse files Browse the repository at this point in the history
  • Loading branch information
reworkhow committed Apr 20, 2018
1 parent ff7b3f5 commit a599b48
Show file tree
Hide file tree
Showing 28 changed files with 283 additions and 10 deletions.
16 changes: 7 additions & 9 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,13 +1,11 @@
*.jl.cov
*.jl.mem
.DS_Store
*.jl.cov
*.jl.*.cov
*.jl.mem
*checkpoint.ipynb
*.ipynb_checkpoints/*
*.ipynb_checkpoints
*.DS_Store
*.jl.cov
*.jl.*.cov
*.jl.mem
*checkpoint.ipynb
*ipynb_checkpoints/*
*ipynb_checkpoints


docs/build/
docs/site/
5 changes: 4 additions & 1 deletion .travis.yml
Original file line number Diff line number Diff line change
Expand Up @@ -10,8 +10,11 @@ matrix:
- julia: nightly
notifications:
email: false
after_success:
- julia -e 'Pkg.add("Documenter")'
- julia -e 'cd(Pkg.dir("PACKAGE_NAME")); include(joinpath("docs", "make.jl"))'
# uncomment the following lines to override the default test script
#script:
# - if [[ -a .git/shallow ]]; then git fetch --unshallow; fi
# - julia -e 'ENV["PYTHON"] = ""; Pkg.clone("PyPlot"); Pkg.build("PyPlot")' #for linux
# - julia -e 'ENV["PYTHON"] = ""; Pkg.clone("PyPlot"); Pkg.build("PyPlot")' #for linux
# - julia --check-bounds=yes -e 'Pkg.clone(pwd()); Pkg.build("JWAS"); Pkg.test("JWAS"; coverage=true)'
62 changes: 62 additions & 0 deletions docs/make.jl
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@
using Documenter, JWAS

makedocs(
# modules = [Documenter],
doctest=false,
clean =true,
format = :html,
# assets = ["assets/favicon.ico"],
sitename = "JWAS.jl",
authors = "Hao Cheng, Rohan Fernando, Dorian Garrick and contributors.",
pages = [
"Home" => "index.md",
"Some Theory" => "theory/theory.md",
"Manual" => Any[
"Guide" => "man/guide.md",
"man/examples.md",
"Contributing" => "examples/genomicBLMM.md",
],

],
"Examples" => Any[
"Linear Mixed Model (conventional)" => "examples/conventionalBLMM.md",
"Linear Additive Genetic Model" => "examples/LinearAdditiveGeneticModel.md",
"Linear Mixed Model (Genomic data)" => "examples/genomicBLMM.md",
],
"Library" => Any[
"Public" => "lib/public.md",
"Internals" => "lib/internals.md",
# hide("Internals" => "lib/internals.md", Any[
# "lib/internals/anchors.md",
# "lib/internals/builder.md",
# "lib/internals/cross-references.md",
# "lib/internals/docchecks.md",
# "lib/internals/docsystem.md",
# "lib/internals/documenter.md",
# "lib/internals/documents.md",
# "lib/internals/dom.md",
# "lib/internals/expanders.md",
# "lib/internals/formats.md",
# "lib/internals/generator.md",
# "lib/internals/mdflatten.md",
# "lib/internals/selectors.md",
# "lib/internals/textdiff.md",
# "lib/internals/utilities.md",
# "lib/internals/walkers.md",
# "lib/internals/writers.md",
#])
]
],
# Use clean URLs, unless built as a "local" build
#html_prettyurls = !("local" in ARGS),
#html_canonical = "https://juliadocs.github.io/Documenter.jl/stable/",
)

deploydocs(
repo = "github.com/reworkhow/JWAS.jl.git",
target = "build",
deps = nothing,
make = nothing,
julia = "0.6",
osname = "osx"
)
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes
File renamed without changes.
File renamed without changes.
File renamed without changes.
Binary file added docs/src/assets/BLMM.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/src/assets/JWAS.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/src/assets/logo.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
7 changes: 7 additions & 0 deletions docs/src/examples/LinearAdditiveGeneticModel.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
# Bayesian Linear Additive Genetic Model


## Univariate Linear Additive Genetic Model


## Multivariate Linear Additive Genetic Model
6 changes: 6 additions & 0 deletions docs/src/examples/conventionalBLMM.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
# Bayesian Linear Mixed Models

## Univariate Linear Mixed Model (conventional)


## Multivariate Linear Mixed Model (conventional)
6 changes: 6 additions & 0 deletions docs/src/examples/genomicBLMM.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
# Bayesian Linear Mixed Models (Genomic Data)

## Univariate Linear Mixed Model (Genomic data)


## Multivariate Linear Mixed Model (Genomic data)
19 changes: 19 additions & 0 deletions docs/src/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@

![JWAS](assets/JWAS.png)

JWAS is a well-documented software platform based on Julia and an interactive Jupyter notebook for analyses of general
univariate and multivariate Bayesian mixed effects models. These models are especially useful for, but not limited to,
routine single-trait and multi-trait genomic prediction and genome-wide association studies using either complete or incomplete
genomic data ("single-step" methods). Currently, JWAS provides broad scope of analyses, e.g., a wide collection of Bayesian
methods for whole-genome analyses, including shrinkage estimation and variable selection methods. The features of JWAS include:

* No limitations on fixed effects (e.g. herd-year, age, sex)
* Random effects other than markers (e.g. litter, pen)
* Random effects using pedigree information
* Random permanent environmental effects
* Single-trait analyses
* Multi-trait analyses
* Use of genomic information
* Complete genomic data
* Incomplete genomic data
* Correlated residuals
1 change: 1 addition & 0 deletions docs/src/lib/internals.md
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
# internals
1 change: 1 addition & 0 deletions docs/src/lib/public.md
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
# Some Theory in JWAS
24 changes: 24 additions & 0 deletions docs/src/man/examples.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
# example

# Get Started


```@contents
```

## example 1
```@docs
runMCMC
```

- link to [JWAS.jl Documentation](@ref)
- link to [`add_genotypes`](@ref)

## example 2

```@docs
add_genotypes
add_markers
```

### Test these examples
3 changes: 3 additions & 0 deletions docs/src/man/guide.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
# guide

abc
143 changes: 143 additions & 0 deletions docs/src/theory/theory.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,143 @@
# Some Theory in JWAS

## A Table for Bayesian Linear Mixed Models (BLMM)

![BLMM](../assets/BLMM.png)



## Models

### Complete Genomic Data
The general form of the multivariate (univariate) mixed effects model for individual $i$ from $n$ individuals with complete genomic data in JWAS is

$$
\mathbf{y}_{i}
=\sum_{j=1}^{p_{\beta}}X_{ij}\boldsymbol{\beta}_{j}+\sum_{k=1}^{p_{u}}Z_{ik}\mathbf{u}_{k}
+\sum_{l=1}^{p}M_{il}\boldsymbol{\alpha}_{l}+\mathbf{e}_{i}(1),$$

where $\mathbf{y}_{i}$ is a vector of phenotypes of $t$ traits for individual $i$; $X_{ij}$ is the incidence matrix covariate corresponding to the $j$th fixed effect for individual $i$; $\boldsymbol{\beta}_{j}$ is a vector of $j$th fixed effects for the $t$ traits; $Z_{ik}$ is the incidence matrix covariate corresponding to the $k$th random effect for individual $i$; $\boldsymbol{u}_{k}$ is a vector of the $k$th random effects of $t$ traits; $M_{il}$ is the genotype covariate at locus $l$ for individual $i$, $p$ is the number of genotyped loci (each coded as 0,1,2), $\boldsymbol{\alpha}_{l}$ is a vector of allele substitution effects or marker effects of $t$ traits for locus $j$, and $\mathbf{e}_{i}$ is the vector of random residual effects of $t$ traits for individual $i$. The JWAS implementation of this model involves missing phenotypes being imputed at each iteration of MCMC \cite{sorensenGianolaBook} so that all individuals have observations for all traits. Note that when the number of traits $t=1$, the general form above simplifies to the single-trait mixed effects model, and all vectors of effects in equation (1) become scalars.

### Incomplete Genomic Data

The general form of the multivariate (univariate) mixed effects model with incomplete genomic data ("single-step" methods) for non-genotyped individuals is

```math
\mathbf{y}_{i}
=\sum_{j=1}^{p_{\beta}}X_{ij}\boldsymbol{\beta}_{j}+\sum_{k=1}^{p_{u}}Z_{ik}\mathbf{u}_{k}+
\sum_{l=1}^{p}\hat{M_{il}}\boldsymbol{\alpha}_{l}+\sum_{m=1}^{p_{\epsilon}}Z_{n[i,m]}\boldsymbol{\epsilon}_{m}+\boldsymbol{e}_{i} (2),
```

where $\mathbf{y}_{i}$ is a vector of phenotypes of $t$ traits for non-genotyped individual $i$; $\hat{{M}_{il}}$ is the imputed genotype covariate at locus $l$ for non-genotyped individual $i$, $Z_{n[i,m]}$ is the incidence matrix covariate corresponding to the $m$th imputation residual for individual $i$ and $\boldsymbol{\epsilon}_i$ is a vector of imputation residuals. $W_{im}$ is the incidence matrix covariate corresponding to the $m$th random effect for individual $i$. That vector of imputation residuals, $\boldsymbol{\epsilon}=\begin{bmatrix}\boldsymbol{\epsilon}_{1}^{T} & \boldsymbol{\epsilon}_{2}^{T} & \ldots & \end{bmatrix}^{T}$, are a priori assumed to be $N\left(0,(\mathbf{A}_{nn}-\mathbf{A}_{ng}\mathbf{A}_{gg}^{-1}\mathbf{A}_{gn})\otimes\mathbf{G}_{g}\right)$, where $\mathbf{A}_{nn}$ is the partition of the numerator relationship matrix $\mathbf{A}$ that corresponds to non-genotyped individuals, $\mathbf{A}_{ng}$ or its transpose $\mathbf{A}_{gn}$ are partitions of $\mathbf{A}$ corresponding to relationships between non-genotyped and genotyped individuals or vice versa, $\mathbf{A}_{gg}$ is the partition of $\mathbf{A}$ that corresponds to genotyped animals, and $\mathbf{G}_{g}$ is the additive genetic covariance matrix. All the other variables are the same as in equation (1).

### Priors

#### Priors for effects other than markers

The fixed effects are assigned flat priors. The vector of random effects, $\mathbf{u}=\begin{bmatrix}\mathbf{u}_{1}^{T} & \mathbf{u}_{2}^{T} & \ldots & \mathbf{u}_{p_{2}}^{T}\end{bmatrix}^{T}$, are a priori
assumed to be $N\left(0,\mathbf{A}\otimes\mathbf{G}\right)$ with various options for $\mathbf{A}$. For example, $\mathbf{A}$ could be an identity matrix if $\boldsymbol{u}_{k}$ is assumed to be independently and
identically distributed. $\mathbf{A}$ can be the numerator relationship matrix, when $\boldsymbol{u}$ is a vector of polygenic effects and $\mathbf{G}$
represents the additive-genetic variance not explained by molecular markers. Note that $\boldsymbol{u}$ can also be a concatenation of vectors
of different types of random effects, such as litter, pen, polygenic and maternal effects. The vector $\boldsymbol{e}_{i}$ of residuals are a
priori assumed to be independently and identically following multivariate normal distributions with null mean and
covariance matrix $\mathbf{R}$, which in turn is a priori assumed to have an inverse Wishart prior distribution, $W_{t}^{-1}\left(\mathbf{S}_{e},\nu_{e}\right)$. Note
that when number of traits $t=1$, the priors for $\mathbf{G}$ and $\mathbf{R}$ in single-trait analyses follow scaled inverted chi-square
distributions.

#### Priors for marker effects

##### single-trait BayesA

The prior assumption is that marker effects have identical
and independent univariate-t distributions each with a null mean,
scale parameter $S^2_{\alpha}$ and $\nu$ degrees of freedom.
This is equivalent to assuming that the marker effect at locus $i$ has a univariate normal
with null mean and unknown, locus-specific variance $\sigma^2_i$,
which in turn is assigned a scaled inverse chi-square prior with scale
parameter $S^2_{\alpha}$ and $\nu_{\alpha}$ degrees of freedom.

##### single-trait BayesB

In BayesB, the prior assumption is that marker effects have identical
and independent mixture distributions, where each has a point mass at
zero with probability $\pi$ and a univariate-t distribution with
probability $1-\pi$ having a null mean, scale parameter $S^2_{\alpha}$
and $\nu$ degrees of freedom. Thus, BayesA is a special case of BayesB
with $\pi=0$. Further, as in BayesA, the t-distribution in BayesB is
equivalent to a univariate normal with null mean and unknown,
locus-specific variance, which in turn is assigned a scaled inverse chi-square
prior with scale parameter $S^2_{\alpha}$ and $\nu_{\alpha}$ degrees
of freedom. *(A fast and efficient Gibbs sampler was implemented for BayesB in JWAS.)*

##### single-trait BayesC and BayesC$\pi$

In BayesC, the prior assumption is that marker effects have identical
and independent mixture distributions, where each has a point mass at
zero with probability $\pi$ and a univariate-normal distribution with
probability $1-\pi$ having a null mean and variance
$\sigma^2_{\alpha}$, which in turn has a scaled inverse chi-square
prior with scale parameter $S^2_{\alpha}$ and $\nu_{\alpha}$ degrees
of freedom. In addition to the above assumptions, in BayesC $\pi$, $\pi$ is treated
as unknown with a uniform prior.

##### multiple-trait BayesABC

In multi-trait BayesC$\Pi$, the prior for $\alpha_{lk}$, the marker effect of trait $k$ for locus $l$, is a mixture with a point mass at zero and a
univariate normal distribution conditional on $\sigma_{k}^{2}$:

```math
\begin{align*}
\alpha_{lk}\mid\pi_{k},\sigma_{k}^{2} & \begin{cases}
\sim N\left(0,\,\sigma_{k}^{2}\right) & probability\;(1-\pi_{k})\\
0 & probability\;\pi_{k}
\end{cases}
\end{align*}
```
and the covariance between effects for traits $k$ and $k'$ at the same locus, i.e., $\alpha_{lk}$ and $\alpha_{lk^{'}}$ is

```math
\begin{align*}
cov\left(\alpha_{lk},\alpha_{lk^{'}}\mid\sigma_{kk^{'}}\right)=\begin{cases}
\sigma_{kk^{'}} & \:if\:both\,\alpha_{lk}\neq0\:and\:\alpha_{lk^{'}}\neq0\\
0 & \:otherwise
\end{cases}.
\end{align*}
```

The vector of marker effects
at a particular locus $\boldsymbol{\alpha}_{l}$ is written as
$\boldsymbol{\alpha}_{l}=\boldsymbol{D}_{l}\boldsymbol{\beta}_{l}$,
where $\boldsymbol{D}_{l}$ is a diagonal matrix with elements $diag\left(\boldsymbol{D}_{l}\right)=\boldsymbol{\delta}_{l}=\left(\delta_{l1},\delta_{l2},\delta_{l3}\ldots\delta_{lt}\right)$,
where $\delta_{lk}$ is an indicator variable indicating whether the marker effect of locus
$l$ for trait $k$ is zero or non-zero, and the vector
$\boldsymbol{\beta}_{l}$ follows a multivariate normal distribution
with null mean and covariance matrix $\boldsymbol{G}$. The covariance matrix $\boldsymbol{G}$ is $a$ $priori$ assumed to follow
an inverse Wishart distribution, $W_{t}^{-1}\left(\mathbf{S}_{\beta},\nu_{\beta}\right)$.

In the most general case, any marker effect might be zero for any possible combination
of $t$ traits resulting in $2^{t}$ possible combinations of $\boldsymbol{\delta}_{l}$. For example, in a $t$=2 trait model, there are $2^{t}=4$ combinations
for $\boldsymbol{\delta}_{l}$: $(0,\,0)$, $(0,\,1)$, $(1,\,0)$, $(1,\,1)$. Suppose in general we use numerical labels "1", "2",$\ldots$, "$l$" for the $2^{t}$ possible
outcomes for $\boldsymbol{\delta}_{l}$, then the prior for $\boldsymbol{\delta}_{l}$ is a categorical distribution

```math
\begin{align*}
& p\left(\boldsymbol{\delta}_{l}=``i"\right)\\
= & \Pi_{1}I\left(\boldsymbol{\delta}_{l}=``1"\right)+\Pi_{2}I\left(\boldsymbol{\delta}_{l}=``2"\right)+...+\Pi_{l}I\left(\boldsymbol{\delta}_{l}=``l"\right),
\end{align*}
```

where $\sum_{i=1}^{l}\Pi_{i}=1$ with $\Pi_{i}$ being the prior probability that the vector $\boldsymbol{\delta}_{l}$ corresponds to the vector labelled $"i"$. A Dirichlet distribution with all parameters equal to one, i.e., a uniform distribution, can be used for the prior for
$\boldsymbol{\Pi}=\left(\Pi_{1},\Pi_{2},...,\Pi_{l}\right)$.

The differences
in multi-trait BayesB method is that the prior for $\boldsymbol{\beta}_{l}$
is a multivariate t distribution, rather than a multivariate normal distribution. This is equivalent to assuming $\boldsymbol{\beta}_{l}$ has a multivariate normal distribution with null mean and locus-specific covariance matrix $\boldsymbol{G}_{l}$, which is assigned an inverse
Wishart prior, $W_{t}^{-1}\left(\mathbf{S}_{\beta},\nu_{\beta}\right)$. Multi-trait BayesA method is a special case of
multi-trait BayesB method where $\boldsymbol{\delta}_{l}$ is always a vector of ones.

> #### references
> * Meuwissen T, Hayes B, Goddard M. Prediction of total genetic value using genome-wide dense marker maps. Genetics, 157:1819–1829, 2001.
> * Fernando R, Garrick D. Bayesian methods applied to GWAS. Methods Mol Biol. 2013;1019:237–274.
> * Cheng H, Garrick D, Fernando R. A fast and efficient Gibbs sampler for BayesB in whole- genome analyses. Genet Sel Evol, 2015, 47:80.
> * Fernando R, Dekkers J,Garrick D. A class of Bayesian methods to combine large numbers of genotyped and non-genotyped animals for whole-genome analyses. Genetics Selection Evolution, 2015 46(1), 50.
> * Cheng H, Kizilkaya K, Zeng J, Garrick D, Fernando R. Genomic Prediction from Multiple-trait Bayesian Regression Methods using Mixture Priors. Genetics. 2018

0 comments on commit a599b48

Please sign in to comment.