<ul class="breadcrumb">
  <li><a href="1_Conventional_Linear_Mixed_Model.ipynb">Bayesian Linear Mixed Models (Conventional)</a></li>
  <li><a href="2_Linear_Additive_Genetic_Model.ipynb">Bayesian Linear Additive Genetic Model</a></li> 
  <li><a href="3_Genomic_Linear_Mixed_Model.ipynb">Bayesian Linear Mixed Models (Genomic Data)</a></li>
</ul>

<div class="span5 alert alert-success">
 <font size="5" face="Georgia">Single-step Bayesian Regression (Incomplete Genomic Data)</font> 
</div>

In [1]:
include("/home/ubuntu/work/Github/JWAS.jl/src/JWAS.jl")

JWAS

<button type="button" class="btn btn-lg btn-primary">Step 1: Load Packages</button> 

In [2]:
using JWAS,JWAS.Datasets,DataFrames,CSV

<button type="button" class="btn btn-lg btn-primary">Step 2: Read data</button> 

In [3]:
phenofile  = Datasets.dataset("example","phenotypes_ssbr.txt")
pedfile    = Datasets.dataset("example","pedigree.txt")
genofile   = Datasets.dataset("example","genotypes.txt")

phenotypes = CSV.read(phenofile,delim = ',',header=true)
pedigree   = get_pedigree(pedfile,separator=",",header=true);

[32mcoding pedigree...   8%|███                             |  ETA: 0:00:01[39m

Finished!


[32mcoding pedigree... 100%|████████████████████████████████| Time: 0:00:00[39m


In [4]:
head(phenotypes)

Unnamed: 0,ID,y1,y2,y3,x1,x2,x3,dam
1,a1,-0.06,3.58,-1.18,0.9,2,m,0
2,a2,-0.6,4.9,0.88,0.3,1,f,0
3,a3,-2.07,3.19,0.73,0.7,2,f,0
4,a4,-2.63,6.97,-0.83,0.6,1,m,a2
5,a5,2.31,3.5,-1.52,0.4,2,m,a2
6,a6,0.93,4.87,-0.01,5.0,2,f,a3


<div class="span5 alert alert-success">
 <font size="5" face="Georgia">Single-trait Single-step Bayesian Regression (Incomplete Genomic Data)</font> 
</div>

<button type="button" class="btn btn-lg btn-primary">Step 3: Build Model Equations</button> 

In [5]:
model_equation1  ="y1 = intercept + x1*x3 + x2 + x3 + ID + dam";

In [6]:
R      = 1.0
model1 = build_model(model_equation1,R);

<button type="button" class="btn btn-lg btn-primary">Step 4: Set Factors or Covariates</button> 

In [7]:
set_covariate(model1,"x1");

<button type="button" class="btn btn-lg btn-primary">Step 5: Set Random or Fixed Effects</button> 

In [8]:
G1 = 1.0
G2 = eye(2)
set_random(model1,"x2",G1);
set_random(model1,"ID dam",pedigree,G2);

<button type="button" class="btn btn-lg btn-primary">Step 6: Use Genomic Information</button> 

In [9]:
G3 =1.0
add_genotypes(model1,genofile,G3,separator=',');

5 markers on 7 individuals were added.


In [10]:
JWAS.outputEBV(model1,["a1","a2","a3"]);

Estimated breeding values and prediction error variances will be included in the output.


<button type="button" class="btn btn-lg btn-primary">Step 7: Run Analysis</button> 

In [11]:
outputMCMCsamples(model1,"x2")
out1=runMCMC(model1,phenotypes,methods="BayesC",estimatePi=true,single_step_analysis=true,pedigree=pedigree,chain_length=5000,output_samples_frequency=100);


The prior for marker effects variance is calculated from 
the genetic variance and π. The prior for the marker effects variance 
is: 0.492462



A Linear Mixed Model was build using model equations:

y1 = intercept + x1*x3 + x2 + x3 + ID + dam

Model Information:

Term            C/F          F/R            nLevels
intercept       factor       fixed                1
x1*x3           interaction  fixed                2
x2              factor       random               2
x3              factor       fixed                2
ID              factor       random              12
dam             factor       random              12
ϵ               factor       random               5
J               covariate    fixed                1

MCMC Information:

methods                                      BayesC
chain_length                                   5000
burnin                                            0
estimatePi                                     true
starting_value                        

[32mrunning MCMC for BayesC...100%|█████████████████████████| Time: 0:00:01[39m


<button type="button" class="btn btn-lg btn-primary">Check Results</button> 

In [12]:
out1["Posterior mean of Pi"]

0.5019453755046704

In [13]:
out1

Dict{Any,Any} with 7 entries:
  "Posterior mean of polyg… => [3.19318 -0.449993; -0.449993 2.27363]
  "EBV_y1"                  => Any["a1" -1.29728; "a2" -0.780137; "a3" -2.84001]
  "Posterior mean of marke… => Any["m1" -0.0130229; "m2" -0.122099; … ; "m4" -0…
  "Posterior mean of resid… => 2.98433
  "Posterior mean of marke… => 0.527822
  "Posterior mean of locat… => 37×4 DataFrames.DataFrame…
  "Posterior mean of Pi"    => 0.501945

In [14]:
res=out1["Posterior mean of location parameters"]

Unnamed: 0,Trait,Effect,Level,Estimate
1,1,intercept,intercept,20.9864
2,1,x1*x3,x1 * m,-3.39436
3,1,x1*x3,x1 * f,0.571842
4,1,x2,2,0.128036
5,1,x2,1,-0.127236
6,1,x3,m,-18.0363
7,1,x3,f,-21.0072
8,1,ID,a12,0.50457
9,1,ID,a10,-0.163578
10,1,ID,a11,0.0868416


In [15]:
out1

Dict{Any,Any} with 7 entries:
  "Posterior mean of polyg… => [3.19318 -0.449993; -0.449993 2.27363]
  "EBV_y1"                  => Any["a1" -1.29728; "a2" -0.780137; "a3" -2.84001]
  "Posterior mean of marke… => Any["m1" -0.0130229; "m2" -0.122099; … ; "m4" -0…
  "Posterior mean of resid… => 2.98433
  "Posterior mean of marke… => 0.527822
  "Posterior mean of locat… => 37×4 DataFrames.DataFrame…
  "Posterior mean of Pi"    => 0.501945

<div class="span5 alert alert-success">
 <font size="5" face="Georgia">Multi-trait Single-step Bayesian Regression (Incomplete Genomic Data)</font> 
</div>

<button type="button" class="btn btn-lg btn-primary">Step 3: Build Model Equations</button> 

In [16]:
model_equation2 ="y1 = intercept + x1 + x3 + ID + dam
                  y2 = intercept + x1 + x2 + x3 + ID
                  y3 = intercept + x1 + x1*x3 + x2 + ID";

In [17]:
R      = eye(3)
model2 = build_model(model_equation2,R);

<button type="button" class="btn btn-lg btn-primary">Step 4: Set Factors or Covariates</button> 

In [18]:
set_covariate(model2,"x1");

<button type="button" class="btn btn-lg btn-primary">Step 5: Set Random or Fixed Effects</button> 

In [19]:
G1 = eye(2)
G2 = eye(4)
set_random(model2,"x2",G1);
set_random(model2,"ID dam",pedigree,G2);

[1m[36mINFO: [39m[22m[36mx2 is not found in model equation 1.
[39m[1m[36mINFO: [39m[22m[36mdam is not found in model equation 2.
[39m[1m[36mINFO: [39m[22m[36mdam is not found in model equation 3.
[39m

<button type="button" class="btn btn-lg btn-primary">Step 6: Use Genomic Information</button> 

In [20]:
G3 = eye(3)
add_genotypes(model2,genofile,G3,separator=',');

5 markers on 7 individuals were added.


<button type="button" class="btn btn-lg btn-primary">Step 7: Run Analysis</button> 

In [21]:
out2=runMCMC(model2,phenotypes,methods="BayesC",estimatePi=true,single_step_analysis=true,pedigree=pedigree,chain_length=5000,output_samples_frequency=100);

[1m[36mINFO: [39m[22m[36mPi (Π) is not provided.
[39m[1m[36mINFO: [39m[22m[36mPi (Π) is generated assuming all markers have effects on all traits.
[39m


The prior for marker effects covariance matrix is calculated from 
genetic covariance matrix and Π. The prior for the marker effects 
covariance matrix is: 

 0.492462  0.0       0.0     
 0.0       0.492462  0.0     
 0.0       0.0       0.492462


A Linear Mixed Model was build using model equations:

y1 = intercept + x1 + x3 + ID + dam
y2 = intercept + x1 + x2 + x3 + ID
y3 = intercept + x1 + x1*x3 + x2 + ID

Model Information:

Term            C/F          F/R            nLevels
intercept       factor       fixed                1
x1              covariate    fixed                1
x3              factor       fixed                2
ID              factor       random              12
dam             factor       random              12
x2              factor       random               2
x1*x3           interaction  fixed                2
ϵ               factor       random               5
J               covariate    fixed                1

MCMC Information:

methods                 

[32mrunning MCMC for BayesC...100%|█████████████████████████| Time: 0:00:05[39m


<button type="button" class="btn btn-lg btn-primary">Check Results</button> 

In [23]:
keys(out2)

Base.KeyIterator for a Dict{Any,Any} with 7 entries. Keys:
  "Posterior mean of polygenic effects covariance matrix"
  "Model frequency"
  "Posterior mean of residual covariance matrix"
  "Posterior mean of marker effects"
  "Posterior mean of marker effects covariance matrix"
  "Posterior mean of location parameters"
  "Posterior mean of Pi"

In [24]:
out2["Posterior mean of Pi"]

Dict{Array{Float64,1},Float64} with 8 entries:
  [1.0, 0.0, 1.0] => 0.125355
  [0.0, 0.0, 1.0] => 0.123946
  [0.0, 1.0, 1.0] => 0.121144
  [1.0, 1.0, 0.0] => 0.126313
  [0.0, 0.0, 0.0] => 0.129351
  [0.0, 1.0, 0.0] => 0.123645
  [1.0, 0.0, 0.0] => 0.128658
  [1.0, 1.0, 1.0] => 0.121586

In [25]:
out2["Posterior mean of marker effects"][1]

5×2 Array{Any,2}:
 "m1"  -0.0391361
 "m2"  -0.241005 
 "m3"   0.212377 
 "m4"  -0.113705 
 "m5"   0.0928107

In [26]:
out2["Posterior mean of marker effects"]

3-element Array{Any,1}:
 Any["m1" -0.0391361; "m2" -0.241005; … ; "m4" -0.113705; "m5" 0.0928107]
 Any["m1" 0.0346502; "m2" 0.00247991; … ; "m4" 0.0775843; "m5" 0.0825744]
 Any["m1" -0.0815575; "m2" 0.059716; … ; "m4" 0.0522826; "m5" 0.00345809]

In [27]:
out1["Posterior mean of marker effects"]

5×2 Array{Any,2}:
 "m1"  -0.0130229
 "m2"  -0.122099 
 "m3"   0.190064 
 "m4"  -0.123359 
 "m5"  -0.0262301

In [29]:
out2["EBV_y1"]

LoadError: [91mKeyError: key "EBV_y1" not found[39m