<ul class="breadcrumb">
  <li><a href="1_Conventional_Linear_Mixed_Model.ipynb">Bayesian Linear Mixed Models (Conventional)</a></li>
  <li><a href="2_Linear_Additive_Genetic_Model.ipynb">Bayesian Linear Additive Genetic Model</a></li> 
  <li><a href="3_Genomic_Linear_Mixed_Model.ipynb">Bayesian Linear Mixed Models (Genomic Data)</a></li>
</ul>

<div class="span5 alert alert-success">
 <font size="5" face="Georgia">Bayesian Linear Mixed Models (Genomic Data)</font> 
</div>

<button type="button" class="btn btn-lg btn-primary">Step 1: Load Packages</button> 

In [1]:
using JWAS,JWAS.Datasets,DataFrames,CSV

┌ Info: Recompiling stale cache file /Users/qtlchengadmin/.julia/compiled/v1.2/JWAS/tbeXw.ji for JWAS [c9a035f4-d403-5e6b-8649-6be755bc4798]
└ @ Base loading.jl:1240


<button type="button" class="btn btn-lg btn-primary">Step 2: Read data</button> 

In [2]:
phenofile  = Datasets.dataset("example","phenotypes.txt")
pedfile    = Datasets.dataset("example","pedigree.txt")
genofile   = Datasets.dataset("example","genotypes.txt")

phenotypes = CSV.read(phenofile,delim = ',',header=true,missingstrings=["NA"])
pedigree   = get_pedigree(pedfile,separator=",",header=true);

[32mThe delimiter in pedigree.txt is ','.[39m
Finished!


In [3]:
first(phenotypes,5)

Unnamed: 0_level_0,ID,y1,y2,y3,x1,x2,x3,dam
Unnamed: 0_level_1,String,Float64⍰,Float64⍰,Float64⍰,Float64,Float64,String,String
1,a1,-0.06,3.58,-1.18,0.9,2.0,m,0
2,a3,-2.07,3.19,missing,0.7,2.0,f,0
3,a4,-2.63,6.97,-0.83,0.6,1.0,m,a2
4,a5,2.31,missing,-1.52,0.4,2.0,m,a2
5,a6,0.93,4.87,-0.01,5.0,2.0,f,a3


<div class="span5 alert alert-success">
 <font size="5" face="Georgia">Univariate Linear Mixed Model (Genomic data)</font> 
</div>

<button type="button" class="btn btn-lg btn-primary">Step 3: Build Model Equations</button> 

In [4]:
model_equation1  ="y1 = intercept + x1*x3 + x2 + x3 + ID + dam";

In [5]:
R      = 1.0
model1 = build_model(model_equation1,R);

<button type="button" class="btn btn-lg btn-primary">Step 4: Set Factors or Covariates</button> 

In [6]:
set_covariate(model1,"x1");

<button type="button" class="btn btn-lg btn-primary">Step 5: Set Random or Fixed Effects</button> 

In [7]:
G1 = 1.0
G2 = [1.0 0.5
      0.5 1.0]
set_random(model1,"x2",G1);
set_random(model1,"ID dam",pedigree,G2);

<button type="button" class="btn btn-lg btn-primary">Step 6: Use Genomic Information</button> 

In [8]:
G3 =1.0
add_genotypes(model1,genofile,G3,separator=',');

[32mThe delimiter in genotypes.txt is ','.[39m
[32mThe header (marker IDs) is provided in genotypes.txt.[39m
5 markers on 7 individuals were added.


<button type="button" class="btn btn-lg btn-primary">Step 7: Run Analysis</button> 

In [9]:
outputEBV(model1,["a1","a2","a3"]);# without this line, EBV for all genotyped individuals are returned by default
out1=runMCMC(model1,phenotypes,methods="BayesC",estimatePi=true,chain_length=5000,output_samples_frequency=100,output_heritability=false);

[31mTesting individuals are not a subset of genotyped individuals (complete genomic data,non-single-step). Only output EBV for tesing individuals with genotypes.[39m
[32mChecking phenotypes...[39m
[32mIndividual IDs (strings) are provided in the first column of the phenotypic data.[39m
[31mPhenotypes for all traits included in the model for individual a7 in the row 6 are missing. This record is deleted.[39m

The prior for marker effects variance is calculated from the genetic variance and π.
The mean of the prior for the marker effects variance is: 0.492462



[0m[1mA Linear Mixed Model was build using model equations:[22m

y1 = intercept + x1*x3 + x2 + x3 + ID + dam

[0m[1mModel Information:[22m

Term            C/F          F/R            nLevels
intercept       factor       fixed                1
x1*x3           interaction  fixed                2
x2              factor       random               2
x3              factor       fixed                2
ID              fac

[32mrunning MCMC for BayesC...100%|█████████████████████████| Time: 0:00:01[39m




[0m[1mThe version of Julia and Platform in use:[22m

Julia Version 1.2.0
Commit c6da87ff4b (2019-08-20 00:03 UTC)
Platform Info:
  OS: macOS (x86_64-apple-darwin18.6.0)
  CPU: Intel(R) Core(TM) i7-8559U CPU @ 2.70GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-6.0.1 (ORCJIT, skylake)


[0m[1mThe analysis has finished. Results are saved in the returned [22m[0m[1mvariable and text files. MCMC samples are saved in text files.[22m




<button type="button" class="btn btn-lg btn-primary">Check Results</button> 

In [10]:
keys(out1)

Base.KeySet for a Dict{Any,Any} with 7 entries. Keys:
  "marker effects"
  "EBV_y1"
  "Pi"
  "location parameters"
  "residual variance"
  "polygenic effects covariance matrix"
  "marker effects variance"

In [11]:
out1["EBV_y1"]

Unnamed: 0_level_0,ID,EBV,PEV
Unnamed: 0_level_1,Any,Any,Any
1,a1,0.439279,12.5976
2,a3,-0.679551,14.5684


In [12]:
out1["Pi"]

Unnamed: 0_level_0,π,Estimate,Std_Error
Unnamed: 0_level_1,Any,Any,Any
1,π,0.502187,0.28722


<div class="span5 alert alert-success">
 <font size="5" face="Georgia">Multivariate Linear Mixed Model (Genomic data)</font> 
</div>

<button type="button" class="btn btn-lg btn-primary">Step 3: Build Model Equations</button> 

In [13]:
model_equation2 ="y1 = intercept + x1 + x3 + ID + dam
                  y2 = intercept + x1 + x2 + x3 + ID
                  y3 = intercept + x1 + x1*x3 + x2 + ID";

In [14]:
R      = [1.0 0.5 0.5
          0.5 1.0 0.5
          0.5 0.5 1.0]
model2 = build_model(model_equation2,R);

<button type="button" class="btn btn-lg btn-primary">Step 4: Set Factors or Covariates</button> 

In [15]:
set_covariate(model2,"x1");

<button type="button" class="btn btn-lg btn-primary">Step 5: Set Random or Fixed Effects</button> 

In [16]:
G1 = [1.0 0.5
      0.5 1.0]
G2 = [1.0 0.5 0.5 0.5
      0.5 1.0 0.5 0.5
      0.5 0.5 1.0 0.5
      0.5 0.5 0.5 1.0]
set_random(model2,"x2",G1);
set_random(model2,"ID dam",pedigree,G2);

[32mx2 is not found in model equation 1.[39m
[32mdam is not found in model equation 2.[39m
[32mdam is not found in model equation 3.[39m


<button type="button" class="btn btn-lg btn-primary">Step 6: Use Genomic Information</button> 

In [17]:
G3 = [1.0 0.5 0.5
      0.5 1.0 0.5
      0.5 0.5 1.0]
add_genotypes(model2,genofile,G3,separator=',');

[32mThe delimiter in genotypes.txt is ','.[39m
[32mThe header (marker IDs) is provided in genotypes.txt.[39m
5 markers on 7 individuals were added.


<button type="button" class="btn btn-lg btn-primary">Step 7: Run Analysis</button> 

In [18]:
outputEBV(model2,["a1","a2","a3"]);# without this line, EBV for all genotyped individuals are returned by default
out2=runMCMC(model2,phenotypes,methods="BayesC",estimatePi=true,chain_length=5000,output_samples_frequency=100,output_heritability=false);

[31mTesting individuals are not a subset of genotyped individuals (complete genomic data,non-single-step). Only output EBV for tesing individuals with genotypes.[39m
[32mChecking phenotypes...[39m
[32mIndividual IDs (strings) are provided in the first column of the phenotypic data.[39m
[31mPhenotypes for all traits included in the model for individual a7 in the row 6 are missing. This record is deleted.[39m

[0mPi (Π) is not provided.
[0mPi (Π) is generated assuming all markers have effects on all traits.

The prior for marker effects covariance matrix is calculated from genetic covariance matrix and Π.
The mean of the prior for the marker effects covariance matrix is:
 0.492462  0.246231  0.246231
 0.246231  0.492462  0.246231
 0.246231  0.246231  0.492462



[0m[1mA Linear Mixed Model was build using model equations:[22m

y1 = intercept + x1 + x3 + ID + dam
y2 = intercept + x1 + x2 + x3 + ID
y3 = intercept + x1 + x1*x3 + x2 + ID

[0m[1mModel Information:[22m

Term    

[32mrunning MCMC for BayesC...100%|█████████████████████████| Time: 0:00:03[39m




[0m[1mThe version of Julia and Platform in use:[22m

Julia Version 1.2.0
Commit c6da87ff4b (2019-08-20 00:03 UTC)
Platform Info:
  OS: macOS (x86_64-apple-darwin18.6.0)
  CPU: Intel(R) Core(TM) i7-8559U CPU @ 2.70GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-6.0.1 (ORCJIT, skylake)


[0m[1mThe analysis has finished. Results are saved in the returned [22m[0m[1mvariable and text files. MCMC samples are saved in text files.[22m




<button type="button" class="btn btn-lg btn-primary">Check Results</button> 

In [19]:
keys(out2)

Base.KeySet for a Dict{Any,Any} with 9 entries. Keys:
  "marker effects"
  "EBV_y2"
  "EBV_y1"
  "Pi"
  "location parameters"
  "residual variance"
  "polygenic effects covariance matrix"
  "EBV_y3"
  "marker effects variance"

In [20]:
out2["Pi"]

Unnamed: 0_level_0,π,Estimate,Std_Error
Unnamed: 0_level_1,Any,Any,Any
1,"[1.0, 1.0, 0.0]",0.130527,missing
2,"[0.0, 0.0, 0.0]",0.118317,missing
3,"[1.0, 0.0, 0.0]",0.151488,missing
4,"[0.0, 1.0, 1.0]",0.113567,missing
5,"[1.0, 0.0, 1.0]",0.134396,missing
6,"[0.0, 0.0, 1.0]",0.110633,missing
7,"[1.0, 1.0, 1.0]",0.12848,missing
8,"[0.0, 1.0, 0.0]",0.112592,missing


In [21]:
out2["EBV_y2"]

Unnamed: 0_level_0,ID,EBV,PEV
Unnamed: 0_level_1,Any,Any,Any
1,a1,-0.567273,0.797028
2,a3,0.0249603,1.28429
