<ul class="breadcrumb">
  <li><a href="1.2.Multivariate_Linear_Mixed_Effects_Model.ipynb">Multivariate Basics</a></li>
  <li><a href="2.2.Multivariate_Linear_Additive_Genetic_Model.ipynb">Multivariate Additive Genetic Model</a></li> 
  <li><a href="3.2.Multivariate_Linear_Mixed_Effects_Model_with_Genomic_Data.ipynb">Multivariate Genomic Data</a></li>
</ul>

<div class="span5 alert alert-success">
 <font size="5" face="Georgia">Multivariate Linear Mixed Effects Model with Genomic Data</font> 
</div>

In [1]:
using DataFrames,JWAS
using JWAS: Datasets,misc

INFO: Recompiling stale cache file /Users/haocheng/.julia/lib/v0.4/JWAS.ji for module JWAS.


<button type="button" class="btn btn-lg btn-primary">Data</button> 

In [2]:
phenofile = Datasets.dataset("testMT","phenotype.txt")
genofile  = Datasets.dataset("testMT","genotype.txt")
pedfile   = Datasets.dataset("testMT","pedigree.txt");

### phenotypes

In [3]:
;cat $phenofile

Animal,BW,CW,age,sex
S1,100.0,10.0,8,M
D1,50.0,12.9,7,F
O1,150.0,13.0,3,M
O3,40.0,5.0,4,F


### genotypes

In [4]:
;cat $genofile

Animal,x1,x2,x3,x4,x5
S1,1.0,0.0,1.0,1.0,1.0
D1,2.0,0.0,2.0,2.0,1.0
O1,1.0,2.0,0.0,1.0,0.0
O3,0.0,0.0,2.0,1.0,1.0


### pedigree

In [5]:
;cat $pedfile

S1 0 0
D1 0 0
O1 S1 D1
O2 S1 D1
O3 S1 D1


In [6]:
data=readtable(phenofile)

Unnamed: 0,Animal,BW,CW,age,sex
1,S1,100.0,10.0,8,M
2,D1,50.0,12.9,7,F
3,O1,150.0,13.0,3,M
4,O3,40.0,5.0,4,F


<div class="span5 alert alert-info">
 <font size="5" face="Georgia">I. Multiple Traits Analyses with Marker Information</font><br> 
</div>

<button type="button" class="btn btn-lg btn-primary">Build Model</button> 

### Genetic covariance matrix and residual covariance matrix

In [7]:
R      = [10.0 2.0
           2.0 1.0]
G      = [20.0 1.0
           1.0 2.0];

In [8]:
model_equations = "BW = intercept + age + sex;
                   CW = intercept + age + sex";

In [9]:
model1 = build_model(model_equations,R);

In [10]:
set_covariate(model1,"age");

In [11]:
add_markers(model1,genofile,G,separator=',',header=true);

5 markers on 4 individuals were added.


<button type="button" class="btn btn-lg btn-primary">Run Model</button> 

In [12]:
Pi=Dict([1.0; 1.0]=>0.7,[1.0;0.0]=>0.1,[0.0,1.0]=>0.1,[0.0; 0.0]=>0.1)
out = runMCMC(model1,data,Pi=Pi,chain_length=5000,methods="BayesC",
estimatePi=true,output_samples_frequency=5);

Priors for marker effects covariance matrix were calculated from genetic covariance matrix and π.
Marker effects covariance matrix is 
[10.958904 0.626223
 0.626223 1.09589].


MCMC Information:
methods                                      BayesC
chain_length                                   5000
estimatePi                                     true
constraint                                    false
missing_phenotypes                            false
starting_value                                false
output_samples_frequency                          5
printout_frequency                             5001
update_priors_frequency                           0

Degree of freedom for hyper-parameters:
residual variances:                           4.000
iid random effect variances:                  4.000
polygenic effect variances:                   4.000
marker effect variances:                      4.000



running MCMC for BayesC...100%|█████████████████████████| Time: 0:00:02


<button type="button" class="btn btn-lg btn-primary">Check Results</button> 

In [13]:
keys(out)

Base.KeyIterator for a Dict{Any,Any} with 7 entries. Keys:
  "Model frequency"
  "Posterior mean of residual covariance matrix"
  "Posterior mean of marker effects"
  "Posterior mean of marker effects covariance matrix"
  "MCMC samples for residual covariance matrix"
  "Posterior mean of location parameters"
  "Posterior mean of Pi"

In [14]:
file1="MCMC_samples_for_marker_effects_BW.txt"
file2="MCMC_samples_for_marker_effects_CW.txt";

In [15]:
get_breeding_values(model1,file1,file2)

2-element Array{Any,1}:
 4×3 DataFrames.DataFrame
│ Row │ ID   │ EBV      │ PEV     │
├─────┼──────┼──────────┼─────────┤
│ 1   │ "S1" │ -1.13724 │ 8.07753 │
│ 2   │ "D1" │ 8.04833  │ 86.4629 │
│ 3   │ "O1" │ 3.37534  │ 70.1204 │
│ 4   │ "O3" │ -10.2864 │ 70.7876 │            
 4×3 DataFrames.DataFrame
│ Row │ ID   │ EBV       │ PEV      │
├─────┼──────┼───────────┼──────────┤
│ 1   │ "S1" │ -0.137831 │ 0.538611 │
│ 2   │ "D1" │ 2.29012   │ 6.84982  │
│ 3   │ "O1" │ 0.163889  │ 4.43166  │
│ 4   │ "O3" │ -2.31618  │ 4.63616  │

In [16]:
samples4G=get_additive_genetic_variances(model1,file1,file2);

samples4R=out["MCMC samples for residual covariance matrix"];

samples4h2=get_heritability(reformat(samples4G),reformat(samples4R));

In [17]:
writedlm("out3.txt",samples4G)

In [18]:
samples_genetic_correlation=get_correlations(reformat(samples4G));

In [19]:
writedlm("out.G.txt",reformat(reformat(samples4G)))

In [20]:
out=readdlm("out.G.txt")

4x1000 Array{Float64,2}:
  42.8867   36.6135    50.638   173.056   …  127.965    12.5632     6.30952 
 -13.6981    0.115981   7.1709   12.3879       9.47146   0.0818334  1.91655 
 -13.6981    0.115981   7.1709   12.3879       9.47146   0.0818334  1.91655 
   4.53719  10.889      8.0778    8.1298       1.22728   0.574829   0.632083

In [21]:
typeof(samples4h2)

Array{Array{Float64,1},1}

In [31]:
reformat(out)

1000-element Array{Array{Float64,2},1}:
 2x2 Array{Float64,2}:
  42.8867  -13.6981 
 -13.6981    4.53719    
 2x2 Array{Float64,2}:
 36.6135     0.115981
  0.115981  10.889     
 2x2 Array{Float64,2}:
 50.638   7.1709
  7.1709  8.0778            
 2x2 Array{Float64,2}:
 173.056   12.3879
  12.3879   8.1298        
 2x2 Array{Float64,2}:
 95.6842  2.7784 
  2.7784  3.56561          
 2x2 Array{Float64,2}:
 126.866   4.9117 
   4.9117  1.96004        
 2x2 Array{Float64,2}:
 166.642  0.0
   0.0    0.0                  
 2x2 Array{Float64,2}:
 215.404    9.27001 
   9.27001  0.468837    
 2x2 Array{Float64,2}:
 172.508   10.1742  
  10.1742   0.660939    
 2x2 Array{Float64,2}:
 77.2814  0.0
  0.0     0.0                  
 2x2 Array{Float64,2}:
 151.585   10.9064 
  10.9064   0.80257      
 2x2 Array{Float64,2}:
 160.008   20.9497 
  20.9497   3.00534      
 2x2 Array{Float64,2}:
 202.058   26.1819 
  26.1819   4.26451      
 ⋮                                                             

In [23]:
out10=reformat(samples4h2)

2x1000 Array{Float64,2}:
 0.228612  0.20322  0.211682  0.648543  …  0.649724  0.173675   0.207354
 0.355741  0.4927   0.425803  0.676871     0.12075   0.0906189  0.157265

In [28]:
reformat(out,2)

1000-element Array{Array{Float64,2},1}:
 2x2 Array{Float64,2}:
  42.8867  -13.6981 
 -13.6981    4.53719    
 2x2 Array{Float64,2}:
 36.6135     0.115981
  0.115981  10.889     
 2x2 Array{Float64,2}:
 50.638   7.1709
  7.1709  8.0778            
 2x2 Array{Float64,2}:
 173.056   12.3879
  12.3879   8.1298        
 2x2 Array{Float64,2}:
 95.6842  2.7784 
  2.7784  3.56561          
 2x2 Array{Float64,2}:
 126.866   4.9117 
   4.9117  1.96004        
 2x2 Array{Float64,2}:
 166.642  0.0
   0.0    0.0                  
 2x2 Array{Float64,2}:
 215.404    9.27001 
   9.27001  0.468837    
 2x2 Array{Float64,2}:
 172.508   10.1742  
  10.1742   0.660939    
 2x2 Array{Float64,2}:
 77.2814  0.0
  0.0     0.0                  
 2x2 Array{Float64,2}:
 151.585   10.9064 
  10.9064   0.80257      
 2x2 Array{Float64,2}:
 160.008   20.9497 
  20.9497   3.00534      
 2x2 Array{Float64,2}:
 202.058   26.1819 
  26.1819   4.26451      
 ⋮                                                             

In [58]:
reformat2(samples4h2)

2x1000 Array{Float64,2}:
 0.212407  0.240839  0.0911501  0.557378  …  0.362104   0.0637727  0.0537529
 0.481242  0.66608   0.297106   0.66303      0.0397157  0.0488209  0.14121  

In [57]:
function reformat2(G::Array{Array{Float64,1},1})
    Gnew = zeros(length(G[1]),length(G))
    for i in 1:length(G)
        Gnew[:,i]=vec(G[i])
    end
    Gnew
end

reformat2 (generic function with 2 methods)

In [23]:
#genetic correlation between trait 1 and trait 2
report(reformat(samples4G),index=[1,2]);

Summary Stats:
Mean:         30.352774
Minimum:      -13.698141
1st Quartile: 0.192949
Median:       21.851885
3rd Quartile: 51.839597
Maximum:      194.005344
nothing


<div class="span5 alert alert-info">
 <font size="5" face="Georgia">II. Multiple Traits Analyses with Marker Effects and Polygenic Effects</font><br> 
</div>

<button type="button" class="btn btn-lg btn-primary">Build Model</button> 

In [23]:
model_equations = "BW = intercept + age + sex + Animal;
                   CW = intercept + age + sex + Animal";
model2          = build_model(model_equations,R);

set_covariate(model2,"age");

get pedigree information from a file

In [24]:
ped=get_pedigree(pedfile);

Finished!


In [25]:
GA = G*0.1
set_random(model2,"Animal",ped,GA)

In [26]:
GM = G*0.9
add_markers(model2,genofile,GM,separator=',',header=true);

5 markers on 4 individuals were added.


<button type="button" class="btn btn-lg btn-primary">Run Model</button> 

In [27]:
Pi=Dict([1.0; 1.0]=>0.25,[1.0;0.0]=>0.25,[0.0,1.0]=>0.25,[0.0; 0.0]=>0.25)
out2=runMCMC(model2,data,Pi=Pi,chain_length=5000,methods="BayesC");

Priors for marker effects covariance matrix were calculated from genetic covariance matrix and π.
Marker effects covariance matrix is 
[15.780822 1.578082
 1.578082 1.578082].


MCMC Information:
methods                                      BayesC
chain_length                                   5000
estimatePi                                    false
constraint                                    false
missing_phenotypes                            false
starting_value                                false
output_samples_frequency                          0
printout_frequency                             5001
update_priors_frequency                           0

Degree of freedom for hyper-parameters:
residual variances:                           4.000
iid random effect variances:                  4.000
polygenic effect variances:                   4.000
marker effect variances:                      4.000



running MCMC for BayesC...100%|█████████████████████████| Time: 0:00:01


<button type="button" class="btn btn-lg btn-primary">Check Results</button> 

In [28]:
keys(out2)

Base.KeyIterator for a Dict{Any,Any} with 6 entries. Keys:
  "Posterior mean of polygenic effects covariance matrix"
  "Model frequency"
  "Posterior mean of residual covariance matrix"
  "Posterior mean of marker effects"
  "Posterior mean of marker effects covariance matrix"
  "Posterior mean of location parameters"

In [29]:
out2["Posterior mean of polygenic effects covariance matrix"]

2x2 Array{Float64,2}:
 2.03556   0.109594
 0.109594  0.195886