<ul class="breadcrumb">
  <li><a href="1_Conventional_Linear_Mixed_Model.ipynb">Bayesian Linear Mixed Models (Conventional)</a></li>
  <li><a href="2_Linear_Additive_Genetic_Model.ipynb">Bayesian Linear Additive Genetic Model</a></li> 
  <li><a href="3_Genomic_Linear_Mixed_Model.ipynb">Bayesian Linear Mixed Models (Genomic Data)</a></li>
</ul>

<button type="button" class="btn btn-lg btn-primary">Step 1: Load Packages</button> 

In [1]:
using JWAS,JWAS.Datasets,DataFrames,CSV

<button type="button" class="btn btn-lg btn-primary">Step 2: Read data</button> 

In [2]:
phenofile  = Datasets.dataset("example","phenotypes.txt")
pedfile    = Datasets.dataset("example","pedigree.txt")
genofile   = Datasets.dataset("example","genotypes.txt")

phenotypes = CSV.read(phenofile,delim = ',',header=true,missingstrings=["NA"])
pedigree   = get_pedigree(pedfile,separator=",",header=true);

[31mThe delimiter in pedigree.txt is ','.[39m
Finished!


In [3]:
first(phenotypes,5)

Unnamed: 0_level_0,ID,y1,y2,y3,x1,x2,x3,dam
Unnamed: 0_level_1,String,Float64,Float64⍰,Float64⍰,Float64,Int64,String,String
1,a1,-0.06,3.58,-1.18,0.9,2,m,0
2,a3,-2.07,3.19,missing,0.7,2,f,0
3,a4,-2.63,6.97,-0.83,0.6,1,m,a2
4,a5,2.31,missing,-1.52,0.4,2,m,a2
5,a6,0.93,4.87,-0.01,5.0,2,f,a3


<div class="span5 alert alert-success">
 <font size="5" face="Georgia">Univariate Linear Mixed Model (Genomic data)</font> 
</div>

<button type="button" class="btn btn-lg btn-primary">Step 3: Build Model Equations</button> 

In [4]:
model_equation1  ="y1 = intercept + x1*x3 + x2 + x3 + ID + dam";

In [5]:
R      = 1.0
model1 = build_model(model_equation1,R);

<button type="button" class="btn btn-lg btn-primary">Step 4: Set Factors or Covariates</button> 

In [6]:
set_covariate(model1,"x1");

<button type="button" class="btn btn-lg btn-primary">Step 5: Set Random or Fixed Effects</button> 

In [7]:
G1 = 1.0
G2 = [1.0 0.5
      0.5 1.0]
set_random(model1,"x2",G1);
set_random(model1,"ID dam",pedigree,G2);

<button type="button" class="btn btn-lg btn-primary">Step 6: Use Genomic Information</button> 

In [8]:
G3 =1.0
add_genotypes(model1,genofile,G3,separator=',');

[31mThe delimiter in genotypes.txt is ','.[39m
[31mThe header (marker IDs) is provided in genotypes.txt.[39m


│   caller = ip:0x0
└ @ Core :-1


5 markers on 7 individuals were added.


<button type="button" class="btn btn-lg btn-primary">Step 7: Run Analysis</button> 

In [9]:
outputMCMCsamples(model1,"x2")
out1=runMCMC(model1,phenotypes,methods="BayesC",estimatePi=true,chain_length=5000,output_samples_frequency=100);


The prior for marker effects variance is calculated from the genetic variance and π.
The mean of the prior for the marker effects variance is: 0.492462


[0m[1mA Linear Mixed Model was build using model equations:[22m

y1 = intercept + x1*x3 + x2 + x3 + ID + dam

[0m[1mModel Information:[22m

Term            C/F          F/R            nLevels
intercept       factor       fixed                1
x1*x3           interaction  fixed                2
x2              factor       random               2
x3              factor       fixed                2
ID              factor       random              12
dam             factor       random              12

[0m[1mMCMC Information:[22m

methods                                      BayesC
chain_length                                   5000
burnin                                            0
estimatePi                                     true
estimateScale                                 false
starting_value                            

[32mrunning MCMC for BayesC...100%|█████████████████████████| Time: 0:00:01[39m


# GWAS

In [12]:
?GWAS

search: [0m[1mG[22m[0m[1mW[22m[0m[1mA[22m[0m[1mS[22m



```
GWAS(marker_effects_file;header=false)
```

Compute the model frequency for each marker (the probability the marker is included in the model) using samples of marker effects stored in **marker*effects*file**.

---

```
GWAS(marker_effects_file,model;header=true,window_size="1 Mb",threshold=0.001)
```

run genomic window-based GWAS without marker locations

  * MCMC samples of marker effects are stored in **marker*effects*file** with delimiter ','.
  * **window_size** is either a constant (identical number of markers in each window) or an array of number of markers in each window
  * **model** is either the model::MME used in analysis or the genotypic covariate matrix M::Array
  * File format:

---

```
GWAS(marker_effects_file,map_file,model;header=true,window_size="1 Mb",threshold=0.001)
```

run genomic window-based GWAS

  * MCMC samples of marker effects are stored in **marker*effects*file** with delimiter ','.
  * **model** is either the model::MME used in analysis or the genotypic cavariate matrix M::Array
  * **map_file** has the (sorted) marker position information with delimiter ','.
  * File format:

```
markerID,chromosome,position
m1,1,16977
m2,1,434311
m3,1,1025513
m4,2,70350
m5,2,101135
```


In [17]:
marker_effects_file="MCMC_samples_marker_effects_y1.txt"

"MCMC_samples_marker_effects_y1.txt"

In [18]:
GWAS(marker_effects_file,header=true) 

5×2 Array{Any,2}:
 "m1"  0.62
 "m2"  0.62
 "m3"  0.6 
 "m4"  0.7 
 "m5"  0.6 

In [20]:
map_file = Datasets.dataset("example","map.txt")
GWAS(marker_effects_file,map_file,model1,header=true,window_size="1 Mb",threshold=0.001)

Unnamed: 0_level_0,window,chr,wStart,wEnd,start_SNP,end_SNP,numSNP,prGenVar,WPPA
Unnamed: 0_level_1,Int64,String,Int64,Int64,Int64,Int64,Int64,Float64,Float64
1,1,1,0,1000000,16977,434311,2,42.36,0.82
2,3,2,0,1000000,70350,101135,2,36.57,0.82
3,2,1,1000000,2000000,1025513,1025513,1,26.98,0.58
