# PolrGWAS.jl

PolrGWAS.jl is a Julia package for performing genome-wide association studies for ordered categorical phenotypes. Install the package by
```julia
Pkg.clone("git@github.com:Hua-Zhou/PolrGWAS.git")
```

## Basic usage

Suppose covariates and phenotype are available in a csv file `covariate.txt`. Variable `trait` is the ordered categorical phenotypes coded as integers 1 to 4. We want to include variable `sex` as the covariate in GWAS.

In [1]:
;head -20 ../data/covariate.txt

famid,perid,faid,moid,sex,trait
2431,NA19916,0,0,1,4
2424,NA19835,0,0,2,4
2469,NA20282,0,0,2,4
2368,NA19703,0,0,1,3
2425,NA19901,0,0,2,3
2427,NA19908,0,0,1,4
2430,NA19914,0,0,2,4
2470,NA20287,0,0,2,1
2436,NA19713,0,0,2,3
2426,NA19904,0,0,1,1
2431,NA19917,0,0,2,1
2436,NA19982,0,0,1,2
2487,NA20340,0,0,1,4
2427,NA19909,0,0,2,4
2424,NA19834,0,0,1,4
2480,NA20317,0,0,2,4
2418,NA19818,0,0,1,1
2490,NA20346,0,0,1,2
2433,NA19921,0,0,2,4


Genotype data is available as binary Plink files.

In [2]:
;ls -l ../data/hapmap3."*"

-rw-r--r--  1 huazhou  staff  1128171 Jun 19  2017 ../data/hapmap3.bed
-rw-r--r--  1 huazhou  staff   388672 Jun 19  2017 ../data/hapmap3.bim
-rw-r--r--  1 huazhou  staff     7136 Jun 19  2017 ../data/hapmap3.fam
-rw-r--r--  1 huazhou  staff   332960 Jun 19  2017 ../data/hapmap3.map


The following command performs GWAS using the proportional odds logistic regression.

In [12]:
using PolrGWAS

@time polrgwas(@formula(trait ~ 0 + sex), "../data/covariate.txt", "../data/hapmap3";
    covartype = [String, String, String, String, Float64, Int])

StatsModels.DataFrameRegressionModel{PolrModels.PolrModel{Int64,Float64,GLM.LogitLink},Array{Float64,2}}

Formula: trait ~ +sex

Coefficients:
      Estimate Std.Error  t value Pr(>|t|)
θ1    -1.48554  0.358711 -4.14131    <1e-4
θ2   -0.569325  0.340645 -1.67132   0.0956
θ3    0.429823  0.339263  1.26693   0.2061
β1    0.424691   0.21391  1.98537   0.0480


[1m[36mINFO: [39m[22m[36mv1.0 BED file detected
[39m

  0.659454 seconds (10.19 M allocations: 183.621 MiB, 9.26% gc time)


It outputs two files:  
* `polrgwas.nullmodel.txt` lists the estimated regression model.  

In [13]:
;cat polrgwas.nullmodel.txt

StatsModels.DataFrameRegressionModel{PolrModels.PolrModel{Int64,Float64,GLM.LogitLink},Array{Float64,2}}

Formula: trait ~ +sex

Coefficients:
      Estimate Std.Error  t value Pr(>|t|)
θ1    -1.48554  0.358711 -4.14131    <1e-4
θ2   -0.569325  0.340645 -1.67132   0.0956
θ3    0.429823  0.339263  1.26693   0.2061
β1    0.424691   0.21391  1.98537   0.0480


* `polrgwas.scoretest.txt` lists the SNPs and their pvalues. 

In [10]:
;head polrgwas.scoretest.txt

chr,pos,snpid,maf,pval
1,554484,rs10458597,0.0,1.0
1,758311,rs12562034,0.07763975155279501,0.0018866769073824054
1,967643,rs2710875,0.32407407407407407,2.5147324918949065e-5
1,1168108,rs11260566,0.19158878504672894,9.890569210468016e-6
1,1375074,rs1312568,0.441358024691358,0.00832247910423294
1,1588771,rs35154105,0.0,1.0
1,1789051,rs16824508,0.00462962962962965,0.5281305094274626
1,1990452,rs2678939,0.4537037037037037,0.2999869793592329
1,2194615,rs7553178,0.22685185185185186,0.16443985731509447


## Input files

## Output files