# Some Julia Commands

### Rohan L. Fernando

### February 2016

https://github.com/rohanLuigi/WGAKenya

Some useful Julia commands are introduced here in the context of simulating genomic data

### Making a matrix A

In [56]:
A = [2 3
     3 4]

2x2 Array{Int64,2}:
 2  3
 3  4

### Making a matrix B

In [25]:
B = [2 5
     2 3]

2x2 Array{Int64,2}:
 2  5
 2  3

### Matrix addition and subtraction

In [26]:
C = A + B 

2x2 Array{Int64,2}:
 3  7
 5  7

In [27]:
C 

2x2 Array{Int64,2}:
 3  7
 5  7

In [31]:
A - B 

2x2 Array{Int64,2}:
 -1  -3
  1   1

### Multiplying two matrices

In [28]:
A*B

2x2 Array{Int64,2}:
  6  11
 14  27

### Transpose of a matrix

In [29]:
A'

2x2 Array{Int64,2}:
 1  3
 2  4

### Computing A'A

In [58]:
A'A

2x2 Array{Int64,2}:
 13  18
 18  25

In [59]:
5A

2x2 Array{Int64,2}:
 10  15
 15  20

### Matrix inverse

In [64]:
iA = inv(A)

2x2 Array{Float64,2}:
 -4.0   3.0
  3.0  -2.0

In [35]:
iA*A

2x2 Array{Float64,2}:
 1.0          0.0
 2.22045e-16  1.0

In [36]:
round(ans,3)

2x2 Array{Float64,2}:
 1.0  0.0
 0.0  1.0

Note that "ans" contains the result of the previous command

### Solving linear equations:

Consider the system of linear equations:
$$
\mathbf{Ax} = \mathbf{y}
$$
where $\mathbf{A}$ is a matrix of known coefficients, $\mathbf{y}$ is a vector of known values, and $\mathbf{x}$ are the unknowns. 

Consider the system of linear equations:
$$
\mathbf{Ax} = \mathbf{y}
$$
where $\mathbf{A}$ is a matrix of known coefficients, $\mathbf{y}$ is a vector of known values, and $\mathbf{x}$ are the unknowns. 

In [67]:
y = [1; 2.5]
sol = iA*y

2-element Array{Float64,1}:
  3.5
 -2.0

In [69]:
y = [1
     2.5]

2-element Array{Float64,1}:
 1.0
 2.5

### Check solution

In [68]:
[A*sol y]

2x2 Array{Float64,2}:
 1.0  1.0
 2.5  2.5

### Different $\mathbf{y}$

In [40]:
y = [1; 3.5]
x = iA*y
A*x

2-element Array{Float64,1}:
 1.0
 3.5

### More efficient method to obtain solution

In [42]:
x = A\y
A*x

2-element Array{Float64,1}:
 1.0
 3.5

## Julia Packages 
* [List of registered Julia packages](http://pkg.julialang.org) 
* Will use [Distributions Package](http://distributionsjl.readthedocs.org/en) to simulate data. 
* It can be added to your system with the command:

## Simulation of data using Julia



In [None]:
#Pkg.add("Distributions")

* This needs to be done only once. 

* But, to access the functions in the Distributions package the "using" command has to be invoked as:


In [70]:
using Distributions

## Simulate matrix of ``genotype" covariates

In [84]:
nRows = 10
nCols = 5
dist  = Binomial(2,0.5)
X     = rand(dist,nRows,nCols)

10x5 Array{Int64,2}:
 1  2  0  2  2
 0  1  1  1  0
 1  2  0  0  0
 0  2  1  0  1
 1  1  0  0  1
 1  1  0  1  1
 0  0  1  0  2
 2  0  0  1  0
 0  0  1  0  0
 1  1  2  1  0

In [89]:
rand(Binomial(5,0.1),5,2)

5x2 Array{Int64,2}:
 0  1
 0  0
 0  0
 0  0
 1  0

## Add a column of ones for intercept

### Command to make a column of ones

In [19]:
ones(nRows,1)

10x1 Array{Float64,2}:
 1.0
 1.0
 1.0
 1.0
 1.0
 1.0
 1.0
 1.0
 1.0
 1.0

### Concatenate two matrices

In [90]:
X = [ones(nRows,1) X]

10x6 Array{Float64,2}:
 1.0  1.0  2.0  0.0  2.0  2.0
 1.0  0.0  1.0  1.0  1.0  0.0
 1.0  1.0  2.0  0.0  0.0  0.0
 1.0  0.0  2.0  1.0  0.0  1.0
 1.0  1.0  1.0  0.0  0.0  1.0
 1.0  1.0  1.0  0.0  1.0  1.0
 1.0  0.0  0.0  1.0  0.0  2.0
 1.0  2.0  0.0  0.0  1.0  0.0
 1.0  0.0  0.0  1.0  0.0  0.0
 1.0  1.0  1.0  2.0  1.0  0.0

## Commands to access elements of a matrix

### Submatrix of X:

In [20]:
X[2:5,3:6]

4x4 Array{Float64,2}:
 1.0  1.0  1.0  1.0
 0.0  2.0  1.0  0.0
 1.0  2.0  1.0  2.0
 2.0  0.0  0.0  2.0

In [44]:
X[:,3]

10-element Array{Float64,1}:
 2.0
 1.0
 0.0
 1.0
 2.0
 1.0
 2.0
 1.0
 0.0
 0.0

In [45]:
X[2,:]

1x6 Array{Float64,2}:
 1.0  0.0  1.0  1.0  1.0  1.0

Get element (i,j) of X

In [22]:
X[3,4]

2.0

In [23]:
X[4,4]

2.0

### Dimensions of X

In [50]:
size(X)

(10,6)

In [51]:
rows = size(X,1)

10

In [52]:
cols = size(X,2)

6

In [53]:
rows,cols = size(X)

(10,6)

In [91]:
rows

10

## Simulate phenotypic values

Want to simulate phenotypic values distributed as:
$$
\mathbf{y} = \mathbf{Xb} + \mathbf{e},
$$
where $\mathbf{X}$ is matrix of observed SNP covariates, $\mathbf{b}$ is a vector of effects that are iid Normal random variables with mean zero and variance 0.25, and $\mathbf{e}$ is a vector of residuals that are iid Normal random variables with mean zero and variance 1.0. 

## Simulate matrix of ``genotype" covariates

In [1]:
using Distributions
nRows = 10
nCols = 5
dist  = Binomial(2,0.5)
X     = rand(dist,nRows,nCols)

10x5 Array{Int64,2}:
 0  0  1  0  0
 1  2  0  1  1
 1  2  1  1  0
 2  0  0  1  1
 2  1  0  2  2
 1  1  1  2  0
 1  1  1  0  0
 0  1  0  2  2
 1  2  1  0  2
 1  2  2  2  2

In [2]:
X     = [ones(nRows,1) X]

10x6 Array{Float64,2}:
 1.0  2.0  1.0  0.0  0.0  2.0
 1.0  1.0  1.0  2.0  0.0  1.0
 1.0  1.0  2.0  0.0  2.0  0.0
 1.0  1.0  0.0  2.0  2.0  0.0
 1.0  1.0  1.0  2.0  1.0  1.0
 1.0  2.0  0.0  2.0  2.0  1.0
 1.0  1.0  1.0  1.0  1.0  1.0
 1.0  1.0  1.0  2.0  0.0  2.0
 1.0  1.0  2.0  2.0  1.0  1.0
 1.0  1.0  0.0  0.0  0.0  2.0

In [3]:
a = [1 2
     3 4]

2x2 Array{Int64,2}:
 1  2
 3  4

## Simulate effects from normal distribution

In [7]:
nRowsX, nColsX = size(X)
mean = 0.0
std  = sqrt(0.25)
b = rand(Normal(mean,std),nColsX)

5-element Array{Float64,1}:
 -0.0863846
 -0.249416 
 -0.842788 
 -0.224963 
  0.182321 

## Simulate residuals from normal distribution

In [19]:
resStd = 1.0
e = rand(Normal(0,resStd),nRowsX)

10-element Array{Float64,1}:
  0.455793 
 -0.942984 
  0.0824349
  1.0199   
  0.156871 
  2.42324  
 -0.517451 
 -0.830487 
 -0.209511 
  1.04214  

In [20]:
y = X*b + e

10-element Array{Float64,1}:
 -0.386995
 -1.57084 
 -1.57053 
  0.804491
 -0.350599
  0.794725
 -1.69604 
 -1.16519 
 -1.27287 
 -1.31393 

## Function to simulate data


In [32]:
using Distributions
function simDat(nObs,nLoci,bMean,bStd,resStd)
    X = [ones(nObs,1) rand(Binomial(2,0.5),(nObs,nLoci))]
    b = rand(Normal(bMean,bStd),size(X,2))
    y = X*b + rand(Normal(0.0, resStd),nObs)
    return (y,X,b)
end

simDat (generic function with 1 method)

## Use of function simDat to simulate data

In [33]:
nObs     = 10
nLoci    = 5
bMean    = 0.0
bStd     = 0.5
resStd   = 1.0
res = simDat(nObs,nLoci,bMean,bStd,resStd)
y = res[1]
X = res[2]
b = res[3]
# alternatively
y,X,b = simDat(nObs,nLoci,bMean,bStd,resStd);

In [25]:
y

10-element Array{Float64,1}:
  3.05282 
  1.30472 
 -1.26533 
  1.82812 
 -0.566031
  0.58242 
  0.178012
  0.174397
 -0.982405
 -0.463178

In [26]:
X

10x6 Array{Float64,2}:
 1.0  1.0  1.0  1.0  1.0  1.0
 1.0  1.0  0.0  2.0  1.0  2.0
 1.0  1.0  1.0  2.0  0.0  1.0
 1.0  1.0  1.0  1.0  0.0  1.0
 1.0  0.0  2.0  1.0  0.0  0.0
 1.0  2.0  0.0  1.0  1.0  1.0
 1.0  2.0  2.0  2.0  2.0  0.0
 1.0  1.0  0.0  0.0  0.0  1.0
 1.0  1.0  1.0  1.0  1.0  1.0
 1.0  1.0  1.0  2.0  2.0  1.0

In [27]:
b

6-element Array{Float64,1}:
 -0.179943 
  0.0209893
 -0.369453 
 -0.371694 
  0.57233  
  0.466209 