# Julia: A modern programming language for Statistics and Operational Research

## By Jamie Fairbrother

“Julia is a high-level, high-performance dynamic programming
language for technical computing, with syntax that is familiar
to users of other technical computing environments.”

juliaorg.lang

# Why Julia for Statistics and Operational Research?

- Speed comparable to C for non-vectorizable/parallelisable code
- Intuitive and easy-to-read syntax
- Packages which make it ideal as a statistics and operational research language
- Open source

## Speed

![Benchmarks](benchmarks.svg "Speed benchmarks")

# Language Basics

## Statements

In [None]:
x = 1
y = 2 + x
z = y / 2
π

In [None]:
r = 2.3
area, circ = π*r^2, 2π*r  # Python-style tuple assignment

## Vectors and Matrices
Julia uses syntax similar to Matlab

In [None]:
A = [[1.0 2.3 3.0];
     [2.0 3.0 0.0];
     [0.0 1.0 0.0]];
b = [1.0, 2.0, 3.0];

In [None]:
A*b   # Matrix vector multiplication
A\b   # inv(A) * b

Element-wise operations:

In [None]:
a, b = [1,2,3], [3,2,1]
a .* b # Element-wise operations
a ./ b

Vector/matrix construction:

In [None]:
eye(3);        # Identity 3 x 3 matrix
fill(1.0, (3,3)) # 3 x 3 matrix filled with ones
C = rand(3,3) # Random 3 x 3 matrix

Indexing:

In [None]:
A[1,1]   # Get individual element
b[2:end] # Subvector
C[2:3,:] # Submatrix

Transposing:

In [None]:
A'

In [None]:
C'C

## Control-flow and Looping
Julia uses Matlab style blocks

In [None]:
for i in 1:20
    if i % 5 == 1
        print("\n\"$i\" ")
    else
        print("$i ")
    end
end

## Functions
Functions can be defined in a Matlab-style block. Note that function arguments can be annotated by types.

In [None]:
function normpdf(x::Float64)
    return 0.5 * exp(-x^2)
end
normpdf(0.0)

Functions can also be created in a single line:

In [None]:
normcdf(x) = quadgk(normpdf, -Inf, x)[1]
normcdf(1.0)

# Documentation

Function documentation is accessible inside a Julia session:

In [None]:
?rand

In [None]:
rand(1:100, (3,3))

## Package Management

The functionality of Julia can be extended through the use of packages.

In [None]:
using Gadfly;

Julia has a package management system which allows the easy installation, removal and updating of packages

```julia
Pkg.add("Gadfly")   # Install Gadfly package
Pkg.rm("RDatasets") # Remove RDatasets package
Pkg.update()        # Update all installed packages
```

# Advanced Features

## Just-in-time (JIT) compilation

Julia compiles code into binary code when a function it is run

In [None]:
A = rand(100,100);

In [None]:
@time svd(A) # Time SV decomposition of a matrix;

Code thus runs quicker after the first run

In [None]:
@time svd(A);

## Parallelisation

Julia has built-in support for parallisation

In [None]:
# Parallelisation of a "for" loop
a = SharedArray(Float64,10)
@parallel for i=1:10
    a[i] = sin(2π*rand())
end;

In [None]:
t = @spawn rand(100000); # Spawn a new process
x = rand(100000);        # Do something else...
x + fetch(t);            # Use result from process;

## Macros
Julia makes extensive of "macros" which are commands used to generate code. They are often used to provide a more convenient way of carrying out a task.

All macros are prefixed with `@` e.g. `@time, @spawn` etc.
To see what a macro does, we can use the `macroexpand` function:

In [None]:
macroexpand(:( @time rand(100000)))

# Other advanced features

- Multiple dispatch (function overloading)
- Directly calling C code
- Efficient user defined types
- Unicode support

# Packages

## Plotting
The main plotting packages for Julia are Gadfly and PyPlot

In [None]:
using Gadfly # Similar interface to ggplot in R

x = linspace(-1, 1, 100)
y = 0.5*exp(-x.^2/2)
plot(x=x, y=y, Geom.line)

# Distributions

Distributions.jl implements many univariate and multivariate distributions from which one can sample, calculate various properties.

In [None]:
using Distributions

norm = Normal(3, 2) #Define Normal distribution with mean 3 and std. dev. 2

In [None]:
pdf(norm, 1:0.05:5)
mean(norm)
quantile(norm, 0.95)

In [None]:
rand(norm, 5)

# Optimization

Julia has interfaces for many popular solvers including GLPK, CLP, CBC, CPLEX, Gurobi, Mosek, Ipopt, NLOpt.

The JuMP package provides a convenient way of modelling problems:

In [None]:
using JuMP
using Gurobi

# Maximization problem
m = Model(solver=GurobiSolver())
@variable(m, x[1:5], Bin)

In [None]:
profit = [ 5, 3, 2, 7, 4 ]
weight = [ 2, 8, 4, 2, 5 ]
capacity = 10

# Objective: maximize profit
@objective(m, Max, dot(profit, x))

In [None]:
# Constraint: can carry all
@constraint(m, dot(weight, x) <= capacity)

In [None]:
# Solve problem using MIP solver
status = solve(m)

println("Objective is: ", getobjectivevalue(m))
println("Solution is:")
for i = 1:5
    print("x[$i] = ", getvalue(x[i]))
    println(", p[$i]/w[$i] = ", profit[i]/weight[i])
end

JuMP can also:
- Model SOCP, SDP and other non-linear problems
- Extract dual/slack information
- Modify problems and warm starts (when solver supports these)
- Define callbacks with certain solvers

# Other Packages

- Operational Reseach: Optim.jl, Convex.jl, Graphs.jl, Queueing.jl, TextAnalysis.jl
- Statistics: DataFrames.jl, TimeSeries.jl, Lora.jl (MCMC), Clustering.jl, GLM.jl, SVM.jl
- Maths: Calculus.jl (symbolic differentiation), Combinatorics.jl, ODE.jl
- Other: Interact.jl (interactive plots)

Certain functionality isn't available?... Write your own package!
STOR-i has published two packages:
- GaussianProcesses.jl
- Changepoints.jl

# Getting Started

# Environments
- Terminal
- Jupyter notebook
- Emacs or Vim
- Juno

![Juno screenshot](juno_screenshot.png)

# Julia Installation

Installers available for Windows, OS X, and all major Linux distributions

Detailed installation instructions can be found at http://julialang.org/downloads/platform.html


# More information

- Interactive tutorials at http://juliabox.org
- Full language reference available at http://julialang.org
- Limited places at STOR-i Julia workshop 9th-10th June

# Conclusions
- Julia is an high-level programming language with speeds comparable to C
- Julia has a very intuitive syntax, borrowing from other languages like Python or R
- A plethora of packages for stats and OR already available

# Limitations

- Julia is under continuing development and the language is changing
- JIT means package load times can be long, but this is improving with the introduction with package caching
- Package base not as well-developed as R or Matlab, but quickly growing