# Why Julia?

_"We want a language that’s open source, with a liberal license. We want the speed of C with the dynamism of Ruby. We want a language that’s homoiconic, with true macros like Lisp, but with obvious, familiar mathematical notation like Matlab. We want something as usable for general programming as Python, as easy for statistics as R, as natural for string processing as Perl, as powerful for linear algebra as Matlab, as good at gluing programs together as the shell. Something that is dirt simple to learn, yet keeps the most serious hackers happy. We want it interactive and we want it compiled._

_(Did we mention it should be as fast as C?)"_

Basically they want the moon on a stick. They're pretty close to getting it.

## This tutorial

This is a whirlwind tour of Julia. We'll cover:
- Data types
- Basic control flow
- Functions 
- Linear algebra
- Introspection
- Interoperability
- Parallelisation

## Data types

Default numeric type is double precision: Float64. Also supports, Float32, Int64, etc

In [1]:
one(Float64)

1.0

In [2]:
rand(Float32)

0.2556975f0

In [3]:
zero(Int64)

0

Mixed type division causes upgrade to Float64. This allows type stability in programs.

In [52]:
1/3

0.3333333333333333

__________________________________________________________________________________________________________________

# Fun things, tres nerd.

<img src="imgs/220px-Pocketp.gif" width="150", align="left">

### Rational numbers

In [7]:
r = 3//7 * 7//5

3//5

In [22]:
typeof(r)

Rational{Int64}

In [43]:
R = 1.//collect(1:5)

5-element Array{Rational{Int64},1}:
 1//1
 1//2
 1//3
 1//4
 1//5

In [48]:
M= R*R'

5×5 Array{Rational{Int64},2}:
 1//1  1//2   1//3   1//4   1//5 
 1//2  1//4   1//6   1//8   1//10
 1//3  1//6   1//9   1//12  1//15
 1//4  1//8   1//12  1//16  1//20
 1//5  1//10  1//15  1//20  1//25

In [49]:
lufactM = lu(M)

(
Rational{Int64}[1//1 0//1 … 0//1 0//1; 1//2 1//1 … 0//1 0//1; … ; 1//4 0//1 … 1//1 0//1; 1//5 0//1 … 0//1 1//1],

Rational{Int64}[1//1 1//2 … 1//4 1//5; 0//1 0//1 … 0//1 0//1; … ; 0//1 0//1 … 0//1 0//1; 0//1 0//1 … 0//1 0//1],

[1,2,3,4,5])

In [51]:
lufactM[2]

5×5 Array{Rational{Int64},2}:
 1//1  1//2  1//3  1//4  1//5
 0//1  0//1  0//1  0//1  0//1
 0//1  0//1  0//1  0//1  0//1
 0//1  0//1  0//1  0//1  0//1
 0//1  0//1  0//1  0//1  0//1

__________________________________________________________________________________________________________________

### High precision data types

Let's try and evaluate $200!=200 \times199 \times 198...$


In [21]:
factorial(200) #woops - this is a big number

LoadError: LoadError: OverflowError()
while loading In[21], in expression starting on line 1

In [10]:
n = 200*one(BigInt)

200

In [11]:
typeof(n)

BigInt

In [14]:
bigfact = factorial(n)

788657867364790503552363213932185062295135977687173263294742533244359449963403342920304284011984623904177212138919638830257642790242637105061926624952829931113462857270763317237396988943922445621451664240254033291864131227428294853277524242407573903240321257405579568660226031904170324062351700858796178922222789623703897374720000000000000000000000000000000000000000000000000

In [67]:
log(bigfact) - lgamma(n+1)  

0.000000000000000000000000000000000000000000000000000000000000000000000000000000

Suppose we have a small $x = \frac{1}{10^{500}}$. Now obvs $\frac{x}{x}=1$.

In [311]:
x_Float64 = 1.0 / (10.0^-500)  #woops - divide by 'zero'
x_Float64/x_Float64

NaN

In [313]:
x_BigFloat = 1.0 / ((10one(BigFloat))^-500)
x_BigFloat/x_BigFloat

1.000000000000000000000000000000000000000000000000000000000000000000000000000000

Seems silly, but this stuff can be super annoying in numerical programming. There's a speed penalty, but maybe you want to optimise your code _after_ you get it working. And/or assure yourself the idea itself isn't rubbish, which it probably is, if you are me.

__________________________________________________________________________________________________________________

# Serious computing, much science

<img src="imgs/muchdog.jpeg" width="300" align="left">

## Linear algebra

Matrix operations are default: 
- multiplication *
- arithmetic +, -
- solve \

Column major (M[:,1] is quicker than M[1,:])

In [315]:
A = randn(5,5);
B = A' * A

5×5 Array{Float64,2}:
  7.44719    2.32404   -0.515429  -3.83385    0.581364
  2.32404   13.9877    -0.245746  -1.23764   -3.19265 
 -0.515429  -0.245746   3.44641   -3.69492   -1.0904  
 -3.83385   -1.23764   -3.69492    9.02937    0.525965
  0.581364  -3.19265   -1.0904     0.525965   1.33075 

For elementwise operations, add a . before the operator.

In [316]:
H = A .* A

5×5 Array{Float64,2}:
 1.31509   0.36906  0.15812   0.734707  0.410736
 0.30145   1.11984  2.32395   0.871804  0.103828
 4.55906   3.23344  0.478224  4.93482   0.15735 
 0.981026  5.85091  0.153583  0.358542  0.522184
 0.290561  3.41446  0.332528  2.12949   0.136651

Comes packed with blas and suitesparse out of the box.

In [317]:
C = chol(B)

5×5 UpperTriangular{Float64,Array{Float64,2}}:
 2.72895  0.851624  -0.188874   -1.40488     0.213035
  ⋅       3.64176   -0.0233118  -0.0113168  -0.926494
  ⋅        ⋅         1.84667    -2.14469    -0.580374
  ⋅        ⋅          ⋅          1.56712    -0.274357
  ⋅        ⋅          ⋅           ⋅          0.121934

In [318]:
l  = randn(5)
C \ l

5-element Array{Float64,1}:
  0.538762
 -1.04438 
 -2.38441 
 -0.605252
 -2.79532 

### Sparse matrices

The point here is that Julia treats sparse matrices properly.

In [319]:
S = sprand(100,100,0.1);
S[1:2,:]

2×100 sparse matrix with 23 Float64 nonzero entries:
	[1  ,   5]  =  0.599002
	[2  ,   5]  =  0.955839
	[2  ,  32]  =  0.93147
	[1  ,  36]  =  0.211836
	[1  ,  39]  =  0.110356
	[1  ,  45]  =  0.559659
	[1  ,  50]  =  0.960968
	[2  ,  52]  =  0.533508
	[1  ,  56]  =  0.88769
	[2  ,  59]  =  0.575355
	⋮
	[1  ,  80]  =  0.706075
	[2  ,  80]  =  0.732738
	[1  ,  87]  =  0.117349
	[1  ,  88]  =  0.712092
	[2  ,  88]  =  0.683166
	[2  ,  91]  =  0.151015
	[1  ,  94]  =  0.815216
	[2  ,  95]  =  0.775434
	[2  ,  96]  =  0.823778
	[1  ,  97]  =  0.25485
	[2  ,  98]  =  0.388228

In [320]:
R = S'*S;

In [321]:
cholR = cholfact(R)

Base.SparseArrays.CHOLMOD.Factor{Float64}
type:          LLt
method: supernodal
maxnnz:          0
nnz:          4784


In [322]:
v = randn(size(R)[1]);

In [323]:
@elapsed cholR\v

5.9378e-5

In [366]:
@elapsed R\v

0.000459807

## Control Flow

In [236]:
for i in 1:length(v)
    if i > 2
        v[i] = sqrt(i)
    elseif i==1
        v[i] = -1
    else
        v[i] = 0
    end
end

In [237]:
println(v)

[-1.0,0.0,1.73205,2.0,2.23607,2.44949,2.64575,2.82843,3.0,3.16228,3.31662,3.4641,3.60555,3.74166,3.87298,4.0,4.12311,4.24264,4.3589,4.47214,4.58258,4.69042,4.79583,4.89898,5.0,5.09902,5.19615,5.2915,5.38516,5.47723,5.56776,5.65685,5.74456,5.83095,5.91608,6.0,6.08276,6.16441,6.245,6.32456,6.40312,6.48074,6.55744,6.63325,6.7082,6.78233,6.85565,6.9282,7.0,7.07107,7.14143,7.2111,7.28011,7.34847,7.4162,7.48331,7.54983,7.61577,7.68115,7.74597,7.81025,7.87401,7.93725,8.0,8.06226,8.12404,8.18535,8.24621,8.30662,8.3666,8.42615,8.48528,8.544,8.60233,8.66025,8.7178,8.77496,8.83176,8.88819,8.94427,9.0,9.05539,9.11043,9.16515,9.21954,9.27362,9.32738,9.38083,9.43398,9.48683,9.53939,9.59166,9.64365,9.69536,9.74679,9.79796,9.84886,9.89949,9.94987,10.0]


## Functions

In [238]:
function dostuff(N)
    v = zeros(N)
    for i in 1:N
        v[i] = sqrt(i)
    end
    return v
end

dostuff (generic function with 1 method)

In [239]:
dostuff(5)

5-element Array{Float64,1}:
 1.0    
 1.41421
 1.73205
 2.0    
 2.23607

In [240]:
sum(dostuff(5))

8.382332347441762

In [241]:
@elapsed dostuff(500000)

0.002693991

Introspect on your code with macros, including *inter alia* :
 - @code_warntype what types are getting used in your function, great for performance tweaking.
 - @code_native the machine code in case you don't believe it's actually a compiled language...

In [242]:
@code_warntype dostuff(3)

Variables:
  #self#::#dostuff
  N::Int64
  v::Array{Float64,1}
  #temp#::Int64
  i::Int64

Body:
  begin 
      v::Array{Float64,1} = $(Expr(:invoke, LambdaInfo for fill!(::Array{Float64,1}, ::Float64), :(Base.fill!), :((Core.ccall)(:jl_alloc_array_1d,(Core.apply_type)(Core.Array,Float64,1)::Type{Array{Float64,1}},(Core.svec)(Core.Any,Core.Int)::SimpleVector,Array{Float64,1},0,N,0)::Array{Float64,1}), :((Base.box)(Float64,(Base.sitofp)(Float64,0))))) # line 3:
      SSAValue(4) = (Base.select_value)((Base.sle_int)(1,N::Int64)::Bool,N::Int64,(Base.box)(Int64,(Base.sub_int)(1,1)))::Int64
      #temp#::Int64 = 1
      5: 
      unless (Base.box)(Base.Bool,(Base.not_int)((#temp#::Int64 === (Base.box)(Int64,(Base.add_int)(SSAValue(4),1)))::Bool)) goto 16
      SSAValue(5) = #temp#::Int64
      SSAValue(6) = (Base.box)(Int64,(Base.add_int)(#temp#::Int64,1))
      i::Int64 = SSAValue(5)
      #temp#::Int64 = SSAValue(6) # line 4:
      SSAValue(2) = (Base.Math.box)(Base.Math.Float64,(Base.Mat

## Probability distributions

In [246]:
using Distributions

LoadError: LoadError: ArgumentError: Module Distributions not found in current path.
Run `Pkg.add("Distributions")` to install the Distributions package.
while loading In[246], in expression starting on line 1

In [247]:
F = Distributions.Laplace(0,1)

LoadError: LoadError: UndefVarError: Distributions not defined
while loading In[247], in expression starting on line 1

In [248]:
mean(F)

LoadError: LoadError: UndefVarError: F not defined
while loading In[248], in expression starting on line 1

In [249]:
var(F)

LoadError: LoadError: UndefVarError: F not defined
while loading In[249], in expression starting on line 1

In [250]:
cdf(F, 1.65)

LoadError: LoadError: UndefVarError: cdf not defined
while loading In[250], in expression starting on line 1

In [251]:
rand(F, 2, 5)

LoadError: LoadError: UndefVarError: F not defined
while loading In[251], in expression starting on line 1

## Metaprogramming

<img src="imgs/dwarffortress.png" width="500" align="left">

Code is data, and can be operated on and changed by other code. 


Before they get executed, your instructions are parsed. We can hang on to the parsed-but-not-yet-executed instructions with `Expr` objects.

In [259]:
somecode = :(M*M)

:(M * M)

In [260]:
typeof(somecode)

Expr

We can evaluate the instruction at a later point.

In [263]:
eval(somecode)

5×5 Array{Rational{Int64},2}:
 12650869//3600   12650869//7200   …  12650869//14400  12650869//18000
 12650869//7200   12650869//14400     12650869//28800  12650869//36000
 12650869//10800  12650869//21600     12650869//43200  12650869//54000
 12650869//14400  12650869//28800     12650869//57600  12650869//72000
 12650869//18000  12650869//36000     12650869//72000  12650869//90000

In [268]:
M = (1//13)M 

5×5 Array{Rational{Int64},2}:
 49//371293   49//742586   49//1113879  49//1485172  49//1856465
 49//742586   49//1485172  49//2227758  49//2970344  49//3712930
 49//1113879  49//2227758  49//3341637  49//4455516  49//5569395
 49//1485172  49//2970344  49//4455516  49//5940688  49//7425860
 49//1856465  49//3712930  49//5569395  49//7425860  49//9282325

In [270]:
eval(somecode)

5×5 Array{Rational{Int64},2}:
 12650869//496290570656400   …  12650869//2481452853282000 
 12650869//992581141312800      12650869//4962905706564000 
 12650869//1488871711969200     12650869//7444358559846000 
 12650869//1985162282625600     12650869//9925811413128000 
 12650869//2481452853282000     12650869//12407264266410000

We can even alter the behaviour of not-yet-executed code with `macro`s.

In [271]:
fieldnames(typeof(somecode))

3-element Array{Symbol,1}:
 :head
 :args
 :typ 

In [272]:
somecode.args

3-element Array{Any,1}:
 :*
 :M
 :M

For example, we could switch out matrix multiplication for elementwise multiplication.

In [273]:
macro corruptmatmult(arg)
   if arg.args[1] == :*
       arg.args[1] = :.*
   end
   return arg
end



@corruptmatmult (macro with 1 method)

In [274]:
M*M

5×5 Array{Rational{Int64},2}:
 12650869//496290570656400   …  12650869//2481452853282000 
 12650869//992581141312800      12650869//4962905706564000 
 12650869//1488871711969200     12650869//7444358559846000 
 12650869//1985162282625600     12650869//9925811413128000 
 12650869//2481452853282000     12650869//12407264266410000

In [275]:
@corruptmatmult M*M

5×5 Array{Rational{Int64},2}:
 2401//137858491849   2401//551433967396    …  2401//3446462296225 
 2401//551433967396   2401//2205735869584      2401//13785849184900
 2401//1240726426641  2401//4962905706564      2401//31018160666025
 2401//2205735869584  2401//8822943478336      2401//55143396739600
 2401//3446462296225  2401//13785849184900     2401//86161557405625

# Parallel 

<img src="imgs/spam.jpg" width="500" align="left">

Were you just thinking, "Sure, but why would I be interested in metaprogramming?" Fret no longer. 

Let's grab some more processors.

In [276]:
rmprocs(workers()); pids = addprocs(4);

In [277]:
workers()'

1×4 Array{Int64,2}:
 22  23  24  25

In [278]:
@fetchfrom pids[2] myid()

23

In [279]:
[@fetchfrom pid randn() for pid in pids]'

1×4 Array{Float64,2}:
 1.3692  -0.0573273  0.529083  -0.752713

In [280]:
@everywhere n = myid()

In [281]:
(myid(), pids[2])

(1,23)

In [282]:
@fetchfrom pids[2] n

1

Huh? We want to retrieve the value we previously assigned to the symbol `:n` on the second processor.

In [283]:
@fetchfrom pids[2] eval(:n)

23

So we need metaprogramming to do parallel computing conveniently.

I made a small package, `ClusterUtils`, to facilitate this.

In [284]:
using ClusterUtils;

In [285]:
results = reap(pids, :(myid()*[1,2,3]))

Dict{Int64,Any} with 4 entries:
  23 => [23,46,69]
  25 => [25,50,75]
  22 => [22,44,66]
  24 => [24,48,72]

In [286]:
reduce(hcat, values(results))

3×4 Array{Int64,2}:
 23  25  22  24
 46  50  44  48
 69  75  66  72

In [287]:
reap(pids, :(x = randn(3)));

In [288]:
reap(pids, :(pi*x))

Dict{Int64,Any} with 4 entries:
  23 => [-3.6407,-0.510301,-1.42506]
  25 => [0.337096,-0.386836,0.646038]
  22 => [2.8661,-1.07549,3.84077]
  24 => [-0.577819,2.27804,0.165013]

## Interoperability


<img src="imgs/allyour.png" width="400" align="left">

### shell

In [376]:
println(readstring(`cat /proc/meminfo`)[1:200] * "...")

MemTotal:       16389716 kB
MemFree:         7230676 kB
Buffers:          389688 kB
Cached:          3303624 kB
SwapCached:            0 kB
Active:          6160128 kB
Inactive:        1816868 kB
Acti...


In [383]:
;git log | head

commit 77100da6673305ed6d4aa66ab1c96f1c2eb4742b
Author: Matthew Pearce <mcp50@cam.ac.uk>
Date:   Mon Nov 7 16:19:12 2016 +0000

    consolidating.

commit b9cfbb54de20950f6e9a001b280bed0c97bc9aee
Author: Matthew Pearce <mcp50@cam.ac.uk>
Date:   Fri Nov 4 13:37:23 2016 +0000



### python

We can use python modules from Julia.

In [351]:
using PyCall
@pyimport sklearn.svm as svm

In [353]:
model = svm.SVC()

PyObject SVC(C=1.0, cache_size=200, class_weight=None, coef0=0.0, degree=3, gamma=0.0,
  kernel='rbf', max_iter=-1, probability=False, random_state=None,
  shrinking=True, tol=0.001, verbose=False)

In [358]:
X = randn(5,5)
y = randn(5)
z = randn(5)

5-element Array{Float64,1}:
 -0.530013 
 -0.0234911
 -0.524552 
  0.465881 
  0.639702 

In [357]:
model[:fit](X,y)

PyObject SVC(C=1.0, cache_size=200, class_weight=None, coef0=0.0, degree=3, gamma=0.0,
  kernel='rbf', max_iter=-1, probability=False, random_state=None,
  shrinking=True, tol=0.001, verbose=False)

In [362]:
model[:predict](z)

1-element Array{Float64,1}:
 -0.113438

### C

We can use C and Fortran libraries from Julia.

In [402]:
ccall((:clock, "libc"), Int32, ())

39000000

In [407]:
gsl_acosh = (x::Float64)->ccall((:gsl_acosh, "libgsl"), Float64, (Float64, ), x)

(::#129) (generic function with 1 method)

In [409]:
gsl_acosh(2.0)

1.3169578969248166