# Re-invitation to Julia

Let's blitz review the fundamentals of Julia. This is suitable as a very quick introduction for experienced programmers, or as a pre-review for the next step for people with some Julia knowledge.

For a more leisurely discussion, see my *Invitation to Julia* tutorial from JuliaCon 2015: https://www.youtube.com/watch?v=gQ1y5NUD_RI

### Financial support

Financial support is acknowledged from DGAPA-UNAM (Mexico) PAPIME grant PE-107114, DGAPA-UNAM PAPIIT grant IN-117214, and from a CONACYT-Mexico sabbatical fellowship. The author thanks Alan Edelman and the Julia group at MIT for hospitality during his sabbatical visit.

## Variables:

In [None]:
a = 2
a, typeof(a)

In [None]:
b = 3.5
b, typeof(b)

In [None]:
c = a + b

What does this do?

In [None]:
@which a + b

What is `+` actually?

In [None]:
+

In [None]:
methods(+)

We see that Julia calls specialised versions, called **methods** of a function. `+` and all other "operators" are just functions with (possibly) many methods. The act of choosing which method to use based on the *types* of all the arguments passed into the function is called **multiple dispatch**, one of the fundamental features of Julia.

We can follow down the chain of what is going on by looking at the code, or preferably using the new debugger integration in the Juno IDE:  `@step 2 + 3.5`

## When is Julia fast?

One of Julia's strengths is its blinding speed, which is perhaps unique among high-level languages.
However, it is easy enough to write *slow* Julia code! The point is that making slow code fast often requires following only a few rules.

Suppose we define a function

In [None]:
f(x, y) = x * y

In [None]:
f(3, 4)

In [None]:
f(3.5, 4.5)

Julia allows us to see various stages in the compilation process:

In [None]:
@code_lowered f(3, 4)

In [None]:
@code_typed f(3, 4)

or better

In [None]:
@code_warntype f(3, 4)

In [None]:
@code_llvm f(3, 4)

In [None]:
@code_llvm f(3.5, 4)

In [None]:
@code_native f(3, 4)

In [None]:
@code_native f(3.5, 4.5)

In [None]:
@code_native f(3, 4.5)

We see that Julia **generates highly-efficient code that is specialised on input type**. This is naturally true only **if Julia is able to correctly infer the types of every **.

For example:

In [None]:
function g(x)
    a = 1   # a starts off life as an integer
    a += x  # it may change type here
    return a
end
    

In [None]:
@code_warntype g(3)

In [None]:
@code_typed g(3)

In [None]:
@code_typed g(3.5)

In [None]:
@code_warntype g(3.5)

In [None]:
@code_native g(3.5)

In [None]:
@code_native g(3)

The code is efficient **only** when the function is **type-stable**, i.e. when no variable changes type during the function. In the future, such simple cases should be able to be analysed by more complex compiler optimizations.

## Arrays

Arrays are another fundamental building block of Julia, and there is much sophisticated array functionality.

The simplest way to create an array of a given type and size is with `zeros`, for example a vector:

In [None]:
v = zeros(Int, 3)  # element type; size 3

or a matrix:

In [None]:
M = zeros(3, 3)  # default element type is Float64 

We see that `Array` is a type with two **type parameters**, the element type and the **dimension** of the array. Note that the size of the array is **not** one of the type parameters, but it available with

In [None]:
size(M)

This returns an object of type

In [None]:
typeof(size(M))

`Vector`s (i.e. 1-dimensional `Array`s) may be extended using `push!`:

In [None]:
push!(v, 1)

Higher-dimensional arrays have a fixed size. Instead you can push to a `Vector` of `Vector`s:

In [None]:
v = Vector{Int}[]
push!(v, [3, 4])
push!(v, [5, 6])

Note that in Julia v0.4, we can write this directly as

In [None]:
v = Vector{Int}[ [3,4], [5,6] ]

by specifying the type of each element of the array comprehension. In Julia v0.5, this may be written simply as

    v = [ [3,4], [5,6] ]

## Performance: don't program in global scope

Coming from other languages, it is natural to write code that looks like the following example simulation of a simple random walker with position `pos`:

In [None]:
@time begin 
pos = 0
numsteps = 10^4
numwalkers = 10^4

final_square_positions = Int[]

for i in 1:numwalkers
    for j in 1:numsteps
        pos += ifelse(rand() < 0.5, -1, +1)
    end
    push!(final_square_positions, pos^2)
end
   
println("Mean square displacement = ", mean(final_square_positions))
end

We have wrapped the code in a `begin...end` block, and timed it with the `@time` macro. (Note that there is also the `@elapsed` macro, which returns the time in seconds, `@timev` for verbose output, and `@timed` for returning detailed information.)

Is 20 seconds for this calculation slow? We cannot know that without having a comparison code from a compiled language such as C or Fortran. **However**, we see that there are a huge number of allocations, which such a simple code should never have.

This is an immediate warning sign that there Julia is unable to correctly infer the type of some object. In this case, it is because we are working in global scope, which is potentially infinite in size. (Nonetheless, it is a future goal of Julia to reduce this effect.)

The solution is extremely simple: just wrap the code in a function. While we are at it, we should take the opportunity to make the constants into parameters of the function. We can also refactor to separate out a single walker into a separate function:

In [None]:
rand(Bool)

In [None]:
function random_walker(numsteps)
    pos = 0
        
    for j in 1:numsteps
        pos += ifelse(rand() < 0.5, -1, +1)
    end
    
    return pos
end


function mean_square_disp(numwalkers, numsteps)
    
    final_square_positions = Int[]

    for i in 1:numwalkers
        final_pos = random_walker(numsteps)
        push!(final_square_positions, final_pos^2)
    end

    return mean(final_square_positions)
end


**Before** we do any timing, we first must ensure that the functions are compiled, by running them once with small values of the parameters:

In [None]:
mean_square_disp(1, 1)

Now we can immediately do the "production run":

In [None]:
@time mean_square_disp(10^4, 10^4)

This is almost a 50-times speedup, and should be competitive with an implementation in C or Fortran.

## Global variables

Global variables should generally be avoided, but sometimes they are a necessary evil. In this case, they should **always** be declared `const`, in which case the compiler can infer their type and create fast code.

We can obtain a typed but mutable variable in one of two ways: by placing it inside an array, or by placing it inside an object of a user-defined type. In both cases, either the array or the object must be `const` for speed.

In [None]:
const my_number = 3.14159