## Introduction to Julia, part 2

## Matrices and vectors

### Dimensions

In [1]:
x = randn(5, 3)

5×3 Matrix{Float64}:
  0.348931  -0.528734   0.0841266
 -0.17331    0.568161   0.333482
 -2.3021     1.34533   -1.28093
 -0.941703  -0.314709   0.391469
 -0.416806  -0.151277   0.411174

In [2]:
size(x)

(5, 3)

In [3]:
size(x, 1) # nrow() in R

5

In [4]:
size(x, 2) # ncol() in R

3

In [5]:
# total number of elements
length(x)

15

In [6]:
# 5 x 5 matrix of random Normal(0, 1)
x = randn(5, 5)

5×5 Matrix{Float64}:
 -2.07092     -0.681146  -0.412797   0.980523   0.384961
  0.376336    -0.656307   0.875751  -1.51705    0.312636
 -0.00692474  -2.51886   -0.408322  -0.632404   0.40076
 -0.114022    -0.188734   1.20963   -0.469076  -0.893117
  0.163478     0.489247   1.13166    1.24018   -0.681309

In [7]:
# first column
x[:, 1]

5-element Vector{Float64}:
 -2.070923144091901
  0.3763357770991123
 -0.006924739792610375
 -0.11402229835681592
  0.16347814306900302

In [8]:
# first row
x[1, :]

5-element Vector{Float64}:
 -2.070923144091901
 -0.6811455534338193
 -0.41279677488265926
  0.9805234720605761
  0.3849614911746269

In [9]:
# sub-array
x[1:2, 2:3]

2×2 Matrix{Float64}:
 -0.681146  -0.412797
 -0.656307   0.875751

In [10]:
# getting a subset of a matrix creates a copy, but you can also create "views"
z = view(x, 1:2, 2:3)

2×2 view(::Matrix{Float64}, 1:2, 2:3) with eltype Float64:
 -0.681146  -0.412797
 -0.656307   0.875751

In [11]:
# same as
@views z = x[1:2, 2:3]

2×2 view(::Matrix{Float64}, 1:2, 2:3) with eltype Float64:
 -0.681146  -0.412797
 -0.656307   0.875751

In [12]:
# change in z (view) changes x as well
z[2, 2] = 0.0
x

5×5 Matrix{Float64}:
 -2.07092     -0.681146  -0.412797   0.980523   0.384961
  0.376336    -0.656307   0.0       -1.51705    0.312636
 -0.00692474  -2.51886   -0.408322  -0.632404   0.40076
 -0.114022    -0.188734   1.20963   -0.469076  -0.893117
  0.163478     0.489247   1.13166    1.24018   -0.681309

In [13]:
# y points to same data as x
y = x

5×5 Matrix{Float64}:
 -2.07092     -0.681146  -0.412797   0.980523   0.384961
  0.376336    -0.656307   0.0       -1.51705    0.312636
 -0.00692474  -2.51886   -0.408322  -0.632404   0.40076
 -0.114022    -0.188734   1.20963   -0.469076  -0.893117
  0.163478     0.489247   1.13166    1.24018   -0.681309

In [14]:
# x and y point to same data
pointer(x), pointer(y)

(Ptr{Float64} @0x000000012d3a17f0, Ptr{Float64} @0x000000012d3a17f0)

In [15]:
# changing y also changes x
y[:, 1] .= 0  # Dot broadcasting: "vectorization" in Julia. More below
x

5×5 Matrix{Float64}:
 0.0  -0.681146  -0.412797   0.980523   0.384961
 0.0  -0.656307   0.0       -1.51705    0.312636
 0.0  -2.51886   -0.408322  -0.632404   0.40076
 0.0  -0.188734   1.20963   -0.469076  -0.893117
 0.0   0.489247   1.13166    1.24018   -0.681309

In [16]:
# create a new copy of data
z = copy(x)

5×5 Matrix{Float64}:
 0.0  -0.681146  -0.412797   0.980523   0.384961
 0.0  -0.656307   0.0       -1.51705    0.312636
 0.0  -2.51886   -0.408322  -0.632404   0.40076
 0.0  -0.188734   1.20963   -0.469076  -0.893117
 0.0   0.489247   1.13166    1.24018   -0.681309

In [17]:
pointer(z), pointer(x) # they should be different now

(Ptr{Float64} @0x000000011553c170, Ptr{Float64} @0x000000012d3a17f0)

In [18]:
a = 1.0 # Float64
b = a
b

1.0

In [19]:
a = 2.0
b

1.0

### What's the difference between `(x, y)` and `(a, b)`?

* In Julia, everything is an object (see **Types** below). But there are *mutable* and *immutable* objects.

* In *assignment* of the form `x = ...`, the LHS is a variable name. Assignment changes which object the variable `x` refers to (called a *variable binding*).

* After the statement `b = a` any change to `a` also affects `b`. However, the value bound to `a` is `1.0`, an immutable value.

* You can't mutate an immutable object. The next statement `a = 2.0` does *not* mutate the value bound to `a` (`1.0`), but creates a new immutable object `2.0` and re-binds it to variable `a`.

* Binding of `b` to the previous object (`1.0`) is not affected. Hence there's no way to tell if it was copied or referenced.

In [20]:
# guess what will happen
x = randn(5, 5)
y

5×5 Matrix{Float64}:
 0.0  -0.681146  -0.412797   0.980523   0.384961
 0.0  -0.656307   0.0       -1.51705    0.312636
 0.0  -2.51886   -0.408322  -0.632404   0.40076
 0.0  -0.188734   1.20963   -0.469076  -0.893117
 0.0   0.489247   1.13166    1.24018   -0.681309

* On the other hand, `Array` is a mutable object.

* `y[:, 1] .= 0` is *not* an assignment, but a *mutation*.

* `x = x .+ 0.1` is an assignment, whereas `x .+= 0.1` is a mutation.

In [21]:
y = x

5×5 Matrix{Float64}:
 -0.889612  0.214323   0.483664   0.487662  -0.133111
 -0.392122  1.52542   -2.80152   -0.523334   2.14402
 -1.33792   1.60519   -0.659322  -1.63398    0.889781
 -0.187051  0.108699  -0.766897  -0.494125  -0.371807
  1.2724    0.126705  -0.567921  -0.56473   -1.31453

In [22]:
x .+= 0.1
y

5×5 Matrix{Float64}:
 -0.789612   0.314323   0.583664   0.587662  -0.0331112
 -0.292122   1.62542   -2.70152   -0.423334   2.24402
 -1.23792    1.70519   -0.559322  -1.53398    0.989781
 -0.0870513  0.208699  -0.666897  -0.394125  -0.271807
  1.3724     0.226705  -0.467921  -0.46473   -1.21453

In [23]:
(pointer(x), pointer(y))

(Ptr{Float64} @0x000000012dfb4b90, Ptr{Float64} @0x000000012dfb4b90)

### Concatenate matrices

In [24]:
# 1-by-3 array
[1 2 3]

1×3 Matrix{Int64}:
 1  2  3

In [25]:
# 3-by-1 vector
[1, 2, 3]

3-element Vector{Int64}:
 1
 2
 3

In [26]:
# multiple assignment by tuple
x, y, z = randn(5, 3), randn(5, 2), randn(3, 5)

([-0.7143372812431211 -0.9031985143650438 0.4080043411526138; 0.1489510408314187 0.855778058910371 1.3193685669088948; … ; 1.3721194401008279 0.5501700235065544 0.6858979866529997; -0.06755408454207981 1.5115987190507878 -0.6737721203722761], [-0.7768601145609387 -1.5339370433424926; 0.37139980420259905 -0.5667662204238973; … ; -0.2842798633115931 0.021425892900743523; 0.08428785390958803 -0.9123884535958271], [0.1721891496997717 -1.0943085678398357 … 1.0773379033065544 1.4547580798032131; -1.3622006823485868 0.7349564764635432 … -1.51823835512586 -1.3301226623755715; 0.9849893450046917 -0.08562875143010264 … -0.8298711149052155 -1.4397055913699262])

In [27]:
[x y] # 5-by-5 matrix

5×5 Matrix{Float64}:
 -0.714337   -0.903199   0.408004   -0.77686    -1.53394
  0.148951    0.855778   1.31937     0.3714     -0.566766
  0.115503   -1.66322    0.0273898   0.0816024   0.670265
  1.37212     0.55017    0.685898   -0.28428     0.0214259
 -0.0675541   1.5116    -0.673772    0.0842879  -0.912388

In [28]:
[x y; z] # 8-by-5 matrix

8×5 Matrix{Float64}:
 -0.714337   -0.903199    0.408004   -0.77686    -1.53394
  0.148951    0.855778    1.31937     0.3714     -0.566766
  0.115503   -1.66322     0.0273898   0.0816024   0.670265
  1.37212     0.55017     0.685898   -0.28428     0.0214259
 -0.0675541   1.5116     -0.673772    0.0842879  -0.912388
  0.172189   -1.09431     1.65145     1.07734     1.45476
 -1.3622      0.734956    1.13909    -1.51824    -1.33012
  0.984989   -0.0856288   1.3253     -0.829871   -1.43971

### Dot operation

In Julia, any function `f(x)` can be applied elementwise to an array `X` with the "dot call" syntax `f.(X)`.

In [29]:
x = randn(5, 3)

5×3 Matrix{Float64}:
  0.0602105  -0.869301  -0.677393
  0.985046   -1.0834    -0.810695
 -1.77535     0.511164   0.138725
  0.736197   -0.604366   0.912871
  0.899441   -0.51086   -0.761734

In [30]:
y = ones(5, 3)

5×3 Matrix{Float64}:
 1.0  1.0  1.0
 1.0  1.0  1.0
 1.0  1.0  1.0
 1.0  1.0  1.0
 1.0  1.0  1.0

In [31]:
x .* y # same as x * y in R

5×3 Matrix{Float64}:
  0.0602105  -0.869301  -0.677393
  0.985046   -1.0834    -0.810695
 -1.77535     0.511164   0.138725
  0.736197   -0.604366   0.912871
  0.899441   -0.51086   -0.761734

In [32]:
x .^ (-2) # same as x^(-2) in R

5×3 Matrix{Float64}:
 275.839     1.3233     2.17931
   1.03059   0.851968   1.52155
   0.317271  3.82718   51.9626
   1.84506   2.73779    1.2
   1.2361    3.83174    1.72343

In [33]:
sin.(x) # same as sin(x) in R

5×3 Matrix{Float64}:
  0.0601741  -0.763878  -0.626764
  0.833297   -0.883555  -0.724766
 -0.979151    0.489193   0.13828
  0.671475   -0.568241   0.791263
  0.782979   -0.488928  -0.690177

### Basic linear algebra

In [34]:
x = randn(5)

5-element Vector{Float64}:
  0.7981320376316344
  0.055046774731860114
  0.9171678411680859
 -0.8011908611441682
  0.3098228505206232

In [35]:
using LinearAlgebra
# vector L2 norm
norm(x)

1.4896773947606878

In [36]:
# same as
sqrt(sum(abs2, x))

1.4896773947606878

In [37]:
y = randn(5) # another vector
# dot product
dot(x, y) # x' * y

0.26864208446859245

In [38]:
# same as
x'y

0.26864208446859245

In [39]:
x, y = randn(5, 3), randn(3, 2)
# matrix multiplication, same as %*% in R
x * y

5×2 Matrix{Float64}:
 -0.534878   3.78335
  1.21531   -0.223997
  0.424466   2.13443
 -0.120563   1.03372
  1.10818    1.97533

In [40]:
x = randn(3, 3)

3×3 Matrix{Float64}:
 0.106923  0.131998  -0.441654
 1.2994    0.194997  -1.64046
 1.06757   0.696026   1.16625

In [41]:
# conjugate transpose
x'

3×3 adjoint(::Matrix{Float64}) with eltype Float64:
  0.106923   1.2994    1.06757
  0.131998   0.194997  0.696026
 -0.441654  -1.64046   1.16625

In [42]:
b = rand(3)
x'b # same as x' * b

3-element Vector{Float64}:
  1.5819757080338568
  0.5290551230423759
 -1.346523752738078

In [43]:
# trace
tr(x)

1.4681727122042554

In [44]:
# determinant
det(x)

-0.592299872473627

In [45]:
# rank
rank(x)

3

### Sparse matrices

In [46]:
using SparseArrays

# 10-by-10 sparse matrix with sparsity 0.1
X = sprandn(10, 10, 0.1)

10×10 SparseMatrixCSC{Float64, Int64} with 13 stored entries:
   ⋅       ⋅    ⋅         ⋅        …   ⋅          ⋅        0.229898   ⋅ 
   ⋅       ⋅    ⋅         ⋅            ⋅          ⋅         ⋅         ⋅ 
   ⋅       ⋅    ⋅         ⋅            ⋅        -0.339338   ⋅         ⋅ 
 -0.4508   ⋅    ⋅         ⋅            ⋅          ⋅         ⋅         ⋅ 
   ⋅       ⋅    ⋅         ⋅            ⋅          ⋅         ⋅         ⋅ 
   ⋅       ⋅    ⋅         ⋅        …   ⋅          ⋅         ⋅         ⋅ 
   ⋅       ⋅    ⋅         ⋅            ⋅        -0.369283   ⋅         ⋅ 
   ⋅       ⋅    ⋅         ⋅            ⋅          ⋅         ⋅         ⋅ 
   ⋅       ⋅    ⋅         ⋅           0.324188  -0.390965   ⋅         ⋅ 
   ⋅       ⋅   0.455017  0.761732      ⋅          ⋅         ⋅         ⋅ 

Question: why do we use `SparseArrays`?

Answer: for efficient use of memory

In [47]:
# convert to dense matrix; be cautious when dealing with big data
Xfull = convert(Matrix{Float64}, X)

10×10 Matrix{Float64}:
  0.0     0.0  0.0       0.0       …  0.0        0.0       0.229898  0.0
  0.0     0.0  0.0       0.0          0.0        0.0       0.0       0.0
  0.0     0.0  0.0       0.0          0.0       -0.339338  0.0       0.0
 -0.4508  0.0  0.0       0.0          0.0        0.0       0.0       0.0
  0.0     0.0  0.0       0.0          0.0        0.0       0.0       0.0
  0.0     0.0  0.0       0.0       …  0.0        0.0       0.0       0.0
  0.0     0.0  0.0       0.0          0.0       -0.369283  0.0       0.0
  0.0     0.0  0.0       0.0          0.0        0.0       0.0       0.0
  0.0     0.0  0.0       0.0          0.324188  -0.390965  0.0       0.0
  0.0     0.0  0.455017  0.761732     0.0        0.0       0.0       0.0

In [48]:
# convert a dense matrix to sparse matrix
sparse(Xfull)

10×10 SparseMatrixCSC{Float64, Int64} with 13 stored entries:
   ⋅       ⋅    ⋅         ⋅        …   ⋅          ⋅        0.229898   ⋅ 
   ⋅       ⋅    ⋅         ⋅            ⋅          ⋅         ⋅         ⋅ 
   ⋅       ⋅    ⋅         ⋅            ⋅        -0.339338   ⋅         ⋅ 
 -0.4508   ⋅    ⋅         ⋅            ⋅          ⋅         ⋅         ⋅ 
   ⋅       ⋅    ⋅         ⋅            ⋅          ⋅         ⋅         ⋅ 
   ⋅       ⋅    ⋅         ⋅        …   ⋅          ⋅         ⋅         ⋅ 
   ⋅       ⋅    ⋅         ⋅            ⋅        -0.369283   ⋅         ⋅ 
   ⋅       ⋅    ⋅         ⋅            ⋅          ⋅         ⋅         ⋅ 
   ⋅       ⋅    ⋅         ⋅           0.324188  -0.390965   ⋅         ⋅ 
   ⋅       ⋅   0.455017  0.761732      ⋅          ⋅         ⋅         ⋅ 

In [49]:
# syntax for sparse linear algebra is the same as dense linear algebra
β = ones(10)
X * β

10-element Vector{Float64}:
  0.06567162789773778
  0.0
  2.123154679438356
 -0.4508000965582072
  0.0
  0.0
 -0.3692830581540265
  0.0
 -3.8142971773623637
  1.2167490544568902

In [50]:
# many functions apply to sparse matrices as well
sum(X)

-1.2288049702816137

### Control flow and loops

* if-elseif-else-end

```julia
if condition1
    # do something
elseif condition2
    # do something
else
    # do something
end
```

* `for` loop

```julia
for i in 1:10
    println(i)
end
```

* Nested `for` loop

```julia
for i in 1:10
    for j in 1:5
        println(i * j)
    end
end
```

* Same as

```julia
for i in 1:10, j in 1:5
    println(i * j)
end
```

* Exit loop

```julia
for i in 1:10
    # do something
    if condition1
        break # skip remaining loop
    end
end
```

* Exit iteration

```julia
for i in 1:10
    # do something
    if condition1
        continue # skip to next iteration
    end
end
```

### Functions

* Function definition
```julia
function func(req1, req2; key1=dflt1, key2=dflt2)
    # do stuff
    return out1, out2, out3
end
```
    - **Required arguments** are separated with a comma and use the positional notation.
    - **Optional arguments** need a default value in the signature.
    - **Semicolon** is not required in function call.
    - **Return** statement is optional (value of the last expression is the return value, like R).
    - Multiple outputs can be returned as a tuple, e.g., `return out1, out2, out3`.
    
* In Julia, all arguments to functions are [**passed by reference**](https://en.wikipedia.org/wiki/Evaluation_strategy#Call_by_reference), in contrast to R and Matlab (which use pass by value).
    - Implication: function arguments can be **modified** inside the function.
    
* By convention, function names ending with `!` indicates that function mutates at least one argument, typically the first.
```julia
sort!(x) # vs sort(x)
```

* There is a subtle binding issue (see the Indexing section above) in functions; see the "I passed an argument `x` to a function, modified it inside that function, but on the outside, the variable `x` is still unchanged. Why?" section of https://docs.julialang.org/en/v1/manual/faq/

* Anonymous functions, e.g., `x -> x^2`, is commonly used in collection function or list comprehensions.
```julia
map(x -> x^2, y) # square each element in y
```

* Functions can be nested:

```julia
function outerfunction()
    # do some outer stuff
    function innerfunction()
        # do inner stuff
        # can access prior outer definitions
    end
    # do more outer stuff
end
```

* Functions can be vectorized using the "dot call" syntax:

In [51]:
function myfunc(x)
    return sin(x^2)
end

x = randn(5, 3)
myfunc.(x)

5×3 Matrix{Float64}:
  0.54054     0.236534   6.06964e-5
 -0.00839683  0.204126   0.411981
  0.322984    0.0661039  0.422811
 -0.289115    0.778884   0.861485
  0.0259346   0.999434   0.000134913

* **collection function** (think this as the series of `apply` functions in R).

    Apply a function to each element of a collection:
    
```julia
map(f, coll) # or
map(call) do elem
    # do stuff with elem
    # must contain return
end
```

In [52]:
map(x -> sin(x^2), x) # same as above

5×3 Matrix{Float64}:
  0.54054     0.236534   6.06964e-5
 -0.00839683  0.204126   0.411981
  0.322984    0.0661039  0.422811
 -0.289115    0.778884   0.861485
  0.0259346   0.999434   0.000134913

In [53]:
map(x) do elem    # long version of above
    elem = elem^2
    return sin(elem)
end

5×3 Matrix{Float64}:
  0.54054     0.236534   6.06964e-5
 -0.00839683  0.204126   0.411981
  0.322984    0.0661039  0.422811
 -0.289115    0.778884   0.861485
  0.0259346   0.999434   0.000134913

In [54]:
# Mapreduce
mapreduce(x -> sin(x^2), +, x) # mapreduce(mapper, reducer, data)

4.573500345814576

In [55]:
# same as
sum(x -> sin(x^2), x)

4.573500345814576

* **List comprehension**

In [56]:
[sin(2i + j) for i in 1:5, j in 1:3] # similar to Python

5×3 Matrix{Float64}:
  0.14112   -0.756802  -0.958924
 -0.958924  -0.279415   0.656987
  0.656987   0.989358   0.412118
  0.412118  -0.544021  -0.99999
 -0.99999   -0.536573   0.420167

### Type system

* Every variable in Julia has a type.

* When thinking about types, think about sets.

* Everything is a subtype of the abstract type `Any`.

* An abstract type defines a set of types
    - Consider types in Julia that are a `Number`: See https://en.wikibooks.org/wiki/Introducing_Julia/Types

* We can explore type hierarchy with `typeof()`, `supertype()` and `subtypes()`.

In [57]:
typeof(1.0), typeof(1)

(Float64, Int64)

In [58]:
supertype(Float64)

AbstractFloat

In [59]:
subtypes(AbstractFloat)

4-element Vector{Any}:
 BigFloat
 Float16
 Float32
 Float64

In [60]:
# Is Float64 a subtype of AbstractFloat?
Float64 <: AbstractFloat

true

In [61]:
# On 64bit machine, Int == Int64
Int == Int64

true

In [62]:
# convert to Float64
convert(Float64, 1)

1.0

In [63]:
# same as
Float64(1)

1.0

In [64]:
# Float32 vector
x = randn(Float32, 5)

5-element Vector{Float32}:
 -0.02471695
 -0.9282092
 -0.15298422
  1.5959787
 -0.59951437

In [65]:
# convert to Float64
convert(Array{Float64}, x)

5-element Vector{Float64}:
 -0.024716949090361595
 -0.9282091856002808
 -0.15298421680927277
  1.5959787368774414
 -0.599514365196228

In [66]:
# same as
Float64.(x)

5-element Vector{Float64}:
 -0.024716949090361595
 -0.9282091856002808
 -0.15298421680927277
  1.5959787368774414
 -0.599514365196228

In [67]:
convert(Int, 1.5) # should use round(1.5)

LoadError: InexactError: Int64(1.5)

In [68]:
round(Int, 1.5)

2

## Multiple dispatch

* [Multiple dispatch](https://en.wikipedia.org/wiki/Multiple_dispatch) is a feature of some programming languages in which a function or method can be dynamically dispatched on the run time (dynamic) type or, in the more general case, some other attribute of more than one of its arguments.

* Multiple dispatch lies in the core of Julia design. It allows built-in and user-defined functions to be overloaded for different combinations of argument types.

* In Julia, methods belong to functions, called **generic functions**.

* Let's consider a simple "doubling" function:

In [69]:
g(x) = x + x

g (generic function with 1 method)

In [70]:
g(1.5)

3.0

This definition is too broad, since some things, e.g., strings, can't be added.

In [71]:
g("hello, world")

LoadError: MethodError: no method matching +(::String, ::String)
[0mClosest candidates are:
[0m  +(::Any, ::Any, [91m::Any[39m, [91m::Any...[39m) at operators.jl:560

* This definition is correct but too restrictive, since any `Number` can be added.

In [72]:
g(x::Float64) = x + x

g (generic function with 2 methods)

* This definition will automatically work on the entire type tree above!

In [73]:
g(x::Number) = x + x

g (generic function with 3 methods)

This is a lot nicer than

```julia
function g(x)
    if isa(x, Number)
        return x + x
    else
        throw(ArgumentError("x should be a number"))
    end
end
```

* `methods(func)` function display all methods defined for `func`.

In [74]:
methods(g)

* When calling a function with multiple definitions, Julia will search from the narrowest signature to the broadest signature.

* `@which func(x)` macro tells which method is being used for argument signature `x`.

In [75]:
# an Int64 input
@which g(1)

In [76]:
@which g(1.0)

In [77]:
# a Vector{Float64} input
@which g(randn(5))

* R akso makes use of generic functions and multiple dispatch, but it is not fully optimized.

## Just-in-time compilation (JIT)

* Julia's efficiency results from its capability to infer the types of **all** variables within a function and then call LLVM(compiler) to generate optimized machine code at run-time.

Consider the `g` (doubling) function defined earlier. This function will work on **any** type which has a method for `+`.

In [78]:
g(2), g(2.0)

(4, 4.0)

**Step 1**: Parse Julia code into [abstract syntax tree (AST)](https://en.wikipedia.org/wiki/Abstract_syntax_tree).

In [79]:
@code_lowered g(2)

CodeInfo(
[90m1 ─[39m %1 = x + x
[90m└──[39m      return %1
)

**Step 2**: Type inference according to input type

In [80]:
@code_warntype g(2)

Variables
  #self#[36m::Core.Const(g)[39m
  x[36m::Int64[39m

Body[36m::Int64[39m
[90m1 ─[39m %1 = (x + x)[36m::Int64[39m
[90m└──[39m      return %1


In [81]:
@code_warntype g(2.0)

Variables
  #self#[36m::Core.Const(g)[39m
  x[36m::Float64[39m

Body[36m::Float64[39m
[90m1 ─[39m %1 = (x + x)[36m::Float64[39m
[90m└──[39m      return %1


**Step 3**: Compile into **LLVM bytecode** (equivalent of R bytecode generated by the compiler package).

In [82]:
@code_llvm g(2)

[90m;  @ In[73]:1 within `g'[39m
[95mdefine[39m [36mi64[39m [93m@julia_g_4154[39m[33m([39m[36mi64[39m [95msignext[39m [0m%0[33m)[39m [33m{[39m
[91mtop:[39m
[90m; ┌ @ int.jl:87 within `+'[39m
   [0m%1 [0m= [96m[1mshl[22m[39m [36mi64[39m [0m%0[0m, [33m1[39m
[90m; └[39m
  [96m[1mret[22m[39m [36mi64[39m [0m%1
[33m}[39m


In [83]:
@code_llvm g(2.0)

[90m;  @ In[72]:1 within `g'[39m
[95mdefine[39m [36mdouble[39m [93m@julia_g_4179[39m[33m([39m[36mdouble[39m [0m%0[33m)[39m [33m{[39m
[91mtop:[39m
[90m; ┌ @ float.jl:326 within `+'[39m
   [0m%1 [0m= [96m[1mfadd[22m[39m [36mdouble[39m [0m%0[0m, [0m%0
[90m; └[39m
  [96m[1mret[22m[39m [36mdouble[39m [0m%1
[33m}[39m


We didn't provide a type annotation. But different LLVM code gets generated depending on the argument type!

In R or Python, `g(2)` and `g(2.0)` would use the same code for both.

In Julia, `g(2)` and `g(2.0)` dispatches to optimized code for `Int64` and `Float64`, respectively.

For integer input `x`, LLVM compiler is smart enough to know `x + x` is simple shifting `x` by 1 bit, which is faster than addition.

* **Step 4**: Lowest level is the **assembly code**, which is machine dependent.

In [84]:
@code_native g(2)

	[0m.section	[0m__TEXT[0m,[0m__text[0m,[0mregular[0m,[0mpure_instructions
[90m; ┌ @ In[73]:1 within `g'[39m
[90m; │┌ @ int.jl:87 within `+'[39m
	[96m[1mleaq[22m[39m	[33m([39m[0m%rdi[0m,[0m%rdi[33m)[39m[0m, [0m%rax
[90m; │└[39m
	[96m[1mretq[22m[39m
	[96m[1mnopw[22m[39m	[0m%cs[0m:[33m([39m[0m%rax[0m,[0m%rax[33m)[39m
[90m; └[39m


1st instruction adds the content of the general purpose 64-bit register (a small memory inside the CPU) RDI to itself, and load the result into another register RAX. The addition here is the integer arithmetic.

In [85]:
@code_native g(2.0)

	[0m.section	[0m__TEXT[0m,[0m__text[0m,[0mregular[0m,[0mpure_instructions
[90m; ┌ @ In[72]:1 within `g'[39m
[90m; │┌ @ float.jl:326 within `+'[39m
	[96m[1mvaddsd[22m[39m	[0m%xmm0[0m, [0m%xmm0[0m, [0m%xmm0
[90m; │└[39m
	[96m[1mretq[22m[39m
	[96m[1mnopw[22m[39m	[0m%cs[0m:[33m([39m[0m%rax[0m,[0m%rax[33m)[39m
[90m; └[39m


In [86]:
run(`which /usr/local/bin/R`)

/usr/local/bin/R


Process(`[4mwhich[24m [4m/usr/local/bin/R[24m`, ProcessExited(0))

1st instruction adds the content of the 128-bit register XMM0 to itself, and overwrites the result into XMM0. The addition here is the floating point arithmetic and a "single instruction, multiple data" (SIMD) instruction.

## Acknowledgment

This lecture note has evolved from [Dr. Hua Zhou](http://hua-zhou.github.io)'s 2019 Winter Statistical Computing course notes available at <http://hua-zhou.github.io/teaching/biostatm280-2019spring/index.html>.