# Introduction to Julia

[Julia](https://julialang.org/) is a fast, dynamic, reproducible, composable, general and open source programming language that can be downloaded [here](https://julialang.org/downloads/). The full documentation of the current release (v1.7) is available [here](https://docs.julialang.org/en/v1/). This tutorial is designed to go over the basics of Julia, and to cover several topics on which Julia's speed is based. 

1. use/add/remove/develop package
2. basic linear algebra
3. just-in-time (JIT) compiler
4. multiple dispatch

## Package management

In Julia, packages are loaded via `using` (like packages are loaded in python via `import`). We typically load all the packages we want to use at the start of the file as

In [1]:
using LinearAlgebra, Test, BenchmarkTools

To add/remove a package, we can follow [Pkg.jl](https://github.com/JuliaLang/Pkg.jl) and run

In [2]:
using Pkg
Pkg.add("IterativeSolvers")

[32m[1m    Updating[22m[39m registry at `~/.julia/registries/General`
[32m[1m    Updating[22m[39m git-repo `https://github.com/JuliaRegistries/General.git`
[32m[1m    Updating[22m[39m registry at `~/.julia/registries/SLIMregistryJL`
[32m[1m    Updating[22m[39m git-repo `https://github.com/slimgroup/SLIMregistryJL.git`
[32m[1m   Resolving[22m[39m package versions...
[32m[1m    Updating[22m[39m `~/.julia/environments/v1.7/Project.toml`
 [90m [42fd0dbc] [39m[93m~ IterativeSolvers v0.9.2 `https://github.com/JuliaLinearAlgebra/IterativeSolvers.jl.git#master` ⇒ v0.9.2[39m
[32m[1m    Updating[22m[39m `~/.julia/environments/v1.7/Manifest.toml`
 [90m [42fd0dbc] [39m[93m~ IterativeSolvers v0.9.2 `https://github.com/JuliaLinearAlgebra/IterativeSolvers.jl.git#master` ⇒ v0.9.2[39m
[32m[1mPrecompiling[22m[39m project...
[32m  ✓ [39mIterativeSolvers
[32m  ✓ [39mJOLI
[32m  ✓ [39mGenSPGL
[32m  ✓ [39mJUDI
[32m  ✓ [39mSetIntersectionProjection
[32m  ✓ [

In [3]:
Pkg.rm("IterativeSolvers")

[32m[1m    Updating[22m[39m `~/.julia/environments/v1.7/Project.toml`
 [90m [42fd0dbc] [39m[91m- IterativeSolvers v0.9.2[39m
[32m[1m  No Changes[22m[39m to `~/.julia/environments/v1.7/Manifest.toml`


Note that we can also install a Julia package at a specific version, or from a GitHub repository

In [4]:
Pkg.add(name="IterativeSolvers", version="0.9.2")

[32m[1m   Resolving[22m[39m package versions...
[32m[1m    Updating[22m[39m `~/.julia/environments/v1.7/Project.toml`
 [90m [42fd0dbc] [39m[92m+ IterativeSolvers v0.9.2[39m
[32m[1m  No Changes[22m[39m to `~/.julia/environments/v1.7/Manifest.toml`


In [5]:
Pkg.add(url="https://github.com/JuliaLinearAlgebra/IterativeSolvers.jl.git", rev="master")

[32m[1m    Updating[22m[39m git-repo `https://github.com/JuliaLinearAlgebra/IterativeSolvers.jl.git`
[32m[1m   Resolving[22m[39m package versions...
[32m[1m    Updating[22m[39m `~/.julia/environments/v1.7/Project.toml`
 [90m [42fd0dbc] [39m[93m~ IterativeSolvers v0.9.2 ⇒ v0.9.2 `https://github.com/JuliaLinearAlgebra/IterativeSolvers.jl.git#master`[39m
[32m[1m    Updating[22m[39m `~/.julia/environments/v1.7/Manifest.toml`
 [90m [42fd0dbc] [39m[93m~ IterativeSolvers v0.9.2 ⇒ v0.9.2 `https://github.com/JuliaLinearAlgebra/IterativeSolvers.jl.git#master`[39m
[32m[1mPrecompiling[22m[39m project...
[32m  ✓ [39mIterativeSolvers
[32m  ✓ [39mJOLI
[32m  ✓ [39mGenSPGL
[32m  ✓ [39mJUDI
[32m  ✓ [39mSetIntersectionProjection
[32m  ✓ [39mImageGather
[32m  ✓ [39mMECurvelets
[32m  ✓ [39mInvertibleNetworks
  8 dependencies successfully precompiled in 22 seconds (356 already precompiled)


We can also do these on julia's interactive command-line REPL (read-eval-print loop) via
```julia
] add IterativeSolvers
] rm IterativeSolvers
] add https://github.com/JuliaLinearAlgebra/IterativeSolvers.jl.git
```
etc.

## Linear algebra

The syntax of creating multidimensional arrays is a bit similar to MATLAB. We use square brackets `[]` to set up a multidimensional array, and use semi-colon `;` to switch rows.

In [6]:
A = [1 2; 3 4]

2×2 Matrix{Int64}:
 1  2
 3  4

Note that multidimensional arrays are stored in a column-major order, meaning that they fill up one column at a time. We can verify this by vectorizing `A` via `vec` or `[:]`

In [7]:
A[:]

4-element Vector{Int64}:
 1
 3
 2
 4

In [8]:
vec(A)

4-element Vector{Int64}:
 1
 3
 2
 4

It is also quite simple to create a vector using square bracket, and Julia assumes 1D vector as column vector.

In [9]:
x = [1, 2]

2-element Vector{Int64}:
 1
 2

There are built-in basic linear algebraic operations, such as

In [10]:
inv(A)

2×2 Matrix{Float64}:
 -2.0   1.0
  1.5  -0.5

In [11]:
A * x

2-element Vector{Int64}:
  5
 11

In [12]:
A \ x

2-element Vector{Float64}:
 0.0
 0.5

In [13]:
det(A)

-2.0

## Just-in-time (JIT) compilation

Julia code is just-in-time compiled---i.e., every statement runs using compiled functions which are either compiled right before they are used, or cached compilations from before. This is different from e.g. C, which is compiled before executed. Due to the JIT compilation, the first run of a function/operation typically takes slightly longer time as it needs to compile first. An example is shown below

In [14]:
A = randn(Float32, 10^4, 10^4);
b = randn(Float32, 10^4);

In [15]:
# first run
@time A * b;

  0.077812 seconds (290.82 k allocations: 16.171 MiB, 80.19% compilation time)


In [16]:
# second run
@time A * b;

  0.017405 seconds (2 allocations: 39.109 KiB)


The first time of execution this "Float32 matrix and Float 32 vector product" takes longer time and more memory allocations as this is the first time to run such functionality in the current notebook. In order for robust estimate of the execution time and memory usage, we can use the `@benchmark` macro in [BenchmarkTools.jl](https://github.com/JuliaCI/BenchmarkTools.jl.git) package.

In [17]:
@benchmark A * b

BenchmarkTools.Trial: 174 samples with 1 evaluation.
 Range [90m([39m[36m[1mmin[22m[39m … [35mmax[39m[90m):  [39m[36m[1m19.071 ms[22m[39m … [35m39.675 ms[39m  [90m┊[39m GC [90m([39mmin … max[90m): [39m0.00% … 0.00%
 Time  [90m([39m[34m[1mmedian[22m[39m[90m):     [39m[34m[1m28.190 ms              [22m[39m[90m┊[39m GC [90m([39mmedian[90m):    [39m0.00%
 Time  [90m([39m[32m[1mmean[22m[39m ± [32mσ[39m[90m):   [39m[32m[1m28.708 ms[22m[39m ± [32m 2.208 ms[39m  [90m┊[39m GC [90m([39mmean ± σ[90m):  [39m0.00% ± 0.00%

  [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m▂[39m█[39m▇[34m▁[39m[39m▂[32m [39m[39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m 
  [39m▂[39m▁[39m▁[39m▁[39m▁[39m▁[39m

## Multiple dispatch

The core design to make Julia fast is type-stability through specialization via multiple-dispatch. A single function in Julia can take different types of input and compile the function in specialized ways. This "specialization" brings huge performance gains to Julia. Let's first go over a simple example, namely scalar multiplication, and use the macro `@code_llvm` to check what Julia compiles it to  (LLVM is a type of portable assembly language).

In [18]:
@which 1 * 2

In this example, we want to compute the multiplication of 2 integers. As we can see, the underlying assembly code is running the multiplication of integers.

In [19]:
@which 1.0 * 2.0

If we multiply 2 floating point numbers, we can see that a different (specialized) assembly code is used, which is designed specifically for the type `double` (`Float64` in Julia).

In [20]:
@which 1.0 * 2

From here, we can see that Julia can't find a specialized way to compute `1.0 * 2` according to the type of inputs. In the end, it chooses to use the very generic `Number` to describe the type of `1.0` and `2` and to perform the multiplication. Note that this is one of the cases that we want to avoid during large-scale computations. We usually want to make sure that the type of input and output is the same in our functions.

This example demonstrates that the same function (multiplication `*` in this case) is executed via different compiled code blocks that are specialized for different kinds of input. This feature brings huge performance gain and potentially makes Julia to run almost as fast as C/Fortran (which are strictly typed). With that being said, we should also take advantage of this type-stability when implementing functions---i.e., dispatch a function to multiple types of input.

Let's show a simple example: we can define a `foo` function that works for different kinds of input. Let's first build it to take `String` inputs.

In [21]:
foo(x::String, y::String) = println("My inputs x and y are both strings!")

foo (generic function with 1 method)

In [22]:
foo("hello", "hi!")

My inputs x and y are both strings!


Notice that now `foo` is a generic function with 1 method (a.k.a. taking 2 `String` and output a sentence). Now the `foo` function won't work for numbers.

In [23]:
foo(3, 4) # doesn't work here

LoadError: MethodError: no method matching foo(::Int64, ::Int64)

To get `foo` to work on integer (Int) inputs, let's add `::Int` onto our input arguments when we declare `foo`.

In [24]:
foo(x::Int, y::Int) = println("My inputs x and y are both integers!")

foo (generic function with 2 methods)

In [25]:
foo(3, 4) # works now

My inputs x and y are both integers!


Now `foo(::Int, ::Int)` works as well. Notice that it says `foo (generic function with 2 methods)`. This means `foo` is a general function but contains 2 methods: the first one works for 2 strings and the second one works for 2 numbers. This means that the functionality of `foo` on 2 strings isn't overwritten when you define how to run `foo` on another type of input(s). Instead, we just added an additional method to the generic function `foo`.

In [26]:
foo("hello", "hi!")

My inputs x and y are both strings!


Then let's show an explanatory example to discuss how we should write code in Julia. Suppose we want to build a function to take the 2 norm of a scalar/vector/matrix (while we choose to ignore the existing norm functionality). A naive way to implement this would be

In [27]:
function my2norm(x)
    if isa(x, Number)
        return abs(x)
    elseif isa(x, Vector)
        return sqrt(sum(x.^2))
    elseif isa(x, Matrix)
        return return maximum(svdvals(x))
    else
        println("my function is not defined for this kind of input")
        return -1
    end
end

my2norm (generic function with 1 method)

In the above code block, we defined a function `my2norm` that uses nested if-else to find the type of `x` and perform the corresponding operation (absolute value for scalar, euclidean norm for vector, and largest singular value for matrix).

However, there is a smarter and cleaner way to write this function, which provides exactly the same functionality.

In [28]:
mysmart2norm(x::Number) = abs(x)
mysmart2norm(x::Vector) = sqrt(sum(x.^2))
mysmart2norm(x::Matrix) = maximum(svdvals(x))
mysmart2norm(x) = begin println("my function is not defined for this kind of input"); return -1; end

mysmart2norm (generic function with 4 methods)

In [29]:
x = randn(10^3, 10^3);
@test my2norm(x) == mysmart2norm(x)

[32m[1mTest Passed[22m[39m
  Expression: my2norm(x) == mysmart2norm(x)
   Evaluated: 62.88199067068611 == 62.88199067068611

This `mysmart2norm` will lead to cleaner and more maintainable code: when a new type comes to our sight (e.g. an abstract type defined by user), then we don't need to dig into the very long `my2norm` to add another set of `ifelse`. A simple `mysmart2norm(x::NewType)` will do the work. For example, check [here](https://github.com/slimgroup/JUDI.jl/blob/master/src/TimeModeling/Types/abstract.jl#L99) to see how [JUDI.jl](https://github.com/slimgroup/JUDI.jl) defines `norm` for `judiMultiSourceVector`, an abstract type defined for seismic data.