In [None]:
# # Check multithreading config:
Base.Threads.nthreads()

In [None]:
# # Instantiate package environment for this notebook
# using Pkg; pkg"instantiate"

In [None]:
# # Check active package versions:
# using Pkg; pkg"status"

<h1 style="text-align: center;">
    <span style="display: block; text-align: center;">
        Introduction to
    </span>
    <span style="display: block; text-align: center;">
        <img alt="Julia" src="images/logos/julia-logo.svg" style="height: 2em; display: inline-block; margin: 1em;"/>
    </span>
</h1>

<div style="text-align: center;">
    <p style="text-align: center; display: inline-block; vertical-align: middle;">
        Oliver Schulz<br>
        <small>
            Max Planck Institute for Physics <br/>
            <a href="mailto:oschulz@mpp.mpg.de" target="_blank">oschulz@mpp.mpg.de</a>
        </small>
    </p>
    <p style="text-align: center; display: inline-block; vertical-align: middle;">
        <img src="images/logos/mpg-logo.svg" style="height: 5em; display: inline-block; vertical-align: middle; margin: 1em;"/>
        <img src="images/logos/mpp-logo.svg" style="height: 5em; display: inline-block; vertical-align: middle; margin: 1em;"/>
    </p>
</div>

<p style="text-align: center;">
     November 2021
</p>

## Course material

**Git-clone or download the course material:**

## https://github.com/oschulz/julia-course

## Why Julia?

### Science needs code - but how to write it?

* Choice of programming language(s) matter!

* Need to balance:
    * Learning time
    * Productivity
    * Performance

* Usually involves compromises

### Programming Language Options

* C++:
    * Pro: Very fast (in expert hands)
    * Pro: Really cool new concepts (even literally) in C++11/14/17/...
    * Con: Complex, takes long time to learn and much longer to master
    * Con: Straightforward tasks often result in lengthy code
    * Con: No memory management (General protection faults)  
    * Con: No universal package management
    * Con: Composability isn't great

### Programming Language Options

* Python:
    * Pro: Broad user base, popular first programming language
    * Pro: Easy to learn, good standard library
    * Con: Can't write time-critical loops in Python,  
      workarounds like Numba/Cython have
      [many limitations](http://www.stochasticlifestyle.com/why-numba-and-cython-are-not-substitutes-for-julia/),  
      don't compose well
    * Con: Language itself fairly primitive, not very expressive
    * Con: Duck-Typing necessitates lots of test code
    * Con: No effective multi-threading
    * Con: Composability isn't great

### What else is there?

* Fortran:
    * Pro: Math can be really fast
    * Con: Old language, few modern concepts
    * Con: Shrinking user base
    * Con: Composability isn't great
    * Do you *really* want to ...?


* Scala, Go, Kotlin etc.:
    * Pro: Lots of individual strengths
    * Con: Math either fast *or* generic *or* or complicated
    * Con: Calling C, Fortran or Phython code often difficult
    * Con: Composability isn't great

### The 97 and the 3 Percent

> We should forget about small efficiencies, say about 97% of the time: *premature optimization is the root of all evil. Yet we should not pass up our opportunities in that critical 3%*.

Donald E. Knuth

* Some programming languages (e.g. Python) great for the 97% - but can't make the 3% fast.
* Some other languages (e.g. C/C++, Fortran) can handle the 3% - but makes the 97% complicated.

### The Two-language Problem

* Common approach nowadays:  
  Write time critical code in C/C++, rest in Python

* Pro: End-user can code comfortably in Python, with good performance

* Con: Complexity of C/C++ **plus** complexity of Python

* Con: Need proficiency in **two** languages, barrier that prevents  
  non-expert users from contributing to important parts of code

* Con: Limits generic implementation of algorithms

* Con: Severely limits metaprogramming, automatic differentiation, etc.

## The Expression Problem

> The expression problem is a new name for an old problem. The goal is to define a datatype by cases, where one can add new cases to the datatype and new functions over the datatype, without recompiling existing code, and while retaining static type safety (e.g., no casts).

Philip Wadler

* In other words: The capability to add both new subtypes and new functionality for a type defined in a package you don't own
* Object oriented languages typically can't do this  
  (Ruby has a dirty way, Scala a clean workaround)
* If you have programming experience, you have felt this, even if you didn't name it
* Result: Packages tend not to compose well


### We were looking for a language ...

* as fast as C/C++/Fortran
* as easy to learn and productive as Python
* with a solution for the expression problem
* with first class math support (vectors, matrices, etc.)
* with true functional programming
* with great Fortran/C/C++/Python integration
* with true metaprogramming (like Lisp or Scala)
* good at parallel and distributed programming
* suitable for for interactive, small and large applications

### Julia

* Designed for scientific/technical computing
* Originated at MIT, first public version 2012
* Covers the whole wish-list
* Clear focus on user productivity and software quality
* Rapid growth of user base and software packages
* Current version: Julia v1.7

### Julia Language Properties

* Fast: JAOT compilation to native CPU and GPU code
* Multiple-dispatch (more powerful than object-oriented):  
  solves the expression problem
* Dynamically typed
* Very powerful type system, types are first-class values
* Functional programming and metaprogramming
* First-class math support (like Fortran or Matlab)
* ...

### Julia Language Properties, cont.

* ...
* Local and distributed code execution
* State-of-the-art multi-threading: parallel code  
  can call parallel code that can call parallel code, ...,  
  without oversubscribing threads
* Software package management:  
  Trivial to create and install packages
* Excellent REPL (console)
* Easy to call Fortran, C/C++ and Python code

### Julia large-scale use case examples

* Celeste: Variational Bayesian inference for astronomical images (doi:10.1214/19-AOAS1258), 1.54 petaflops using 1.3 million threads on 9,300 Knights Landing (KNL) nodes on Cori at NERSC

* Clima: Full-earth climate simulation, https://clima.caltech.edu, large team, uses everything from MPI to GPUs

* ...


### When (not) to use Julia

* *Do* use Julia for computations, visualization, data processing ... pretty much anything scientific/technical

* *Do not* use Julia for scripts what will only run for a second (code gen overhead), use Python or shell scripts

* *Do not* use Julia for non-computing web apps, etc. (*at least not yet*), use Go or Node.js

## Julia 101

### Verbs and nouns - functions and types

* Julia is not Java: Verbs aren't owned by nouns

* Julia has: types, functions and methods

* Methods belong to *functions*, not to types!

### Functions

Short one-liner function:

In [None]:
f(x) = x^2

In [None]:
f(3)

Function that needs more than one line:

In [None]:
function f(x)
    # ... something ...
    x^2
end

is equivalent to

In [None]:
function f(x)
    # ... something ...
    return x^2
end

**Note:** `return` is optional, and often not used explicitly. Last expression in a function, block, etc. is automatically returned (like in Mathematica).

### Types

An abstract type, must be empty:

```julia
abstract type MySuperType end
```

An immutable type, value of `i` can't change:

```julia
struct MySubType <: MySuperType
    i::Int
end
```

A mutable type, value of `i` can change:

```julia
mutable struct MyMutableSubType <: MySuperType
    i::Int
end
```

### Type parameters

Julia has a powerful parametric type system, based on set theory:

```julia
struct MyRealArray{T<:Real,N} <: AbstractArray{T,N}
    # ...
end
```

defines an array type with real-valued elements.

```julia
foo(A::AbstractArray{<:Real}) = do_something_with(A)
```

is equivalent to

```julia
bar(A::AbstractArray{T,N}) where {T<:Real,N} = do_something_with(A)
```

defines a function covariant in the element type of `A`. Can also be contravariant:

```julia
baz(A::AbstractArray{>:Real}) = do_something_with(A)
```

### Type aliases and union types

Type aliases are just const values:

```julia
const Abstract2DArray{T} = AbstractArray{T,2}
rand(2, 2) isa Abstract2DArray == true
```

Type unions are unions of set of types.

```julia
const RealVecOrMat{T} where {T<:Real} = Union{AbstractArray{T,1}, AbstractArray{T,2}}
```

is the union of a 1D and 2D array types with real-valued elements.

### Syntax: Variables

```julia
# Global variables:
const a = 42
b = 24

function foo(x)
    # Local variables:

    c = a * x
    d = b * x # Avoid, type of b can change!
    #...
end
```

### Loops

For loop:

```julia
for i in something_iterable
    # ...
end
```

`something_iterable` can be a range, an array, anything that implements the Julia [iterator API](https://docs.julialang.org/en/v1/manual/interfaces/).

While loop:

```julia
while condition
    # do something
end
```

### Control flow

If-else, evaluate only one branch:

```julia
if condition
    # do something
elseif condition
    # do something else
else
    # or something different
end
```

Ternary operator, evaluate only one branch:

```julia
condition ? result_if_true : result_if_false
```

`ifelse`, evaluate both results but return only one:

```julia
ifelse(condition, result_if_true, result_if_false)
```

## Blocks and scoping

Begin/end-block:

```julia
begin
    # *Not* a new scope in here
    # ...
end
```

Let-block:

```julia
b = 24

let my_b = b
    # New scope in here.
    # If b is bound to new value, my_b won't change.
    # ...
end
```

## Arrays

Vectors:

```julia
v = [1, 2, 3]

v = rand(5)
```

Matrices:

```julia
A = [1 2; 3 4]

A = rand(4, 5)
```

* Column-first memory layout!

* Almost anything array-like is subtype of `AbstractArray`.

### Array indexing

Get `i`-th element of vector `v`:

```julia
v[i]
```

Most higher-dimensional array types support cartesian and linear indexing (usually faster):

```julia
A[i, j]
A[lin_idx] 
```

Use `eachindex(A)` to get indices of best type for given `A` (usually linear).


In Julia, anything array-like can usually be an index as well

```julia
A[2:3, [1, 4, 5]]
```

### Array comprehension and generators

Returns an array:

```
[f(x) for x in some_collection]
```

Returns an iterable generator:

```
(f(x) for x in some_collection)
```

### Hello World (and more) in Julia

In [None]:
println("Hello, World!")

Let's define a function

In [None]:
f(x, y) = x * y
f(20, 2.1)

Multiplication is also defined for vectors, so this works, too:

In [None]:
f(4.2, [1, 2, 3, 4])

### Let's Look Under the Hood

In [None]:
@code_llvm debuginfo=:none f(20, 2.1)

In [None]:
@code_native debuginfo=:none f(20, 2.1)

### Multiple Dispatch

In [None]:
foo(x::Integer, y::Number) = x * y
foo(x::Integer, y::AbstractString) = join(fill(y, x))

In [None]:
foo(3, 4)

In [None]:
foo(3, "abc")

In [None]:
foo(4.5, 3)

### Functional Programming

In [None]:
A = rand(10)

In [None]:
idxs = findall(x -> 0.2 < x < 0.6, A)

In [None]:
A[idxs]

Even types are first-class values:

In [None]:
mytype = Number

In [None]:
subtypes(mytype)

Julia type hierarchy extends all the way down to primitive types:

In [None]:
Float64 <: AbstractFloat <: Real <: Number <: Any

### Broadcasting

In [None]:
A = [1.1, 2.2, 3.3]
B = [4.4, 5.5, 6.6]
broadcast((x, y) -> (x + y)^2, A, B)

Shorter broadcast syntax:

In [None]:
(A .+ B) .^ 2

#### Loop Fusion and SIMD Vectorization

In [None]:
foo(X, Y) = (X .+ Y) .^ 2
@code_llvm raw=false debuginfo=:none foo(A, B)

In [None]:
@code_native debuginfo=:none foo(A, B)

### Package management

* Julia probably has the best package management to date

* Press "]" to enter package management console

* Typically `add PACKAGE_NAME` is sufficient, can also do `add PACKAGE_NAME@VERSION`

* To get an unreleased version, use `add PACKAGE_NAME#BRANCH_NAME`

* Easy to start modifying a package via `dev PACKAGE_NAME`

* Multiple package versions can be installed, selection via [Pkg.jl environments](https://julialang.github.io/Pkg.jl/v1/environments).

* Also useful: `julia> using Pkg; pkg"<Pkg console command>"`

### Package creation

* A Julia package needs:

    * A "Project.toml" file
    * A "src/PackageName.jl" file

* That's it: Push to GitHub, and package is installable via `add PACKAGE_URL`

* Use [Documenter.jl](https://github.com/JuliaDocs/Documenter.jl) to document your package

* To enable `add PACKAGE_NAME`, package must be [registered](https://github.com/JuliaRegistries/Registrator.jl), there are [some rules](https://github.com/JuliaRegistries/RegistryCI.jl#automatic-merging-guidelines)

* Use [PkgTemplates.jl](https://github.com/invenia/PkgTemplates.jl) to generate new package with CI config (Travis, Appveyor, ...), docs generation, etc.

### No free lunch

* Package loading and code-gen can sometime take a long time, 
  but mitigations available:

* [Revise.jl](https://github.com/timholy/Revise.jl): Hot code-reloading at runtime

* [PackageCompiler.jl](https://github.com/JuliaLang/PackageCompiler.jl]: Ahead-of-time compilation, producing custom Julia system images

### Performance tips

* Read the [official Julia performance tips](https://docs.julialang.org/en/v1/manual/performance-tips/)!
* Do *not* call on (non-const) global variables from time-critical code
* Type-stable code is fast code. Use [`@code_warntype`](https://docs.julialang.org/en/v1/manual/performance-tips/#man-code-warntype-1) and [`Test.@inferred`](https://docs.julialang.org/en/v1/stdlib/Test/#Test.@inferred) to check!
* In some situations, closures [can be troublesome](https://docs.julialang.org/en/v1/manual/performance-tips/#man-performance-captured-1), using `let` can help the compiler

This is efficient (not runtime reflection):

In [None]:
half_dynrange(T::Type{<:Number}) = (Int(typemax(T)) - Int(typemin(T))) / 2
half_dynrange(Int16)

In [None]:
@code_llvm half_dynrange(Int16)

### SIMD

Demo

### Shared-memory parallelism

* Julia has native multithreading support

* Simple cases: Use `@threads` macro

* Since Julia v1.3: Cache-efficient [composable multi-threaded parallelism](https://julialang.org/blog/2019/07/multithreading/)

### Processes, Clusters, MPI

* Julia brings a full API for remote processes and compute clusters

* Native support for local processes and remote processes via SSH

* MPI support via [MPI.jl](https://github.com/JuliaParallel/MPI.jl) and [MPIClusterManagers.jl](https://github.com/JuliaParallel/MPIClusterManagers.jl)

## Benchmarking and profiling, digging deeper

Demo

## Docs and help

* [Official Julia docs](https://docs.julialang.org/en/v1/)

* [Julia Cheat Sheet](https://juliadocs.github.io/Julia-Cheat-Sheet/)

* [ThinkJulia](https://benlauwens.github.io/ThinkJulia.jl/latest/book.html)

* https://julialang.org/learning/

* [Julia Discourse](https://discourse.julialang.org/)

* [Julia Slack](https://slackinvite.julialang.org/)

* [Julia Gitter](https://gitter.im/JuliaLang/julia)

* [Julia on Youtube](https://www.youtube.com/user/JuliaLanguage)

* [JuliaCon 2020](https://juliacon.org/)

## Statistics

Demo

## Visualization/Plotting: Plots, Makie, plotting recipes

#### Let's Make a Plot

In [None]:
using Plots
range = -π:0.01:π
plot(range, sin.(range) + rand(length(range)))

#### Histograms are easy, too

In [None]:
using Distributions
dist = Normal(0.0, 5.0)

In [None]:
stephist(rand(dist, 10000))

## Automatic differentiation

Let's define a simple neural network layer and loss function and auto-differentiate through it.

In [None]:
struct DenseLayer{M<:AbstractMatrix{<:Real},V<:AbstractVector{<:Real},F<:Function} <: Function
    A::M
    b::V
    f::F
end 

(l::DenseLayer)(x::AbstractVector{<:Real}) = (l.f).(l.A * x + l.b)

f_loss(y) = sum(y .^ 2);

In [None]:
mylayer = DenseLayer(rand(5,5), rand(5), x -> ifelse(x > zero(x), x, zero(x)));

In [None]:
x = rand(5)
mylayer(x)

In [None]:
f_loss(mylayer(x))

In [None]:
using Zygote
g = Zygote.gradient((mylayer, x) -> f_loss(mylayer(x)), mylayer, x)
g[1].A

In [None]:
g[1].b

## Calling code written in other language, REPL modes

### Shell REPL mode

* The Julia shell is multi-language

* Press "," for a system shell

### PyCall, RCall

Calling Python from Julia is easy, can even use inline Python code:

```julia
using PyCall
numpy = pyimport("numpy")
numpy.zeros(5) isa Array

A = rand(5)
py"""type($A)""" isa PyObject
```

## An incomplete tour of the Julia package ecosystem

### Math

* [ApproxFun.jl](https://github.com/JuliaApproximation/ApproxFun.jl): Powerful function approximations

* [FFTW.jl](https://github.com/JuliaMath/FFTW.jl): Fast fourier transforms via [FFTW](http://www.fftw.org/)

* [DifferentialEquations.jl](https://github.com/JuliaDiffEq/DifferentialEquations.jl): A suite for numerically solving differential equations

* ...

### Optimization

* [JuMP.jl](https://github.com/JuliaOpt/JuMP.jl): Modeling language for Mathematical Optimization

* [NLopt.jl](https://github.com/JuliaOpt/NLopt.jl): Optimization via [NLopt](https://github.com/stevengj/nlopt)

* [Optim](https://github.com/JuliaNLSolvers/Optim.jl): Julia native nonlinear optimization

### TypedTables and DataFrames

* [Tables.jl](https://github.com/JuliaData/Tables.jl): Abstract API for tabular data

* [DataFrames.jl](https://github.com/JuliaData/DataFrames.jl): Python/R-like dataframes

* [TypedTables.jl](https://github.com/JuliaData/TypedTables.jl): Type-stable tables

* [Query.jl](https://github.com/queryverse/Query.jl) LINQ-inspired data query and transformation

### Plotting and Visualization

* [IJulia.jl](https://github.com/JuliaLang/IJulia.jl): Julia Jupyter kernel

* [Images.jl](https://github.com/JuliaImages/Images.jl): Image processing

* [PyPlot.jl](https://github.com/JuliaPy/PyPlot.jl): Use matplotlib/PyPlot from Julia

* [Makie.jl](https://github.com/JuliaPlots/Makie.jl): Hardware-accelerated plotting

* [Plots.jl](https://github.com/JuliaPlots/Plots.jl): Plotting with generic recipes and multiple backends

### Statistics

* [Distributions.jl](https://github.com/JuliaStats/Distributions.jl): Probability distributions and associated functions

* [StatsBase.jl](https://github.com/JuliaStats/StatsBase.jl): Statistics, histograms, etc.

* Many specialized packages

### Automatic Differentiation

* [ForwardDiff.jl](https://github.com/JuliaDiff/ForwardDiff.jl): Forward-mode automatic differentiation
* [Zyote.jl](https://github.com/FluxML/Zygote.jl): Source-level reverse-mode automatic differentiation
* [Enzyme.jl](https://github.com/wsmoses/Enzyme.jl): LLVM-level reverse-mode automatic differentiation
* Several other packages available ([ReverseDiff.jl](https://github.com/JuliaDiff/ReverseDiff.jl), [Nabla.jl](https://github.com/invenia/Nabla.jl), [Yota.jl](https://github.com/dfdx/Yota.jl), ...)
* Exciting developements to come with new Julia Compiler features

### Bayesian analysis and probabilistic programming


* [BAT.jl](https://github.com/bat/BAT.jl): Bayesian analysis toolkit

* [Gen.jl](https://github.com/probcomp/Gen): General-Purpose Probabilistic Programming System

* [Mamba.jl](https://github.com/brian-j-smith/Mamba.jl): MCMC for Bayesian analysis

* [Turing.jl](https://github.com/TuringLang/Turing.jl): Probabilistic machine learning and Bayesian statistics

* ...

### Machine learning

* [Flux.jl](https://github.com/FluxML/Flux.jl): Julia native deep learning library

* [Knet.jl](https://github.com/denizyuret/Knet.jl):Koc University deep learning framework

* [MXNet.jl](https://github.com/dmlc/MXNet.jl): [MXNet](https://mxnet.apache.org/) Julia API

* ...

### Calling code in other languages

* [Cxx.jl](https://github.com/JuliaInterop/Cxx.jl): Call C++ from Julia

* [PyCall.jl](https://github.com/JuliaPy/PyCall.jl): Call Python from Julia

* [RCall.jl](https://github.com/JuliaInterop/RCall.jl): Call R from Julia

* ...

### Efficient memory layout

* [ArraysOfArrays.jl](https://github.com/oschulz/ArraysOfArrays.jl): Duality of flat and nested arrays

* [StructArrays.jl](https://github.com/JuliaArrays/StructArrays.jl), [TypedTables.jl](https://github.com/JuliaData/TypedTables.jl): AoS and SoA duality

* [ValueShapes.jl](https://github.com/oschulz/ValueShapes.jl): Duality of flat and nested structures

* ...

### GPU Programming

* [AMDGPU.jl](https://github.com/JuliaGPU/AMDGPU.jl): Julia on AMD GPUs (WIP)

* [CUDA.jl](https://github.com/JuliaGPU/CUDA.jl): Julia on NVIDIA GPUs

* [GPUArrays.jl](https://github.com/JuliaGPU/GPUArrays.jl): Generic CPU programming API

* [XLA.jl](https://github.com/JuliaTPU/XLA.jl): Julia on Google TPUs

* Maybe more architectures in the future?

### IDEs

* [julia-vscode](https://github.com/julia-vscode/julia-vscode): Visual Studio Code based Julia IDE

* [Juno](https://junolab.org/): Atom based Julia IDE (now deprecated in favor of VS code)

## Final Remarks

* Julia is productive, fast and fun - give it a chance!

* Multiple dispatch opens up powerful ways of combining code