 https://juliaacademy.com/
 
 ## Advanced Topics

### Multiple Dispatch

In [1]:
methods(+)

In [2]:
@which 3+3

In [3]:
@which 3.0+3.0

In [4]:
@which 3 + 3.0

In [6]:
import Base: +
+(x::String, y::String) = string(x,y)
"hello " + "world!"

"hello world!"

In [7]:
@which "hello " + "world!"

In [8]:
foo(x,y) = println("duck-typed foo!")
foo(x::Int, y::Float64) = println("foo with an integer and float")
foo(x::Float64, y::Float64) = println("foo with two floats!")
foo(x::Int, y::Int) = println("foo with two integers")

foo (generic function with 4 methods)

In [9]:
foo(1,1)

foo with two integer floats


In [10]:
foo(1.,1.)

foo with two floats!


In [11]:
foo(1,1.0)

foo with an integer and float


In [13]:
foo(true,false)

duck-typed foo!


In [None]:
using Test


foo(x::String, y::String) = println("My inputs x and y are both strings!")

# ------------------------------------------------------------------------------------------
# We see here that in order to restrict the type of `x` and `y` to `String`s, we just follow
# the input argument name by a double colon and the keyword `String`.
#
# Now we'll see that `foo` works on `String`s and doesn't work on other input argument
# types.
# ------------------------------------------------------------------------------------------

foo("hello", "hi!")

#foo(3, 4)
#This code gives an error 

# ------------------------------------------------------------------------------------------
# To get `foo` to work on integer (`Int`) inputs, let's tack `::Int` onto our input
# arguments when we declare `foo`.
# ------------------------------------------------------------------------------------------

foo(x::Int, y::Int) = println("My inputs x and y are both integers!")

foo(3, 4)

# ------------------------------------------------------------------------------------------
# Now `foo` works on integers! But look, `foo` also still works when `x` and `y` are
# strings!
# ------------------------------------------------------------------------------------------

foo("hello", "hi!")

# ------------------------------------------------------------------------------------------
# This is starting to get to the heart of multiple dispatch. When we declared
#
# ```julia
# foo(x::Int, y::Int) = println("My inputs x and y are both integers!")
# ```
# we didn't overwrite or replace
# ```julia
# foo(y::String, y::String)
# ```
# Instead, we just added an additional ***method*** to the ***generic function*** called
# `foo`.
#
# A ***generic function*** is the abstract concept associated with a particular operation.
#
# For example, the generic function `+` represents the concept of addition.
#
# A ***method*** is a specific implementation of a generic function for *particular argument
# types*.
#
# For example, `+` has methods that accept floating point numbers, integers, matrices, etc.
#
# We can use the `methods` to see how many methods there are for `foo`.
# ------------------------------------------------------------------------------------------

methods(foo)

# ------------------------------------------------------------------------------------------
# Aside: how many methods do you think there are for addition?
# ------------------------------------------------------------------------------------------

methods(+)

# ------------------------------------------------------------------------------------------
# So, we now can call `foo` on integers or strings. When you call `foo` on a particular set
# of arguments, Julia will infer the types of the inputs and dispatch the appropriate
# method. *This* is multiple dispatch.
#
# Multiple dispatch makes our code generic and fast. Our code can be generic and flexible
# because we can write code in terms of abstract operations such as addition and
# multiplication, rather than in terms of specific implementations. At the same time, our
# code runs quickly because Julia is able to call efficient methods for the relevant types.
#
# To see which method is being dispatched when we call a generic function, we can use the
# @which macro:
# ------------------------------------------------------------------------------------------

#@which foo(3, 4)
#Code gives an error in Repl.it but should work locally. 

# ------------------------------------------------------------------------------------------
# Let's see what happens when we use `@which` with the addition operator!
# ------------------------------------------------------------------------------------------

#@which 3.0 + 3.0
#Code gives an error in Repl.it but should work locally. 

# ------------------------------------------------------------------------------------------
# And we can continue to add other methods to our generic function `foo`. Let's add one that
# takes the ***abstract type*** `Number`, which includes subtypes such as `Int`, `Float64`,
# and other objects you would think of as numbers:
# ------------------------------------------------------------------------------------------

foo(x::Number, y::Number) = println("My inputs x and y are both numbers!")

# ------------------------------------------------------------------------------------------
# This method for `foo` will work on, for example, floating point numbers:
# ------------------------------------------------------------------------------------------

foo(3.0, 4.0)

# ------------------------------------------------------------------------------------------
# We can also add a fallback, duck-typed method for `foo` that takes inputs of any type:
# ------------------------------------------------------------------------------------------

foo(x, y) = println("I accept inputs of any type!")

# ------------------------------------------------------------------------------------------
# Given the methods we've already written for `foo` so far, this method will be called
# whenever we pass non-numbers to `foo`:
# ------------------------------------------------------------------------------------------

v = rand(3)
foo(v, v)

# ------------------------------------------------------------------------------------------
# ### Exercises
#
# #### 9.1
#
# Extend the function `foo`, adding a method that takes only one input argument, which is of
# type `Bool`, and returns "foo with one boolean!"
# ------------------------------------------------------------------------------------------



# ------------------------------------------------------------------------------------------
# #### 9.2
#
# Check that the method being dispatched when you execute
# ```julia
# foo(true)
# ```
# is the one you wrote.
# ------------------------------------------------------------------------------------------



#@assert foo(true) == "foo with one boolean!"
#Code gives an error in Repl.it but should work locally. 


## Fast af boi!


# Julia is fast

Very often, benchmarks are used to compare languages.  These benchmarks can lead to long
discussions, first as to exactly what is being benchmarked and secondly what explains the
differences.  These simple questions can sometimes get more complicated than you at first
might imagine.

(This material began life as a wonderful lecture by Steven Johnson at MIT:
https://github.com/stevengj/18S096/blob/master/lectures/lecture1/Boxes-and-
registers.ipynb.)
------------------------------------------------------------------------------------------

------------------------------------------------------------------------------------------
# Outline of this notebook

- Define the sum function
- Implementations & benchmarking of sum in...
    - C (hand-written)
    - C (hand-written with -ffast-math)
    - python (built-in)
    - python (numpy)
    - python (hand-written)
    - Julia (built-in)
    - Julia (hand-written)
    - Julia (hand-written with SIMD)

- Summary of benchmarks
------------------------------------------------------------------------------------------

------------------------------------------------------------------------------------------
# `sum`: An easy enough function to understand
------------------------------------------------------------------------------------------

------------------------------------------------------------------------------------------
Consider the  **sum** function `sum(a)`, which computes
$$
\mathrm{sum}(a) = \sum_{i=1}^n a_i,
$$
where $n$ is the length of `a`.

In [None]:
a = rand(10^7) # 1D vector of random numbers, uniform on [0,1)

sum(a)

------------------------------------------------------------------------------------------
The expected result is 0.5 * 10^7, since the mean of each entry is 0.5
------------------------------------------------------------------------------------------

------------------------------------------------------------------------------------------
Benchmarking a few ways in a few languages
------------------------------------------------------------------------------------------


In [None]:
@time sum(a)

In [None]:
@time sum(a)

In [None]:
@time sum(a)

In [None]:
------------------------------------------------------------------------------------------
The `@time` macro can yield noisy results, so it's not our best choice for benchmarking!

Luckily, Julia has a `BenchmarkTools.jl` package to make benchmarking easy and accurate:
------------------------------------------------------------------------------------------

In [None]:
using Pkg
Pkg.add("BenchmarkTools")

using BenchmarkTools  

In [None]:
------------------------------------------------------------------------------------------
#  1. The C language

# C is often considered the gold standard: difficult on the human, nice for the machine.
# Getting within a factor of 2 of C is often satisfying. Nonetheless, even within C, there
# are many kinds of optimizations possible that a naive C writer may or may not get the
# advantage of.

# The current author does not speak C, so he does not read the cell below, but is happy to
# know that you can put C code in a Julia session, compile it, and run it. Note that the
# `"""` wrap a multi-line string.
# ------------------------------------------------------------------------------------------

using Libdl
C_code = """
#include <stddef.h>
double c_sum(size_t n, double *X) {
    double s = 0.0;
    for (size_t i = 0; i < n; ++i) {
        s += X[i];
    }
    return s;
}
"""

const Clib = tempname()   # make a temporary file


# compile to a shared library by piping C_code to gcc
# (works only if you have gcc installed):

open(`gcc -fPIC -O3 -msse3 -xc -shared -o $(Clib * "." * Libdl.dlext) -`, "w") do f
    print(f, C_code) 
end

# define a Julia function that calls the C function:

In [None]:
c_sum(X::Array{Float64}) = ccall(("c_sum", Clib), Float64, (Csize_t, Ptr{Float64}), length(X), X)

In [None]:
c_sum(a)

In [None]:
c_sum(a) ≈ sum(a) # type \approx and then <TAB> to get the ≈ symbolb

In [None]:
c_sum(a) - sum(a)  

In [None]:
#≈  # alias for the `isapprox` function #?isapprox

------------------------------------------------------------------------------------------
We can now benchmark the C code directly from Julia:
------------------------------------------------------------------------------------------

In [None]:
c_bench = @benchmark c_sum($a)

println("C: Fastest time was $(minimum(c_bench.times) / 1e6) msec")

d = Dict()  # a "dictionary", i.e. an associative array
d["C"] = minimum(c_bench.times) / 1e6  # in milliseconds
d


In [None]:
Pkg.add("Plots")
using Plots
gr()

In [None]:
using Statistics # bring in statistical support for standard deviations
t = c_bench.times / 1e6 # times in milliseconds
m, σ = minimum(t), std(t)

In [None]:
histogram(t, bins=500,
    xlim=(m - 0.01, m + σ),
    xlabel="milliseconds", ylabel="count", label="")

------------------------------------------------------------------------------------------
# 2. C with -ffast-math

If we allow C to re-arrange the floating point operations, then it'll vectorize with SIMD
(single instruction, multiple data) instructions.
------------------------------------------------------------------------------------------

In [None]:
const Clib_fastmath = tempname()   # make a temporary file

# The same as above but with a -ffast-math flag added
open(`gcc -fPIC -O3 -msse3 -xc -shared -ffast-math -o $(Clib_fastmath * "." * Libdl.dlext) -`, "w") do f
    print(f, C_code) 
end

In [None]:
# define a Julia function that calls the C function:
c_sum_fastmath(X::Array{Float64}) = ccall(("c_sum", Clib_fastmath), Float64, (Csize_t, Ptr{Float64}), length(X), X)

c_fastmath_bench = @benchmark $c_sum_fastmath($a)

d["C -ffast-math"] = minimum(c_fastmath_bench.times) / 1e6  # in milliseconds

In [None]:
------------------------------------------------------------------------------------------
# 3. Python's built in `sum`
------------------------------------------------------------------------------------------

------------------------------------------------------------------------------------------
The `PyCall` package provides a Julia interface to Python:
------------------------------------------------------------------------------------------

In [None]:
using Pkg; Pkg.add("PyCall")
using PyCall

# get the Python built-in "sum" function:
pysum = pybuiltin("sum")

In [None]:
pysum(a)

In [None]:
pysum(a) ≈ sum(a)

In [None]:
py_list_bench = @benchmark $pysum($a)

In [None]:
d["Python built-in"] = minimum(py_list_bench.times) / 1e6
d

------------------------------------------------------------------------------------------
# 4. Python: `numpy`

## Takes advantage of hardware "SIMD", but only works when it works.

`numpy` is an optimized C library, callable from Python.
It may be installed within Julia as follows:
------------------------------------------------------------------------------------------

In [None]:
using Pkg; Pkg.add("Conda")
using Conda

Conda.add("numpy")

numpy_sum = pyimport("numpy")["sum"]

py_numpy_bench = @benchmark $numpy_sum($a)

In [None]:
numpy_sum(a)

In [None]:
numpy_sum(a) ≈ sum(a)

In [None]:
d["Python numpy"] = minimum(py_numpy_bench.times) / 1e6
d

------------------------------------------------------------------------------------------
# 5. Python, hand-written
------------------------------------------------------------------------------------------

In [None]:
"""py
def py_sum(A):
    s = 0.0
    for a in A:
        s += a
    return s
"""
sum_py = "py_sum"

py_hand = @benchmark $sum_py($a)


In [None]:
sum_py(a)

In [None]:
sum_py(a) ≈ sum(a)

In [None]:
d["Python hand-written"] = minimum(py_hand.times) / 1e6
d

------------------------------------------------------------------------------------------
# 6. Julia (built-in)

## Written directly in Julia, not in C!
------------------------------------------------------------------------------------------


In [None]:
@which sum(a)

In [None]:
j_bench = @benchmark sum($a)

In [None]:
d["Julia built-in"] = minimum(j_bench.times) / 1e6
d

------------------------------------------------------------------------------------------
# 7. Julia (hand-written)
------------------------------------------------------------------------------------------

In [None]:
function mysum(A)   
    s = 0.0 # s = zero(eltype(a))
    for a in A
        s += a
    end
    s
end

j_bench_hand = @benchmark mysum($a)

In [None]:
d["Julia hand-written"] = minimum(j_bench_hand.times) / 1e6
d

------------------------------------------------------------------------------------------
# 8. Julia (hand-written w. simd)
------------------------------------------------------------------------------------------

In [None]:
function mysum_simd(A)   
    s = 0.0 # s = zero(eltype(A))
    @simd for a in A
        s += a
    end
    s
end

j_bench_hand_simd = @benchmark mysum_simd($a)

In [None]:
mysum_simd(a)

In [None]:
d["Julia hand-written simd"] = minimum(j_bench_hand_simd.times) / 1e6
d

In [None]:
# ------------------------------------------------------------------------------------------
# # Summary
# ------------------------------------------------------------------------------------------

for (key, value) in sort(collect(d), by=last)
    println(rpad(key, 25, "."), lpad(round(value; digits=1), 6, "."))
end