# Session 3: Types, type inference and stability

### OBJECTIVE: Demonstrate the dynamic programming features of Julia.

#### KR1: Shown or demonstrated the hierarchy of Julia’s type hierarchy using the command subtypes(). Start from Number and use subtypes() to explore from  down to . Use supertype() to determine the  abstract type.

The root type is `Any`. When I use subtypes on it, it shows that it has 513 subtypes. 

Interestingly, `Any` is a subtype of `Any`.

In [1]:
subtypes(Any)

513-element Vector{Any}:
 AbstractArray
 AbstractChannel
 AbstractChar
 AbstractDict
 AbstractDisplay
 AbstractMatch
 AbstractPattern
 AbstractSet
 AbstractString
 Any
 Base.AbstractBroadcasted
 Base.AbstractCartesianIndex
 Base.AbstractCmd
 ⋮
 Tuple
 Type
 TypeVar
 UndefInitializer
 Val
 Vararg
 VecElement
 VersionNumber
 WeakRef
 ZMQ.Context
 ZMQ.Socket
 ZMQ._Message

so it makes sense that`Any` is also a supertype of `Any`

In [2]:
supertype(Any)

Any

I want to show the hierarchy starting from type `Number`. The output of supertype is a vector so I'll create a function to print all the branches from the type `Number`.

I want the input of my function to be whatever `Any`/ `Number` is which is a `DataType`

In [3]:
typeof(Number)

DataType

`DataType` is not a subtype of `Any` and its type is still `DataType`.

It is a subtype of `Type` which is a subtype of `Any`

In [4]:
DataType in subtypes(Any)

false

In [5]:
typeof(DataType)

DataType

In [6]:
supertype(DataType)

Type{T}

In [7]:
subtypes(DataType)

Type[]

In [8]:
subtypes(Type)

4-element Vector{Any}:
 Core.TypeofBottom
 DataType
 Union
 UnionAll

In [9]:
supertype(Type)

Any

In [10]:
length(subtypes(DataType))

0

In [11]:
supertype(DataType)

Type{T}

Now we want to explore the type hierarchy of `Number`. I made a function to make this a bit easier.

In [12]:
supertype(Number)

Any

In [13]:
function get_hierarchy(x; tabs::Integer=0)
    println(repeat("    ", tabs), "└--", x)
    st = subtypes(x)
    
    if length(st) > 0
        for i in st
            get_hierarchy(i, tabs=tabs+1)
        end
    end
end

get_hierarchy (generic function with 1 method)

In [14]:
get_hierarchy(Number)

└--Number
    └--Complex
    └--Real
        └--AbstractFloat
            └--BigFloat
            └--Float16
            └--Float32
            └--Float64
        └--AbstractIrrational
            └--Irrational
        └--Integer
            └--Bool
            └--Signed
                └--BigInt
                └--Int128
                └--Int16
                └--Int32
                └--Int64
                └--Int8
            └--Unsigned
                └--UInt128
                └--UInt16
                └--UInt32
                └--UInt64
                └--UInt8
        └--Rational


#### KR2: Implemented and used at least one own composite type via struct. Generate two more versions that are mutable type and type-parametrized of the custom-built type.

One thing I've used commonly is generating Gaussians so I'll make a composite type for that. Simplest way to describe them is just via mean / variance. I'll add N to pretend it's number of particles or something just so I can add an Int type that sort of makes sense.

In [15]:
struct gaussian
    μ::Float64
    σ::Float64
    N::Int
end

In [16]:
typeof(gaussian)

DataType

In [17]:
g = gaussian(0.0, 1.0, 1000)

gaussian(0.0, 1.0, 1000)

In [18]:
typeof(g)

gaussian

In [19]:
print("μ is $(g.μ), σ is $(g.σ), N is $(g.N)")

μ is 0.0, σ is 1.0, N is 1000

Let's see if I can change the value once it's set. Since this isn't mutable, I'm assuming that I can't.

In [20]:
g.μ = 0.5

LoadError: setfield! immutable struct of type gaussian cannot be changed

Now I want to try to create a mutable struct

In [21]:
mutable struct gaussian_mutable
    μ::Float64
    σ::Float64
    N::Int
end

Seems like there's no distinction using `typeof` if it is mutable or not.

In [22]:
typeof(gaussian_mutable)

DataType

In [23]:
g_m = gaussian_mutable(0.0, 1.0, 1000)

gaussian_mutable(0.0, 1.0, 1000)

In [24]:
typeof(g_m)

gaussian_mutable

With a mutable structure, I can change the value!

In [25]:
g_m.μ

0.0

In [26]:
g_m.μ = 0.5

0.5

In [27]:
g_m

gaussian_mutable(0.5, 1.0, 1000)

Last I need to show a type parametrized struct

In [28]:
struct gaussian_parametrized{T}
    μ::T
    σ::T
    N::Int
end

Interestingly, when it's parametrized, its type is `UnionAll`

In [29]:
typeof(gaussian_parametrized)

UnionAll

Here we see that the type depends on the inputs for each parameter

In [30]:
gaussian_parametrized(0.0, 1.0, 1000)

gaussian_parametrized{Float64}(0.0, 1.0, 1000)

In [31]:
gaussian_parametrized(0, 1, 1000)

gaussian_parametrized{Int64}(0, 1, 1000)

Wondering what would happen if two are parametrized to `T` with different datatypes; Result: It doesn't work!

In [32]:
gaussian_parametrized(0.0, 1, 1000)

LoadError: MethodError: no method matching gaussian_parametrized(::Float64, ::Int64, ::Int64)
[0mClosest candidates are:
[0m  gaussian_parametrized(::T, [91m::T[39m, ::Int64) where T at In[28]:2

In [33]:
g_p = gaussian_parametrized(0, 1, 1000)

gaussian_parametrized{Int64}(0, 1, 1000)

I didn't declare this as a mutable struct so value shouldn't be changeable.

In [34]:
g_p.μ = 0.0

LoadError: setfield! immutable struct of type gaussian_parametrized cannot be changed

Lastly, let's make a parametrized, mutable struct.

In [35]:
mutable struct gaussian_parametrized_mutable{T}
    μ::T
    σ::T
    N::Int
end

In [36]:
g_pm = gaussian_parametrized_mutable(0, 1, 1000)

gaussian_parametrized_mutable{Int64}(0, 1, 1000)

In [37]:
g_pm.μ = 1;
g_pm

gaussian_parametrized_mutable{Int64}(1, 1, 1000)

It works! And just to see that when you change the value it should still be of the same type `T` that is initialized:

In [38]:
g_pm.μ = 0.5

LoadError: InexactError: Int64(0.5)

#### KR3: Demonstrated type inference in Julia. Generator expressions may be used for this.

Following the example in the textbook, we can show type inference:

Julia knows when it's either an integer or a float:

In [39]:
[x for x in 1:5]

5-element Vector{Int64}:
 1
 2
 3
 4
 5

In [40]:
[x for x in 1.0:5.0]

5-element Vector{Float64}:
 1.0
 2.0
 3.0
 4.0
 5.0

Seems like float takes precedence (which makes sense)

In [41]:
[x for x in 1.0:5]

5-element Vector{Float64}:
 1.0
 2.0
 3.0
 4.0
 5.0

In [42]:
[x for x in 1:5.0]

5-element Vector{Float64}:
 1.0
 2.0
 3.0
 4.0
 5.0

Also interesting to note that type inference can still be shown when adding things to the generator expression

In [43]:
[x + 1.0 for x in 1:5]

5-element Vector{Float64}:
 2.0
 3.0
 4.0
 5.0
 6.0

In [44]:
[x + 1 for x in 1:5]

5-element Vector{Int64}:
 2
 3
 4
 5
 6

In [45]:
[x + 1 for x in 1.0:5]

5-element Vector{Float64}:
 2.0
 3.0
 4.0
 5.0
 6.0

#### KR4: Created a function with inherent type-instability. Create a version of the function with fixed  issues.

Doing something similar to the expression on the book `pos`, where

$  pos(x)=\begin{cases}
    x, & \text{if $x>0$}.\\
    0, & \text{otherwise}.
  \end{cases}
  $


Learned something new from https://docs.julialang.org/en/v1/manual/control-flow/ which is the ternary operator and makes the function more compact code-wise.

In [46]:
pos_unstable(x) = x > 0 ? x : 0

pos_unstable (generic function with 1 method)

According to the textbook, we can demonstrate type instability when the output of a function depends on the value of the input and not just its type.

Consider float inputs, one positive and one negative. We can see that because of how we wrote the function, the output is of type Int64 no matter the type of the input if the input is a number less than or equal to 0.

In [47]:
typeof(42.0)

Float64

In [48]:
typeof(-42.0)

Float64

In [49]:
pos_unstable(42.0)

42.0

In [50]:
typeof(pos_unstable(42.0))

Float64

In [51]:
pos_unstable(-42.0)

0

In [52]:
typeof(pos_unstable(-42.0))

Int64

We can fix this  by returning a 0 with the same type as the input

In [53]:
pos_stable(x) = x > 0 ? x : zero(x)

pos_stable (generic function with 1 method)

In [54]:
typeof(pos_stable(42.0))

Float64

In [55]:
typeof(pos_stable(-42.0))

Float64

In [56]:
typeof(pos_unstable(-42))

Int64

#### KR5: Demonstration of how @code_warntype can be useful in detecting type-instability .

For type unstable functions with a possible float input and int output, there's a red warning that highlights possible type-instability when using `@code_warntype`, regardless of the input value.

In [57]:
@code_warntype pos_unstable(42.0)

Variables
  #self#[36m::Core.Const(pos_unstable)[39m
  x[36m::Float64[39m

Body[91m[1m::Union{Float64, Int64}[22m[39m
[90m1 ─[39m %1 = (x > 0)[36m::Bool[39m
[90m└──[39m      goto #3 if not %1
[90m2 ─[39m      return x
[90m3 ─[39m      return 0


In [58]:
@code_warntype pos_unstable(-42.0)

Variables
  #self#[36m::Core.Const(pos_unstable)[39m
  x[36m::Float64[39m

Body[91m[1m::Union{Float64, Int64}[22m[39m
[90m1 ─[39m %1 = (x > 0)[36m::Bool[39m
[90m└──[39m      goto #3 if not %1
[90m2 ─[39m      return x
[90m3 ─[39m      return 0


For the same unstable code, if the input is also an int, there is no red "warning".

In [59]:
@code_warntype pos_unstable(42)

Variables
  #self#[36m::Core.Const(pos_unstable)[39m
  x[36m::Int64[39m

Body[36m::Int64[39m
[90m1 ─[39m %1 = (x > 0)[36m::Bool[39m
[90m└──[39m      goto #3 if not %1
[90m2 ─[39m      return x
[90m3 ─[39m      return 0


In [60]:
@code_warntype pos_unstable(-42)

Variables
  #self#[36m::Core.Const(pos_unstable)[39m
  x[36m::Int64[39m

Body[36m::Int64[39m
[90m1 ─[39m %1 = (x > 0)[36m::Bool[39m
[90m└──[39m      goto #3 if not %1
[90m2 ─[39m      return x
[90m3 ─[39m      return 0


Further highlighting the that when the function does not have type-instability, there is no warning for `@code_warntype` and the type or value of the inputs don't really change the output at all.

In [61]:
@code_warntype pos_stable(-42.0)

Variables
  #self#[36m::Core.Const(pos_stable)[39m
  x[36m::Float64[39m

Body[36m::Float64[39m
[90m1 ─[39m %1 = (x > 0)[36m::Bool[39m
[90m└──[39m      goto #3 if not %1
[90m2 ─[39m      return x
[90m3 ─[39m %4 = Main.zero(x)[36m::Core.Const(0.0)[39m
[90m└──[39m      return %4


In [62]:
@code_warntype pos_stable(-42)

Variables
  #self#[36m::Core.Const(pos_stable)[39m
  x[36m::Int64[39m

Body[36m::Int64[39m
[90m1 ─[39m %1 = (x > 0)[36m::Bool[39m
[90m└──[39m      goto #3 if not %1
[90m2 ─[39m      return x
[90m3 ─[39m %4 = Main.zero(x)[36m::Core.Const(0)[39m
[90m└──[39m      return %4


In [63]:
@code_warntype pos_stable(42.0)

Variables
  #self#[36m::Core.Const(pos_stable)[39m
  x[36m::Float64[39m

Body[36m::Float64[39m
[90m1 ─[39m %1 = (x > 0)[36m::Bool[39m
[90m└──[39m      goto #3 if not %1
[90m2 ─[39m      return x
[90m3 ─[39m %4 = Main.zero(x)[36m::Core.Const(0.0)[39m
[90m└──[39m      return %4


In [64]:
@code_warntype pos_stable(42.0)

Variables
  #self#[36m::Core.Const(pos_stable)[39m
  x[36m::Float64[39m

Body[36m::Float64[39m
[90m1 ─[39m %1 = (x > 0)[36m::Bool[39m
[90m└──[39m      goto #3 if not %1
[90m2 ─[39m      return x
[90m3 ─[39m %4 = Main.zero(x)[36m::Core.Const(0.0)[39m
[90m└──[39m      return %4


#### KR6: Demonstration of how Arrays containing ambiguous/abstract types often results to slow execution of codes. The BenchmarkTools may be useful in this part.

In [65]:
using BenchmarkTools

We know that `Number` is a more abstract type and `Int64` is a more specific type so we can compare two arrays with different declared types.

In [66]:
abstract_ = Number[1,2,3,4]

4-element Vector{Number}:
 1
 2
 3
 4

In [67]:
concrete = Int64[1,2,3,4]

4-element Vector{Int64}:
 1
 2
 3
 4

The simplest thing to do would be to just get the sum of the contents of both arrays and even if there are only four elements in each, we can see the speed improvement from the benchmarks of around 3x for the example whent the variable type is not abstract.

In [68]:
@benchmark sum(abstract_)

BenchmarkTools.Trial: 10000 samples with 957 evaluations.
 Range [90m([39m[36m[1mmin[22m[39m … [35mmax[39m[90m):  [39m[36m[1m92.258 ns[22m[39m … [35m220.446 ns[39m  [90m┊[39m GC [90m([39mmin … max[90m): [39m0.00% … 0.00%
 Time  [90m([39m[34m[1mmedian[22m[39m[90m):     [39m[34m[1m92.698 ns               [22m[39m[90m┊[39m GC [90m([39mmedian[90m):    [39m0.00%
 Time  [90m([39m[32m[1mmean[22m[39m ± [32mσ[39m[90m):   [39m[32m[1m92.951 ns[22m[39m ± [32m  3.206 ns[39m  [90m┊[39m GC [90m([39mmean ± σ[90m):  [39m0.00% ± 0.00%

  [39m [39m▄[39m▆[34m█[39m[39m▄[32m [39m[39m [39m [39m [39m [39m▂[39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m▁
  [39m▅[39m█[39m█[34m█[3

In [69]:
@benchmark sum(concrete)

BenchmarkTools.Trial: 10000 samples with 996 evaluations.
 Range [90m([39m[36m[1mmin[22m[39m … [35mmax[39m[90m):  [39m[36m[1m24.889 ns[22m[39m … [35m114.379 ns[39m  [90m┊[39m GC [90m([39mmin … max[90m): [39m0.00% … 0.00%
 Time  [90m([39m[34m[1mmedian[22m[39m[90m):     [39m[34m[1m24.917 ns               [22m[39m[90m┊[39m GC [90m([39mmedian[90m):    [39m0.00%
 Time  [90m([39m[32m[1mmean[22m[39m ± [32mσ[39m[90m):   [39m[32m[1m25.060 ns[22m[39m ± [32m  1.458 ns[39m  [90m┊[39m GC [90m([39mmean ± σ[90m):  [39m0.00% ± 0.00%

  [39m█[34m▅[39m[39m [39m [32m [39m[39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m▁
  [39m█[34m█[39m[39m▄[39

Also wanted to test if just generating a struct would have speed improvements by specifying types.

In [70]:
struct point
    x::Float64
    y::Float64
end

In [71]:
@benchmark point(rand(), rand())

BenchmarkTools.Trial: 10000 samples with 998 evaluations.
 Range [90m([39m[36m[1mmin[22m[39m … [35mmax[39m[90m):  [39m[36m[1m14.729 ns[22m[39m … [35m167.199 ns[39m  [90m┊[39m GC [90m([39mmin … max[90m): [39m0.00% … 0.00%
 Time  [90m([39m[34m[1mmedian[22m[39m[90m):     [39m[34m[1m15.615 ns               [22m[39m[90m┊[39m GC [90m([39mmedian[90m):    [39m0.00%
 Time  [90m([39m[32m[1mmean[22m[39m ± [32mσ[39m[90m):   [39m[32m[1m16.105 ns[22m[39m ± [32m  2.395 ns[39m  [90m┊[39m GC [90m([39mmean ± σ[90m):  [39m0.00% ± 0.00%

  [39m▁[39m [39m [39m [39m█[34m█[39m[39m [39m [32m [39m[39m [39m [39m▄[39m▆[39m▂[39m▁[39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m▂
  [39m█[39m▁[39m▁[39m▁[3

In [72]:
struct point_
    x
    y
end

In [73]:
@benchmark point_(rand(), rand())

BenchmarkTools.Trial: 10000 samples with 996 evaluations.
 Range [90m([39m[36m[1mmin[22m[39m … [35mmax[39m[90m):  [39m[36m[1m25.085 ns[22m[39m … [35m 2.209 μs[39m  [90m┊[39m GC [90m([39mmin … max[90m): [39m0.00% … 98.47%
 Time  [90m([39m[34m[1mmedian[22m[39m[90m):     [39m[34m[1m28.275 ns              [22m[39m[90m┊[39m GC [90m([39mmedian[90m):    [39m0.00%
 Time  [90m([39m[32m[1mmean[22m[39m ± [32mσ[39m[90m):   [39m[32m[1m29.929 ns[22m[39m ± [32m49.758 ns[39m  [90m┊[39m GC [90m([39mmean ± σ[90m):  [39m4.37% ±  2.60%

  [39m [39m [39m [39m [39m▃[39m▄[39m▃[39m [39m [39m [39m [39m▁[39m [39m [39m▄[39m█[34m▇[39m[39m▄[39m▂[39m▁[39m [39m [39m [39m [32m▁[39m[39m▂[39m▄[39m▅[39m▄[39m▁[39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m▂
  [39m▅[39m▅[39m▅[39m▆[39m█[39

Based on this, even with the same input types, there's ~a 2x speed improvement when we declare Float64 for the struct parameters.