## Data Types, Methods, and Introspection
([Back to Overview](../index.html#/0/3))

## Data Types

Every data type is a first class citizen. They live in a tree, which can be interrogated using the `subtypes` command.

Abstract types have subtypes

In [62]:
subtypes(Number)

2-element Vector{Any}:
 Complex
 Real

In [63]:
subtypes(Real)

4-element Vector{Any}:
 AbstractFloat
 AbstractIrrational
 Integer
 Rational

Concrete data types don't have subtypes

In [55]:
subtypes(Int64)

Type[]

Eg. all numeric data types in Julia form this tree:
![Datatype tree for Julia Number abstract type](https://upload.wikimedia.org/wikipedia/commons/4/40/Type-hierarchy-for-julia-numbers.png)

While types are not strictly _necessary_, they are helpful in:
1. helping the compiler optimize code
2. provide meaningful error messages

Let's call `fib_1` on a string type

In [1]:
function fib_1(n)
    if n <= 2
        return 1
    end

    fib_1(n - 1) + fib_1(n - 2)
end

fib_1 (generic function with 1 method)

In [65]:
fib_1("32.")

LoadError: MethodError: no method matching isless(::String, ::Int64)
[0mClosest candidates are:
[0m  isless([91m::AbstractFloat[39m, ::Real) at /home/linuxbrew/.linuxbrew/Cellar/julia/1.7.2/share/julia/base/operators.jl:186
[0m  isless(::AbstractString, [91m::AbstractString[39m) at /home/linuxbrew/.linuxbrew/Cellar/julia/1.7.2/share/julia/base/strings/basic.jl:344
[0m  isless([91m::Real[39m, ::Real) at /home/linuxbrew/.linuxbrew/Cellar/julia/1.7.2/share/julia/base/operators.jl:430
[0m  ...

In [66]:
function fib_2(n::Number)
    n <= 2 && return 1
    fib_2(n - 1) + fib_2(n - 2)
end

fib_2 (generic function with 1 method)

Which limits the inputs to numeric types (both `Int` and `Float64` are inherited from the abstract type `Number`)

In [67]:
fib_2("32.")

LoadError: MethodError: no method matching fib_2(::String)
[0mClosest candidates are:
[0m  fib_2([91m::Number[39m) at In[66]:1

## Methods

You should think of _functions_ as ideas. How they are implemented are a functions _methods_:

Eg: "something that doubles just the part of the number in front of the decimal point". So `double_int(10)=20`, and `double_int(10.1) = 20.1`. We can implement this in several ways, eg:
1. If the input is an integer, double it,
2. If the input is a floating-point value, then compute the decimal part, double it, and add the original remainder:

In [40]:
function double_int(x::Int)
    return 2*x
end

function double_int(x::AbstractFloat)
    y = floor(Int, x)
    r = x - y
    return 2*y + r
end

double_int (generic function with 2 methods)

In [30]:
double_int(10)

20

In [31]:
double_int(10.1)

20.1

We can list the methods for a function using the `methods` function:

In [32]:
methods(double_int)

## Introspection

We may also inspect the details the code using code introspection: https://docs.julialang.org/en/v1/devdocs/reflection/#Reflection-and-introspection

The `@code_lowered` macro gives is a (still somewhat abstract) idea what Julia actually _does_.

In [33]:
@code_lowered double_int(2)

CodeInfo(
[90m1 ─[39m %1 = 2 * x
[90m└──[39m      return %1
)

This picks up the method for `x` as an integer, and similarly we can see what Julia does when `x` is a float:

In [34]:
@code_lowered double_int(2.1)

CodeInfo(
[90m1 ─[39m      y = Main.floor(Main.Int, x)
[90m│  [39m      r = x - y
[90m│  [39m %3 = 2 * y
[90m│  [39m %4 = %3 + r
[90m└──[39m      return %4
)

And `@code_llvm` shows the llvm IR:

In [41]:
@code_llvm double_int(2)

[90m;  @ In[40]:1 within `double_int`[39m
[95mdefine[39m [36mi64[39m [93m@julia_double_int_2028[39m[33m([39m[36mi64[39m [95msignext[39m [0m%0[33m)[39m [0m#0 [33m{[39m
[91mtop:[39m
[90m;  @ In[40]:2 within `double_int`[39m
[90m; ┌ @ int.jl:88 within `*`[39m
   [0m%1 [0m= [96m[1mshl[22m[39m [36mi64[39m [0m%0[0m, [33m1[39m
[90m; └[39m
  [96m[1mret[22m[39m [36mi64[39m [0m%1
[33m}[39m


We can see that Julia generates _different_ llvm IR code depending in data types

In [42]:
@code_llvm double_int(2.1)

[90m;  @ In[40]:5 within `double_int`[39m
[95mdefine[39m [36mdouble[39m [93m@julia_double_int_2030[39m[33m([39m[36mdouble[39m [0m%0[33m)[39m [0m#0 [33m{[39m
[91mtop:[39m
  [0m%1 [0m= [96m[1malloca[22m[39m [33m[[39m[33m3[39m [0mx [33m{[39m[33m}[39m[0m*[33m][39m[0m, [95malign[39m [33m8[39m
  [0m%gcframe4 [0m= [96m[1malloca[22m[39m [33m[[39m[33m3[39m [0mx [33m{[39m[33m}[39m[0m*[33m][39m[0m, [95malign[39m [33m16[39m
  [0m%gcframe4.sub [0m= [96m[1mgetelementptr[22m[39m [95minbounds[39m [33m[[39m[33m3[39m [0mx [33m{[39m[33m}[39m[0m*[33m][39m[0m, [33m[[39m[33m3[39m [0mx [33m{[39m[33m}[39m[0m*[33m][39m[0m* [0m%gcframe4[0m, [36mi64[39m [33m0[39m[0m, [36mi64[39m [33m0[39m
  [0m%.sub [0m= [96m[1mgetelementptr[22m[39m [95minbounds[39m [33m[[39m[33m3[39m [0mx [33m{[39m[33m}[39m[0m*[33m][39m[0m, [33m[[39m[33m3[39m [0mx [33m{[39m[33m}[39m[0m*[33m][39m[0m* [

Julia does compile different machine code for different input types. For more information go to:
https://docs.julialang.org/en/v1/manual/integers-and-floating-point-numbers/#Integers-and-Floating-Point-Numbers and https://docs.julialang.org/en/v1/manual/types/

## Performance Benchmarking and Type Stability

Here is the reason why it's always good to specity data types: whenever a data type "morphs" into another (for example integer division), you have to do a lot of work, in order to accommodate type instability. It boils down to having to treat otherwise simple variables as more complex objects.

For example:

In [44]:
function t1(n)
    s = 1
    for i in 1:n
        s /= rand()  ## WARNING: unstable type!
    end
    s
end

t1 (generic function with 1 method)

In [45]:
function t2(n)
    s = 1.      ## Stable type
    for i in 1:n
        s /= rand()
    end
    s
end

t2 (generic function with 1 method)

The function `t1` can't decide ahead of time if `s` can remain as an integer!

Let's see how this can effect runtime:

In [46]:
using BenchmarkTools

In [47]:
@benchmark t1(10)

BenchmarkTools.Trial: 10000 samples with 989 evaluations.
 Range [90m([39m[36m[1mmin[22m[39m … [35mmax[39m[90m):  [39m[36m[1m45.681 ns[22m[39m … [35m98.027 ns[39m  [90m┊[39m GC [90m([39mmin … max[90m): [39m0.00% … 0.00%
 Time  [90m([39m[34m[1mmedian[22m[39m[90m):     [39m[34m[1m46.075 ns              [22m[39m[90m┊[39m GC [90m([39mmedian[90m):    [39m0.00%
 Time  [90m([39m[32m[1mmean[22m[39m ± [32mσ[39m[90m):   [39m[32m[1m46.836 ns[22m[39m ± [32m 3.737 ns[39m  [90m┊[39m GC [90m([39mmean ± σ[90m):  [39m0.00% ± 0.00%

  [39m▇[34m█[39m[39m [32m [39m[39m [39m▂[39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m▁
  [39m█[34m█[39m[39m▄[32m▄[39m[

In [48]:
@benchmark t2(10)

BenchmarkTools.Trial: 10000 samples with 996 evaluations.
 Range [90m([39m[36m[1mmin[22m[39m … [35mmax[39m[90m):  [39m[36m[1m22.531 ns[22m[39m … [35m41.823 ns[39m  [90m┊[39m GC [90m([39mmin … max[90m): [39m0.00% … 0.00%
 Time  [90m([39m[34m[1mmedian[22m[39m[90m):     [39m[34m[1m22.626 ns              [22m[39m[90m┊[39m GC [90m([39mmedian[90m):    [39m0.00%
 Time  [90m([39m[32m[1mmean[22m[39m ± [32mσ[39m[90m):   [39m[32m[1m22.768 ns[22m[39m ± [32m 1.208 ns[39m  [90m┊[39m GC [90m([39mmean ± σ[90m):  [39m0.00% ± 0.00%

  [39m▆[39m█[34m█[39m[39m▇[39m▆[39m [32m [39m[39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m▂[39m▁[39m▁[39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m▂
  [39m█[39m█[34m█[39m[39m█[39m█

The `@code_warntype` macro is able to show us how stable data types are

In [None]:
@code_warntype t1(10)

MethodInstance for t1(::Int64)
  from t1(n) in Main at In[44]:1
Arguments
  #self#[36m::Core.Const(t1)[39m
  n[36m::Int64[39m
Locals
  @_3[33m[1m::Union{Nothing, Tuple{Int64, Int64}}[22m[39m
  s[91m[1m::Union{Float64, Int64}[22m[39m
  i[36m::Int64[39m
Body[91m[1m::Union{Float64, Int64}[22m[39m
[90m1 ─[39m       (s = 1)
[90m│  [39m %2  = (1:n)[36m::Core.PartialStruct(UnitRange{Int64}, Any[Core.Const(1), Int64])[39m
[90m│  [39m       (@_3 = Base.iterate(%2))
[90m│  [39m %4  = (@_3 === nothing)[36m::Bool[39m
[90m│  [39m %5  = Base.not_int(%4)[36m::Bool[39m
[90m└──[39m       goto #4 if not %5
[90m2 ┄[39m %7  = @_3[36m::Tuple{Int64, Int64}[39m
[90m│  [39m       (i = Core.getfield(%7, 1))
[90m│  [39m %9  = Core.getfield(%7, 2)[36m::Int64[39m
[90m│  [39m %10 = s[91m[1m::Union{Float64, Int64}[22m[39m
[90m│  [39m %11 = Main.rand()[36m::Float64[39m
[90m│  [39m       (s = %10 / %11)
[90m│  [39m       (@_3 = Base.iterate(%2, %9))
[90m│  

The `Union{Float64, Int64}` data type is a red flag: at this point in the code, we might need to convert between `Float64` and `Int64`.

In [None]:
@code_warntype t2(10)

MethodInstance for t2(::Int64)
  from t2(n) in Main at In[45]:1
Arguments
  #self#[36m::Core.Const(t2)[39m
  n[36m::Int64[39m
Locals
  @_3[33m[1m::Union{Nothing, Tuple{Int64, Int64}}[22m[39m
  s[36m::Float64[39m
  i[36m::Int64[39m
Body[36m::Float64[39m
[90m1 ─[39m       (s = 1.0)
[90m│  [39m %2  = (1:n)[36m::Core.PartialStruct(UnitRange{Int64}, Any[Core.Const(1), Int64])[39m
[90m│  [39m       (@_3 = Base.iterate(%2))
[90m│  [39m %4  = (@_3 === nothing)[36m::Bool[39m
[90m│  [39m %5  = Base.not_int(%4)[36m::Bool[39m
[90m└──[39m       goto #4 if not %5
[90m2 ┄[39m %7  = @_3[36m::Tuple{Int64, Int64}[39m
[90m│  [39m       (i = Core.getfield(%7, 1))
[90m│  [39m %9  = Core.getfield(%7, 2)[36m::Int64[39m
[90m│  [39m %10 = s[36m::Float64[39m
[90m│  [39m %11 = Main.rand()[36m::Float64[39m
[90m│  [39m       (s = %10 / %11)
[90m│  [39m       (@_3 = Base.iterate(%2, %9))
[90m│  [39m %14 = (@_3 === nothing)[36m::Bool[39m
[90m│  [39m %15 = B

The function `t2` is type stable => no variables change between data type as the function runs.