# Custom Types

## Defining data types

We can define types (i.e. data structures) ourselves using the `struct` keyword.

It is a convention that type names are capitalized and [camel cased](https://en.wikipedia.org/wiki/Camel_case).

(Note that types can not be redefined - you have to restart your Julia session to change a type definiton.)

In [None]:
struct MyType end

To create an object of type `MyType` we have to call a [constructor](https://docs.julialang.org/en/latest/manual/constructors/). Loosely speaking, a constructor is a function that create new objects.

Julia automatically creates a trivial constructors for us, which has the same name as the type.

In [None]:
methods(MyType)

In [None]:
m = MyType()

In [None]:
typeof(m)

In [None]:
m isa MyType

Since no data is contained in our `MyType`  - it is a so-called *singleton type* - we can basically only use it for dispatch.

Most of the time, we'll want a self-defined type to hold some data. For this, we need *fields*.

In [None]:
struct A
    x::Int64
end

In [None]:
A()

The default constructor always expects values for all fields.

In [None]:
A(3)

In [None]:
a = A(3)

In [None]:
# a.<TAB>
a.x

Note that types defined with `struct` are **immutable**, that is the values of it's fields cannot be changed.

In [None]:
a.x = 2

In [None]:
mutable struct B
    x::Int64
end

In [None]:
b = B(3)

In [None]:
b.x

In [None]:
b.x = 4

In [None]:
b.x

Note, however, that **immutability is not recursive**.

In [None]:
struct C
    x::Vector{Int64}
end

In [None]:
c = C([1, 2, 3])

In [None]:
c.x

In [None]:
c.x = [3,4,5]

In [None]:
c.x[1] = 3

In [None]:
c.x

In [None]:
c.x .= [3,4,5] # dot to perform the assignment element-wise

Abstract types are just as easy to define using the keyword `abstract type`.

In [None]:
abstract type MyAbstractType end

Since abstract types don't have fields, they only (informally) define interfaces and can be used for dispatch.

In [None]:
struct MyConcreteType <: MyAbstractType # subtype
    somefield::String
end

In [None]:
c = MyConcreteType("test")

In [None]:
c isa MyAbstractType

In [None]:
supertype(MyConcreteType)

In [None]:
subtypes(MyAbstractType)

## Example: Diagonal Matrix

In [None]:
struct DiagonalMat
    diag::Vector{Float64}
end

In [None]:
DiagonalMat([1.2,4.3,5.0])

### Arithmetic

In [None]:
import Base: +, -, *, /

+(Da::DiagonalMat, Db::DiagonalMat) = DiagonalMat(Da.diag + Db.diag)
-(Da::DiagonalMat, Db::DiagonalMat) = DiagonalMat(Da.diag - Db.diag)
*(Da::DiagonalMat, Db::DiagonalMat) = DiagonalMat(Da.diag .* Db.diag)
/(Da::DiagonalMat, Db::DiagonalMat) = DiagonalMat(Da.diag ./ Db.diag)

In [None]:
D1 = DiagonalMat([1,2,3])
D2 = DiagonalMat([2.4,1.9,5.7])

In [None]:
D1 + D2

In [None]:
D1 - D2

In [None]:
D1 * D2

In [None]:
D1 / D2

Arithmetics involving other types:

In [None]:
# Number
*(x::Number, D::DiagonalMat) = DiagonalMat(x * D.diag)
*(D::DiagonalMat, x::Number) = DiagonalMat(D.diag * x)
/(D::DiagonalMat, x::Number) = DiagonalMat(D.diag / x)

# Vector
*(D::DiagonalMat, V::AbstractVector) = D.diag .* V

In [None]:
D1 * 2

In [None]:
D1 * rand(3)

Note that some functions already work for our `DiagonalMat`:

In [None]:
sum([D1, D2])

### Parameterization

In [None]:
DiagonalMat([1,2,3]) # implicit conversion to Vector{Float64}

In [None]:
DiagonalMat([1+3im, 4-2im, im])

In [None]:
DiagonalMat(["Why", "not", "support", "strings?"])

We can easily relax our type definition to allow all sorts of internal value types.

In [None]:
struct DiagonalMatParam{T, V<:AbstractVector{T}}
    diag::V
end

# copied from above
import Base: +, -, *, /
+(Da::DiagonalMatParam, Db::DiagonalMatParam) = DiagonalMatParam(Da.diag + Db.diag)
-(Da::DiagonalMatParam, Db::DiagonalMatParam) = DiagonalMatParam(Da.diag - Db.diag)
*(Da::DiagonalMatParam, Db::DiagonalMatParam) = DiagonalMatParam(Da.diag .* Db.diag)
/(Da::DiagonalMatParam, Db::DiagonalMatParam) = DiagonalMatParam(Da.diag ./ Db.diag)
# Number
*(x::Number, D::DiagonalMatParam) = DiagonalMatParam(x * D.diag)
*(D::DiagonalMatParam, x::Number) = DiagonalMatParam(D.diag * x)
/(D::DiagonalMatParam, x::Number) = DiagonalMatParam(D.diag / x)
# Vector
*(D::DiagonalMatParam, V::AbstractVector) = D.diag .* V

In [None]:
DiagonalMatParam([1+3im, 4-2im, im])

In [None]:
DiagonalMatParam(["This ", "just "]) * DiagonalMatParam(["should", "work!"])

### `AbstractArray` interface

Let's integrate our diagonal matrix into Julia's type hierarchy by subtyping `AbstractMatrix`. Of course, we should then also implement the [`AbstractArray` interface](https://docs.julialang.org/en/latest/manual/interfaces/#man-interface-array-1)!

In [None]:
struct DiagonalMatrix{T, V<:AbstractVector{T}} <: AbstractMatrix{T}
    diag::V
end

In [None]:
# implement AbstractArray interface
Base.size(D::DiagonalMatrix) = (length(D.diag), length(D.diag))
function Base.getindex(D::DiagonalMatrix{T,V}, i::Int, j::Int) where {T,V}
    if i == j
        r = D.diag[i]
    else
        r = zero(T)
    end
    r
end
function setindex!(D::DiagonalMatrix, v, i::Int, j::Int)
    if i == j
        D.diag[i] = v
    else
        throw(ArgumentError("cannot set off-diagonal entry ($i, $j)"))
    end
    return v
end

In [None]:
D = DiagonalMatrix([1,2,3])

Note how it's automagically pretty printed!

In [None]:
D * D

In [None]:
D + D

In [None]:
D - D

In [None]:
D / D

Basic arithmetics **just works!** What about broadcasting and more complicated functions?

In [None]:
sin.(D)

In [None]:
sum([D, D, D])

In [None]:
using LinearAlgebra
eigen(D)

It is still advantageous to define fast versions that utilize the special diagonal structure:

In [None]:
@which D + D

In [None]:
+(Da::DiagonalMatrix, Db::DiagonalMatrix) = DiagonalMatrix(Da.diag + Db.diag)
*(Da::DiagonalMatrix, Db::DiagonalMatrix) = DiagonalMatrix(Da.diag .* Db.diag)

In [None]:
@which D + D

An important thing to note is that **user defined types are just as good as built-in types**!

There is nothing special about built-in types. In fact, [they are implemented in precisely the same way](https://github.com/JuliaLang/julia/blob/master/stdlib/LinearAlgebra/src/diagonal.jl#L5)!

Let us quickly confirm that our `DiagonalMatrix` type does not come with any performance overhead by benchmarking it in a simple function.

# Benchmarking with `BenchmarkTools.jl`

In [None]:
using BenchmarkTools

In [None]:
operation(x) = x + 2*x

In [None]:
x = rand(2,2)
@time operation.(x)

In [None]:
function f()
    x = rand(2,2)
    @time operation.(x)
end

In [None]:
f()

We should wrap benchmarks into functions!

Fortunately, there are tools that do this for us. In addition, they also collect some statistics by running the benchmark multiple times.

In [None]:
@benchmark operation.(x)

Typically we don't need all this information. Just use `@btime` instead of `@time`!

In [None]:
@btime operation.(x);

However, we still have to take some care to avoid accessing global variables.

In [None]:
@btime operation.($x); # interpolate the value of x into the expression to avoid overhead of globals

More information: [BenchmarkTools.jl](https://github.com/JuliaCI/BenchmarkTools.jl/blob/master/doc/manual.md).

Finally, we can check the performance of our custom volume type.

In [None]:
using LinearAlgebra
x = rand(100);
Djl = Diagonal(x)
D = DiagonalMatrix(x)
@btime operation(Djl);
@btime operation(D);

# Core messages of this Notebook

* **User defined types are as good as built-in types.**
* There are `mutable struct`s and immutable `struct`s and immutability is not recursive.
* We can easily **extend `Base` functions** for our types to implement arithmetics and such.
* **Subtyping an existing interface** can give lots of functionality for free.
* We should always benchmark our code with **BenchmarkTools.jl's @btime and @benchmark**.

# Exercise: One-hot vector

[One-hot encoding](https://en.wikipedia.org/wiki/One-hot) is useful in machine learning, as we'll see later.

It simply means that among a group of bits (all either 0 or 1) only one is hot (1) while all others are cold (0),

`v = [0, 0, 0, 0, 0, 1, 0, 0, 0]`

### Task

1. Think about what information an implementation of a one-hot vector actually has to store.
2. Define a `OneHot` type which represents a vector with only a single hot (i.e. `== 1`) bit.
3. Extend all the necessary `Base` functions such that the following computation works for a matrix `A` and a vector of `OneHot` vectors `vs` (i.e. `vs isa Vector{OneHot}`).

    ```julia
    function innersum(A, vs)
        t = zero(eltype(A)) # generic!
        for v in vs
            y = A*v
            for i in 1:length(vs[1])
                t += v[i] * y[i]
            end
        end
        return t
    end

    A = rand(3,3)
    vs = [rand(3) for _ = 1:10] # This should be replaced by a `Vector{OneHot}`

    innersum(A, vs)

    ```

4. Benchmark the speed of `innersum` when called with a `OneHot` vector or with a `Vector{Float64}`, respectively.
 * Do you observe a speed up?


5. Now, define a `OneHotVector` type which is identical to `OneHot` but is declared to be a subtype of `AbstractVector{Bool}` and extend only the functions `Base.getindex(v::OneHotVector, i::Int)` and `Base.size(v::OneHotVector)`.
 * Here, the function `size` should return a `Tuple{Int64}` indicating the length of the vector, i.e. `(3,)` for a one-hot vector of length 3 (see the [AbstractArray interface](https://docs.julialang.org/en/latest/manual/interfaces/#man-interface-array-1) for more information)
 

6. Try to create a single `OneHotVector` and try to run the `innersum` function using the new `OneHotVector` type.
 * What changes do you observe?
 * Do you have to implement any further methods?