# Exercise: One-Hot Vector

[One-hot encoding](https://en.wikipedia.org/wiki/One-hot): Among a group of bits (all either `0` or `1`) only one is hot (`1`) while all others are cold (`0`).

Example: `v = [0, 0, 0, 0, 0, 1, 0, 0, 0]`

One-hot encoding has applications in machine learning, specifically in classification tasks where there is a finite set of categories (think of [MNIST](https://en.wikipedia.org/wiki/MNIST_database) where images can show numbers from 0 to 9).

### Task

1. Think about what information an implementation of a one-hot vector actually has to store.
2. Using `struct`, define a `OneHot` type which represents a vector with only a single hot (i.e. `== 1`) bit.
3. Extend all the necessary `Base` functions such that the following computation works for a matrix `A` and a vector of `OneHot` vectors `vs` (i.e. `vs isa Vector{OneHot}`).

    ```julia
    function innersum(A, vs)
        t = zero(eltype(A))
        for v in vs
            y = A*v
            for i in 1:length(vs[1])
                t += v[i] * y[i]
            end
        end
        return t
    end

    A = rand(3,3)
    vs = [rand(3) for i in 1:10] # This should be replaced by a `Vector{OneHot}`

    innersum(A, vs)

    ```

4. Benchmark the speed of `innersum` when called with a vector of `OneHot` vectors (i.e. `vs = [OneHot(3, rand(1:3)) for i in 1:10]`) and when called with a vector of `Vector{Float64}` vectors, respectively.
    - What do you observe?


5. Now, define a `OneHotVector` type which is identical to `OneHot` but is declared to be a subtype of `AbstractVector{Bool}` and extend only the functions `Base.getindex(v::OneHotVector, i::Int)` and `Base.size(v::OneHotVector)`.
    - Here, the function `size` should return a `Tuple{Int64}` indicating the length of the vector, i.e. `(3,)` for a one-hot vector of length 3.
 

6. Try to create a single `OneHotVector` and try to run the `innersum` function using the new `OneHotVector` type.
    - What changes do you observe?
    - Do you have to implement any further methods?

In [17]:
function innersum(A, vs)
     t = zero(eltype(A))
     for v in vs
         y = A*v
         for i in 1:length(vs[1])
             t += v[i] * y[i]
         end
     end
     return t
 end

innersum (generic function with 1 method)

### Your solution:

In [2]:
#...

In [3]:
struct OneHot
    len::Int
    ind::Int
end

In [4]:
Base.getindex(v::OneHot, i::Int) = v.ind == i
Base.length(v::OneHot) = v.len
Base.size(v::OneHot) = (v.len,)
import Base: *
*(A::AbstractArray, v::OneHot) = A[:,v.ind]

* (generic function with 234 methods)

In [5]:
t = OneHot(3,2); A=rand(3,3); vs = [t,t,t]

3-element Vector{OneHot}:
 OneHot(3, 2)
 OneHot(3, 2)
 OneHot(3, 2)

In [6]:
innersum(A,vs)

In [7]:
vs = [OneHot(3, rand(1:3)) for i in 1:10]
A = rand(3,10)

3×10 Matrix{Float64}:
 0.766233  0.289479    0.318935  0.809471  …  0.932064  0.553563    0.241418
 0.788245  0.00195445  0.123022  0.239711     0.166428  0.00769221  0.206831
 0.374356  0.432844    0.839605  0.849857     0.375148  0.829517    0.248995

In [8]:
using BenchmarkTools
@benchmark innersum($A, $vs)

BenchmarkTools.Trial: 10000 samples with 379 evaluations.
 Range [90m([39m[36m[1mmin[22m[39m … [35mmax[39m[90m):  [39m[36m[1m243.799 ns[22m[39m … [35m 80.608 μs[39m  [90m┊[39m GC [90m([39mmin … max[90m): [39m0.00% … 0.00%
 Time  [90m([39m[34m[1mmedian[22m[39m[90m):     [39m[34m[1m280.211 ns               [22m[39m[90m┊[39m GC [90m([39mmedian[90m):    [39m0.00%
 Time  [90m([39m[32m[1mmean[22m[39m ± [32mσ[39m[90m):   [39m[32m[1m365.440 ns[22m[39m ± [32m889.015 ns[39m  [90m┊[39m GC [90m([39mmean ± σ[90m):  [39m6.60% ± 7.23%

  [39m█[39m▇[34m▇[39m[39m▆[39m▅[39m▄[39m▄[32m▃[39m[39m▂[39m▂[39m▂[39m▃[39m▃[39m▂[39m▃[39m▃[39m▂[39m▂[39m▁[39m▁[39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m▂
  [39m█[39m█[34m█

In [9]:
xs = rand(3,10)
@benchmark innersum($A,$xs)

BenchmarkTools.Trial: 10000 samples with 10 evaluations.
 Range [90m([39m[36m[1mmin[22m[39m … [35mmax[39m[90m):  [39m[36m[1m1.410 μs[22m[39m … [35m76.020 μs[39m  [90m┊[39m GC [90m([39mmin … max[90m): [39m0.00% … 75.76%
 Time  [90m([39m[34m[1mmedian[22m[39m[90m):     [39m[34m[1m1.640 μs              [22m[39m[90m┊[39m GC [90m([39mmedian[90m):    [39m0.00%
 Time  [90m([39m[32m[1mmean[22m[39m ± [32mσ[39m[90m):   [39m[32m[1m2.110 μs[22m[39m ± [32m 2.261 μs[39m  [90m┊[39m GC [90m([39mmean ± σ[90m):  [39m3.35% ±  3.71%

  [39m▃[39m█[34m▇[39m[39m▆[39m▄[39m▃[39m▃[32m▂[39m[39m▁[39m▁[39m [39m▁[39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m▁[39m▁[39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m▁
  [39m█[39m█[34m█[39m[39m█[39m█[39m█[3

In [10]:
struct OneHotVector<:AbstractVector{Bool}
    len::Int
    ind::Int
end

In [11]:
Base.getindex(v::OneHotVector, i::Int) = v.ind == i
Base.size(v::OneHotVector) = (v.len,)

In [18]:
innersum(A,OneHotVector(10,2))

0.7662328830658524

In [15]:
A*OneHotVector(10,2)

3-element Vector{Float64}:
 0.2894792597538084
 0.001954453531136102
 0.43284412755826407