# Towards a Type-Stable Generation of Vectors Containing User Defined Types

We wish to work with vectors (one-dimensional arrays or lists) with elements whose type is used-defined. We seek to understand how to allocate of such vectors. And we seek to understand how to construct such vectors in such a way that the number of allocations is independent of (or only depends mildly on) the length of the vector. This implies that we seek to avoid allocating memory in the loop over elements in which the vector is constructed.

We wonder whether the use of static arrays can be avoided. We wish to understand the intrinsics of Julia. And possibly (?) the use of static arrays does not carry over (?) to user-defined types.   

We wish to understand the mechanism that static arrays adds (at compiler level?) to avoid memory allocations? 

This notebook is divided into various sections. More ... 

## Import Packages

In [1]:
using LinearAlgebra
using StructArrays 
using Statistics
using StaticArrays
using BenchmarkTools

## Section 1: Generate N-Vector of Single Int64 

We start elementary by considering vectors of long integers. 

<b> The good news</b>: Using the macro @btime we show that vector allocation and construction takes one (single) allocation independent of problem size. The solution we provide here acts as a model for more complex constructions. 

<b> Lessons Learned </b>: 
1. <i>undef</i> means <i>uninitialized</i>.  
2. the keyword <i>missing</i> is explained at [Julia manual for missing](https://docs.julialang.org/en/v1/manual/missing/);

In [2]:
function genvec1(N)
    result = Vector{Int64}(undef,N)
    for i=1:N
        result[i] = 2*i 
    end 
    return result
end 

genvec1 (generic function with 1 method)

In [3]:
# test run for a given input 
genvec1(5)

5-element Vector{Int64}:
  2
  4
  6
  8
 10

In [4]:
N = 10;    @btime genvec1(N); 
N = 100;   @btime genvec1(N);
N = 1000;  @btime genvec1(N);

  29.816 ns (1 allocation: 144 bytes)
  48.077 ns (1 allocation: 896 bytes)
  358.537 ns (1 allocation: 7.94 KiB)


In [5]:
#@code_warntype genvec1(5)

In [6]:
#@code_lowered genvec1(5)

## Section 2: Generate N-Vector of 2-Tuple of Int64

Here we consider vectors of two-vectors on long integer. 

### Native Bad Solutions to Avoid  

In thec solution that follows, a vector of size two is allocated for each value in the loop. The number that @btime increases proportionally with problem size. This solution provides a prototype we wish to avoid. 

In [7]:
function genvec2(N)
    result = Vector{Vector{Int64}}(undef, N) # allocate one Vector
    for i = 1:N
        result[i] = [2i, 2i+1] # allocate one Vector per loop
    end 
    return result
end

genvec2 (generic function with 1 method)

In [8]:
N = 10;    @btime genvec2(N); 
N = 100;   @btime genvec2(N);
N = 1000;  @btime genvec2(N);

  207.957 ns (11 allocations: 928 bytes)
  1.733 μs (101 allocations: 8.69 KiB)
  16.917 μs (1001 allocations: 86.06 KiB)


In [54]:
function genvec3(N)
    result = [zeros(Int64,2) for i=1:N] 
    # result = Vector{Vector{Int64}}(undef, N) # UndefRefError: access to undefined reference in line 5 
    for i=1:N
        # result[i] = (2*i, 2i+1) # using a generator - cannot convert Tuple{Int64, Int64} to Vector{Int64}  
        result[i] .= (2*i, 2i+1) # using a generator   
    end 
    return result
end 

genvec3 (generic function with 1 method)

In [55]:
N = 10;    @btime genvec3(N); 
N = 100;   @btime genvec3(N);
N = 1000;  @btime genvec3(N);

  237.330 ns (11 allocations: 928 bytes)
  1.944 μs (101 allocations: 8.69 KiB)
  19.125 μs (1001 allocations: 86.06 KiB)


In [11]:
# this implementation fails while it does work for mikmore. Unclear why. 
function genvec4(N)
    result = eachcol(zeros(Int64, 2, N)) # allocate one Matrix but read it like many Vectors
    for i = 1:N
        result[i] .= (2i, 2i+1) # fill the sliced matrix with values
    end 
    return result
end

genvec4 (generic function with 1 method)

### More Bad Solutions Even Though Static Arrays Are Used 

In [34]:
function genvec5(N)
    # result = Vector{Vector{Int64}}(undef, N)
    result = [zeros(Int64,2) for i=1:N]
    # result = eachcol(zeros(Int64, 2, N)) # allocate one Matrix but read it like many Vectors
    for i = 1:N
        result[i] .= SVector(2i, 2i+1) # fill the sliced matrix with values
    end 
    return result
end

genvec5 (generic function with 1 method)

In [24]:
N = 10;    @btime genvec5(N); 
N = 100;   @btime genvec5(N);
N = 1000;  @btime genvec5(N);

  234.541 ns (11 allocations: 928 bytes)
  1.931 μs (101 allocations: 8.69 KiB)
  18.667 μs (1001 allocations: 86.06 KiB)


In [29]:
# use as script - intermediate for next version 
result = @SVector [ ( 2*i, 2*i+1 ) for i=1:5 ]

5-element SVector{5, Tuple{Int64, Int64}} with indices SOneTo(5):
 (2, 3)
 (4, 5)
 (6, 7)
 (8, 9)
 (10, 11)

In [30]:
# use as function 
function genvec6(N) 
    result = @SVector [ ( 2*i, 2*i+1 ) for i=1:N ]
    return result
end

genvec6 (generic function with 1 method)

In [37]:
# takes age to run - what is wrong here?  
# genvec6(5)

In [31]:
N = 10;    @btime genvec6(N); 
N = 100;   @btime genvec6(N);
N = 1000;  @btime genvec6(N);

  336.788 ns (0 allocations: 0 bytes)
  336.611 ns (0 allocations: 0 bytes)
  336.406 ns (0 allocations: 0 bytes)


### Finally a Good Solution - Hurray! 

The solution that follows employs both static arrays and map. Can 

In [32]:
function genvec7(N)
    result = map(i -> @SVector[2i,2i+1], 1:N) # allocate one Vector
    return result
end

genvec7 (generic function with 1 method)

In [33]:
N = 10;    @btime genvec7(N); 
N = 100;   @btime genvec7(N);
N = 1000;  @btime genvec7(N);

  31.061 ns (1 allocation: 224 bytes)
  60.106 ns (1 allocation: 1.77 KiB)
  356.118 ns (1 allocation: 15.75 KiB)
