# Data Containers

This notebook shows how to put data into different types of "containers" (arrays, tuples, named tuples and dictionaries). 

(This notebook does not discuss [DataFrames.jl](https://juliadata.github.io/DataFrames.jl/stable/) or [Tables.jl](https://github.com/JuliaData/Tables.jl).)

## Load Packages and Extra Functions

In [1]:
using Printf

include("src/printmat.jl");

# Quick Summary 

of how to create some data containers.

In [2]:
a   = 1:3              
C   = [11 12;21 22]     #an array

t = (a,C)               #a tuple
display(t)

println()
nt = (a2=a,C2=C)        #a named tuple
display(nt)

println()
D = Dict(:a2=>a,:C2=>C) #dictionary
display(D)

(1:3, [11 12; 21 22])




(a2 = 1:3, C2 = [11 12; 21 22])




Dict{Symbol, AbstractArray{Int64}} with 2 entries:
  :a2 => 1:3
  :C2 => [11 12; 21 22]

# Arrays

are used everywhere in finance and statistics/econometrics.

## Arrays: Create

can be created in many ways: the code below demonstrates just a few of them. See the tutorial on Arrays for more details.

Notice that `D = [A B]` creates an independent copy, so later changing `B` does not affect `D`. However, if we define `E = B`, then a change of `B` will affect both itself and `E`.

In [3]:
A = [100,101]           #a vector
printmat(A)             #or display(A)

   100    
   101    



## Arrays: Access and Change Elements

To access a vector element, just do `A[2]` or similarly. Also, you can change an matrix element as in `B[2,1] = -999`, that is, arrays are mutable.

In [4]:
println("A[2] is ",A[2])          #access an element of a vector

A[2] = -999                       #change an element of a matrix
println("\nA is now")
printmat(A)

A[2] is 101

A is now
   100    
  -999    



## Arrays of Arrays (or other types)

You can store very different things (a mixture of numbers, matrices, strings) in an array. For instance, if `a` is a vector, `str` is a string and `C` is a matrix, then `x = [a,str,C]` puts them into a vector, where `x[1]` equals the string in `a`.

If you later change elements of the matrix `C` then it will affect `x` (discussed at the end of the notebook).

To first allocate an array of arrays and later fill it, use, for instance, `y = Vector{Any}(undef,3)` can be filled with any sort of elements, while `y = Vector{Array}(undef,3)` can be filled with arrays.

In [5]:
a   = 1:3
str = "Hazel"
C   = [11 12;21 22]
x  = [a,str,C]         #element 1 of x is a
foreach(printmat,x)    #loops over the elements of x

#=
println("\nAlternative approach:")
y    = Vector{Any}(undef,3)
y[1] = 1:3
y[2] = "Hazel"
y[3] = [11 12;21 22]
foreach(printmat,y)
=#

     1    
     2    
     3    

     Hazel

    11        12    
    21        22    



# Tuples and Named Tuples

are very useful for collecting very different types of data (a number, a string, and a couple of vectors, say). They are also often used as inputs or outputs of functions.

Once created, you cannot change tuples (they are immutable). (Exception: *changing an element of an array* that belongs to the tuple will affect the tuple too.)

The next few cells show how to create tuples and named tuples, how to extract parts of them, merge them and what happens when you to try to change them.

## Tuples and Named Tuples: Create

In [6]:
a   = 1:3              #how to create tuples and named tuples
str = "Hazel"
C   = [11 12;21 22]

t = (a,str,C)           #a tuple, or tuple(a,str,C)
display(t)

println()
nt = (a=a,str=str,C=C)  #a named tuple, can also give new name like (a2=a,...
display(nt)

nt_b = (;a,str,C);      #also a named tuple, names are given by variables
#display(nt_b)

(1:3, "Hazel", [11 12; 21 22])




(a = 1:3, str = "Hazel", C = [11 12; 21 22])

## Tuples and Named Tuples: Extract Elements

In [7]:
(a2,str2,C2) = t                          #extract the tuple into variables ("destructuring")
println("a2 and str2 are: $a2 $str2 \n")  #a2 is the same as t[1]

println("t[3] is ",t[3],"\n")             #can index into (tuple) t

println("nt.C is ",nt.C,"\n")             #we can use nt.C as a name (nt is a named tuple)

(;C,a) = nt                               #extract by symbol name, not position
println("C is:")
printmat(C)

a2 and str2 are: 1:3 Hazel 

t[3] is [11 12; 21 22]

nt.C is [11 12; 21 22]

C is:
    11        12    
    21        22    



## Tuples and Named Tuples: Add and Merge

In [8]:
t = (t...,3.14)                     #add an element like this
display(t)

println()
nt = (;nt...,abc=3.14)
display(nt)

println()
nt_c = nt[(:a,:C)]                   #create a new named tuple as a subset of another one
display(nt_c)

println()
nt_d = merge(nt,(abc=3.14,x2="a"))   #merge named tuples to create a new one
display(nt_d)

(1:3, "Hazel", [11 12; 21 22], 3.14)




(a = 1:3, str = "Hazel", C = [11 12; 21 22], abc = 3.14)




(a = 1:3, C = [11 12; 21 22])




(a = 1:3, str = "Hazel", C = [11 12; 21 22], abc = 3.14, x2 = "a")

In [9]:
#t[1] = -999                        #cannot change the tuple, uncomment to get an error
#t[4] = 34                          #cannot add elements like this, uncomment to get an error


## Create a Tuple Dynamically (extra)

when `values` and/or `names` are created dynamically in the program, not hardcoded as above.

Using `tuple(values...)` and `NamedTuple{names}(values)` creates tuples/named tuples.

In [10]:
values = [a,str,C]

t2 = tuple(values...)                        #or (values...,)
display(t2)

println()
names  = (:a, :b, :c)                        #should be a tuple of symbols (:a)   
nt2    = NamedTuple{names}(values)           #or (;zip(names,values)...)
display(nt2)

(1:3, "Hazel", [11 12; 21 22])




(a = 1:3, b = "Hazel", c = [11 12; 21 22])

# Dictionaries

offer a flexible way to collect different types of data. Dictionaries can (in contrast to tuples) be changed: they are mutable. In contrast, they are often a bit slower.

(As usual, changing elements of an array that belongs to the dictionary will affect the dictionary too.)

A dictionary is organised as (key,value) pairs, where the key is the name of the element. You can loop over the elements (see below) and also change/add elements in a loop.

By using `get.(Ref(D),[:a,:C],missing)` you can extract several variables at once.

In [11]:
a   = 1:10
str = "Hazel"
C   = [11 12;21 22]

D = Dict(:a=>a,:str=>str,:C=>C)       #dictionary, "a" instead of :a works too
#D = Dict([(:a,a),(:str,str),(:C,C)])  #alternative syntax
display(D)

println("\n","D[:C] is ",D[:C])            #extract an element

D[:a] = -999                          #can change an element
D[:verse2] = "Stardust"               #can add an element this way
#display(D)

D_b = merge(D,Dict(:abc=>3.14));       #merge Dicts
#display(D_b)

Dict{Symbol, Any} with 3 entries:
  :a   => 1:10
  :str => "Hazel"
  :C   => [11 12; 21 22]


D[:C] is [11 12; 21 22]


## Dictionary:  A Potential Pitfall in Adding (extra)

If you have created a dict with only numbers by 
```
D = Dict(:aa=>1)
``` 
then you cannot add eg. a string by `D[:cc] = "hello"` since `D` is only set up to accept variables of the type `Int`. 

In [12]:
D = Dict(:aa=>1)
#D[:cc] = "hello"            #error since D only accepts Int

D = Dict{Any,Any}(:aa=>1)    #this works
D[:cc] = "hello"
display(D)

Dict{Any, Any} with 2 entries:
  :aa => 1
  :cc => "hello"

## Dictionary: Create Dynamically (extra)

See below for examples.

Remark: if you have the names as an array of strings (`names = ["a","b","c"]`), but want symbol names (`:a` etc), then use `Symbol.(names)`.

In [13]:
names  = (:a, :b, :c)           #or ["a","b","c"]
values = [a,str,C]

D = Dict(zip(names,values))
display(D)

#=                           #alternative approach
D = Dict()                   #empty dictionary
for i = 1:length(values)     #loop
    D[names[i]] = values[i]  #add this to the dictionary
end
display(D)
=#

Dict{Symbol, Any} with 3 entries:
  :a => 1:10
  :b => "Hazel"
  :c => [11 12; 21 22]

# From a Dict to a NamedTuple and Back Again (extra)

In [14]:
nt = (;D...)          #create a named tuple from a dict
display(nt)

println()
D2 = Dict(pairs(nt))  #create a dict from a named tuple
display(D2)

(a = 1:10, b = "Hazel", c = [11 12; 21 22])




Dict{Symbol, Any} with 3 entries:
  :a => 1:10
  :b => "Hazel"
  :c => [11 12; 21 22]

# Loop over a Tuple/Named Tuple/Dictionary

by using `(key,value) in pairs()`

In [15]:
for (key,value) in pairs(nt)             #or over `t` or `D`
    println("$key: $value")
end

a: 1:10
b: Hazel
c: [11 12; 21 22]


# Your Own Tailor Made Data Type (`struct`)

It is sometime convenient to define your own `struct` as a container. The `struct` command creates an immutable type (you cannot change it, except for elements of arrays that belong to it). There is also a `mutable struct` approach.

In [16]:
a   = 1:10
str = "Hazel"
C   = [11 12;21 22]

struct MyType            #change to `mutable struct` to be able to change it later
   x                     #can be anything
   s::String             #has to be a String
   z::Array              #has to be an Array
end

x1 = MyType(a,str,C)    #has to specify all arguments

println("x1: ",x1,"\n")
println("x1.s: ",x1.s)

#x1 = MyType(1:10,10,[1;2])      #error since 10 is not a string
#x1.x = 3                        #error since we cannot change

x1: MyType(1:10, "Hazel", [11 12; 21 22])

x1.s: Hazel
