# Data structures

Once we start working with many pieces of data at once, it will be convenient for us to store data in structures like arrays or dictionaries (rather than just relying on variables).<br>

Types of data structures covered:
1. Tuples
2. Dictionaries
3. Arrays

<br>
As an overview, tuples and arrays are both ordered sequences of elements (so we can index into them). Dictionaries and arrays are both mutable.
We'll explain this more below!

## Tuples

We can create a tuple by enclosing an ordered collection of elements in `( )`.

Syntax: <br>
```julia
(item1, item2, ...)```

In [1]:
myfavoriteanimals = ("penguins", "cats", "sugargliders")

("penguins", "cats", "sugargliders")

We can index into this tuple,

In [2]:
myfavoriteanimals[1]

"penguins"

but since tuples are immutable, we can't update it

In [3]:
myfavoriteanimals[1] = "otters"

MethodError: MethodError: no method matching setindex!(::Tuple{String,String,String}, ::String, ::Int64)

## Now in 1.0: NamedTuples

As you might guess, `NamedTuple`s are just like `Tuple`s except that each element additionally has a name! They have a special syntax using `=` inside a tuple:

```julia
(name1 = item1, name2 = item2, ...)
```

In [4]:
myfavoriteanimals = (bird = "penguins", mammal = "cats", marsupial = "sugargliders")

(bird = "penguins", mammal = "cats", marsupial = "sugargliders")

In [5]:
myfavoriteanimals = (bird = "penguins", mammal = "cats", bird = "sugargliders")

ErrorException: syntax: field name "bird" repeated in named tuple

Like regular `Tuples`, `NamedTuples` are ordered, so we can retrieve their elements via indexing:

In [6]:
myfavoriteanimals[end]

"sugargliders"

They also add the special ability to access values by their name:

In [9]:
myfavoriteanimals.bird

"penguins"

## Dictionaries

If we have sets of data related to one another, we may choose to store that data in a dictionary. We can create a dictionary using the `Dict()` function, which we can initialize as an empty dictionary or one storing key, value pairs.

Syntax:
```julia
Dict(key1 => value1, key2 => value2, ...)```

A good example is a contacts list, where we associate names with phone numbers.

In [10]:
myphonebook = Dict("Jenny" => "867-5309", "Ghostbusters" => "555-2368")

Dict{String,String} with 2 entries:
  "Jenny"        => "867-5309"
  "Ghostbusters" => "555-2368"

In this example, each name and number is a "key" and "value" pair. We can grab Jenny's number (a value) using the associated key

In [11]:
myphonebook["Jenny"]

"867-5309"

We can add another entry to this dictionary as follows

In [12]:
myphonebook["Kramer"] = "555-FILK"

"555-FILK"

Let's check what our phonebook looks like now...

In [13]:
myphonebook

Dict{String,String} with 3 entries:
  "Jenny"        => "867-5309"
  "Kramer"       => "555-FILK"
  "Ghostbusters" => "555-2368"

We can delete Kramer from our contact list - and simultaneously grab his number - by using `pop!`

In [14]:
pop!(myphonebook, "Kramer")

"555-FILK"

In [15]:
push!(myphonebook, "Kramer" => "555" )

Dict{String,String} with 3 entries:
  "Jenny"        => "867-5309"
  "Kramer"       => "555"
  "Ghostbusters" => "555-2368"

In [16]:
myphonebook

Dict{String,String} with 3 entries:
  "Jenny"        => "867-5309"
  "Kramer"       => "555"
  "Ghostbusters" => "555-2368"

Unlike tuples and arrays, dictionaries are not ordered. So, we can't index into them.

In [17]:
myphonebook[1]

KeyError: KeyError: key 1 not found

In [18]:
myphonebook.Jenny

ErrorException: type Dict has no field Jenny

In the example above, `julia` thinks you're trying to access a value associated with the key `1`.

## Arrays

Unlike tuples, arrays are mutable. Unlike dictionaries, arrays contain ordered collections. <br>
We can create an array by enclosing this collection in `[ ]`.

Syntax: <br>
```julia
[item1, item2, ...]```


For example, we might create an array to keep track of my friends

In [19]:
myfriends = ["Ted", "Robyn", "Barney", "Lily", "Marshall"]

5-element Array{String,1}:
 "Ted"
 "Robyn"
 "Barney"
 "Lily"
 "Marshall"

The `1` in `Array{String,1}` means this is a one dimensional vector.  An `Array{String,2}` would be a 2d matrix, etc.  The `String` is the type of each element.

or to store a sequence of numbers

In [20]:
fibonacci = [1, 1, 2, 3, 5, 8, 13]

7-element Array{Int64,1}:
  1
  1
  2
  3
  5
  8
 13

In [21]:
fibonacci[end] = 15
fibonacci

7-element Array{Int64,1}:
  1
  1
  2
  3
  5
  8
 15

In [22]:
mixture = [1, 1, 2, 3, "Ted", "Robyn"]

6-element Array{Any,1}:
 1
 1
 2
 3
  "Ted"
  "Robyn"

Once we have an array, we can grab individual pieces of data from inside that array by indexing into the array. For example, if we want the third friend listed in `myfriends`, we write

In [23]:
myfriends[3]

"Barney"

We can use indexing to edit an existing element of an array

In [24]:
myfriends[3] = "Baby Bop"

"Baby Bop"

Yes, Julia is 1-based indexing, not 0-based like Python.  Wars are fought over lesser issues. I have a friend with the wisdom of Solomon who proposes settling this once and for all with ½ 😃

We can also edit the array by using the `push!` and `pop!` functions. `push!` adds an element to the end of an array and `pop!` removes the last element of an array.

We can add another number to our fibonnaci sequence

In [25]:
push!(fibonacci, 21)

8-element Array{Int64,1}:
  1
  1
  2
  3
  5
  8
 15
 21

In [26]:
fibonacci[8] = 21

21

and then remove it

In [27]:
pop!(fibonacci)

21

In [28]:
fibonacci

7-element Array{Int64,1}:
  1
  1
  2
  3
  5
  8
 15

So far I've given examples of only 1D arrays of scalars, but arrays can have an arbitrary number of dimensions and can also store other arrays. 
<br><br>
For example, the following are arrays of arrays:

In [29]:
favorites = [["koobideh", "chocolate", "eggs"],["penguins", "cats", "sugargliders"]]

2-element Array{Array{String,1},1}:
 ["koobideh", "chocolate", "eggs"]
 ["penguins", "cats", "sugargliders"]

In [30]:
favorites[:][2]

3-element Array{String,1}:
 "penguins"
 "cats"
 "sugargliders"

In [34]:
numbers = [[1, 2, 3],[4, 5, 6],[7, 8, 9]]


3-element Array{Array{Int64,1},1}:
 [1, 2, 3]
 [4, 5, 6]
 [7, 8, 9]

In [40]:
#println(numbers[2])
println(numbers[:,1])

[1, 2, 3, 4, 5, 6, 7, 8, 9]


In [37]:
numbers = [[1, 2, 3] ;[4, 5, 6] ;[7, 8, 9]]

9-element Array{Int64,1}:
 1
 2
 3
 4
 5
 6
 7
 8
 9

In [41]:
println(numbers[2])
println(numbers[6])

2
6


In [42]:
numbers = [[1 2 3] [4 5 6] [7 8 9]]

1×9 Array{Int64,2}:
 1  2  3  4  5  6  7  8  9

In [43]:
println(numbers[2])
println(numbers[2,3])

2


BoundsError: BoundsError: attempt to access 1×9 Array{Int64,2} at index [2, 3]

In [44]:
numbers = [[1 2 3];[4 5 6];[7 8 9]]

3×3 Array{Int64,2}:
 1  2  3
 4  5  6
 7  8  9

In [45]:
println(numbers[2])
println(numbers[2,3])

4
6


In [46]:
numbers = [[1 2 3],[4 5 6],[7 8 9]]

3-element Array{Array{Int64,2},1}:
 [1 2 3]
 [4 5 6]
 [7 8 9]

In [47]:
println(numbers[2])
println(numbers[2,3])

[4 5 6]


BoundsError: BoundsError: attempt to access 3-element Array{Array{Int64,2},1} at index [2, 3]

In [48]:
numbers = [[1, 2, 3] [4, 5, 6] [7, 8, 9]] 

3×3 Array{Int64,2}:
 1  4  7
 2  5  8
 3  6  9

In [49]:
println(numbers[2])
println(numbers[2,3])

2
8


The simplest way to define a matrix

In [50]:
numbers = [1 2 3; 4 5 6; 7 8 9] 

3×3 Array{Int64,2}:
 1  2  3
 4  5  6
 7  8  9

The following will not work as the "," is treated as seperator of array elements and semicolon is not permitted in between unless one of the above tricks of matrix creation is applied.

In [51]:
numbers = [1, 2, 3; 4, 5, 6; 7, 8, 9] 

ErrorException: syntax: unexpected semicolon in array expression

Below are examples of 2D and 3D arrays populated with random values.

In [52]:
y = rand(4,3)

4×3 Array{Float64,2}:
 0.502644  0.180816  0.605967
 0.241358  0.133781  0.492097
 0.860804  0.243546  0.80082
 0.14517   0.303648  0.686514

In [53]:
y[:]

12-element Array{Float64,1}:
 0.5026436466874751
 0.24135824527933614
 0.8608035426484164
 0.1451702892508966
 0.18081623808770053
 0.1337812520917394
 0.2435464691443654
 0.3036482233896971
 0.6059668053199894
 0.49209679398228845
 0.8008200127872795
 0.6865140836104189

In [54]:
x = rand(4, 3, 2)

4×3×2 Array{Float64,3}:
[:, :, 1] =
 0.423937  0.150621  0.0284276
 0.94715   0.461231  0.550478
 0.690565  0.684341  0.44705
 0.752426  0.625381  0.29683

[:, :, 2] =
 0.53064   0.0198399  0.372656
 0.409091  0.835435   0.417472
 0.942018  0.692341   0.745662
 0.447406  0.178595   0.754698

In [55]:
x[:]

24-element Array{Float64,1}:
 0.42393686786689844
 0.947149983540601
 0.6905654618265784
 0.7524262902436758
 0.15062053371336948
 0.46123115076010257
 0.6843409047955222
 0.625380983233002
 0.028427570327122398
 0.5504778973735025
 0.44704977423190373
 0.2968295331962947
 0.5306404322035885
 0.40909055697814156
 0.9420175163778002
 0.44740569852441925
 0.01983991214378955
 0.8354351475350112
 0.6923414499648872
 0.17859513883667377
 0.3726561687142662
 0.417471654491548
 0.7456616786850643
 0.7546978062057383

Be careful when you want to copy arrays!

In [56]:
fibonacci

7-element Array{Int64,1}:
  1
  1
  2
  3
  5
  8
 15

In [57]:
somenumbers = fibonacci

7-element Array{Int64,1}:
  1
  1
  2
  3
  5
  8
 15

In [58]:
somenumbers[1] = 404

404

In [59]:
fibonacci

7-element Array{Int64,1}:
 404
   1
   2
   3
   5
   8
  15

Editing `somenumbers` caused `fibonacci` to get updated as well!

In the above example, we didn't actually make a copy of `fibonacci`. We just created a new way to access the entries in the array bound to `fibonacci`.

If we'd like to make a copy of the array bound to `fibonacci`, we can use the `copy` function.

In [60]:
# First, restore fibonacci
fibonacci[1] = 1
fibonacci

7-element Array{Int64,1}:
  1
  1
  2
  3
  5
  8
 15

In [61]:
somemorenumbers = copy(fibonacci[1:4])

4-element Array{Int64,1}:
 1
 1
 2
 3

somenumbers = fibonacci[:] also behaves as copy

In [62]:
somemorenumbers[1] = 404

404

In [63]:
fibonacci

7-element Array{Int64,1}:
  1
  1
  2
  3
  5
  8
 15

In this last example, fibonacci was not updated. Therefore we see that the arrays bound to `somemorenumbers` and `fibonacci` are distinct.

#### Summary of Data Structures

|Data structure   | mutable | ordered |
|---------|:-------:|:-------:|
|Tuples     |   no |    yes    |
|Dictionary |    yes    |    no     |
|Array      |    yes    |    yes    |

### Exercises

#### 3.1 
Create an array, `a_ray`, with the following code:

```julia
a_ray = [1, 2, 3]
```

Add the number `4` to the end of this array and then remove it.

In [69]:
a_ray = [1, 2, 3]

3-element Array{Int64,1}:
 1
 2
 3

In [70]:
push!(a_ray,4)

4-element Array{Int64,1}:
 1
 2
 3
 4

In [71]:
pop!(a_ray)

4

In [72]:
@assert a_ray == [1, 2, 3]

#### 3.2 
Try to add "Emergency" as key to `myphonebook` with the value `string(911)` with the following code
```julia
myphonebook["Emergency"] = 911
```

Why doesn't this work?

In [73]:
myphonebook


Dict{String,String} with 3 entries:
  "Jenny"        => "867-5309"
  "Kramer"       => "555"
  "Ghostbusters" => "555-2368"

In [78]:
typeof(myphonebook)

Dict{String,String}

In [77]:
myphonebook["Emergency"] = 911

MethodError: MethodError: Cannot `convert` an object of type Int64 to an object of type String
Closest candidates are:
  convert(::Type{T}, !Matched::T) where T<:AbstractString at strings/basic.jl:209
  convert(::Type{T}, !Matched::AbstractString) where T<:AbstractString at strings/basic.jl:210
  convert(::Type{T}, !Matched::T) where T at essentials.jl:171

In [76]:
myphonebook

Dict{String,String} with 4 entries:
  "Jenny"        => "867-5309"
  "Emergency"    => "911"
  "Kramer"       => "555"
  "Ghostbusters" => "555-2368"

#### 3.3 
Create a new dictionary called `flexible_phonebook` that has Jenny's number stored as an integer and Ghostbusters' number stored as a string with the following code

```julia
flexible_phonebook = Dict("Jenny" => 8675309, "Ghostbusters" => "555-2368")
```

In [79]:
flexible_phonebook = Dict("Jenny" => 8675309, "Ghostbusters" => "555-2368")

Dict{String,Any} with 2 entries:
  "Jenny"        => 8675309
  "Ghostbusters" => "555-2368"

In [82]:
typeof(flexible_phonebook)

Dict{String,Any}

In [83]:
@assert flexible_phonebook == Dict("Jenny" => 8675309, "Ghostbusters" => "555-2368")

#### 3.4 
Add the key "Emergency" with the value `911` (an integer) to `flexible_phonebook`.

In [84]:
flexible_phonebook["Emergency"]=911

911

In [85]:
flexible_phonebook

Dict{String,Any} with 3 entries:
  "Jenny"        => 8675309
  "Emergency"    => 911
  "Ghostbusters" => "555-2368"

In [86]:
@assert haskey(flexible_phonebook, "Emergency")

In [87]:
@assert flexible_phonebook["Emergency"] == 911

#### 3.5 
Why can we add an integer as a value to `flexible_phonebook` but not `myphonebook`? How could we have initialized `myphonebook` so that it would accept integers as values?

In [93]:
myphonebookv2=Dict{String,Any}("Jenny" => 8675309, "Ghostbusters" => "555-2368")

Dict{String,Any} with 2 entries:
  "Jenny"        => 8675309
  "Ghostbusters" => "555-2368"

In [95]:
myphonebookv2["Emergency"] = 911

911

In [96]:
myphonebookv2

Dict{String,Any} with 3 entries:
  "Jenny"        => 8675309
  "Emergency"    => 911
  "Ghostbusters" => "555-2368"

Please run @assert cells upon completion of the excercise to validate your answers