## Dictionaries

Dictionaries are suitable when we are dealing with data that fit in a key - value format. Let's see a few examples.

The building block of dictionaries in Julia is called `Pair`.

In [1]:
phone = "mobile number" => 07000300 

"mobile number" => 7000300

In [2]:
typeof(phone)

Pair{String, Int64}

We can access the information stored in a pair as follows

In [3]:
phone.first

"mobile number"

In [4]:
phone.second

7000300

We can create a dictionary as follows

In [5]:
my_dict = Dict("short" => 15, "medium" => 20, "tall" => 30)
my_dict

Dict{String, Int64} with 3 entries:
  "medium" => 20
  "tall"   => 30
  "short"  => 15

The built-in  `enumerate`function can be useful when we want to iteratate across a dictionary.

In [8]:
for (key,value) in enumerate(my_dict)
    println("The key is $key")
    println("The value is $value")
end


The key is 1
The value is "medium" => 20
The key is 2
The value is "tall" => 30
The key is 3
The value is "short" => 15


Keep in mind that enumerate is not restricted only to dictionaries but can be used in every collection returning an index, value pair.

The builtin `zip`function can be handy when making dictionaries

In [9]:
age = [18,3, 44, 56]

4-element Vector{Int64}:
 18
  3
 44
 56

In [10]:
names = ["John", "George", "Nick", "Kelly"]

4-element Vector{String}:
 "John"
 "George"
 "Nick"
 "Kelly"

In [11]:
family = Dict(zip(names,age))

Dict{String, Int64} with 4 entries:
  "Kelly"  => 56
  "John"   => 18
  "Nick"   => 44
  "George" => 3

Now let's see how we can access dictionary keys and values.

In [12]:
my_dict["short"]

15

In [13]:
keys(my_dict)

KeySet for a Dict{String, Int64} with 3 entries. Keys:
  "medium"
  "tall"
  "short"

In [14]:
values(my_dict)

ValueIterator for a Dict{String, Int64} with 3 entries. Values:
  20
  30
  15

We can check whether a particular value exists

In [15]:
15 in values(my_dict)

true

Below we see what happens when try to use a key that doesn't exist

In [16]:
my_dict["not_so_tall"]

KeyError: KeyError: key "not_so_tall" not found

In [17]:
haskey(my_dict, "not_so_tall")

false

The following structure can be particularly helpful

In [18]:
get(my_dict,"not_so_tall","Not found")

"Not found"

We have the possibility of restricting the types of keys and values as follows

In [19]:
metrics = Dict{String, Float64}()

Dict{String, Float64}()

In [20]:
metrics["bream"] = 13.2

13.2

In [21]:
metrics["bass"] = 14.1

14.1

In [22]:
metrics

Dict{String, Float64} with 2 entries:
  "bass"  => 14.1
  "bream" => 13.2

In [23]:
metrics["mackerel"] = "small"

MethodError: MethodError: Cannot `convert` an object of type String to an object of type Float64
The function `convert` exists, but no method is defined for this combination of argument types.

Closest candidates are:
  convert(::Type{T}, !Matched::T) where T<:Number
   @ Base number.jl:6
  convert(::Type{T}, !Matched::AbstractChar) where T<:Number
   @ Base char.jl:185
  convert(::Type{T}, !Matched::Number) where T<:Number
   @ Base number.jl:7
  ...


It is easy to change the values of a specific key or add new ones

In [24]:
my_dict = merge(my_dict,Dict("not_so_tall" => 25, "very_tall" => 50))
my_dict

Dict{String, Int64} with 5 entries:
  "medium"      => 20
  "tall"        => 30
  "short"       => 15
  "not_so_tall" => 25
  "very_tall"   => 50

If we want to delete a key-value pair

In [25]:
delete!(my_dict,"not_so_tall")
my_dict

Dict{String, Int64} with 4 entries:
  "medium"    => 20
  "tall"      => 30
  "short"     => 15
  "very_tall" => 50

The code below could be useful for going through key-value pairs in a dictionary

In [28]:
for (key,value) in pairs(my_dict)
    println("The key is $key and it's value is $value")
end

The key is medium and it's value is 20
The key is tall and it's value is 30
The key is short and it's value is 15
The key is very_tall and it's value is 50


## Tuples and NamedTuples

Tuples can be created in a very similar way to arrays. Here instead of square brackets we use brackets.

In [29]:
info = (10,20,30,"red","blue")

(10, 20, 30, "red", "blue")

In [30]:
typeof(info)

Tuple{Int64, Int64, Int64, String, String}

However, when we try to access particular elements the syntax is the same to arrays

In [31]:
info[1]

10

In [32]:
info[1:3]

(10, 20, 30)

In [33]:
for i in info
    println(i)
end

10
20
30
red
blue


A key difference to other collections is that tuples are immutable. This means that once a tuple is created we cannot modify it's elements

In [34]:
info[1] = 3

MethodError: MethodError: no method matching setindex!(::Tuple{Int64, Int64, Int64, String, String}, ::Int64, ::Int64)
The function `setindex!` exists, but no method is defined for this combination of argument types.

In some cases it makes sense to use a named tuple in a similar fashion of cases where we would use a dictionary. 

In [35]:
professions = (tom="footballer", magnus="actor", lotta="politician", alice="professor")

(tom = "footballer", magnus = "actor", lotta = "politician", alice = "professor")

Here we can also access individual elements as follows

In [36]:
professions.tom

"footballer"

Now the question when would it be more advantageous to use a named tuple over a dictionary or a tuple over an array. If we know in advance that the data we store in tuples are not changing then their usage allows the Julia compiler to perform optimizations and therefore result in faster running code.

## Exercises

### Exercise 1

From the following lists create a dictionary where the first list will be the keys and the second the values `list_1 = ['n_estimators', 'max_features', 'min_samples_leaf', 'oob_score']` and `list2 = [1000,10,5,False]`. Name the dictionary ML_params. Update the dictionary with the key `lambda` and the value `1`. Check whether the key 'oob_score' exists and retrieve it's value. Check whether the key `learning_rate` exists using a default value of `0` in case it doesn't exist.

### Exercise 2

Unpack the following `(30,5,25)` to  the variables `average`, `standard_deviation`, `variance`.

### Exercise 3

From the following variables create a named tuple. For that use the provided variables and don't hard code them. `name="paul"`, `profession="doctor"`, `height=1.8`, `age=33`. 