# A dictionary is a mapping

A dictionary is like a list, but more general. In a list, the indices have to be integers; in a dictionary they can be (almost) any type.

A dictionary contains a collection of indices, which are called __keys__, and a collection of __values__. Each key is associated with a single value. The association of a key and a value is called a __key-value pair__ or sometimes an __item__.

In mathematical language, a dictionary represents a mapping from keys to values, so you can also say that each key "maps to" a value. As an example, you can imagine a study where several participants receive different doses of a drug; we’ll build a dictionary that maps from participants ID to doses, so the keys will be strings and the values floats.

The function `dict` creates a new dictionary with no items. Because dict is the name of a built-in function, you should avoid using it as a variable name.

In [39]:
id2dose = dict()
id2dose

{}

In [40]:
id2dose = {}
id2dose

{}

The squiggly-brackets, `{}`, represent an empty dictionary. To add items to the dictionary, you can use square brackets:

In [41]:
id2dose['02'] = 31

This line creates an item that maps from the key '02' to the value 31. If we print the dictionary again, we see a key-value pair with a colon between the key and value:

In [42]:
id2dose

{'02': 31}

This output format is also an input format. For example, you can create a new dictionary with three items:

In [43]:
id2dose = {'05': 31, '02': 38, '12': 43}

But if you print `id2dose`, you might be surprised:

In [44]:
id2dose

{'05': 31, '02': 38, '12': 43}

The order of the key-value pairs is as you entered it, it's not sorted. In general, the order of items in a dictionary is unpredictable.

But the order doesn't matter since you access element through the key and not through their position as for lists.

If the key isn’t in the dictionary, you get an exception:

In [45]:
id2dose['01']

KeyError: '01'

The `len` function works on dictionaries; it returns the number of key-value pairs:

In [None]:
len(id2dose)

The `in` operator works on dictionaries, too; it tells you whether something appears as a key in the dictionary (appearing as a value is not good enough).

In [None]:
'02' in id2dose

In [None]:
'01' in id2dose

To see whether something appears as a value in a dictionary, you can use the method values, which returns a collection of values, and then use the in operator:

In [None]:
vals = id2dose.values()
31 in vals

# Dictionary as a collection of counters

Histograms are very common in science. A histogram is a collection of counters: you define descrete categories and count how many objects there are in you data of that category. A very simple case is a histogram of characters in a string: you want to count how many times each letter appears there. There are several ways you could do it. One elegant way is using a dictionary.

You could create a dictionary with characters as keys and counters as the corresponding values. The first time you see a character, you would add an item to the dictionary. After that you would increment the value of an existing item.

We don’t have to know ahead of time which letters appear in the string and we only have to make room for the letters that do appear.

Here is what the code might look like:

In [None]:
def histogram(s):
    d = dict()
    for c in s:
        if c not in d:
            d[c] = 1
        else:
            d[c] += 1
    return d

The first line of the function creates an empty dictionary. The for loop traverses the string. Each time through the loop, if the character `c` is not in the dictionary, we create a new item with key `c` and the initial value 1 (since we have seen this letter once). If `c` is already in the dictionary we increment `d[c]`.

Here’s how it works:

In [None]:
h = histogram('brontosaurus')
h

The histogram indicates that the letters 'a' and 'b' appear once; 'o' appears twice, and so on.

Dictionaries have a method called `get` that takes a key and a default value. If the key appears in the dictionary, get returns the corresponding value; otherwise it returns the default value. For example:

In [None]:
h.get('a', 0)

In [None]:
h.get('c', 0)

As an __exercise__, use `get` to write histogram more concisely. You should be able to eliminate the if statement.

# Looping and dictionaries

If you use a dictionary in a for statement, it traverses the keys of the dictionary. For example, print_hist prints each key and the corresponding value:

In [None]:
def print_hist(h):
    for c in h:
        print(c, h[c])

Here’s what the output looks like:

In [None]:
h = histogram('parrot')
print_hist(h)

# Reverse lookup

Given a dictionary d and a key k, it is easy to find the corresponding value v = d[k]. This operation is called a lookup.

But what if you have v and you want to find k? You have two problems: first, there might be more than one key that maps to the value v. Depending on the application, you might be able to pick one, or you might have to make a list that contains all of them. Second, there is no simple syntax to do a reverse lookup; you have to search.

Here is a function that takes a value and returns the first key that maps to that value:

In [None]:
def reverse_lookup(d, v):
    for k in d:
        if d[k] == v:
            return k
    raise LookupError('value does not appear in the dictionary')

The raise statement causes a LookupError, which is a built-in exception used to indicate that a lookup operation failed.

If we get to the end of the loop, that means `v` doesn’t appear in the dictionary as a value, so we raise an exception.

Here is an example of a successful reverse lookup:

In [None]:
h = histogram('parrot')
key = reverse_lookup(h, 2)
key

And an unsuccessful one:

In [None]:
key = reverse_lookup(h, 3)

A reverse lookup is much slower than a forward lookup; if you have to do it often, or if the dictionary gets big, the performance of your program will suffer.

# Dictionaries and lists

Lists can appear as values in a dictionary. For example, if you might want to create a dictionary that associate each participant in an experiment with the recording from a sensor (EEG, MEG, ECG, etc.) over time that you might store in a list or numpy array.

Here we illustrate this with random values.

In [1]:
import numpy as np
id2recording = {'01': np.random.randn(20), '04': np.random.randn(22), '08': np.random.randn(18)}

In [None]:
id2recording['01']

Lists and arrays can be values of dictionaries but they cannot be keys.

If we try it we get an error.

In [None]:
l = [0, 1, 2]
d = dict()
d[l] = 'error'

This error tells us that the type we are trying to use as a key is not allowed. Variables used as keys in dictionaries have to be "[hashable](https://en.wikipedia.org/wiki/Hash_table)". You can read more about it but you don't need to worry too much. For now it's enough to know that keys have to be of an immutable type, thus list-like types are not allowed. The workaround if you really need to use sequences as keys is to use *tuples* that we'll see in the next section.

# Exercises

## Exercise 1

Write a function that takes a long list of normal random numbers, create a dictionary using those values as keys and then uses the `in` operator to test whether the value 1 is included in the list. What do you expect?

## Exercise 2

Imagine you have to organize the data recorded during an experiment. There are 10 subjects with identifiers 'S0' to 'S9'. From each subject the neural activity is recorded from 32 channels during 5 minutes at 60 Hz. The activity is recorded in three sessions of 5 minutes: before anesthesia, during anesthesia and post-anesthesia. The dose of anesthetics is different for each subject. You need to store all this information in one dictionray. Think about what would be a meaningful structure and initialize the dictionary using simulated data.

Hint: a dictionary can be used as the value of a dictionary.

Extract and plot the neural activity of subject 'S7' under anesthesia. Also print a message with the dose of this subject received.
