# Dictionaries

## Programming Fundamentals (NB11)

### MIEIC/2019-20

#### João Correia Lopes

INESC TEC, FEUP

## Goals

By the end of this class, the student should be able to:

- Use the main operations and methods available to work with dictionaries
- Describe the differences between dictionary aliasing and shallow copying


## Bibliography

- Peter Wentworth, Jeffrey Elkner, Allen B. Downey, and Chris Meyers, *How to Think Like a Computer Scientist — Learning with Python 3* (Section 5.4)

- Brad Miller and David Ranum, Learning with Python: Interactive Edition. Based on material by Jeffrey Elkner, Allen B. Downey, and Chris Meyers (Chapter 12)


# Data type: Dictionaries

### A compound data type

- So far we have seen built-in types like `int`, `float`, `bool`,
    `str` and also lists, pairs or tuples

- Strings, lists, and tuples are qualitatively different from the
    others because they are made up of smaller pieces

- Lists, tuples, and strings have been called *sequences*, because
    their items occur in order

### Dictionary

- Dictionaries are yet another kind of compound type

- They are Python’s built-in **mapping type**

- They map **keys**, which can be any *immutable type*, to **values**,
    which can be any type (heterogeneous)<sup>1</sup> 

- In other languages, they are called *associative arrays* since they
    associate a key with a value

- One way to create a dictionary is to start with the empty dictionary
    and add **key:value** pairs

```
    >>> english_spanish = {}
    >>> english_spanish['one'] = "uno"
    >>> english_spanish["two"] = 'dos'
    >>> print(english_spanish)
    {'one': 'uno', 'two': 'dos'}
```

<sup>1</sup> 
Just like the elements of a list or tuple

In [0]:
english_spanish = {}
print(english_spanish)

In [0]:
english_spanish['one'] = "uno"
english_spanish["two"] = 'dos'
print(english_spanish)

### Hashing

- The order of the pairs may not be what was expected

- Python uses complex algorithms, designed for very fast access, to
    determine where the **key:value** pairs are stored in a dictionary

- For our purposes we can think of this ordering as **unpredictable**

- The implementation uses a technique called **hashing**

- The same concept of mapping a key to a value could be implemented
    using a list of tuples, but…

```
  >>> {"apples": 430, "bananas": 312, "oranges": 525, "pears": 217}
  {'apples': 430, 'bananas': 312, 'oranges': 525, 'pears': 217}

  >>> [("apples", 430), ("bananas", 312), ("oranges", 525), ("pears", 217)]
  [('apples', 430), ('bananas', 312), ('oranges', 525), ('pears', 217)]
```

### Efficiency

- The reason to choose this new data type is because dictionaries are **very fast**
- Hashing allows us to access a value very quickly
- By contrast, the list of tuples implementation is slow
  - If we wanted to find a value associated with a key, we would have to iterate over every tuple, checking the 0th element
  - What if the key wasn’t even in the list? 
  - We would have to get to the end of it to find out

### Look up a value

![mafalda](images/11/mafalda.png)

$\Rightarrow$
<https://en.wikipedia.org/wiki/Mafalda>


### Look up a value

- Another way to create a dictionary is to provide a list of
    **key:value** pairs using the same syntax as the previous output

- It doesn’t matter what order we write the pairs (there’s no
    indexing!)<sup>2</sup> 

```
  >>> english_spanish = {"one": "uno", "three": "tres", "two": "dos"}
  >>> english_spanish
  {'one': 'uno', 'three': 'tres', 'two': 'dos'}

  >>> print(english_spanish["two"])
  dos
```

<sup>2</sup> 
The dictionary is the first compound type that we’ve seen that is
    not a sequence, so we can’t index or slice a dictionary

In [0]:
english_spanish = {"one": "uno", "three": "tres", "two": "dos"}
print(english_spanish["three"])

## 5.4.1 Dictionary operations

### Dictionary operations

- The `del` statement removes a *key:value* pair from a dictionary

- The `len` function also works on dictionaries; it returns the number
    of *key:value* pairs

```
   >>> inventory = {"apples": 430, "bananas": 312, "quinces": 217}
   >>> del inventory["bananas"]
   >>> len(inventory)
```

$\Rightarrow$
<https://github.com/fpro-feup/public/tree/master/lectures/11/operations.py>

Watch for some operations:

In [0]:
inventory = {"apples": 430, "bananas": 312, "oranges": 525, "pears": 217}
print(inventory)

In [0]:
del inventory["pears"]
print(inventory)

In [0]:
inventory["bananas"] = 0
print(inventory)

In [0]:
inventory["bananas"] += 200
print(inventory)

## 5.4.2 Dictionary methods

### Dictionary methods

- Dictionaries have a number of useful built-in methods

- The `keys()` method returns what Python3 calls a *view* of its
    underlying keys

    - A *view* object has some similarities to the *range* object we saw
        earlier — it is a **lazy promise**, to deliver its elements when
        they’re needed by the rest of the program

    - We can *iterate over the view*, or turn the view into a list

- The `values()` method is similar

- The `items()` method also returns a view, which promises a list of
    tuples

```
  for key in english_spanish.keys():
      # The order is not defined
      print("Got key", key, "which maps to value", english_spanish[key])
```

$\Rightarrow$
<https://github.com/fpro-feup/public/tree/master/lectures/11/methods.py>

Use `keys()`:

In [0]:
english_spanish = {"one": "uno", "two": "dos", "three": "tres"}

for key in english_spanish.keys():
    print("Got key", key, "which maps to value", english_spanish[key])

In [0]:
keys = list(english_spanish.keys())
print(keys)

Iterating over a dictionary implicitly iterates over its keys

In [0]:
for key in english_spanish:
    print("Got key", key)

Use `values()`:

In [0]:
values = list(english_spanish.values())
print(values)

Use `items()`:

In [0]:
print(list(english_spanish.items()))

In [0]:
for (key, value) in english_spanish.items():
    print("Got key", key, "which maps to value", value)

Membership

In [0]:
print("one" in english_spanish)
print("six" in english_spanish)

Note that 'in' tests keys, not values

In [0]:
print("tres" in english_spanish)

Looking up a non-existent key in a dictionary

In [0]:
print(english_spanish["dog"])

## 5.4.3 Aliasing and copying

### Aliasing and copying

- As in the case of lists, because **dictionaries are mutable**, we
    need to be aware of *aliasing*

- Whenever two variables refer to the same object, changes to one
    affect the other

- To modify a dictionary and keep a copy of the original,
    use the `copy` method

```
  >>> opposites = {"up": "down", "right": "wrong", "yes": "no"}
  >>> alias = opposites
  >>> copy = opposites.copy()  # Shallow copy
```

$\Rightarrow$
<https://github.com/fpro-feup/public/tree/master/lectures/11/aliases.py>


Aliases

In [0]:
opposites = {"up": "down", "right": "wrong", "yes": "no"}

alias = opposites

Shallow copy

In [0]:
copy = opposites.copy()

What now?



In [0]:
alias["right"] = "left"
print(opposites["right"])

In [0]:
copy["right"] = "Guiness"
print(opposites["right"])

## 5.4.4 Counting letters

### Generate a Frequency Table

- To write a function that counted the number of occurrences of a
    letter in a string

- Dictionaries provide an elegant way to generate a frequency table

```
     start with an empty dictionary
     for each letter in the string:
        find the current count (possibly zero) and increment it
     the dictionary contains pairs of letters and their frequencies 
```

$\Rightarrow$
<https://github.com/fpro-feup/public/tree/master/lectures/11/frequency-table.py>

The text (any idea where it came from?):

In [0]:
s = """ 
This parrot is no more! It has ceased to be! 
It’s expired and gone to meet its maker! 
This is a late parrot! It’s a stiff! 
Bereft of life, it rests in peace! 
If you hadn’t nailed it to the perch, it would be pushing up the daisies! 
It’s run down the curtain and joined the choir invisible! 
This is an ex-parrot!
"""

Let's count the letters

In [0]:
letter_counts = {}
for letter in s:
    letter_counts[letter] = letter_counts.get(letter, 0) + 1 
print(letter_counts)

## Sparse matrices

### Sparse matrices

- We previously used a list of lists to represent a matrix

- That is a good choice for a matrix with mostly nonzero values, but
    consider a sparse matrix like this one:

$$\left[
    \begin{array}{ccccc}
    0 & 0 & 0 & 1 & 0 \\
    0 & 0 & 0 & 0 & 0 \\
    0 & 2 & 0 & 0 & 0 \\
    0 & 0 & 0 & 0 & 0 \\
    0 & 0 & 0 & 3 & 0 \\
    \end{array}
  \right]$$

- The list representation contains a lot of zeroes

- An alternative is to use a dictionary and the `get()` method

- There’s a trade-off here as the access may take more time

$\Rightarrow$
<https://github.com/fpro-feup/public/tree/master/lectures/11/matrix.py>

The sparse matrix as a list

In [0]:
matrix = [[0, 0, 0, 1, 0],
          [0, 0, 0, 0, 0],
          [0, 2, 0, 0, 0],
          [0, 0, 0, 0, 0],
          [0, 0, 0, 3, 0]]

print(matrix)

The sparse matrix as a dictionary

In [0]:
# For the keys, we can use tuples that contain the row and column numbers
matrix = {(0, 3): 1, (2, 1): 2, (4, 3): 3}

print(matrix)

Accessing matrix elements

In [0]:
print(matrix[(0, 3)])

What happen if we try

In [0]:
print(matrix[(1, 3)])

The `get` method solves this problem:

In [0]:
# The first argument is the key; the second argument is the value get
# should return if the key is not in the dictionary

print(matrix.get((0, 3), 0))
print(matrix.get((1, 3), 0))

## Memoisation

### Memoisation

- Consider this call graph for `fib()` with n = 4

- A good solution is to keep track of values that have already been
    computed by storing them in a dictionary

- A previously computed value that is stored for later use is called a
    **memo**

![fib](images/11/fib.png)

$\Rightarrow$
<https://github.com/fpro-feup/public/tree/master/lectures/11/fib.py>

Let's have a look at 
[Fibonacci numbers](https://en.wikipedia.org/wiki/Fibonacci_number)
(using recursion!):

In [0]:
# This is a particularly inefficient algorithm, and this could be solved
# far more efficient iteratively or using memoisation
def fib(n):
    if n <= 1:
        return n
    t = fib(n-1) + fib(n-2)
    return t

print(fib(10))

Now, using memoisation:

In [0]:
alreadyknown = {0: 0, 1: 1}

def fib(n):
    if n not in alreadyknown:
        new_value = fib(n-1) + fib(n-2)
        alreadyknown[n] = new_value
    return alreadyknown[n]

print(fib(10))

# Ticket to leave

## Moodle activity

[LE11: Dictionaries](https://moodle.up.pt/mod/quiz/view.php?id=39245)


$\Rightarrow$ 
[Go back to the Table of Contents](00-contents.ipynb)

$\Rightarrow$ 
[Read the Preface](00-preface.ipynb)