# Dictionaries

### Exercise solutions

[Module 10](https://colab.research.google.com/drive/1w2s28vLo26hzppP0Z-kSqniU2eoXJNU3)

### CDH course "Programming in Python"

[index](https://colab.research.google.com/drive/1kFvnhumJ0tOTzDVJnIvvMDRRJ19yk9ZS)

Previous module: [9. String manipulation](https://colab.research.google.com/drive/19yTpFfp9uhBb-kAuOmSQY8_LrMtj8Goq) - [solutions](https://colab.research.google.com/drive/1vfoSsG_arMA-m21YgdD6zYQ_7h2iuy45)

### This module

- Learn about _dictionaries_, a useful way of storing and looking up data

## Exercise 10.1: Dictionaries

1. In each of the code blocks below, try to predict what will be printed, then run the code. If your guess was incorrect, try to figure out why the result is different. If your guess was correct, celebrate!

In [1]:
{0: 0}

{0: 0}

A dictionary with `0` as the only key and `0` its value. The dictionary as a whole is the last value in the cell, so the notebook echoes it to us.

In [2]:
{'0': 0}

{'0': 0}

Like the previous, but with the string `'0'` as key.

In [3]:
{1 + 2: 3 * 4}

{3: 12}

Keys and values in a literal dictionary can be expressions. They are evaluated before being stored in the dictionary.

In [4]:
{'1' + '2': {3: 'Hooray!'}}

{'12': {3: 'Hooray!'}}

Dictionaries can be used as values in other dictionaries (but not as keys).

In [5]:
programming_languages = {
    'Fortran': 1957,
    'Algol 60': 1960,
    'C': 1972,
    'Perl': 1987,
    'Python': 1991,
    'Julia': 2012,
    'Mojo': 2023,
}

Assignments cause the variable to obtain a value as a side effect. The assignment itself has no result value, so we see no output. But the notebook will remember the `programming_languages` dictionary for us, so we can use it in the following code blocks.

In [6]:
programming_languages[Perl]

NameError: name 'Perl' is not defined

If you write a word without quotes, Python attempts to find a variable with that name. We did not define one, so we get a `NameError`.

In [7]:
programming_languages[1960]

KeyError: 1960

`1960` is a value that appears in the dictionary, but the `[]` notation only looks for keys.

In [8]:
programming_languages['Perl']

1987

Third try is a blessing! `'Perl'` is, indeed, a key that appears in the dictionary, and its associated value is `1987`.

In [9]:
{None: None}[None]

The notation is valid; `None` can be both a key and a value in a dictionary. The lookup is successful and returns `None`. Notebooks only echo the last value when it is **not** `None`, so we see no output.

You can verify that there is a value by wrapping the entire expression in `print()`.

In [10]:
programming_languages.get('Per1', 2125)

2125

Note that we wrote `'Per1'`, not `'Perl'`. See the difference? Because `'Per1'` is not in the dictionary, the default (fallback) value of `2125` is used.

In [11]:
programming_languages.get('Per1')

`'Per1'` is still not in the dictionary. Since we don't supply a default in this case, we get `None`, which produces no output.

In [12]:
programming_languages.get('Python', None)

1991

`'Python'` is in the dictionary, so we get its associated value. The default of `None` is never used.

In [13]:
programming_languages.get('Python')

1991

The result is the same as in the previous code block, because the default value is irrelevant.

In [14]:
'Per1' in programming_languages

False

There's that `'Per1'` again, which is not in the dictionary.

In [15]:
'Fortran' in programming_languages

True

`'Fortran'`, on the other hand, is in the dictionary, hence `True`.

In [16]:
2012 in programming_languages

False

`2012` is in the dictionary as a value but not as a key. `in` only looks for keys.

In [17]:
programming_languages.update({'Per1': 2125, 'Raku': 2015})

`update` changes `programming_languages` as a side effect. There is no return value, so we see no output.

In [18]:
2012 in programming_languages.values()

True

Since we query the `.values()` of `programming_languages`, `2012` is now found.

In [19]:
('Per1', 'Perl') in programming_languages.items()

False

When we look `in` a dictionary's `.items()`, we query `(key, value)` pairs. `('Per1', 2125)` would return `True` and so would `('Perl', 1987)`, but not `('Per1', 1987)` or `('Perl', 2125)`.

In [22]:
del programming_languages[2012]

for language, year in programming_languages.items():
    print(f'{language} first appeared in {year}')

KeyError: 2012

The first line attempts to remove the `2012` key from the dictionary, but there is none. Hence a `KeyError`. If you comment out the first line, you will find that the loop prints a nice line for each programming language.

2. The code below attempts to count the frequencies of the individual characters in our party invitation from module 6. There is a bug which prevents it from working. Fix the bug.

In [23]:
invitation = '''
    Dear Sheean,

    I hereby invite you for my Python party on the 11th of April.
    The bar will open at 2 PM. 🍸 Please bring pseudocode.

    Yours sincerely,
    Julian
'''

frequencies = {}
for character in invitation:
    count = frequencies.get(character, 0)
    # Here is the bug: dictionaries have a .get method, but no .set.
    #frequencies.set(character, count + 1)
    # The fix is to use assignment with a square bracket lookup.
    frequencies[character] = count + 1
print(frequencies)

{'\n': 8, ' ': 44, 'D': 1, 'e': 15, 'a': 7, 'r': 9, 'S': 1, 'h': 6, 'n': 8, ',': 2, 'I': 1, 'b': 3, 'y': 6, 'i': 7, 'v': 1, 't': 6, 'o': 9, 'u': 4, 'f': 2, 'm': 1, 'P': 3, 'p': 4, '1': 2, 'A': 1, 'l': 6, '.': 3, 'T': 1, 'w': 1, '2': 1, 'M': 1, '🍸': 1, 's': 4, 'g': 1, 'd': 2, 'c': 2, 'Y': 1, 'J': 1}


3 . Below are two dictionaries containing information about different types of fruit. Print a nice message about each fruit stating its colour and price. For example, _An apple is red and costs € 2.50_, etc.

In [None]:
fruit_colors = {'apple': 'red', 'banana': 'yellow', 'orange': 'orange'}
fruit_prices = {'apple': 2.50, 'banana': 2.10, 'orange': 1.50}

# The example sentence includes 'a'/'an' ('an apple' / 'a banana')
# which was a bit of an oversight: the point of this exercise
# was to loop through a dictionary and access values.
# You could make a solution like this to avoid it.

for fruit in fruit_colors:
    color = fruit_colors[fruit]
    price = fruit_prices[fruit]

    print(fruit + 's', 'are', color, 'and cost €', price, sep=' ')

# Bonus points if you DID implement a solution on when to use
# 'a' or 'an'! Here is an example of your you can do that.

print()

def starts_with_vowel(word):
    '''
    Returns True if the word (a string) starts with a vowel.
    '''
    vowels = ['a', 'e', 'i', 'o', 'u']
    if word[0] in vowels:
        return True
    return False


for fruit in fruit_colors:
    color = fruit_colors[fruit]
    price = fruit_prices[fruit]

    if starts_with_vowel(fruit):
        article = 'An'
    else:
        article = 'A'

    print(article, fruit, 'is', color, 'and costs €', price, sep=' ')

apples are red and cost € 2.5
bananas are yellow and cost € 2.1
oranges are orange and cost € 1.5

An apple is red and costs € 2.5
A banana is yellow and costs € 2.1
An orange is orange and costs € 1.5


4 . Here is a longer lists of fruit colours. Write a function `count_fruits` which gets gets a colour as input and returns the number of fruits that have that colour (according to `lots_of_fruit`).

In [None]:
lots_of_fruit = {'apple': 'red', 'banana': 'yellow', 'orange': 'orange',
                 'cucumber': 'green', 'kiwi': 'green', 'strawberry': 'red',
                 'pineapple': 'yellow','blackberry': 'black', 'cherry': 'red',
                 'gooseberry': 'green', 'raspberry': 'red', 'mandarin': 'orange',
                 'lemon': 'yellow', 'lime': 'green'}

In [None]:
def count_fruits(color):
    '''Count the number of fruits in `lots_of_fruit` that match this colour.'''
    count = 0
    for value in lots_of_fruit.values():
        if value == color:
            count = count + 1
    return count

# let's see if it works!
assert count_fruits('red') == 4
assert count_fruits('lavender') == 0

5 . The list `fruit_basket` contains a bunch of fruits. Can you make a dictionary `fruit_counts` which gives the amount for each fruit in `fruit_basket`? (Do not count the fruits by hand!)

In [None]:
fruit_basket = ['apple', 'banana', 'banana', 'banana', 'apple', 'orange',
                'orange', 'grape', 'grape', 'grape', 'grape', 'grape', 'grape',
                'grape', 'grape', 'grape', 'pear', 'apple', 'strawberry',
                'strawberry', 'strawberry', 'orange']

In [None]:
def count_items(items):
    '''
    Count the items in a list.

    Input: a list of items, such as strings
    Output: a dictionary with the total of occurences for each item.
    '''

    counts = dict() # we will keep track of our counts in here!

    for item in items:
        # the current count: either the current value in the dictionary
        # or 0 if we haven't seen this fruit yet
        current_count = counts.get(item, 0)
        new_count = current_count + 1
        counts[item] = new_count

    return counts

fruit_counts = count_items(fruit_basket)

# let's see if it works!
assert fruit_counts['apple'] == 3


6 . Here is a different list, which contains the words in a sentence. Can you use your code above to make a dictionary `word_counts` telling us how often each word occurs? (Tip: if you need to do very similar tasks, make a function!)

Write a function that takes a dictionary like `word_counts` tells us the most commonly occuring item and the count. Note that there can be multiple items that occurred the most.

In [None]:
# the variable sent0 contains the first sentence of The Catcher in the Rye
# split into single words
sent0 = ['If', 'you', 'really', 'want', 'to', 'hear', 'about', 'it,', 'the',
         'first', 'thing', 'you’ll', 'probably', 'want', 'to', 'know', 'is',
         'where', 'I', 'was', 'born,', 'and', 'what', 'my', 'lousy', 'childhood',
         'was', 'like,', 'and', 'how', 'my', 'parents', 'were', 'occupied',
         'and', 'all', 'before', 'they', 'had', 'me,', 'and', 'all', 'that',
         'David', 'Copperfield', 'kind', 'of', 'crap,', 'but', 'I', 'don’t',
         'feel', 'like', 'going', 'into', 'it,', 'if', 'you', 'want',
         'to', 'know', 'the', 'truth.']

In [None]:
word_counts = count_items(sent0) # we recycle our function from the last exercise

In [None]:
def most_frequent(counts):
    '''
    For a dictionary with totals, the most commonly occuring item(s) and the count.

    Input should be a dictionary with the total number of occurences for each key in some collection.
    Returns a tuple of two items. First is a list of the most frequent item(s). If the input
    is an empty dict, the list if empty. Second is the number of occurences for that item.
    '''

    if not len(counts):
        return [], 0

    max_count = max(counts.values())

    top_items = []
    for item, count in counts.items():
        if count == max_count:
            top_items.append(item)

    return top_items, max_count

words, total = most_frequent(word_counts)
print(words, total)

# here are some assert statements you could use to check your own function
# feel free to adapt them if your function gives a different output format
assert most_frequent(fruit_counts) == (['grape'], 9)
assert most_frequent(word_counts) == (['and'], 4)
assert most_frequent({}) == ([], 0)

['and'] 4


## Next module

[11 - Working with files](https://colab.research.google.com/drive/1KsFZV-jmfaQnCFevSxIZrd7chm3Z5CJo) - [solutions](https://colab.research.google.com/drive/1UZywfMphqJx8iB7aFvBH4ePKBh7E_-Hd)