# Dictionaries 2 (Manipulating)

We've seen how dictionaries can be used to store information in {key: value} pairs, like this.

In [None]:
# Your average dictionary

room_to_square_feet = {
    # key     : value
    'dining'  : 140,
    'living'  : 160,
    'bedroom' : 120,
    'bathroom': 40,
    'kitchen' : 80
}

Now let's see how to work with them in code, rather than typing the data by hand.

## From phonebooks to frequencies

Last time, we created a phonebook where you added names and numbers in a loop. Starting from an empty contacts dictionary, we repeatedly added entries to the dictionary:

In [None]:
# Phonebook

contacts = {
    'John': '647-123-5678',
    'Gus': '905-555-1234'
}

# Repeatedly ask for name & number till 
choice = input('Press Enter to add a contact or Q to quit: ')
while choice.strip().upper() != 'Q':
  name = input('Enter a name: ')
  number = input('Enter a number: ')
  contacts[name] = number
  choice = input('Press Enter to add a contact or Q to quit: ')

print(contacts)

Let's see now how to update values that are already in the dictionary, in order to create a tally.

We'll begin by taking a string. We'll loop through it and put each letter into a dictionary. But we're likely to see the same letter more than once, so what do we do?

We'll make the letter the key, and the value will be the *number of times we've seen it*. So it'll start at 1, and each time we see it again, we'll increase it by 1.

In [None]:
# Frequency of letters in a sentence

def letter_frequency(s: str) -> dict:
  counts = {}

  for char in s:
    char = char.lower() # Normalize
    if char not in counts:
      counts[char] = 1
    else:
      counts[char] += 1
  
  return counts

message = 'What if all the icebergs melt? Where will we get ice cream?'
counts = letter_frequency(message)
print(counts)

This allows us to track the **frequency** of each letter.

### Frequency and code-breaking

You might be curious to know that this is one way to decode ciphers like the random assignment one we saw yesterday. Suppose we change the sentence to an encoded one:

In [None]:
# Frequency graph for encoded letters

message = "ijln kh luu njm ktmwmgyq cmun? ijmgm ikuu im ymn ktm tgmlc?"
counts = letter_frequency(message)
print(counts)

The most common letter, by far, is `'m'`. In the regular English sentence, that was `'e'`. Therefore, there's a pretty good chance that `'m' = 'e'`. This sample size is pretty small and unreliable, but with enough text, you can break even a strong cipher this way.

**Question:** If you were trying to write hard-to-decode messages, how could you avoid the above problem?

<details>
<summary>Click to reveal</summary>

> One way is to intentionally write messages using the less common letters.
</details>

### Sorting the most frequent letters

If we want to find the most frequent letter — the mode — how can we do that?

We can sort the keys based on their value. It looks a little tricky but isn't so bad.

In [None]:
# Finding the mode

message = "Dang it, did you throw out all my old albums?!"
counts = letter_frequency(message)

# Sort the keys based on value
most_frequent = sorted(counts, key=counts.get)

# Python sorts smallest to largest, so reverse it to get most frequent first
most_frequent.reverse()
print(most_frequent)

Tied words appear in random order when sorting.

### Your turn: Most frequent numbers

Recreate the above frequency chart, with two differences.
* Ask the user to enter a set of numbers
* Sort the numbers by frequency

Example:
```
Enter some numbers: 15 2 3 4 2 7 9 3 0 8 2
[2, 3, 15, 4, 7, 9, 0, 8]
```

<details>
<summary>Click for hint</summary>

> Don't forget to split the user's input in order to get the separate numbers.
</details>

In [None]:
# Find the mode

# Get a list of integer numbers
# TODO

# Make the frequency dictionary
# TODO

# Sort it from most frequent to least
# TODO

# Print the output
print(most_frequent)

## Looping through dictionaries

Unlike most containers, dictionaries give you two for the price of one: a key and a value. So when you loop through it, what do you get? The key, the value, or both?

Try it and find out:

In [None]:
# Looping through a dictionary

pokemon_to_evolution = {
    'Charmander': 'Charizard',
    'Squirtle': 'Blastoise',
    'Bulbasaur': 'Venusaur'
}

# Loop through the dictionary just like a list and print each item
# TODO

Well, were the loop items keys, values, or both somehow?

<details>
<summary>Click to reveal</summary>

> When you loop through a dictionary, the keys are the items. In this case, you should have gotten unevolved Pokémon names.
</details>

This might seem limiting, but it isn't so bad, because you can use the one to get the other.

In [None]:
# Getting both key and value

pokemon_to_evolution = {
    'Charmander': 'Charizard',
    'Squirtle': 'Blastoise',
    'Bulbasaur': 'Venusaur'
}

for key in pokemon_to_evolution:
  value = pokemon_to_evolution[key]
  print(f'{key} evolves to {value}')

You can also save time and print the values directly, though you rarely need to do this.

In [None]:
# Looping directly through values

pokemon_to_evolution = {
    'Charmander': 'Charizard',
    'Squirtle': 'Blastoise',
    'Bulbasaur': 'Venusaur'
}

for value in pokemon_to_evolution.values():
  print(value)

### Concept check

Take our letter frequency code or your number frequency code from above.

Copy and paste, and add a loop through the sorted list where you print each item along with how many times it appears.

Example:
```
Enter some numbers: 4 2 1 4 3 1 4 2 3 4 1 4
4: 5
1: 3
3: 2
2: 2
```

<details>
<summary>Click for hint</summary>

> The number of times it appears can be found by checking the key in the dictionary, similar to dict[key].
</details>

In [None]:
# Loop through & print values

# Copy and paste your previous code
# TODO

# Add the loop print
# TODO

## Inverting dictionaries

Finally, let's learn how to **invert** a dictionary. Inverting means that the keys become values and vice-versa.

### Challenge: Simple inversion

Let's try an initial attempt. Loop through this dictionary and set each {key: value} pair to {value: key} in the new dictionary.

The result should look like:
```
{
    'Charizard': 'Charmander',
    'Blastoise': 'Squirtle',
    'Venusaur': 'Bulbasaur'
}
```

In [None]:
# Simple inversion

pokemon_to_evolution = {
    'Charmander': 'Charizard',
    'Squirtle': 'Blastoise',
    'Bulbasaur': 'Venusaur'
}

evolution_to_original = {}

# Loop through and invert the dictionary
# TODO

print(evolution_to_original)

### Complex inversion

OK, all well and good. But what about this example?

In [None]:
# Problematic inversion

hair_colours = {
    'Kamiye': 'brown',
    'Liam': 'blond',
    'Winston': 'black'
    'Camille': 'brown'
}

Can you see the problem if you were to use your above code on this dictionary?

<details>
<summary>Click to reveal</summary>

> Keys are unique. So when there are two of one, like `'brown'`, which name is it supposed to point to?
</details>

### Domains & ranges

Let's compare dictionaries to functions once more. As a reminder, in math we said that functions turn input into output. The possible inputs are called the domain, and the possible outputs are called the range.

There is a restriction on mathematical functions. Each input must point to exactly one output. That is, if I input an `x` value, I will get a single `y` value back so I can make an ordered pair `(x, y)`. We call this the vertical test: if you draw a vertical line down a graph, it's not supposed to cross your function more than once.

![Elephant.jpg](https://i.imgur.com/rm9AJcS.png)
A dictionary satisfies this condition of functions. Notice that a single value in the range can occur more than once, like in the parabola above — there's no horizontal line test; horizontal lines are fine. In the same way, two students have brown hair above.

Do you see the problem yet? Hang tight...

When we invert a dictionary, we swap the keys and the values. When we invert a function, we swap the domain and the range, the x axis and the y axis — rotating the graph 90 degrees. This means that it may no longer be a function.

![Elephant.jpg](https://i.imgur.com/c6w6ACa.png)



### Square roots revisited

What if I told you our square root dictionary was wrong?...

In [None]:
# Square root dict (incomplete)

square_roots = {
    '64': 8,
    '49': 7,
    '36': 6
}

From quadratics in math, you might remember that there are not ONE but TWO solutions to any parabola... What's the other one? And how do we store it in a dictionary?

Here's how!

In [None]:
# Square root dict (complete)

square_roots = {
    '64': [8, -8],
    '49': [7, -7],
    '36': [6, -6]
}

In short, we fix the problem of needing to map to more than one value by making the one value... a list. That way, we can pack many into one.

### Challenge: Complex inversion

OK, now we know enough to invert the hair colour dictionary. The algorithm will be similar to how we counted the frequencies — except instead of adding 1 to an int, we'll append to a list.

1. Go through the keys of the original dictionary, as before.
2. If a hair colour is new, set it to a one-item list [name].
3. If it's not new, append the name to the existing list.

The result should look like:
```
{
  'blond': ['Liam'],
  'brown': ['Kamiye', 'Camille'],
  'black': ['Winston']
}
```

<details>
<summary>Click for hint</summary>

> Reuse the structure from the frequency block above: `if char not in counts` would become `if colour not in hair_colour_to_names`, and so on, except appending instead of adding.
</details>

In [None]:
# Complex inversion

name_to_hair_colour = {
    'Liam': 'blond',
    'Kamiye': 'brown',
    'Camille': 'brown',
    'Winston': 'black',
}

hair_colour_to_names = {}

# Loop through and invert the dictionary
# TODO

# Print the result
print(hair_colour_to_names)