## Dictionaries

### Motivating example: Short-comings of lists

We have already seen `List` objects as a way of storing data. For a `List`, we use the _index_ to look up the value. We can have _values_ repeated, but not indicies:

In [1]:
example_list = [0, 1, 1, 1, 3, 3, 6, 6, 6]

In `example_list`, the values `1`, `3` and `6` are all repeated, but the indices run from `0` to `8`. Each entry has a unique index.

Sometimes, looking things up by a numerical index is inconvinient. Consider a list of states and capitals:

In [15]:
states_and_capitals = [
    ['California', 'Sacramento'],
    ['Oregon', 'Salem'],
    ['Washington', 'Olympia'],
    ['Alaska', 'Juneau'],
    ['Hawaii', 'Honolulu'],
    ['New York', 'Albany'],
    ['Maine', 'Portland'],
    ['Illinois', 'Springfield']
]

In [3]:
states_and_capitals[2]

['Washington', 'Olympia']

Suppose we wanted to look up the capital of Hawaii. First we need to iterate through the list. Each element is itself a list. Once we find 'Hawaii', we can set its capital:

In [16]:
state_to_find = 'California'
capital = ''

for state_and_capital in states_and_capitals:
    state = state_and_capital[0]
    if state == state_to_find:
        capital = state_and_capital[1]

print(capital)

Sacramento


We can make slightly slicker versions of this loop using _variable unpacking_ and _break_ commands, but for the moment, let's just keep the focus on the problem at hand:
> Lists are not super convinient for looking up things by a non-index. We _can_ do it by making "lists of lists", but it isn't very efficient AND it makes us write a bunch of code that obscures what we are doing.

### How a dictionary solves this problem

A dictionary allows us to have a `key` to lookup a `value`, where the `key` is any immutable object. Instead of looking a value up by _index_, we look it up by _key_. The idea is similar to a dictionary, where you use the `word` (the key) to look up the meaning (the value). 

Let's try to make this clearer with the states/capitals example:

In [38]:
# syntax is 
# { key1: value1, key2: value2, ...... }

states_and_capitals = {
    'California': 'Sacramento',  # California is the key, Sacramento is the value
    'Oregon': 'Salem',           
    'Washington': 'Olympia',
    'Alaska': 'Juneau',
    'Hawaii': 'Honolulu',
    'New York': 'Albany',
    'Maine': 'Portland',
    'Illinois': 'Springfield'
}

We can use the `keys` to look things up

In [39]:
states_and_capitals['California']

'Sacramento'

We **cannot** use the values (this will have an error)

In [40]:
print(states_and_capitals)

{'California': 'Sacramento', 'Oregon': 'Salem', 'Washington': 'Olympia', 'Alaska': 'Juneau', 'Hawaii': 'Honolulu', 'New York': 'Albany', 'Maine': 'Portland', 'Illinois': 'Springfield'}


We can add new keys easily:

In [41]:
states_and_capitals.get('Washington', 'unknown')

'Olympia'

In [42]:
states_and_capitals['Texas'] = 'Austin'

However, keys have to be unique. If we overwrite a key, we lose the previous value

In [43]:
print(states_and_capitals['Illinois'])

Springfield


In [44]:
states_and_capitals['Illinois'] = 'Chicago'
print(states_and_capitals['Illinois'])

Chicago


In [45]:
print(states_and_capitals)

{'California': 'Sacramento', 'Oregon': 'Salem', 'Washington': 'Olympia', 'Alaska': 'Juneau', 'Hawaii': 'Honolulu', 'New York': 'Albany', 'Maine': 'Portland', 'Illinois': 'Chicago', 'Texas': 'Austin'}


When dictionaries get big, it can be hard to read them. The `pprint` package does a nice job of printing them for us

In [46]:
import pprint
pprint.pprint(states_and_capitals)

{'Alaska': 'Juneau',
 'California': 'Sacramento',
 'Hawaii': 'Honolulu',
 'Illinois': 'Chicago',
 'Maine': 'Portland',
 'New York': 'Albany',
 'Oregon': 'Salem',
 'Texas': 'Austin',
 'Washington': 'Olympia'}


We can use the `in` operator to check if a key is in a dictionary. Note it only works on keys!

In [47]:
# Note that California is a key
'California' in states_and_capitals

True

In [48]:
# Sacramento isn't a key, but it is a value
'Sacramento' in states_and_capitals

False

In [49]:
'Rhode Island' in states_and_capitals

False

We cannot access dictionaries by index, only by keys:

In [50]:
# This will give an error:
states_and_capitals[0]

KeyError: 0

You shouldn't rely on the order of items in a dictionary either. They are not designed to be accessed by position. We can iterate over a dictionary in a `for` loop, but should not rely on the order

In [51]:
for state in states_and_capitals:
    print(state)

California
Oregon
Washington
Alaska
Hawaii
New York
Maine
Illinois
Texas


## Brief summary of dictionaries

- Make them with `{key1: value1, key2: value2, .... }`
- Keys must be immutable. Basically use strings, numbers, or tuples as your keys.
- Keys cannot repeat, assigning to the same key will overwrite the existing value
- Values can repeat
- The `in` keyword tests whether a key is in the dictionary or not.
- We can mutate a dictionary. 
  - To add `new_key` to a dictionary `d`, we can write `d[new_key] = .....`
  - To remove `old_key` from a dictionary `d`, we can write `del d[old_key]`

## Test yourself:

We have a menu with the following items on it:

| Name | Price |
| --- | --- |
| Small fries | 1.00 |
| Hamburger | 1.00 |
| Small drink | 1.00|
| Medium drink | 1.00 |
| Large drink | 1.00 |
| Medium fries | 1.45 |
| Large fries | 2.00 |
| Cheeseburger | 2.50 |

1. Would we be able to make a dictionary `name_to_price` where the keys are names and the values are the price?
2. Would we be able to make a dictionary `price_to_name` where the keys are prices and the values are the name?

In [53]:
price_to_name = {
    1.00: ['Small fries', 'Hamburger', 'Small drink'],
    1.45: ['Medium fries'],
    2.00: ['Large fries'],
    2.50: ['Cheeseburger'] 
}

## Examples of using a dictionary:

Dictionaries are quick to add keys, and quick to find keys (they use a trick called _hashing_ that we can talk about if there is interest). Often the idea that 

1. **Contact book:** e.g. Key: name or id, value: phone number (i.e. a contacts address book)
2. **Counters:** e.g. key: thing to be counted, value: number of occurances of thing to be counted
3. **Checking uniqueness** e.g. key: thing to be checked, value: arbitrary value. Then check the number of keys is the same as the number of things you started with (if not, there were some repeats)
4. **More readable datastructures**: We can get away with storing information in lists such as `[name, age, salary]`, but then we have to remember the order. A dictionaries keys can make it easier for the next person to read.


## Example: letter counter

Given `text`, count how many times each letter appears in text.

In [58]:
import math
pi_string = str(math.pi)

digit_count = [0 for num in range(10)]
for digit in pi_string:
    if digit.isnumeric():
        digit_count[int(digit)] = digit_count[int(digit)] + 1

print(digit_count)
for index in range(10):
    print(f'Digit {index} appears {digit_count[index]} times in {pi_string}')

[0, 2, 1, 3, 1, 3, 1, 1, 1, 3]
Digit 0 appears 0 times in 3.141592653589793
Digit 1 appears 2 times in 3.141592653589793
Digit 2 appears 1 times in 3.141592653589793
Digit 3 appears 3 times in 3.141592653589793
Digit 4 appears 1 times in 3.141592653589793
Digit 5 appears 3 times in 3.141592653589793
Digit 6 appears 1 times in 3.141592653589793
Digit 7 appears 1 times in 3.141592653589793
Digit 8 appears 1 times in 3.141592653589793
Digit 9 appears 3 times in 3.141592653589793


In [84]:
# In Dicken's day, authors were paid by the word
#
text = """
It was the best of times, it was the worst of times, it was the age of wisdom, it was the age of foolishness, 
it was the epoch of belief, it was the epoch of incredulity, it was the season of Light, it was the season of 
Darkness, it was the spring of hope, it was the winter of despair, we had everything before us, we had nothing 
before us, we were all going direct to Heaven, we were all going direct the other way—in short, the period was 
so far like the present period, that some of its noisiest authorities insisted on its being received, for good 
or for evil, in the superlative degree of comparison only.
"""

In [80]:
#text = pi_string

In [90]:
counter = {}

for word in text.lower().split():
    counter[word] = counter.get(word, 0) + 1
    
counter

{'it': 10,
 'was': 11,
 'the': 14,
 'best': 1,
 'of': 12,
 'times,': 2,
 'worst': 1,
 'age': 2,
 'wisdom,': 1,
 'foolishness,': 1,
 'epoch': 2,
 'belief,': 1,
 'incredulity,': 1,
 'season': 2,
 'light,': 1,
 'darkness,': 1,
 'spring': 1,
 'hope,': 1,
 'winter': 1,
 'despair,': 1,
 'we': 4,
 'had': 2,
 'everything': 1,
 'before': 2,
 'us,': 2,
 'nothing': 1,
 'were': 2,
 'all': 2,
 'going': 2,
 'direct': 2,
 'to': 1,
 'heaven,': 1,
 'other': 1,
 'way—in': 1,
 'short,': 1,
 'period': 1,
 'so': 1,
 'far': 1,
 'like': 1,
 'present': 1,
 'period,': 1,
 'that': 1,
 'some': 1,
 'its': 2,
 'noisiest': 1,
 'authorities': 1,
 'insisted': 1,
 'on': 1,
 'being': 1,
 'received,': 1,
 'for': 2,
 'good': 1,
 'or': 1,
 'evil,': 1,
 'in': 1,
 'superlative': 1,
 'degree': 1,
 'comparison': 1,
 'only.': 1}

In [69]:
states_and_capitals['Rhode Island'] = 'Providence'

In [64]:
counter['it']

In [65]:
counter['it'] = 0

{'it': 0}

In [89]:
states_and_capitals.get('California', 'not in the dictionary')

'Sacramento'

In [None]:
letter_freq = {}
for letter in text:
    if letter not in letter_freq:
        letter_freq[letter] = 0
    letter_freq[letter] += 1

In [None]:
pprint.pprint(letter_freq)

## Sorting

## Plotting (sort of)

## Could we count words instead?

## Scrabble

In [91]:
TILE_SCORES = {
    'A': 1, 'B': 3, 'C': 3, 'D': 2, 'E': 1,
    'F': 4, 'G': 2, 'H': 4, 'I': 1, 'J': 8,
    'K': 5, 'L': 1, 'M': 3, 'N': 1, 'O': 1,
    'P': 3, 'Q':10, 'R': 1, 'S': 1, 'T': 1,
    'U': 1, 'V': 4, 'W': 4, 'X': 8, 'Y': 4,
    'Z': 10
}

Write a function `score_word(word)` that takes `word` (e.g. "DOCTOR") and returns the score, where each letter's score is the value in `TILE_SCORES` (in this case 2+1+3+1+1+1 = 9 is the score for DOCTOR).

Bonus: you should make it case-insensitive, e.g. both `score_word("DOCTOR")` and `score_word("doctor")` return 9

In [None]:
def score_word(word):
    pass

In [None]:
score_word("DOCTOR")

In [None]:
score_word("doctor")

## Loading scrabble words

Let's load up a dictionary of (all?) scrabble words from a file.

In [92]:
with open('words.txt') as f:
    WORDS = [w.strip() for w in f.readlines()]
WORDS[:10]

['AA',
 'AAH',
 'AAHED',
 'AAHING',
 'AAHS',
 'AAL',
 'AALII',
 'AALIIS',
 'AALS',
 'AARDVARK']

These words are all in capitals. 

Question: What are the 10 highest scoring scrabble words?

## ScrabbleBot

Write a function `score_tiles(tiles)` that takes a list of tiles, and returns the score _and_ a highest scoring word you can make from those tiles. The word must be a valid word that appears in the array `WORDS`

For example

```python
>>> score_tiles(['C', 'A', 'T'])
[5, 'CAT'] # might also return [5, 'ACT'] as a valid word
>>> score_tiles(['Q', 'Z', 'A', 'T'])
[12, 'QAT']  # Apparently QAT is a word, but we don't have words with
             # Q, Z and only A/T in them, so we cannot use all tiles.
```

Hint: 
- Any anagrams have the same tiles, so they will have the same score
- You might want a function `can_make_from_tiles(word, tiles)` to help you

In [None]:
# A potentially helpful dictionary (keys are words, values are scores)
# Use of a dictionary comprehension
SCORES = {w: score_word(w) for w in WORDS}

In [None]:
def can_make_from_tiles(word, tiles):
    pass

In [None]:
def score_tiles(tiles):
    pass

In [None]:
score_tiles(['Q', 'Z', 'A', 'T'])

## Telephone words

We can convert words into numbers using the telephone keypad:

![Telephone keypad](keypad.png)

For example `1-888-WAIT-WAI` (the number for NPRs "Wait, Wait, Don't tell me") can be decoded as `1-888-924-8924`.

1. Is it possible, given a string like `1-888-WAIT-WAI` to return a _unique_ number like `18889248924`?
2. Is is possible, given a number like `18889248924` to return a _unique_ number/string like `1888WAITWAI`?


In [None]:
NUM_TO_LETTERS = {
    0: '',
    1: '',
    2: 'ABC',
    3: 'DEF',
    4: 'GHI',
    5: 'JKL',
    6: 'MNO',
    7: 'PQRS',
    8: 'TUV',
    9: 'WXYZ'
}

NUM_TO_LETTERS_LIST = {num: list(value) for num, value in NUM_TO_LETTERS.items()}

In [None]:
NUM_TO_LETTERS_LIST

In [None]:
# Can you make a dictionary that has a letter as a key, and a number as a value?

## Exercises

1. Write a function `decode_to_number(alpha_number)` that returns the number after replacing the letters with numbers (e.g. `decode_to_number("1-888-WAIT-WAI")` would return `18889248924`

2. Write a function `get_all_alpha_numbers(phone_number)` the returns a list of all "alphanumbers" you can make from `phone_number`. The letters should correspond to actual words in `WORDS`. For example  
  - `get_all_alpha_numbers(18889248924)` would return a list containing `'1888WAIT924'`, but not `1888WAITWAI` because `WAI` isn't a word. 
  - `get_all_alpha_numbers(188892489248)` should return `1888WAITWAIT` as one of the words.