## Dictionaries

## Brief summary of dictionaries

- Make them with `{key1: value1, key2: value2, .... }`
- Keys must be immutable. Basically use strings, numbers, or tuples as your keys.
- Keys cannot repeat, assigning to the same key will overwrite the existing value
- Values can repeat
- The `in` keyword tests whether a key is in the dictionary or not.
- We can mutate a dictionary. 
  - To add `new_key` to a dictionary `d`, we can write `d[new_key] = .....`
  - To remove `old_key` from a dictionary `d`, we can write `del d[old_key]`

## Example

In [None]:
states_and_capitals = {
    'California': 'Sacramento',
    'Oregon': 'Salem',
    'Washington': 'Olympia',
    'Maine': 'Portland',
    'Texas': 'Austin',
    'New York': 'Albany',
    'Illinois': 'Springfield'
}

# access a state
print(states_and_capitals['California'])

In [None]:
# Mutatable: easy to add keys
states_and_capitals['Florida'] = 'Jackson'

In [None]:
# Can check if a key is in the dictionary using "in", but not a value
print('California' in states_and_capitals)
print('Sacramento' in states_and_capitals)

In [None]:
# Accessing a key can be done two ways:
#  - using dict[key], gives an error if key is not in dictionary
#  - using dict.get(key, value_if_missing)
states_and_capitals['Sacramento']

In [None]:
states_and_capitals.get('Sacramento', 'not a state I know about')

### Counters

Given `text`, count how many times each letter appears in text. This is the type of problem that a dictionary `counter` is great for, because:
- `counter[letter]` keeps the count of number of times we have seen letter
- If we see a new letter, we can use mutability of dictionaries to add a new key. 
- Every time we see `letter`, we should add one to `counter[letter]`

We could decided to look _only_ at letters, but we can also count all characters (e.g. spaces, new lines, full stops, tabs, ....) just as easily. See the example below

In [None]:
# In Dicken's day, authors were paid by the word
#
text = """
It was the best of times, it was the worst of times, it was the age of wisdom, it was the age of foolishness, 
it was the epoch of belief, it was the epoch of incredulity, it was the season of Light, it was the season of 
Darkness, it was the spring of hope, it was the winter of despair, we had everything before us, we had nothing 
before us, we were all going direct to Heaven, we were all going direct the other way—in short, the period was 
so far like the present period, that some of its noisiest authorities insisted on its being received, for good 
or for evil, in the superlative degree of comparison only.
"""

In [None]:
counter = {}

for letter in text.lower():
    counter[letter] = counter.get(letter, 0) + 1
    
counter

## Sorting -- new

One thing dictionaries don't do well is get sorted, beacuse the are not supposed to keep order. How would we find the most popular letter?

In [None]:
# attempt 1
sorted(counter)

Note this is just the keys in "alphabetical" (technically _lexographic_) order. We really want the letters sorted by _value_ instead. This is a little awkward. Let's do it the long way first:

In [None]:
my_list = []
for key in counter:
    my_list.append([counter[key], key])
print(my_list)

This is now a list of lists. The entry `my_list[i]` is a list with two things
- the value (count)
- the key (the letter)

When we sort a list of lists, we order by the first element. In the case of a tie, we sort by the second element. If both of those tie, we move onto the third (if it exists), etc.

In [None]:
sorted(my_list)

Now we see that we have ordered the "characters" from least popular to most popular.

### Using list comprehension

In [None]:
my_list = [  [counter[key], key] for key in counter ]
sorted(my_list)

The idea of getting the `(key, value)` pairs is so common, we can access items using `dict.items()`:

In [None]:
list(counter.items())

Sorted also takes a keyword argument, which allows us to "reverse" the order. We can also just reverse it as a list

In [None]:
# This is the ordinary list way
sorted(my_list)[::-1]

In [None]:
# This is the "fancy" argument reverse
sorted(my_list, reverse=True)

## Plotting (sort of)

Before learning all the plotting functionality, let's look at "ASCII" plot

Idea: take `[10, 'p']` and write the line as the letter, a delimiter, and 10 `'-'`s.
```
p   | ----------
```
Then we can write out each line to give a simple histogram

In [None]:
for count_and_letter in sorted(my_list, reverse=True):
    count = count_and_letter[0]
    letter = count_and_letter[1]
    print(f'{letter:10s}|{"-"*count}')

#### Unpacking trick

We have `count_and_letter` set to a list (e.g. `[10, 'p']`). We can get the two entries by index
```python
count_and_letter = [10, 'p']  # set an example
count = count_and_letter[0]   # get the 10
letter = count_and_letter[1]  # get the 'p'
```

We can also "unpack" the variables in one step
```python
count_and_letter = [10, 'p']  # set an example
count, letter = count_and_letter # sets count to count_and_letter[0], letter to count_and_letter[1]
```

In [None]:
# The same result
for count_and_letter in sorted(my_list, reverse=True):
    count, letter = count_and_letter
    print(f'{letter:10s}|{"-"*count}')

Even slicker: we can "unpack" directly in the for loop, bypassing `count_and_letter` altogether:

In [None]:
# the same result
for count, letter in sorted(my_list, reverse=True):
    print(f'{letter:10s}|{"-"*count}')

###  Exercise

Change the loop above so we only print out the line is `letter` is actually a letter (not punctuation, etc). 

**Hint:** `letter.isalpha()` is useful here. 

## Scrabble

In [None]:
TILE_SCORES = {
    'A': 1, 'B': 3, 'C': 3, 'D': 2, 'E': 1,
    'F': 4, 'G': 2, 'H': 4, 'I': 1, 'J': 8,
    'K': 5, 'L': 1, 'M': 3, 'N': 1, 'O': 1,
    'P': 3, 'Q':10, 'R': 1, 'S': 1, 'T': 1,
    'U': 1, 'V': 4, 'W': 4, 'X': 8, 'Y': 4,
    'Z': 10
}

Write a function `score_word(word)` that takes `word` (e.g. "DOCTOR") and returns the score, where each letter's score is the value in `TILE_SCORES` (in this case 2+1+3+1+1+1 = 9 is the score for DOCTOR).

Bonus: you should make it case-insensitive, e.g. both `score_word("DOCTOR")` and `score_word("doctor")` return 9

In [None]:
def score_word(word):
    """Gives the score for word, using the scores in TILE_SCORES"""
    scores = [TILE_SCORES[letter] for letter in word.upper()]
    return sum(scores)

In [None]:
score_word("DOCTOR")

In [None]:
score_word("doctor")

## Loading scrabble words

Let's load up a dictionary of (all?) scrabble words from a file.

In [None]:
with open('words.txt') as f:
    WORDS = [w.strip() for w in f.readlines()]
WORDS[:10]

These words are all in capitals. 

Question: What are the 10 highest scoring scrabble words?

## Answer

This is actually a little tricky. It isn't too hard to get the scores of each word:

In [None]:
scores = [score_word(w) for w in WORDS]
scores

Getting the top 10 scores isn't too bad:

In [None]:
sorted(scores, reverse=True)[:10]

So _one_ way of doing this is to look for all words with a score of 37 or higher

In [None]:
[w for w in WORDS if score_word(w) >= 37]

Note that we have lost the score. A better way of doing this is to make a list where the elements are `[score, word]`. We put the score first because we want to sort by score _first_.

In [None]:
## Better solution
scores_and_words = [  [score_word(w), w] for w in WORDS ]
sorted(scores_and_words, reverse=True)[:10]

We can even write this out nicely

In [None]:
scores_and_words = [  [score_word(w), w] for w in WORDS ]
ordered_scores_and_words = sorted(scores_and_words, reverse=True)[:10]

for score, word in ordered_scores_and_words:
    print(f'The word "{word}" has a score {score} in scrabble')

We have dictionary comprehsions as well.

In [None]:
# A potentially helpful dictionary (keys are words, values are scores)
# Use of a dictionary comprehension
SCORES = {w: score_word(w) for w in WORDS}

In [None]:
SCORES

## Telephone words

We can convert words into numbers using the telephone keypad:

![Telephone keypad](keypad.png)

For example `1-888-WAIT-WAI` (the number for NPRs "Wait, Wait, Don't tell me") can be decoded as `1-888-924-8924`.

1. Is it possible, given a string like `1-888-WAIT-WAI` to return a _unique_ number like `18889248924`?
2. Is is possible, given a number like `18889248924` to return a _unique_ number/string like `1888WAITWAI`?


In [None]:
NUM_TO_LETTERS = {
    0: '',
    1: '',
    2: 'ABC',
    3: 'DEF',
    4: 'GHI',
    5: 'JKL',
    6: 'MNO',
    7: 'PQRS',
    8: 'TUV',
    9: 'WXYZ'
}

NUM_TO_LETTERS_LIST = {num: list(value) for num, value in NUM_TO_LETTERS.items()}

Can we get letters to number? We could write it out manually (`'A' --> 1`, `'B' --> 1`, ..., `'Z' --> 9`) which is somewhat more natural. We should construct it from this list if possible

In [None]:
LETTERS_TO_NUM = {}
for number in NUM_TO_LETTERS:
    for letter in NUM_TO_LETTERS[number]:
        LETTERS_TO_NUM[letter] = number
LETTERS_TO_NUM

One of the exercises was to make `decode_to_number(alpha_number)`. 

It would take something like `decode_to_number('1-888-WAIT-WAI')` would return `18889248924`

In [None]:
def decode_to_number(alpha_number):
    digits_only = []
    for character in alpha_number:
        # if it is a digit, pass through
        character = character.upper()
        if character.isdigit():
            digits_only.append(character)
        else:
            if character in LETTERS_TO_NUM:
                number = LETTERS_TO_NUM[character]
                digits_only.append(str(number))
    print(digits_only)
    joined_digits = ''.join(digits_only)
    print(joined_digits)
    return int(joined_digits)

In [None]:
decode_to_number('1-888-WAIT-WAI')

## Exercises

The following exercises are being "pushed"

- ScrabbleBot
- Telephone words part 2
