# Dictionaries

In this notebook we will: 
- Learn how to work with `Dictionaries`

## Dictionaries

### Motivating example: Short-comings of lists

We have already seen `list` objects as a way of storing data. For a `list`, we use the _index_ to look up the value. Consider a list of countries and capitals:

In [1]:
countries_and_capitals = [
    ['United States of America', 'Washington D.C.'],
    ['Argentina', 'Buenos Aires'],
    ['France', 'Paris'],
    ['India', 'New Delhi'],
]

In [2]:
countries_and_capitals[2] # "this returns the third element of the list ['France', 'Paris']"

['France', 'Paris']

In [3]:
countries_and_capitals[2][1]

'Paris'

Suppose we wanted to look up the capital of Argentina. First we need to iterate through the list. Each element is itself a list. Once we find 'Argentina', we can determine its capital, **it is not easy to subset on characters**:

In [5]:
country_to_find = 'Argentina'

for country_and_capital in countries_and_capitals: # "for each element (sublist) in the list "
    country = country_and_capital[0] # "set county to be the first elemnt (in the sublist)"
    if country == country_to_find: # "if the first element in the sublist is Argentina"
        capital = country_and_capital[1] # "store the second element in the sublist"

print(capital)

Buenos Aires


Lists are not convenient for looking up things without an index. We _can_ do it by making "lists of lists", but it isn't efficient AND it makes us write a lot of code that obscures what we are doing.

### How a dictionary solves this problem

A dictionary allows us to have a `key` to lookup a `value`. Instead of looking a value up by _index_, we look it up by _key_. The idea is similar to a dictionary, where you use the word (the `key`) to look up the meaning (the `value`). 

Let's try to make this clearer with the countries/capitals example. __Note__ that dictionaries use curly brackets `{...}`.

In [6]:
# syntax is 
# { key1: value1, key2: value2, ...... }

countries_and_capitals = {
    'United States of America': 'Washington D.C.',  
    'Argentina': 'Buenos Aires',           
    'France': 'Paris', # France is the key, and Paris is the value
    'India': 'New Delhi',
}

We can use the `keys` to look things up with the square brackets `[...]`

In [7]:
countries_and_capitals['India']

'New Delhi'

We **cannot** use the **values** (this will have an error)

In [None]:
countries_and_capitals['New Delhi']

We can add new `keys` easily:

In [8]:
countries_and_capitals['Botswana'] = 'Gaborone' # "adding to dictionary"
countries_and_capitals

{'United States of America': 'Washington D.C.',
 'Argentina': 'Buenos Aires',
 'France': 'Paris',
 'India': 'New Delhi',
 'Botswana': 'Gaborone'}

However, `keys` have to be unique. If we overwrite a key, we lose the previous value

In [9]:
print(countries_and_capitals['Argentina'])

Buenos Aires


In [13]:
countries_and_capitals['Argentina'] = 'Paris' # "this will replace the value to be Paris"
print(countries_and_capitals['Argentina'])

Paris


The `values` do not need to be unique. Now, we have two `values` with 'Paris'.

In [11]:
countries_and_capitals

{'United States of America': 'Washington D.C.',
 'Argentina': 'Paris',
 'France': 'Paris',
 'India': 'New Delhi',
 'Botswana': 'Gaborone'}

We can use the `in` operator to check if a key is in a dictionary. __Note__ it only works on keys!

In [12]:
# Note that Botswana is a key
'Botswana' in countries_and_capitals # "in used only for keys"

True

In [14]:
# Paris isn't a key, so it is not found
'Paris' in countries_and_capitals

False

In [15]:
'Fiji' in countries_and_capitals

False

We cannot access dictionaries by index, only by `keys`:

In [16]:
# This will give an error, unless there is a key assigne to 0
countries_and_capitals[0]

KeyError: 0

You shouldn't rely on the order of items in a dictionary either. They are not designed to be accessed by position. We can iterate over a dictionary in a `for loop`, but should not rely on the order

In [17]:
for country in countries_and_capitals:
    print(country) # "print the keys only"

United States of America
Argentina
France
India
Botswana


### Dictionary methods

Dictionaries have a few methods that can be observed by writing the name of the dictionary, followe by a `.` and press the `TAB` key.  Here are a few examples:

In [18]:
dir(countries_and_capitals)

['__class__',
 '__contains__',
 '__delattr__',
 '__delitem__',
 '__dir__',
 '__doc__',
 '__eq__',
 '__format__',
 '__ge__',
 '__getattribute__',
 '__getitem__',
 '__gt__',
 '__hash__',
 '__init__',
 '__init_subclass__',
 '__iter__',
 '__le__',
 '__len__',
 '__lt__',
 '__ne__',
 '__new__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__setattr__',
 '__setitem__',
 '__sizeof__',
 '__str__',
 '__subclasshook__',
 'clear',
 'copy',
 'fromkeys',
 'get',
 'items',
 'keys',
 'pop',
 'popitem',
 'setdefault',
 'update',
 'values']

In [19]:
countries_and_capitals.keys() # displays all the keys

dict_keys(['United States of America', 'Argentina', 'France', 'India', 'Botswana'])

In [20]:
countries_and_capitals.values() # displays all the values

dict_values(['Washington D.C.', 'Paris', 'Paris', 'New Delhi', 'Gaborone'])

In [26]:
countries_and_capitals.get('Fiji', 'unknown') # "does not error if the key is not found"


'unknown'

In [27]:
countries_and_capitals.items() # "returns the key/value pairs"

dict_items([('United States of America', 'Washington D.C.'), ('Argentina', 'Paris'), ('France', 'Paris'), ('India', 'New Delhi'), ('Botswana', 'Gaborone')])

### Brief summary of dictionaries

- Created with `{key1: value1, key2: value2, .... }`
- Keys must be immutable. Basically use strings, numbers, or tuples as your keys.
- Keys cannot repeat, assigning to the same key will overwrite the existing value
- Values can repeat
- The `in` keyword tests whether a key is in the dictionary or not.
- We can mutate a dictionary. 
  - To add `new_key` to a dictionary `d`, we can write `d[new_key] = .....`
  - To remove `old_key` from a dictionary `d`, we can write `del d[old_key]`

### Test:

We have a menu with the following items on it:

| Name | Price |
| --- | --- |
| Small fries | 1.00 |
| Hamburger | 1.00 |
| Small drink | 1.00|
| Medium drink | 1.00 |
| Large drink | 1.00 |
| Medium fries | 1.45 |
| Large fries | 2.00 |
| Cheeseburger | 2.50 |

1. Would we be able to make a dictionary `name_to_price` where the keys are names and the values are the price?
2. Would we be able to make a dictionary `price_to_name` where the keys are prices and the values are the name?

In [29]:
price_to_name = {
    1.00: ['Small fries', 'Hamburger', 'Small drink'],
    1.45: ['Medium fries'],
    2.00: ['Large fries'],
    2.50: ['Cheeseburger'] 
}

price_to_name

{1.0: ['Small fries', 'Hamburger', 'Small drink'],
 1.45: ['Medium fries'],
 2.0: ['Large fries'],
 2.5: ['Cheeseburger']}

## Examples of using a dictionary:

Dictionaries are quick to add keys, and quick to find keys (they use a trick called _hashing_). Here are a few examples where `dictionaries` are useful. 

1. **Phone book:** e.g. Key: name, value: phone number
2. **Counters:** e.g. key: thing to be counted, value: number of occurances of thing to be counted
3. **More readable datastructures**: We can get away with storing information in lists such as `[name, age, salary]`, but then we have to remember the order. A dictionaries keys can make it easier for the next person to read.


## Exercise:

Write a function that given a string of digits, returns a dictionary that counts how many times each digit appears in the text. The `keys` are the digits and the `values` are the counts of how many times the digit occurs.

In [None]:
pi_string = '3.141592653589793'

### Write code here




In [32]:
### ANSWER

pi_string = '3.141592653589793'

def count_digits(text):
    digit_counter = {} # "create empty dictionary"
    for digit in text: # "for each elemnt in pi_string"
        if digit.isnumeric(): # "if digit is numeric proceed"
            if digit not in digit_counter: # "if digit is not saved before in digit_counter set the digit_counter for this digit to 0"
                digit_counter[digit] = 0 
            digit_counter[digit] +=1 # "if digit is numeric add 1 to the digit counter for this digit"
    return digit_counter

In [31]:
digit_counter = count_digits(pi_string)
digit_counter

{'3': 3, '1': 2, '4': 1, '5': 3, '9': 3, '2': 1, '6': 1, '8': 1, '7': 1}

In [33]:
### ANOTHER ANSWER

pi_string = '3.141592653589793'

def count_digits2(text):
    digit_counter = {}
    for digit in text:
        if digit.isnumeric() and digit not in digit_counter: # in one line
            digit_counter[digit] = pi_string.count(digit) # "usie the count method to count the number of times the digit has ocuured"
    return digit_counter

In [34]:
digit_counter = count_digits2(pi_string)
digit_counter

{'3': 3, '1': 2, '4': 1, '5': 3, '9': 3, '2': 1, '6': 1, '8': 1, '7': 1}

If we want to figure out which number has the **highest counts** we could **create a list** of lists with the first element being the `value`, and the second element being the `key`. Finally we would sort this list as shown below. 

In [35]:
# we create a list of lists
new_list = []
for key, value in digit_counter.items(): # "gives the key/value pair, notice here for key,value"
    new_list.append([value,key])
new_list

[[3, '3'],
 [2, '1'],
 [1, '4'],
 [3, '5'],
 [3, '9'],
 [1, '2'],
 [1, '6'],
 [1, '8'],
 [1, '7']]

In [36]:
# we sort from high to low
sorted(new_list,reverse = True) # " arrange by first elements"

[[3, '9'],
 [3, '5'],
 [3, '3'],
 [2, '1'],
 [1, '8'],
 [1, '7'],
 [1, '6'],
 [1, '4'],
 [1, '2']]

### Nested Dictionaries

We can also have `dictionaries` within `dictionaries`.

In [37]:
my_dict = {'Clark': {'age': 20, 'weight': 170},
          'Bruce': {'age': 25, 'height': 6}}

my_dict

{'Clark': {'age': 20, 'weight': 170}, 'Bruce': {'age': 25, 'height': 6}}

In [39]:
my_dict['Bruce']['age'] # "subseting in nested dictionary"

25