# Dictionaries

## Motivation

Suppose we need to represent years and the total North American fossil fuel CO2 emissions for those years. How should we do this?

One option is to use *parallel lists*, in which the `years` list at position `i` corresponds to the `emissions` list at position `i`:

In [None]:
years = [1799, 1800, 1801, 1802, 1902, 2002]
emissions = [1, 70, 74, 79, 82, 1733297]

This is fine, but how would you add new data (e.g., year 1950 and emissions 734914)?

We would need to modify both lists at the same time. We could append the data at the end:

In [None]:
years.append(1950)
emissions.append(734914)
print(years)
print(emissions)

[1799, 1800, 1801, 1802, 1902, 2002, 1950]
[1, 70, 74, 79, 82, 1733297, 734914]


However, then the data points wouldn't be sorted by year and we would have to think of a clever way to getting everything in order.

And how would we edit the emissions value for a particular year?

We would need to find the year in the `years` list and then modify the corresponding item in the emissions list:

In [None]:
target_index = years.index(1800)
emissions[target_index] = 500
print(years)
print(emissions)

[1799, 1800, 1801, 1802, 1902, 2002, 1950]
[1, 500, 74, 79, 82, 1733297, 734914]


A second option is to use *a list of lists*:



In [None]:
years_emissions = [[1799, 1], [1800, 70], [1801, 74], [1802, 79], [1902, 82], [2002, 1733297]]

This keeps the years and the values paired together, but it still takes a bit of work to add or modify values since you have to search through the list manually:

In [None]:
target_index = -1
for i in range(len(years_emissions)):
  if years_emissions[i][0] == 1800:
    years_emissions[i][1] = 900
print(years_emissions)

[[1799, 1], [1800, 900], [1801, 74], [1802, 79], [1902, 82], [2002, 1733297]]


## What Is a Dictionary?

A **dictionary** is a type of object that keeps track of associations for you. In Python, it is represented by the type `dict`. A dictionary has this general form:

    dict = {key1: value1, key2: value2, key3: value3, ...}

The dictionary consists of the following expressions:

* `keys`: Like a physical/metaphorical key, these expressions provide a means of gaining access to something
* `values`: The data that is associated with each key

Like lists, dictionaries are mutable.

Keys must be immutable objects (i.e., things like `int` and `str`, not things like `list`), but the associated values can be any type.

In [None]:
d = {1: 5, 3: 45, 4: 10}
d = {1: 5, "abc": "def", 2: "xyz", "uvw": 3}

In [None]:
# Invalid dictionary
d = {["Diane", "F", "45"]: 105, ["John", "M", "38"]: 84}

TypeError: ignored

Keys in a dictionary must be unique. If you duplicate a key while creating it, the latest value is used:

In [None]:
d = {1: 5, 1: 8}
print(d)

{1: 8}


## Dictionary Operations

Let's go back to the emissions example:

In [6]:
emissions_by_year = {1799: 1, 1800: 70, 1801: 74, 1802: 79,
                     1902: 82, 2002: 1733297}                

We can add new or update existing key-value pairs by using the key to index into the `dict`:

In [None]:
# Adding a new pair
emissions_by_year[2009] = 9000000
print(emissions_by_year) 
        
# Updating an existing pair
emissions_by_year[2009] = 10
print(emissions_by_year)

{1799: 1, 1800: 70, 1801: 74, 1802: 79, 1902: 82, 2002: 1733297, 2009: 9000000}
{1799: 1, 1800: 70, 1801: 74, 1802: 79, 1902: 82, 2002: 1733297, 2009: 10}


You can also check if a key is in the `dict` by using the `in` operator (note: you would need to do something more complicated if you want to do this for values):

In [None]:
print(1799 in emissions_by_year)
print(1950 in emissions_by_year)

True
False


You can remove a key-value pair by using the `del` operator:

In [None]:
print(1800 in emissions_by_year)
del emissions_by_year[1800]
print(1800 in emissions_by_year)

True
False


You can figure out how many key-value pairs are in the `dict` by using the `len()` function:

In [None]:
len(emissions_by_year)

5

You can also check to see if two `dict` objects have the same content by using the `==` comparator. Notice how the order of the keys does not matter:

In [2]:
d1 = {1: 'a', 2: 'b', 3: 'c'}
d2 = {3: 'c', 1: 'a', 2: 'b'}
d1 == d2

True

## Dictionary Methods

If you want to get a list of the keys in a `dict`, you can use the method `keys()`:

In [None]:
emissions_by_year.keys()

Likewise, if you want to get a list of values in a `dict`, you can use the method `values()`:

In [8]:
emissions_by_year.values()

dict_values([1, 70, 74, 79, 82, 1733297])

If you want to get a list of key-value pairs, you can use the method `items()`:

In [7]:
emissions_by_year.items()

dict_items([(1799, 1), (1800, 70), (1801, 74), (1802, 79), (1902, 82), (2002, 1733297)])

These objects are technically **views** of the dictionary, but you can easily convert them to `list` objects as follows:

In [None]:
years = list(emissions_by_year.keys())
print(years)

Also notice how the `items()` method returns a list of objects with the form `(key, value)`. These are known as **tuples**. 

In [25]:
emissions_tuples = list(emissions_by_year.items())
first_entry = emissions_tuples[0]
print(first_entry)
print(type(first_entry))


(1799, 1)
<class 'tuple'>


Tuples are basically the same as lists in that they can hold an arbitrarily long sequence of elements. However, lists are mutable, but tuples are immutable (i.e., cannot be modified):

In [28]:
# Inspecting elements is okay
print(first_entry[0])
print(first_entry[1])

1799
1


In [27]:
# Modifying elements is not okay
first_entry[0] = 1805

TypeError: ignored

## Practice Exercise: Working with Dictionaries

   1. Create a variable `doctor_to_patients` that refers to an empty dictionary.
   2. Add an entry for `'Dr. Ngo'` with `1200` patients.
   3. Add another entry for `'Dr. Singh'` with `1400` patients.
   4. Add a third entry for `'Dr. Gray'` with `1350` patients.
   5. Print the number of patients associated with `'Dr. Singh'`.
   6. Change the number of patients associated with `'Dr. Singh'` to `1401`.
   7. Write an expression to get the number of key-value pairs in the dictionary.
   8. Write an expression to get the doctors.
   9. Write an expression to get the patient quantities.
   10. Write an expression to check whether `'Dr. Koch'` is a key in the dictionary.
   11. Remove the key-value pair with `'Dr. Ngo'` as the key.   

In [None]:
# Write your code here

## Iterating through a Dictionary

When you iterate through a `list`, you normally access elements in one of two ways: (1) by numerical index or (2) by the elements themselves:

In [None]:
phone_list = ['555-7632', '555-9832', '555-6677', '555-9823', '555-6342', '555-7343']

In [13]:
# By index
for i in range(len(phone_list)):
  print(phone_list[i])

555-7632
555-9832
555-6677
555-9823
555-6342
555-7343


In [15]:
# By element
for phone_num in phone_list:
  print(phone_num)

555-7632
555-9832
555-6677
555-9823
555-6342
555-7343


If you need to iterate through a `dict`, you can also access key-value pairs in one of two ways: (1) by key or (2) by the key-value pairs themselves:

In [18]:
phone_dict = {'555-7632': 'Paul', '555-9832': 'Andrew', '555-6677': 'Dan', 
         '555-9823': 'Michael', '555-6342' : 'Cathy', '555-7343' : 'Diane'}

In [19]:
# By key
for key in phone_dict:
    print('Number:', key, ', Name:', phone_dict[key])

Number: 555-7632 , Name: Paul
Number: 555-9832 , Name: Andrew
Number: 555-6677 , Name: Dan
Number: 555-9823 , Name: Michael
Number: 555-6342 , Name: Cathy
Number: 555-7343 , Name: Diane


In [32]:
# By key-value pair
for item in phone_dict.items():
    print('Number:', item[0], ', Name:', item[1])

Number: 555-7632 , Name: Paul
Number: 555-9832 , Name: Andrew
Number: 555-6677 , Name: Dan
Number: 555-9823 , Name: Michael
Number: 555-6342 , Name: Cathy
Number: 555-7343 , Name: Diane


To make it easier to access the key and the value separately when iterating by key-value pair, you can actually name them in the `for` loop itself:

In [31]:
# By key-value pair (with naming)
for (number, name) in phone.items():
    print('Number:', number, ', Name:', name)

Number: 555-7632 , Name: Paul
Number: 555-9832 , Name: Andrew
Number: 555-6677 , Name: Dan
Number: 555-9823 , Name: Michael
Number: 555-6342 , Name: Cathy
Number: 555-7343 , Name: Diane


Unlike a real dictionary, iterating through the `dict` does not retrieve elements in an alphanumeric order. Instead, iteration works the same as it would for a `list` in that elements are retrieved in the same in which they were added to the dictionary.

**Note:** In Python 3.5 and earlier versions, the dictionary keys are not guaranteed to be in a particular order.

## Practice Exercise: Iterating over Dictionaries

The following dictionary has brand name drugs as keys and generic drug names as values:

In [33]:
brand_to_generic = {'lipitor': 'atorvastatin',
                    'zithromax': 'azithromycin',
                    'amoxcil': 'amoxicillin',
                    'singulair': 'montelukast',
                    'nexium': 'esomeprazole',
                    'plavix': 'clopidogrel',
                    'abilify': 'ARIPiprazole'}

  1. Get a list of brand name drugs that start with the letter `'a'`.

In [None]:
# Write your code here

  2. Count the number of generic drugs that end with the letter `'n'`.

In [None]:
# Write your code here

  3. Get a list of brand name drugs in alphabetical order.

In [39]:
# Write your code here

## Inverting a Dictionary 

Dictionaries are primarily designed to be searched according to their keys. However, there might be cases when you need to search by value instead.

Take this list of phone numbers for example, in which people can have multiple phone numbers:

In [10]:
phone_to_person = {'555-7632': 'Paul', '555-9832': 'Andrew', 
                   '555-6677': 'Dan', '555-9823': 'Michael',
                   '555-6342' : 'Cathy', '555-2222': 'Michael',
                   '555-7343' : 'Diane', '555-1982' : 'Cathy'}

If we want to get all of the phone numbers associated with Michael, we could iterate through the dictionary looking for all key-value pairs where the value is `'Michael'`:

In [None]:
michael = []
for key in phone_to_person:
    if phone_to_person[key] == 'Michael':
        michael.append(key)
print(michael)

If we want to do this for all people, we can **invert** the dictionary such that the keys become values and the values become keys. In this examples, that would make the names the keys and the numbers the values:

In [11]:
person_to_phone = {}
for (number, name) in phone_to_person.items():
    # Check if the person is already in the new dictionary
    if name not in person_to_phone:
        # Initialize the key with a new list
        person_to_phone[name] = [number]
    else:
        # Add the number to the existing list
        person_to_phone[name].append(number)
print(person_to_phone)

{'Paul': ['555-7632'], 'Andrew': ['555-9832'], 'Dan': ['555-6677'], 'Michael': ['555-9823', '555-2222'], 'Cathy': ['555-6342', '555-1982'], 'Diane': ['555-7343']}
