#CSE 101: Computer Science Principles
####Stony Brook University
####Kevin McDonnell (ktm@cs.stonybrook.edu)
##Module 13: Dictionaries



### Overview of Dictionaries

A **dictionary** is a type of *unordered* collection where we can index (access) an element in the collection using a string (or some other data types) instead of an integer index (as in a list).

We say that a dictionary *maps* **keys** to **values**. The keys are not sorted or ordered in a dictionary.

For example, a dictionary for a store's inventory could map a product code (e.g., a string) to the count (integer) of how many units of that product are in the storeroom.

Another example: a dictionary for a grade book could map a student ID # (integer) to that student's grades (a list).


An empty dictionary is created using `{}`.

In [0]:
price_list = {}
price_list

{}

A dictionary with starting keys and values can also be created using `{}`.

In [0]:
price_list = {'iPhone': 1000.0, 'speakers': 125.99, 'Blu-ray player': 150.95 }
price_list

{'Blu-ray player': 150.95, 'iPhone': 1000.0, 'speakers': 125.99}

To add a key/value pair to a dictionary, we can use the `[]` operator.

In [0]:
price_list['tv'] = 1200.0
price_list['laptop'] = 799.95
price_list['radio'] = 35.95
price_list['printer'] = 150.95
price_list

{'Blu-ray player': 150.95,
 'iPhone': 1000.0,
 'laptop': 799.95,
 'printer': 150.95,
 'radio': 35.95,
 'speakers': 125.99,
 'tv': 1200.0}

Every key in a dictionary is unique, but the values need not be unique (e.g., `150.95` in the `price_list` dictionary). If we assign a new value for an existing key, the old value is lost.

In [0]:
price_list['radio'] = 27.95
price_list

{'Blu-ray player': 150.95,
 'iPhone': 1000.0,
 'laptop': 799.95,
 'printer': 150.95,
 'radio': 27.95,
 'speakers': 125.99,
 'tv': 1200.0}

To remove a key/value pair from a dictionary, use the `del` operator.

In [0]:
del price_list['radio']
price_list

{'Blu-ray player': 150.95,
 'iPhone': 1000.0,
 'laptop': 799.95,
 'printer': 150.95,
 'speakers': 125.99,
 'tv': 1200.0}

Attempting to access a non-existing key causes an error.

In [0]:
#price_list['PS5']

Attempting to delete a non-existing key also causes an error.

In [0]:
#del price_list['Xbox']

To avoid such errors, use the `in` operator to check if a key exists in the dictionary before attempt these operations.

In [0]:
if 'Xbox' in price_list:
    del price_list['Xbox']

Dictionary values can be updated like any "normal" variable.

In [0]:
discount = 0.1
price_list['tv'] *= (1 - discount)
upcharge = 75
price_list['tv'] += upcharge
price_list

{'Blu-ray player': 150.95,
 'iPhone': 1000.0,
 'laptop': 799.95,
 'printer': 150.95,
 'speakers': 125.99,
 'tv': 1155.0}

A for-loop provides the easiest way to iterate over the keys of a dictionary.

In [0]:
reduction = 5
for product in price_list:
    price_list[product] -= reduction
price_list

{'Blu-ray player': 145.95,
 'iPhone': 995.0,
 'laptop': 794.95,
 'printer': 145.95,
 'speakers': 120.99,
 'tv': 1150.0}

Use `dict_name.values()` to access the list of values in the dictionary. Similarly, `dict_name.keys()` lets you iterate over the keys.

In [0]:
for price in price_list.values():
    print(price)

995.0
120.99
145.95
1150.0
794.95
145.95


Use `dict.items()` to access the keys and associated values in tandem.

In [0]:
for product, price in price_list.items():
    print(f'{product}: ${price:0.2f}')

iPhone: $995.00
speakers: $120.99
Blu-ray player: $145.95
tv: $1150.00
laptop: $794.95
printer: $145.95


### Dictionary Comprehensions

Similar to a list comprehension, a **dictionary comprehension** lets you create a new dictionary based on one or more existing data structures. 

In [0]:
names = ['Sara', 'Jenn', 'Deb', 'Mike', 'Maria']
jobs = ['scientist', 'doctor', 'engineer', 'teacher', 'lawyer']
careers = { name: job for name, job in zip(names, jobs)}
careers

{'Deb': 'engineer',
 'Jenn': 'doctor',
 'Maria': 'lawyer',
 'Mike': 'teacher',
 'Sara': 'scientist'}

Remember that the keys must be unique in a dictionary. One or more keys will be lost if we build a dictionary using non-unique keys.

In [0]:
reversed_price_list = {price_list[product]: product for product in price_list}
reversed_price_list

{120.99: 'speakers',
 145.95: 'printer',
 794.95: 'laptop',
 995.0: 'iPhone',
 1150.0: 'tv'}

### Application: Autocorrection

An autocorrection feature can be implemented very easily with a dictionary that maps a correctly-spelled word to common misspellings.

In [0]:
def autocorrect(message, mappings):
    message = message.split()
    corrected = ''
    for word in message:  # go through all the words in the message
        word_found = False
        for correct_word in mappings:  # go through all the keys
            misspellings = mappings[correct_word]
            if word in misspellings:
                corrected += correct_word
                word_found = True
        if not word_found:
            corrected += word
        corrected += ' '
    return corrected.strip()

replacements = {  
    'the': ['hte', 'teh'],
    'this': ['thsi', 'tis','htis', 'tshi'],
    'hey': ['hye', 'ehy', 'yhe'],
    'you': ['yuo', 'ouy', 'uyo', 'u'],
    'how': ['haw', 'hwo'],
    'are': ['r','aer'],
    'is': ['si'],
    'for': ['fir'],
    'test': ['tset', 'tets', 'etts'],
    'am': ['ma','m'],
    'best': ['bset', 'bets', 'btes'],
    'me': ['em', 'mi'],
    'hello': ['hallo', 'heello', 'helo', 'hell']
}

print(autocorrect('hye thsi si mi', replacements))
print(autocorrect('you are the best', replacements))
print(autocorrect('helo aer yuo ready fir teh tset', replacements))

hey this is me
you are the best
hello are you ready for the test


Another, perhaps more efficient way, would be to map each misspelled word to its corrected spelling. We will explore this implementation later in the module.

### Application: Electronic Grade Book

We could envision many scenarios where a value is associated with a list of other values:

* rosters of names in courses (map a string to a list of strings)
* lists of stores in different zip codes (map an integer zip code to a list of strings)
* an electronic grade book (map an integer student ID # to a list of numbers)

Let's set up a grade book with some starting values.

In [0]:
gradebook = {
    100000728: [80, 91, 83],
    100000122: [89, 99, 90],
    100000345: [92, 77, 87],
    100000912: [60, 81, 74]
}
roster = {
    100000728: 'Charlie',
    100000122: 'Steve',
    100000345: 'Molly',
    100000912: 'Paula',
}

We have some new test scores to add to the grade book.

In [0]:
scores = {
    100000728: 77,
    100000122: 91,
    100000345: 67,
    100000912: 88
}

In [0]:
for id_num, score in scores.items():
    gradebook[id_num].append(score)
gradebook

{100000122: [89, 99, 90, 91],
 100000345: [92, 77, 87, 67],
 100000728: [80, 91, 83, 77],
 100000912: [60, 81, 74, 88]}

Let's compute everyone's average.

In [0]:
averages = {}
for id_num, scores in gradebook.items():
    averages[id_num] = sum(scores)/len(scores)
averages

{100000122: 92.25, 100000345: 80.75, 100000728: 82.75, 100000912: 75.75}

Let's compute letter grades.

* 90-100: A
* 80-89: B
* 70-79: C
* 60-69: D
* 0-59: F

A dictionary is not well-suited for mapping ranges to values, so we will use a function instead.



In [0]:
def compute_grade(average):
    if average >= 90:
        return 'A'
    elif average >= 80:
        return 'B'
    elif average >= 70:
        return 'C'
    elif average >= 60:
        return 'D'
    else:
        return 'F'

grades = {}
for id_num in averages:
    grades[id_num] = compute_grade(averages[id_num])
grades

{100000122: 'A', 100000345: 'B', 100000728: 'B', 100000912: 'C'}

### Application: Electronic Address Book / Contacts List

For a typical contacts list, we can store several pieces of information about a contact:
* name
* one or more phone numbers
* one or more email addresses
* birthday (see the [`datetime`](https://docs.python.org/3/library/datetime.html) module for how to handle dates and times)
* notes
* etc.

We can store this kind of data using **nested dictionaries**, wherein each name (or phone number or whatever unique value we like) is the key, and the associated value is another dictionary with all the information about the contact.

In [0]:
from datetime import date

contacts = {
    'Bill': {
        'mobile': '6319871234',
        'email': ['bill@stonybrook.edu', 'bill@gmail.com'],
        'birthday': date(2000, 12, 5)
    },
    'Nancy': {
        'home': '2129223104',
        'mobile': '6317290113',
        'email': ['nancy@yahoo.com'],
        'notes': 'allergic to peanuts'
    }
}
contacts

{'Bill': {'birthday': datetime.date(2000, 12, 5),
  'email': ['bill@stonybrook.edu', 'bill@gmail.com'],
  'mobile': '6319871234'},
 'Nancy': {'email': ['nancy@yahoo.com'],
  'home': '2129223104',
  'mobile': '6317290113',
  'notes': 'allergic to peanuts'}}

Let's retrieve Nancy's info.

In [0]:
contacts['Nancy']

{'email': ['nancy@yahoo.com'],
 'home': '2129223104',
 'mobile': '6317290113',
 'notes': 'allergic to peanuts'}

Let's look up Nancy's mobile phone number.

In [0]:
contacts['Nancy']['mobile']

'6317290113'

Let's retrieve Bill's info.

In [0]:
contacts['Bill']

{'birthday': datetime.date(2000, 12, 5),
 'email': ['bill@stonybrook.edu', 'bill@gmail.com'],
 'mobile': '6319871234'}

Let's look up Bill's birthday and found out what month it's in.

In [0]:
contacts['Bill']['birthday'].month  # or .day, or .year 

12

Let's look up Bill's email address(es).

In [0]:
contacts['Bill']['email']

['bill@stonybrook.edu', 'bill@gmail.com']

Bill has several email addresses. Let's retrieve the first one.

In [0]:
contacts['Bill']['email'][0]

'bill@stonybrook.edu'

### Application: Autocorrection (Improved)

The `replacements` dictionary above that maps a correctly-spelled word to common misspellings is backwards, in a way. We should probably reverse the mapping and map each misspelled word to its correct spelling.

In [0]:
replacements = {  
    'the': ['hte', 'teh'],
    'this': ['thsi', 'tis','htis', 'tshi'],
    'hey': ['hye', 'ehy', 'yhe'],
    'you': ['yuo', 'ouy', 'uyo', 'u'],
    'how': ['haw', 'hwo'],
    'are': ['r','aer'],
    'is': ['si'],
    'for': ['fir'],
    'test': ['tset', 'tets', 'etts'],
    'am': ['ma','m'],
    'best': ['bset', 'bets', 'btes'],
    'me': ['em', 'mi'],
    'hello': ['hallo', 'heello', 'helo', 'hell']
}

corrections = {}
for word in replacements:
    for misspelling in replacements[word]:
        corrections[misspelling] = word
corrections

{'aer': 'are',
 'bets': 'best',
 'bset': 'best',
 'btes': 'best',
 'ehy': 'hey',
 'em': 'me',
 'etts': 'test',
 'fir': 'for',
 'hallo': 'hello',
 'haw': 'how',
 'heello': 'hello',
 'hell': 'hello',
 'helo': 'hello',
 'hte': 'the',
 'htis': 'this',
 'hwo': 'how',
 'hye': 'hey',
 'm': 'am',
 'ma': 'am',
 'mi': 'me',
 'ouy': 'you',
 'r': 'are',
 'si': 'is',
 'teh': 'the',
 'tets': 'test',
 'thsi': 'this',
 'tis': 'this',
 'tset': 'test',
 'tshi': 'this',
 'u': 'you',
 'uyo': 'you',
 'yhe': 'hey',
 'yuo': 'you'}

In [0]:
def autocorrect2(message, mappings):
    message = message.split()
    corrected = ''
    for word in message: 
        if word in mappings:
            corrected += mappings[word]
        else:
            corrected += word
        corrected += ' '
    return corrected.strip()

print(autocorrect2('hye thsi si mi', corrections))
print(autocorrect2('you are the best', corrections))
print(autocorrect2('helo aer yuo ready fir teh tset', corrections))

hey this is me
you are the best
hello are you ready for the test


We can do even better with a list comprehension and `join`.

In [0]:
def autocorrect3(message, mappings):
    return ' '.join([mappings[word] if word in mappings else word for word in message.split()])

print(autocorrect3('hye thsi si mi', corrections))
print(autocorrect3('you are the best', corrections))
print(autocorrect3('helo aer yuo ready fir teh tset', corrections))

hey this is me
you are the best
hello are you ready for the test
