# Introduction: Dictionaries
The compound data types we have studied in detail so far — strings and lists — are sequential collections. This means that the items in the collection are ordered from left to right and they use integers as indices to access the values they contain. This also means that looking for a particular value requires scanning the many items in the list until you find the desired value.

Data can sometimes be organized more usefully by associating a key with the value we are looking for. For example, if you are asked for the page number for the start of chapter 5 in a large textbook, you might flip around the book looking for the chapter 5 heading. If the chapter number appears in the header or footer of each page, you might be able to find the page number fairly quickly but it’s generally easier and faster to go to the index page and see that chapter 5 starts on page 78.

This sort of direct look up of a value in Python is done with an object called a Dictionary. Dictionaries are a different kind of collection. They are Python’s built-in mapping type. A map is an unordered, associative collection. The association, or mapping, is from a key, which can be of any immutable type (e.g., the chapter name and number in the analogy above), to a value (the starting page number), which can be any Python data object. You’ll learn how to use these collections in the following chapter.

# Getting Started with Dictionaries
To provide an example of this new kind of datatype, we will create a dictionary to translate English words into Spanish. For this dictionary, the keys are strings and the values will also be strings.

One way to create a dictionary is to start with the empty dictionary and add key-value pairs. The empty dictionary is denoted {}.

In [1]:
#What is printed by the following statements?

mydict = {"cat":12, "dog":6, "elephant":23}
print(mydict["dog"])

6


Create a dictionary that keeps track of the USA’s Olympic medal count. Each key of the dictionary should be the type of medal (gold, silver, or bronze) and each key’s value should be the number of that type of medal the USA’s won. Currently, the USA has 33 gold medals, 17 silver, and 12 bronze. Create a dictionary saved in the variable medals that reflects this information.

In [2]:
medals = {"gold":33, "silver":17, "bronze":12}
print(medals)

{'gold': 33, 'silver': 17, 'bronze': 12}


In [3]:
#You are keeping track of olympic medals for Italy in the 2016 Rio Summer Olympics! At the moment, Italy has 7 gold medals, 8 silver metals, and 6 bronze medals. Create a dictionary called olympics where the keys are the types of medals, and the values are the number of that type of medals that Italy has won so far.
olympics = {"gold":7, "silver":8, "bronze":6}
print(olympics)


{'gold': 7, 'silver': 8, 'bronze': 6}


# Dictionary operations
The del statement removes a key-value pair from a dictionary. For example, the following dictionary contains the names of various fruits and the number of each fruit in stock. If someone buys all of the pears, we can remove the entry from the dictionary.

In [4]:
# What is printed by the following statements?

mydict = {"cat":12, "dog":6, "elephant":23}
mydict["mouse"] = mydict["cat"] + mydict["dog"]
print(mydict["mouse"])

18


In [6]:
#Update the value for “Phelps” in the dictionary swimmers to include his medals from the Rio Olympics by adding 5 to the current value (Phelps will now have 28 total medals). Do not rewrite the dictionary.
swimmers = {'Manuel':4, 'Lochte':12, 'Adrian':7, 'Ledecky':5, 'Dirado':4, 'Phelps':23}
swimmers['Phelps'] = 28
print(swimmers)

{'Manuel': 4, 'Lochte': 12, 'Adrian': 7, 'Ledecky': 5, 'Dirado': 4, 'Phelps': 28}


# Dictionary methods
Dictionaries have a number of useful built-in methods. The following table provides a summary and more details can be found in the Python Documentation.

Method

Parameters

Description

keys

none

Returns a view of the keys in the dictionary

values

none

Returns a view of the values in the dictionary

items

none

Returns a view of the key-value pairs in the dictionary

get

key

Returns the value associated with key; None otherwise

get

key,alt

Returns the value associated with key; alt otherwise

As we saw earlier with strings and lists, dictionary methods use dot notation, which specifies the name of the method to the right of the dot and the name of the object on which to apply the method immediately to the left of the dot. The empty parentheses in the case of keys indicate that this method takes no parameters. If x is a variable whose value is a dictionary, x.keys is the method object, and x.keys() invokes the method, returning a view of the value.

The keys method returns the keys, not necessarily in the same order they were added to the dictionary or any other particular order.

In [7]:
inventory = {'apples': 430, 'bananas': 312, 'oranges': 525, 'pears': 217}

for akey in inventory.keys():     # the order in which we get the keys is not defined
    print("Got key", akey, "which maps to value", inventory[akey])

ks = list(inventory.keys())
print(ks)

Got key apples which maps to value 430
Got key bananas which maps to value 312
Got key oranges which maps to value 525
Got key pears which maps to value 217
['apples', 'bananas', 'oranges', 'pears']


In [8]:
#It’s so common to iterate over the keys in a dictionary that you can omit the keys method call in the for loop — iterating over a dictionary implicitly iterates over its keys.
inventory = {'apples': 430, 'bananas': 312, 'oranges': 525, 'pears': 217}

for k in inventory:
    print("Got key", k)

Got key apples
Got key bananas
Got key oranges
Got key pears


In [9]:
#The values and items methods are similar to keys. They return the objects which can be iterated over. Note that the item objects are tuples containing the key and the associated value.
inventory = {'apples': 430, 'bananas': 312, 'oranges': 525, 'pears': 217}

print(list(inventory.values()))
print(list(inventory.items()))

for k in inventory:
    print("Got",k,"that maps to",inventory[k])

[430, 312, 525, 217]
[('apples', 430), ('bananas', 312), ('oranges', 525), ('pears', 217)]
Got apples that maps to 430
Got bananas that maps to 312
Got oranges that maps to 525
Got pears that maps to 217


In [10]:
#The in and not in operators can test if a key is in the dictionary:
inventory = {'apples': 430, 'bananas': 312, 'oranges': 525, 'pears': 217}
print('apples' in inventory)
print('cherries' in inventory)

if 'bananas' in inventory:
    print(inventory['bananas'])
else:
    print("We have no bananas")

True
False
312


This operator can be very useful since looking up a non-existent key in a dictionary causes a runtime error.

The get method allows us to access the value associated with a key, similar to the [ ] operator. The important difference is that get will not cause a runtime error if the key is not present. It will instead return None. There exists a variation of get that allows a second parameter that serves as an alternative return value in the case where the key is not present. This can be seen in the final example below. In this case, since “cherries” is not a key, return 0 (instead of None).

In [12]:
inventory = {'apples': 430, 'bananas': 312, 'oranges': 525, 'pears': 217}

print(inventory.get("apples"))
print(inventory.get("cherries"))

print(inventory.get("cherries",0))

430
None
0


In [13]:
# What is printed by the following statements?

mydict = {"cat":12, "dog":6, "elephant":23, "bear":20}
answer = mydict.get("cat")//mydict.get("dog")
print(answer)

2


In [14]:
#What is printed by the following statements?

mydict = {"cat":12, "dog":6, "elephant":23, "bear":20}
print("dog" in mydict)

True


In [15]:
# hat is printed by the following statements?

mydict = {"cat":12, "dog":6, "elephant":23, "bear":20}
print(23 in mydict)

False


In [16]:
# What is printed by the following statements?

total = 0
mydict = {"cat":12, "dog":6, "elephant":23, "bear":20}
for akey in mydict:
   if len(akey) > 3:
      total = total + mydict[akey]
print(total)

43


In [18]:
#Every four years, the summer Olympics are held in a different country. Add a key-value pair to the dictionary places that reflects that the 2016 Olympics were held in Brazil. Do not rewrite the entire dictionary to do this!
places = {"Australia":2000, "Greece":2004, "China":2008, "England":2012}
places["Brazil"] = 2016
print(places)

{'Australia': 2000, 'Greece': 2004, 'China': 2008, 'England': 2012, 'Brazil': 2016}


In [20]:
#We have a dictionary of the specific events that Italy has won medals in and the number of medals they have won for each event. Assign to the variable events a list of the keys from the dictionary medal_events. Do not hard code this.
medal_events = {'Shooting': 7, 'Fencing': 4, 'Judo': 2, 'Swimming': 3, 'Diving': 2}
events = list(medal_events.keys())
print(events)

['Shooting', 'Fencing', 'Judo', 'Swimming', 'Diving']


# Aliasing and copying
Because dictionaries are mutable, you need to be aware of aliasing (as we saw with lists). Whenever two variables refer to the same dictionary object, changes to one affect the other. For example, opposites is a dictionary that contains pairs of opposites.

In [21]:
opposites = {'up': 'down', 'right': 'wrong', 'true': 'false'}
alias = opposites

print(alias is opposites)

alias['right'] = 'left'
print(opposites['right'])

True
left


In [22]:
# What is printed by the following statements?

mydict = {"cat":12, "dog":6, "elephant":23, "bear":20}
yourdict = mydict
yourdict["elephant"] = 999
print(mydict["elephant"])

999


# Accumulating Multiple Results In a Dictionary
You have previously seen the accumulator pattern; it goes through the items in a sequence, updating an accumulator variable each time. Rather than accumulating a single result, it’s possible to accumulate many results. Suppose, for example, we wanted to find out which letters are used most frequently in English.

Suppose we had a reasonably long text that we thought was representative of general English usage. For our purposes in the this chapter, we will use the text of the Sherlock Holmes story, “A Study in Scarlet”, by Sir Arthur Conan Doyle. The text actually includes a few lines about the source of the transcription (Project Gutenberg), but those will not materially affect our analyses so we will just leave them in. You can access this text within this chapter with the code open('scarlet.txt', 'r').

In [None]:
# If we want to find out how often the letter ‘t’ occurs, we can accumulate the result in a count variable.
f = open('scarlet.txt', 'r')
txt = f.read()
# now txt is one long string containing all the characters
t_count = 0 #initialize the accumulator variable
for c in txt:
    if c == 't':
        t_count = t_count + 1   #increment the counter
print("t: " + str(t_count) + " occurrences")

t: 17584 occurrences

In [None]:
# We can accumulate counts for more than one character as we traverse the text. Suppose, for example, we wanted to compare the counts of ‘t’ and ‘s’ in the text.
f = open('scarlet.txt', 'r')
txt = f.read()
# now txt is one long string containing all the characters
t_count = 0 #initialize the accumulator variable
s_count = 0 # initialize the s counter accumulator as well
for c in txt:
    if c == 't':
        t_count = t_count + 1   #increment the t counter
    elif c == 's':
        s_count = s_count + 1
print("t: " + str(t_count) + " occurrences")
print("s: " + str(s_count) + " occurrences")

t: 17584 occurrences

s: 11830 occurrences

OK, but you can see this is going to get tedious if we try to accumulate counts for all the letters. We will have to initialize a lot of accumulators, and there will be a very long if..elif..elif statement. Using a dictionary, we can do a lot better.

One dictionary can hold all of the accumulator variables. Each key in the dictionary will be one letter, and the corresponding value will be the count so far of how many times that letter has occurred.



In [None]:
f = open('scarlet.txt', 'r')
txt = f.read()
# now txt is one long string containing all the characters
x = {} # start with an empty dictionary
x['t'] = 0  # initialize the t counter
x['s'] = 0  # initialize the s counter
for c in txt:
    if c == 't':
        x['t'] = x['t'] + 1  # increment the t counter
    elif c == 's':
        x['s'] = x['s'] + 1  # increment the s counter

print("t: " + str(x['t']) + " occurrences")
print("s: " + str(x['s']) + " occurrences")

t: 17584 occurrences

s: 11830 occurrences

In [None]:
#This hasn’t really improved things yet, but look closely at lines 8-11 in the code above. Whichever character we’re seeing, t or s, we’re incrementing the counter for that character. So lines 9 and 11 could really be the same.
f = open('scarlet.txt', 'r')
txt = f.read()
# now txt is one long string containing all the characters
x = {} # start with an empty dictionary
x['t'] = 0  # intiialize the t counter
x['s'] = 0  # initialize the s counter
for c in txt:
    if c == 't':
        x[c] = x[c] + 1   # increment the t counter
    elif c == 's':
        x[c] = x[c] + 1   # increment the s counter

print("t: " + str(x['t']) + " occurrences")
print("s: " + str(x['s']) + " occurrences")

In [None]:
f = open('scarlet.txt', 'r')
txt = f.read()
# now txt is one long string containing all the characters
x = {} # start with an empty dictionary
for c in txt:
    if c not in x:
        # we have not seen this character before, so initialize a counter for it
        x[c] = 0

    #whether we've seen it before or not, increment its counter
    x[c] = x[c] + 1

print("t: " + str(x['t']) + " occurrences")
print("s: " + str(x['s']) + " occurrences")


In [None]:
f = open('scarlet.txt', 'r')
txt = f.read()
# now txt is one long string containing all the characters
letter_counts = {} # start with an empty dictionary
for c in txt:
    if c not in letter_counts:
        # we have not seen this character before, so initialize a counter for it
        letter_counts[c] = 0

    #whether we've seen it before or not, increment its counter
    letter_counts[c] = letter_counts[c] + 1

for c in letter_counts.keys():
    print(c + ": " + str(letter_counts[c]) + " occurrences")


Note that only those letters that actually occur in the text are shown. Some punctuation marks that are possible in English, but were never used in the text, are omitted completely. The blank line partway through the output may surprise you. That’s actually saying that the newline character, \\n, appears 5155 times in the text. In other words, there are 5155 lines of text in the file. Let’s test that hypothesis.

In [None]:
f = open('scarlet.txt', 'r')
txt_lines = f.readlines()
# now txt_lines is a list, where each item is one
# line of text from the story
print(len(txt_lines))
print(txt_lines[70:85])

In [25]:
#Provided is a string saved to the variable name sentence. Split the string into a list of words, then create a dictionary that contains each word and the number of times it occurs. Save this dictionary to the variable name word_counts.
sentence = "The dog chased the rabbit into the forest but the rabbit was too quick."

words = sentence.split()
word_counts = {}
for word in words:
    if word not in word_counts:
        word_counts[word] = 0
    word_counts[word] += 1
print(word_counts)

{'The': 1, 'dog': 1, 'chased': 1, 'the': 3, 'rabbit': 2, 'into': 1, 'forest': 1, 'but': 1, 'was': 1, 'too': 1, 'quick.': 1}


In [26]:
#Create a dictionary called char_d from the string stri, so that the key is a character and the value is how many times it occurs.
stri = "what can I do"

char_d = {}
for word in stri:
    if word not in char_d:
        char_d[word] = 0
    char_d[word] += 1
print(char_d)

{'w': 1, 'h': 1, 'a': 2, 't': 1, ' ': 3, 'c': 1, 'n': 1, 'I': 1, 'd': 1, 'o': 1}


# Accumulating Results From a Dictionary
Just as we have iterated through the elements of a list to accumulate a result, we can also iterate through the keys in a dictionary, accumulating a result that may depend on the values associated with each of the keys.

For example, suppose that we wanted to compute a Scrabble score for the Study in Scarlet text. Each occurrence of the letter ‘e’ earns one point, but ‘q’ earns 10. We have a second dictionary, stored in the variable letter_values. Now, to compute the total score, we start an accumulator at 0 and go through each of the letters in the counts dictionary. For each of those letters that has a letter value (no points for spaces, punctuation, capital letters, etc.), we add to the total score.

In [None]:
f = open('scarlet2.txt', 'r')
txt = f.read()
# now txt is one long string containing all the characters
x = {} # start with an empty dictionary
for c in txt:
    if c not in x:
        # we have not seen this character before, so initialize a counter for it
        x[c] = 0

    #whether we've seen it before or not, increment its counter
    x[c] = x[c] + 1

letter_values = {'a': 1, 'b': 3, 'c': 3, 'd': 2, 'e': 1, 'f':4, 'g': 2, 'h':4, 'i':1, 'j':8, 'k':5, 'l':1, 'm':3, 'n':1, 'o':1, 'p':3, 'q':10, 'r':1, 's':1, 't':1, 'u':1, 'v':4, 'w':4, 'x':8, 'y':4, 'z':10}

tot = 0
for y in x:
    if y in letter_values:
        tot = tot + letter_values[y] * x[y]

print(tot)


In [28]:
#The dictionary travel contains the number of countries within each continent that Jackie has traveled to. Find the total number of countries that Jackie has been to, and save this number to the variable name total. Do not hard code this!
travel = {"North America": 2, "Europe": 8, "South America": 3, "Asia": 4, "Africa":1, "Antarctica": 0, "Australia": 1}

total = 0
for key in travel:
    total += travel[key]
total

19

In [30]:
#schedule is a dictionary where a class name is a key and its value is how many credits it was worth. Go through and accumulate the total number of credits that have been earned so far and assign that to the variable total_credits. Do not hardcode.
schedule = {"UARTS 150": 3, "SPANISH 103": 4, "ENGLISH 125": 4, "SI 110": 4, "ENS 356": 2, "WOMENSTD 240": 4, "SI 106": 4, "BIO 118": 3, "SPANISH 231": 4, "PSYCH 111": 4, "LING 111": 3, "SPANISH 232": 4, "STATS 250": 4, "SI 206": 4, "COGSCI 200": 4, "AMCULT 202": 4, "ANTHRO 101": 4}

total_credits = 0
for key in schedule:
    total_credits += schedule[key]
total_credits

63

# Accumulating the Best Key

In [31]:
#Write a program that finds the key in a dictionary that has the maximum value. If two keys have the same maximum value, it’s OK to print out either one. Fill in the skeleton code
d = {'a': 194, 'b': 54, 'c':34, 'd': 44, 'e': 312, 'full':31}

ks = d.keys()
best_key_so_far = list(ks)[0]  # Have to turn ks into a real list before using [] to select an item
for k in ks:
    if d[k] > d[best_key_so_far]:
        best_key_so_far = k

print("key " + best_key_so_far + " has the highest value, " + str(d[best_key_so_far]))


key e has the highest value, 312


In [32]:
#Create a dictionary called d that keeps track of all the characters in the string placement and notes how many times each character was seen. Then, find the key with the lowest value in this dictionary and assign that key to min_value.
placement = "Beaches are cool places to visit in spring however the Mackinaw Bridge is near. Most people visit Mackinaw later since the island is a cool place to explore."
d = {}
for k in placement:
    if k not in d:
        d[k] = 0
    d[k] += 1

ks = d.keys()
min_value = list(ks)[0]  # Have to turn ks into a real list before using [] to select an item
for k in ks:
    if d[k] < d[min_value]:
        min_value = k
print(min_value)        


x


In [33]:
#Create a dictionary called lett_d that keeps track of all of the characters in the string product and notes how many times each character was seen. Then, find the key with the highest value in this dictionary and assign that key to max_value
product = "iphone and android phones"
lett_d = {}
for k in product:
    if k not in lett_d:
        lett_d[k] = 0
    lett_d[k] += 1

ks = lett_d.keys()
max_value = list(ks)[0]  # Have to turn ks into a real list before using [] to select an item
for k in ks:
    if lett_d[k] > lett_d[max_value]:
        max_value = k
print(max_value)


n


In [36]:
#Predict what will print out from the following code. If a line causes a run-time error, comment it out and see whether the rest of your predictions were correct.
d = {'apples': 15, 'grapes': 12, 'bananas': 35}
print(d['bananas'])
d['oranges'] = 20
print(len(d))
print('grapes' in d)
#print(d['pears'])
#print(d.get('pears', 0))
fruits = d.keys()
print(fruits)
del d['apples']
print('apples' in d)

35
4
True
dict_keys(['apples', 'grapes', 'bananas', 'oranges'])
False


#Here’s a table of English to Pirate translations

English

Pirate

sir

matey

hotel

fleabag inn

student

swabbie

boy

matey

madam

proud beauty

professor

foul blaggart

restaurant

galley

your

yer

excuse

arr

students

swabbies

are

be

lawyer

foul blaggart

the

th’

restroom

head

my

me

hello

avast

is

be

man

matey

Write a program that asks the user for a sentence in English and then translates that sentence to Pirate.



In [38]:
pirate = {}
pirate['sir'] = 'matey'
pirate['hotel'] = 'fleabag inn'
pirate['student'] = 'swabbie'
pirate['boy'] = 'matey'
pirate['restaurant'] = 'galley'
#and so on

sentence = input("Please enter a sentence in English")

psentence = []
words = sentence.split()
for aword in words:
    if aword in pirate:
        psentence.append(pirate[aword])
    else:
        psentence.append(aword)

print(" ".join(psentence))


Please enter a sentence in Englishsir
matey


In [None]:
#Write a program that finds the most used 7 letter word in scarlet3.txt.
f = open('scarlet3.txt', 'r')
contents = f.read()
d = {}

for w in contents.split():
    if len(w) == 7:
        if w not in d:
            d[w] = 1
        else:
            d[w] = d[w] + 1

dkeys = d.keys()
most_used = dkeys[0]
for k in dkeys:
    if d[k] > d[most_used]:
        most_used = k

print("The most used word is '"+most_used+"', which is used "+str(d[most_used])+" times")


In [39]:
#Write a program that allows the user to enter a string. It then prints a table of the letters of the alphabet in alphabetical order which occur in the string together with the number of times each letter occurs. Case should be ignored.
x = input("Enter a sentence")

x = x.lower()   # convert to all lowercase

alphabet = 'abcdefghijklmnopqrstuvwxyz'

letter_count = {} # empty dictionary
for char in x:
    if char in alphabet: # ignore any punctuation, numbers, etc
        if char in letter_count:
            letter_count[char] = letter_count[char] + 1
        else:
            letter_count[char] = 1

keys = letter_count.keys()
for char in sorted(keys):
    print(char, letter_count[char])


Enter a sentencekjyug
g 1
j 1
k 1
u 1
y 1
