# Introduction
In this section, we will turn our attention to the dictionary data structure. The dictionary is an extremely flexible data structure, but it is also quite complex and can be difficult to understand. We will start by looking at similarities and differences between lists and dictionaries. As with lists, we will cover issues related to accessing and modifying dictionaries. We conclude with a discussion of example uses for dictionaries.  

# Dictionaries
Like lists, a collection of values. However, in a list, the elements are organized as a sequence of elements. In a dictionary they are organized as a collection of related data elements. This means that there is no order to a dictionary in the way that list elements are ordered. Like a list, dictionaries can include characters, numbers, strings or even other lists/dictionaries. 

Dictionaries can be thought of as similar in purpose and function as a typical english dictionary. Consider the following entry:
<br><img src="http://thislondonhouse.hopto.org/Jupyter/Images/10-Dictionaries-01.png" /><br>
Presumably, 'python' is an entry in a word dictionary. This entry has other associated data elements such as definitions (a list of definitions), parts of speech (a list of parts of speech), language of origin, phonetic spelling, pronunciation, etc. Each element has its own value(s). So, when we think about a word dictionary, we know it has a hierarchical structure. The dictionary datatype allows us to creat similar structures within our python applications.

In [2]:
myList = ['whether','I','shall','turn','out','to','be','the','hero','of','my','own','life','or','whether','that','station','will','be','held','by','anybody','else','these','pages','must','show'] 
myDict = {'whether','I','shall','turn','out','to','be','the','hero','of','my','own','life','or','whether','that','station','will','be','held','by','anybody','else','these','pages','must','show'} 
emptyDict = {} 

In [3]:
print(myList)
print(type(myList))
print(len(myList))

['whether', 'I', 'shall', 'turn', 'out', 'to', 'be', 'the', 'hero', 'of', 'my', 'own', 'life', 'or', 'whether', 'that', 'station', 'will', 'be', 'held', 'by', 'anybody', 'else', 'these', 'pages', 'must', 'show']
<class 'list'>
27


In [4]:
print(myDict)
print(type(myDict))
print(len(myDict))

{'of', 'will', 'turn', 'that', 'show', 'whether', 'my', 'shall', 'own', 'station', 'hero', 'by', 'anybody', 'be', 'pages', 'life', 'I', 'else', 'must', 'held', 'these', 'the', 'or', 'to', 'out'}
<class 'set'>
25


In [5]:
print(myList[0])

whether


In [6]:
print(myDict['whether'])

TypeError: 'set' object is not subscriptable

In [7]:
for item in myList:
    print(item)

whether
I
shall
turn
out
to
be
the
hero
of
my
own
life
or
whether
that
station
will
be
held
by
anybody
else
these
pages
must
show


In [8]:
for item in myDict:
    print(item)

of
will
turn
that
show
whether
my
shall
own
station
hero
by
anybody
be
pages
life
I
else
must
held
these
the
or
to
out


These examples show how dictionaries and lists are related and how they are different. The first variable is a list of words, the second variable is a dictionary of words. While this obscures the value of dictionaries, it illustrates how dictionaries and lists are similar. What sets dictionaries apart from lists is the flexibility afforded by organizing elements by keys instead of by position. For example, in a list, elements are ordered and you need to know the position of an element to be able to access the value stored in that position. In dictionaries, values are stored by a key value. Consider the following values: 

In [9]:
aList = ['Charles Dickens', 'm', 58, [1812, 1870]] 
aDict = {'name':'Charles Dickens', 'gender':'m', 'age':58, 'life_range':[1812, 1870]} 

In [10]:
print(len(aList))
print(len(aDict))

4
4


In [11]:
print(aList[2])

58


In [12]:
print(aDict['age'])

58


In the first example, the elements are simply stored in an ordered list and you need to know the position of each element to access it. If the list changes, your application will have to keep track of these changes so that it doesn’t lose access to values. In the second example, the values are stored in a contextual list. Regardless of how the dictionary changes over time, you will always be able to access the value of the author’s name by referring to the ‘name’ key. 

## Accessing Dictionaries 

### Referencing
To access a specific element in a list, you use the bracket notation. You use the same notation for dicationaries, but dictionaries are unordered which means that the index position of an element is meaningless. Instead of using an index to reference a value, you use a key. 

In [13]:
print(aList[0]) 

Charles Dickens


In [14]:
print(aList[0] + " died in " + str(aList[3][1]) + " at age " + str(aList[2])) 

Charles Dickens died in 1870 at age 58


In [15]:
print(aDict['name']) 

Charles Dickens


In [17]:
print(aDict['name'] + " died in " + str(aDict['life_range'][1]) + " at age " + str(aDict['age'])) 

Charles Dickens died in 1870 at age 58


The first line prints the first element in the list. The second line prints the element that is associated with the ‘name’ key. The values are the same but the means of accessing the values differs.  

### Traversing
When traversing a list, the pointer starts at the first item and retrieves each item in order. When traversing a dictionary, the pointer starts at the first key and traverses each key.

In [18]:
for aItem in aList: 
    print("Item [" + str(aList.index(aItem)) + "]: " + str(aItem)) 

Item [0]: Charles Dickens
Item [1]: m
Item [2]: 58
Item [3]: [1812, 1870]


In [19]:
for aKey in aDict: 
    print("Item ['" + str(aKey) + "']: " + str(aDict[aKey]))

Item ['life_range']: [1812, 1870]
Item ['age']: 58
Item ['name']: Charles Dickens
Item ['gender']: m


In [20]:
for aKey, aValue in aDict.items():
    print("Item ['" + str(aKey) + "']: " + str(aValue))

Item ['life_range']: [1812, 1870]
Item ['age']: 58
Item ['name']: Charles Dickens
Item ['gender']: m


In the first loop, python loops through each element and sets the variable aItem equal to the current element in the list. In the second loop, python loops through each key and sets the aKey variable equal to the value of the key. Elements in the dictionary are then accessible via the retrieved key. 

## Modifying Dictionaries
Dictionaries are malleable in many of the same ways that lists are malleable. You can add and delete items. Many of the methods associated with lists are available to dictionaries as well. 

### Adding Elements
When adding elements to a dictionary,  you start by adding a key. Because dictionaries are unordered, it doesn’t make sense to add an element without an associated key.  


In [34]:
print(aDict)

{'novels': ['Great Expectations', 'David Copperfiled', 'Tale of Two Cities', 'Bleak House'], 'name': 'Charles Dickens', 'occupation': 'authors', 'age': 58, 'life_range': [1812, 1870], 'nationality': 'English', 'gender': 'm'}


In [25]:
aDict['occupation'] = 'authors' 

In [29]:
aDict.setdefault('nationality', 'British') 

'English'

In [31]:
novelsList = ["Great Expectations", "David Copperfiled", "Tale of Two Cities"] 
aDict['novels'] = novelsList 

As with lists, you can add strings, numbers, lists or even other dictionaries. The .setdefault() method creates a new key and gives it an initial value. It can be useful to use if you are unsure whether a key exists. If it doesn’t exist, .setdefault() will create it with an initial value. If it does exist it will leave the key and its value alone. 

When referencing elements in a dictionary, you can edit those elements just as if they were individual variables. 

In [33]:
aDict['novels'].append("Bleak House") 

In [35]:
print(aDict['novels'])

['Great Expectations', 'David Copperfiled', 'Tale of Two Cities', 'Bleak House']


In [36]:
print(len(aDict['novels']))

4


In [37]:
print(aDict['novels'].index('Tale of Two Cities'))

2


In [38]:
print(aDict['novels'][3])

Bleak House


In [40]:
authorDict = {'Charles Dickens':{'gender':'m', 'age':58, 'life_range':[1812, 1870], 'novels':["Great Expectations", "David Copperfiled"]}}
authorDict['Jane Austen'] = {'gender':'f', 'age':41, 'life_range':[1787, 1817], 'novels':['Pride and Prejudice', 'Emma']}
authorDict['George Eliot'] = {'gender':'f', 'age':61, 'life_range':[1819, 1880], 'novels':['Middlemarch']}
print(authorDict.keys())

dict_keys(['Charles Dickens', 'George Eliot', 'Jane Austen'])


In [41]:
print(authorDict['George Eliot']['gender'])

f


**Note:** Keys in a dictionary must be unique. The example above assumes that there would never be another author with the same name in our data structure. This is likely, but not guaranteed. The example above should only serve as an example and not a best practice. In situations where you are storing multiple collections of like data, it is better to use a list structure where each record is appended to the list as in the example below.

In [42]:
authorList = []
authorList.append({'name':'Charles Dickens', 'gender':'m', 'age':58, 'life_range':[1812, 1870], 'novels':["Great Expectations", "David Copperfiled"]})
authorList.append({'name':'Jane Austen', 'gender':'f', 'age':41, 'life_range':[1787, 1817], 'novels':['Pride and Prejudice', 'Emma']})
authorList.append({'name':'George Eliot', 'gender':'f', 'age':61, 'life_range':[1819, 1880], 'novels':['Middlemarch']})

In this code, dictionaries representing data about each author are appended to a list. This ensures that keys are not duplicated and that all data associated with each author are stored in a dictionary.

In [43]:
for author in authorList:
    print(author['name'], "\t", author['gender'], "\t", author['age'], "\t", author['life_range'], "\t", author['novels'])

Charles Dickens 	 m 	 58 	 [1812, 1870] 	 ['Great Expectations', 'David Copperfiled']
Jane Austen 	 f 	 41 	 [1787, 1817] 	 ['Pride and Prejudice', 'Emma']
George Eliot 	 f 	 61 	 [1819, 1880] 	 ['Middlemarch']


### Deleting Elements
In dictionaries, elements are deleted when keys are deleted. The .pop method accepts a key value as input and it searches for the key and deletes the reference to it. 

In [44]:
print(aList[0] + " died in " + str(aList[3][1]) + " at age " + str(aList[2]))
print(aDict['name'] + " died in " + str(aDict['life_range'][1]) + " at age " + str(aDict['age']))

Charles Dickens died in 1870 at age 58
Charles Dickens died in 1870 at age 58


In [45]:
aList.pop(1) 
print(aList[0] + " died in " + str(aList[3][1]) + " at age " + str(aList[2]))

IndexError: list index out of range

In [46]:
aDict.pop('gender') 
print(aDict['name'] + " died in " + str(aDict['life_range'][1]) + " at age " + str(aDict['age']))

Charles Dickens died in 1870 at age 58


The code above illustrates an advantage of dictionaries. Because the dictionary is unordered, changes to the contents of a dictionary do not affect the ways in which you reference other elements in the data structure. 

### Dictionary Methods
Dictionaries share most of their methods with lists, but there are two useful methods that are unique to dictionaries: .keys() and .values(). The .keys() method returns a list of all keys in the dictionary. This method is not recursive (meaning it won’t return the keys of a dictionary inside a dictionary), and will only show the keys of the elements at the root level of the specified dictionary. 

In [48]:
print(authorDict.keys()) 

dict_keys(['Charles Dickens', 'George Eliot', 'Jane Austen'])


In [49]:
if 'Charles Dickens' in authorDict.keys(): 
    print(True) 
else: 
    print(False) 

True


The .keys() method is particularly useful because, as discussed above, referencing a key that does not exist will raise an exception in your code. Therefore, the .setdefault() and .keys() methods both provide a mechanism for verifying the existence of keys without causing an error. 

Similarly, the .values() method returns a list of all values represented in the dictionary. This will return all values in the dictionary including those in subdictionaries, but all lower-level data elements will maintain their structure. 

In [50]:
print(authorDict.values()) 

dict_values([{'life_range': [1812, 1870], 'novels': ['Great Expectations', 'David Copperfiled'], 'gender': 'm', 'age': 58}, {'life_range': [1819, 1880], 'novels': ['Middlemarch'], 'gender': 'f', 'age': 61}, {'life_range': [1787, 1817], 'novels': ['Pride and Prejudice', 'Emma'], 'gender': 'f', 'age': 41}])


In [51]:
if 'f' in authorDict.values(): 
    print(True) 
else: 
    print(False) 
print(len(authorDict.values()))

False
3


If we wanted to traverse our dictionary and look for entries that have specific values, we could use the .items() method which returns an iteratble tuple containing the key-value pairs for each entry in the dictionary. See the code below:

In [47]:
for author, authorData in authorDict.items():
    if authorData['gender'] == 'f':
        print(author + ' is female')

George Eliot is female
Jane Austen is female


Another method that may be useful is the .copy() method. When creating dictionaries, you cannot create a copy by creating a new variable and setting that variable equal to your existing dictionary (You can do this, but it creates a referene rather than a copy...which means that any changes made to either dictionary are replicated in the other). To create an independent copy, you will need to use the .copy() method. Consider the following code:

In [57]:
aDict = {'name':'Charles Dickens', 'gender':'m', 'age':58, 'life_range':[1812, 1870]} 

In [59]:
aNewDict = aDict

In [63]:
print(aNewDict)
print(aDict)

{'life_range': [1812, 1870], 'name': 'Charles Dickens', 'gender': 'm', 'age': 58}
{'life_range': [1812, 1870], 'name': 'Charles Dickens', 'gender': 'm'}


In [62]:
aDict.pop('age')

58

In [61]:
aNewDict = aDict.copy()

### Dictionary Functions
As with dictionary methods, many of the functions available to lists are available to dictionaries. The sorted() function accepts a dictionary as input and returns a list of dictionary keys sorted alphabetically. This can be useful in instances where you want to force order on a dictionary (which is unordered by nature).

In [64]:
sortedKeys = sorted(aDict)
print(sortedKeys)

['gender', 'life_range', 'name']


Consider the following example that creates a dictionary of common words. This dictionary is used to keep track of the number of times a common word appears in the string. First the dictionary of common words is initialized with each key representing a common word and the value of the key set to zero (the number of times the word has appeared). The string is split on whitespace to create a list of words and each word is check 

In [65]:
copperFieldIntro = """Whether I shall turn out to be the hero of my own life, or whether that station will be held by anybody else, these pages must show. To begin my life with the beginning of my life, I record that I was born (as I have been informed and believe) on a Friday, at twelve o'clock at night. It was remarked that the clock began to strike, and I began to cry, simultaneously."""
commonWordDict = {'to':0,'be':0,'the':0,'of':0,'or':0,'that':0,'and':0}
for aWord in copperFieldIntro.split():
    if aWord in commonWordDict.keys():
        print("Found '" + aWord + "' key")
        commonWordDict[aWord] += 1
print(commonWordDict)

Found 'to' key
Found 'be' key
Found 'the' key
Found 'of' key
Found 'or' key
Found 'that' key
Found 'be' key
Found 'the' key
Found 'of' key
Found 'that' key
Found 'and' key
Found 'that' key
Found 'the' key
Found 'to' key
Found 'and' key
Found 'to' key
{'be': 2, 'of': 2, 'the': 3, 'or': 1, 'that': 3, 'to': 3, 'and': 2}


The next two blocks of codes consider different ways of presenting the data stored in the common words dictionary. In the first block, the keys are sorted alphabetically and used to iterate through the elements in the dictionary. In the second block, the keys are sorted based on their values and this list is then used to iterate through the dicitionary elements.

In [66]:
print(sorted(commonWordDict))
for word in sorted(commonWordDict):
    print(word, commonWordDict[word])

['and', 'be', 'of', 'or', 'that', 'the', 'to']
and 2
be 2
of 2
or 1
that 3
the 3
to 3


In [67]:
sortedKeys = sorted(commonWordDict, key=commonWordDict.__getitem__, reverse=True)
for word in sorted(commonWordDict, key=commonWordDict.__getitem__, reverse=True):
    print(word, commonWordDict[word])

the 3
that 3
to 3
be 2
of 2
and 2
or 1


In [68]:
freqWord = ""
for key, value in commonWordDict.items():
    if freqWord == "" or commonWordDict[freqWord] < value:
        freqWord = key

print(freqWord)

the


# Exercise
Write code to create an empty dictionary. Add entries for your immediate family. Use their names as keys and let each key contain a dictionary as a value. The nested dictionary should contain keys and values about each member of your family (e.g., age, height, birthday, etc.).

In [None]:
# Step 1...

# Step 2...