Dictionaries

@Authors: Sridhar Nerur, Samuel Jayarajan, and Mahyar Vaghefi

In this IPython notebook, we will examine Dictionaries, which are built-in containers that store key-value pairs. A lot of publicly available data are stored in JSON (JavaScript Object Notation) format, which is nothing but a dictionary. So, what you learn in this notebook will be very useful for extracting information from JSON files.
It is very important to to know that keys must be immutable. Therefore, strings and tuples are appropriate for keys, but not lists or sets.

In [1]:
#Let us start by creating an empty dictionary
capitals = {} #dictionary to store countries and their capitals
#Let us add a country as a key and its capital as the value
capitals["USA"] = "Washington D.C."
#let us display out dictionary
capitals

{'USA': 'Washington D.C.'}

Note that {} is an empty dictionary and NOT a set. Remember how we created an empty set? We would use set() to create an empty set.

In [2]:
#let us add a few more countries and their capitals
capitals['Japan'] = 'Tokyo'
capitals['Norway'] = 'Oslo'
capitals['Denmark'] = 'Copenhagen'
capitals

{'USA': 'Washington D.C.',
 'Japan': 'Tokyo',
 'Norway': 'Oslo',
 'Denmark': 'Copenhagen'}

In [3]:
#Let us retrieve the capital of Norway - remember Norway is the key
capitals['Norway']

'Oslo'

What if we use a key that is not in our dictionary?

In [4]:
capitals['India'] #will give an error

KeyError: 'India'

In [5]:
#Is there a better way to retrieve capitals? Yes, use get() instead.
capitals.get('India', "India is not in the dictionary")

'India is not in the dictionary'

In [6]:
capitals.get('Japan', 'Capital of Japan not found')

'Tokyo'

Note that get() returns a capital (i.e., value) if the key exists, otherwise it returns the second parameter (in the example above, that would be 'Capital of Japan not found'.

In [7]:
#can we check to see if a country appears as a key in the dictionary?
'Canada' in capitals

False

In [8]:
'USA' in capitals

True

In [9]:
#deleting a country/key from capitals
del capitals['Japan']
capitals #should not have Japan in it

{'USA': 'Washington D.C.', 'Norway': 'Oslo', 'Denmark': 'Copenhagen'}

In [10]:
#Displaying the keys in our dictionary
capitals.keys()

dict_keys(['USA', 'Norway', 'Denmark'])

In [11]:
#A better way to do it is to convert it to a list as follows
list(capitals.keys())

['USA', 'Norway', 'Denmark']

In [12]:
#How about the values or capitals in our case?
list(capitals.values())

['Washington D.C.', 'Oslo', 'Copenhagen']

In [13]:
#We can also get the key-value pairs in tuples, as shown below
list(capitals.items())

[('USA', 'Washington D.C.'), ('Norway', 'Oslo'), ('Denmark', 'Copenhagen')]

In [14]:
#len() works with dictionaries too. It tells you how many keys
#are in the dictionary
len(capitals)

3

In [19]:
#Dictionaries can be easily sorted by key or by value as shown below
import operator
sorted_list = sorted(capitals.items(), key = operator.itemgetter(0))
sorted_list

[('Denmark', 'Copenhagen'), ('Norway', 'Oslo'), ('USA', 'Washington D.C.')]

Note that sorted takes tuples containing countries and their capitals and sorts them by the first element of the tuple (which is the country). The second parameter, key = operator.itemgetter(0) indicates that the sort should be based on the first element of the tuple. Had it been key = operator.itemgetter(1), it would have sorted by value.

Also, note that the list of tuples can be easily converted to a dictionary by using the keyword dict, as shown below.

In [20]:
sorted_dictionary = dict(sorted_list)
sorted_dictionary

{'Denmark': 'Copenhagen', 'Norway': 'Oslo', 'USA': 'Washington D.C.'}

In [21]:
capitals

{'USA': 'Washington D.C.', 'Norway': 'Oslo', 'Denmark': 'Copenhagen'}

In [22]:
#let us sort by value
sorted_dictionary = dict(sorted(capitals.items(), key = operator.itemgetter(1)))
sorted_dictionary

{'Denmark': 'Copenhagen', 'Norway': 'Oslo', 'USA': 'Washington D.C.'}

In [24]:
#Updating a dictionary using contents from another dictionary
#Suppose we have the following list
more_capitals = {"India": "New Delhi", "China": "Beijing"}
#Let us update capitals with this dictionary
capitals.update(more_capitals)
capitals

{'USA': 'Washington D.C.',
 'Norway': 'Oslo',
 'Denmark': 'Copenhagen',
 'India': 'New Delhi',
 'China': 'Beijing'}

In [1]:
#Going from lists to list of tuples to dictionary
#Suppose we wish to create a dictionary of authors and the
#books they have written given the following lists
authors = ['PG Wodehouse', "Louis L'Amour", "Edgar Wallace"]
books = [["Little Nugget", "The Return of Jeeves"],
         ["To Tame a Land", "The Sacketts"], 
         ["THe Council of Justice", "The Four Just Men"]]
#We will use a function called zip to create a list of tuples
#containing the author and the list of books each has written.

author_books = zip(authors, books)
list(author_books)

[('PG Wodehouse', ['Little Nugget', 'The Return of Jeeves']),
 ("Louis L'Amour", ['To Tame a Land', 'The Sacketts']),
 ('Edgar Wallace', ['THe Council of Justice', 'The Four Just Men'])]

In [4]:
#Let us convert the list of authors and the respective books they have written
#into a dictionsry
author_books = zip(authors, books)
d_author_books = dict(author_books)
d_author_books

{'PG Wodehouse': ['Little Nugget', 'The Return of Jeeves'],
 "Louis L'Amour": ['To Tame a Land', 'The Sacketts'],
 'Edgar Wallace': ['THe Council of Justice', 'The Four Just Men']}

In [46]:
#Let us look at all the methods in a dictionary
help(dict)

Help on class dict in module builtins:

class dict(object)
 |  dict() -> new empty dictionary
 |  dict(mapping) -> new dictionary initialized from a mapping object's
 |      (key, value) pairs
 |  dict(iterable) -> new dictionary initialized as if via:
 |      d = {}
 |      for k, v in iterable:
 |          d[k] = v
 |  dict(**kwargs) -> new dictionary initialized with the name=value pairs
 |      in the keyword argument list.  For example:  dict(one=1, two=2)
 |  
 |  Methods defined here:
 |  
 |  __contains__(self, key, /)
 |      True if D has a key k, else False.
 |  
 |  __delitem__(self, key, /)
 |      Delete self[key].
 |  
 |  __eq__(self, value, /)
 |      Return self==value.
 |  
 |  __ge__(self, value, /)
 |      Return self>=value.
 |  
 |  __getattribute__(self, name, /)
 |      Return getattr(self, name).
 |  
 |  __getitem__(...)
 |      x.__getitem__(y) <==> x[y]
 |  
 |  __gt__(self, value, /)
 |      Return self>value.
 |  
 |  __init__(self, /, *args, **kwargs)
 |

In [6]:
#let us see what pop does
#let us pop one of our authors - it should return the value and remove the author
d_author_books.pop("Louis L'Amour")

['To Tame a Land', 'The Sacketts']

In [7]:
d_author_books #Louis L'Amour should have been removed

{'PG Wodehouse': ['Little Nugget', 'The Return of Jeeves'],
 'Edgar Wallace': ['THe Council of Justice', 'The Four Just Men']}

In [8]:
#What if we try to pop an author who is not in the dictionary
d_author_books.pop("JK Rowling")

KeyError: 'JK Rowling'

In [9]:
#we can avoid the error by doing the following instead
d_author_books.pop("JK Rowling", "Author not found")

'Author not found'

In [10]:
#you can also pop a random key and value as follows
d_author_books.popitem()

('Edgar Wallace', ['THe Council of Justice', 'The Four Just Men'])

In [11]:
d_author_books

{'PG Wodehouse': ['Little Nugget', 'The Return of Jeeves']}

Methods such as clear() are self-explanatory. You may remember the copy() method from our lesson on lists. copy() creates a shallow copy.

That is all that you need to know about dictionaries. We will use this structure quite frequently throughout the course.