# Dictionaries and structuring data

In [1]:
myCat = {"size": "fat", "color": "grey", "disposition": "loud"}
print (myCat["size"])

fat


In [2]:
print ("My cat has " + myCat["color"] + " fur.")

My cat has grey fur.


Like a list, a dictionary is a collection of many values. But unlike indexes for lists, indexes for dictionaries can use many different data types, not just integers. Indexes for dictionaries are called keys, and a key with its associated value is called a key-value pair.

Dictionaries can still use integer values as keys, just like lists use integers for indexes, but they do not have to start at 0 and can be any number.

Unlike lists, items in dictionaries are unordered. The first item in a list named spam would be `spam[0]`. But there is no “first” item in a dictionary. While the order of items matters for determining whether two lists are the same, it does not matter in what order the key-value pairs are typed in a dictionary. 

List:


In [3]:
spam = ["cats", "dogs", "moose"]
bacon = ["dogs", "moose", "cats"]
spam == bacon

False

Dictionary:

In [4]:
person1 = {"name": "Silfa", "species": "human", "age": "33"}
person2 = {"species": "human", "age": "33", "name": "Silfa"}
person1 == person2


True

Because dictionaries are not ordered, they can’t be sliced like lists.

Though dictionaries are not ordered, the fact that you can have arbitrary values for the keys allows you to organize your data in powerful ways. Say you wanted your program to store data about your friends’ birthdays. You can use a dictionary with the names as keys and the birthdays as values.

In [5]:
#Create an initial dictionary and store it in birthdays
birthdays = {"Alice": "Apr 1", "Bob": "Dec 12", "Carol": "Mar 4"}

#Create an input for name
while True:
    print ("Enter a name: (blank to quit)")
    name = input()
    if name == "":
        break

#You can see if the entered name exists as a key in the dictionary with the "in" keyword
#If the name is in the dictionary, you access the associated value with square brackets  
    if name in birthdays:
       
        print (name + "'s birthday is on " + birthdays[name])
     
        
#If name is not yet in the dictionary, add it 
    else: 
        print ("I do not have birthday information for " + name)
        print ("What is their birthday?")
        bday = input()
        birthdays[name] = bday
        print ("Birthday database updated")

Enter a name: (blank to quit)
Duncan
I do not have birthday information for Duncan
What is their birthday?
2 July 1984
Birthday database updated
Enter a name: (blank to quit)
Silfa
I do not have birthday information for Silfa
What is their birthday?
28 July 1984
Birthday database updated
Enter a name: (blank to quit)
Duncan
Duncan's birthday is on 2 July 1984
Enter a name: (blank to quit)
Soofa
I do not have birthday information for Soofa
What is their birthday?
quit
Birthday database updated
Enter a name: (blank to quit)



## The keys(), values() and items() methods


There are three dictionary methods that will return list-like values of the dictionary’s keys, values, or both keys and values: `keys()`, `values()`, and `items()`. The values returned by these methods are not true lists: They cannot be modified and do not have an `append()` method. But these data types (`dict_keys`, `dict_values`, and `dict_items`, respectively) can be used in for loops. 

In [6]:
spam1 = {"color": "red", "age": "42"}
for v in spam1.values():
    print(v)

red
42


A `for` loop iterates over each of the values in the `spam1` dictionary. A `for` loop can also iterate over the keys or both keys and values.

In [7]:
for k in spam1.keys():
    print(k)

color
age


In [8]:
for i in spam1.items():
    print(i)

('color', 'red')
('age', '42')


Using the `keys()`, `values()`, and `items()` methods, a `for` loop can iterate over the keys, values or key-value pairs in a dictionary, respectively. Notice that the values in the `dict_items` value returned by the `items()` method are tuples of the key and value. 

If you want a true list from one of these methods, pass its list-like return value to the `list()` function. 

In [9]:
spam1.keys()

dict_keys(['color', 'age'])

In [10]:
list(spam1.keys())

['color', 'age']

The `list(spam1.keys())` line takes the `dict_keys` value returned from `keys()` and passes it to `list()`, which then returns a list value of `{"color", "age"}`.

You can also use the multiple assignment trick in a `for` loop to assign the key and value to separate variables. 

In [11]:
for k, v in spam1.items():
    print("Key: " + k + ", Value: " + str(v))

Key: color, Value: red
Key: age, Value: 42


## Checking whether a key or value exists in a dictionary

The `in` and `not in` operators can check whether a value exists in a list. You can also use these operators to see whether a certain key or value exists in a dictionary.

In [12]:
spam3 = {"name": "Silfa", "age": "33"}
"name" in spam3.keys()

True

In [13]:
"Silfa" in spam3.values()

True

In [14]:
"Color" not in spam3.keys()

True

In [15]:
"color" in spam3.keys()

False

In [16]:
"color" in spam3 #a shorter version of the previous command

False



If you ever want to chekc whether a value is, or is not, a key in the dictionary, you can simply use the `in` or `not in` keyword with the dictionary value itself.

In [17]:
"Silfa" in spam1

False

In [18]:
"Silfa" in spam1.values()

False

## The get() method

It's tedious to check whether a key exists in a dictionary before accessing that key's value. Fortunately, dictionaries have a `get()` method that takes two arguments: the key of the value to retrieve and a fallback value to return if that key does not exist. 

In [19]:
picnic_items = {"apples": 5, "cups": 2}
print ("I am bringing " + str(picnic_items.get("cups", 0)) + " cups.")

I am bringing 2 cups.


In [20]:
print ("I am bringing also " + str(picnic_items.get("eggs", 0)) + " eggs.")

I am bringing also 0 eggs.


Because there is no `eggs` key in the `picnic_items` dictionary, the default value `0` is returned by the `get()` method. Without using `get()`, the code would have caused an error message, such as in the following example:

In [21]:
print ("I am bringing " + str(picnic_items["eggs"]) + " eggs.")

KeyError: 'eggs'

## The setdefault() method

You will often have to set a value in a dictionary for a certain key only if that key does not already have a value. E.g. 

In [None]:
spam4 = {"name": "Pooka", "age": "3"}

if "color" not in spam:
    spam4["color"] = "black"

print (spam4)

The `setdefault()` method offers a way to do this in one line of code. The first argument passed to the method is the key to check for, and the second argument is the value to set at that key if the key does not exist. If the key does exist, the `setdefault()` method returns the key's value. 

In [None]:
spam4.setdefault("size", "short")



The value for a key in the `setdefault()` method is not changed by providing a new value for it. 

In [None]:
print (spam4)


In [None]:
spam4.setdefault("size", "large")

In [None]:
print (spam4)

The `setdefault()` method is a nice shortcut to ensure that a key exists. 


### Counting letters in a string
Here is a short program that counts the number of occurrence of each letter in a string. 

The program loops over each character in the `message` variable's string, counting how often each character appears. The `setdefault()` method call ensures that the key is in the `count` dictionary (with the default value of 0) so that the program doesn't throw a `KeyError` when `count[character] + 1` is executed. 

In [None]:
message = "It was a bright cold day in April, and the clocks were striking thirteen."
count = {}

for character in message:
    count.setdefault(character, 0)
    count[character] = count[character] + 1

print (count)


print (list(count.values()))

From the output, you can see that the lowercase letter c appears 3 times, the space character appears 13 times, and the uppercase A appears 1 time. This program will work no matter what string is inside the `message` variable, even if the string is millions of characters long!

### Pretty printing

If you import the `pprint` module into your pgrams, you'll have access to the `pprint()` and `pformat()` functions that will "pretty print" a dictionary's values. This is helpful when you want a cleaner display of the items in a dictionary than what `print()` provides. 

In [None]:
import pprint
pprint.pprint(count)


This output looks much cleaner, and the keys are sorted.

The `.pprint()` function is especially helpful when the dictionary itself contains nested lists or dictionaries.


If you want to obtain the prettified text as a string value instead of displaying it on the screen, call `pprint.pformat()` instead. 


In [None]:
print(pprint.pformat(count))
print(type(pprint.pformat(count)))

## Using data structures to model real-world things

### Exercise: A tic-tac-toe board

A tic-tac-toe board looks like a large hash symbol (#) with nine slots that can each contain an X, an O, or a blank. To represent the board with a dictionary, you can assign each slot a string-value key. 

You can use string values to represent what’s in each slot on the board: 'X', 'O', or ' ' (a space character). Thus, you’ll need to store nine strings. You can use a dictionary of values for this. The string value with the key 'top-R' can represent the top-right corner, the string value with the key 'low-L' can represent the bottom-left corner, the string value with the key 'mid-M' can represent the middle, and so on.

This dictionary is a data structure that represents a tic-tac-toe board. Store this board-as-a-dictionary in a variable named theBoard. Open a new file editor window, and enter the following source code

In [23]:
theBoard = {"top-L": " ", "top-M": " ", "top-R": " ", 
            "mid-L": " ", "mid-M": " ", "mid-R": " ",
            "low-L": " ", "low-M": " ", "low-R": " "}


The data structure stored in the `theBoard` variable represents the tic-tac-toe board. Since the value for every key in `theBoard` is a single-spaced string, this dictionary represents a completely clear board. If play X went firt and chose the middle space, you could represent that board with the same dictionary, only with the entry `"mid-M": "X"`.

Let's create a function to print the board dictionary onto the screen. 

In [24]:
def printBoard(board):
    print(board['top-L'] + '|' + board['top-M'] + '|' + board['top-R'])
    print('-+-+-')
    print(board['mid-L'] + '|' + board['mid-M'] + '|' + board['mid-R'])
    print('-+-+-')
    print(board['low-L'] + '|' + board['low-M'] + '|' + board['low-R'])
printBoard(theBoard)


 | | 
-+-+-
 | | 
-+-+-
 | | 


If you change the `theBoard` dictionary, the modeled board will change.

Because you created a data structure to represent a tic-tac-toe board and wrote code in `printBoard()` to interpret that data structure, you now have a program that "models" the tic-tac-toe board.

The `printBoard()` function expects the tic-tac-toe data structure to be a dictionary with keys for all nine slots. If the dictionary you passed was missing, e.g. the `mid-L` key, your program would no longer work.

Now let's add code that allows the players to enter their moves. 

In [25]:
turn = "X"

for i in range(9):
    #print the board at the start of each turn
    printBoard(theBoard)
    print ("Turn for " + turn + ". Move on which space?")
    
    #get the active player's move
    move = input()
    
    #update the game board accordingly
    theBoard[move] = turn
    
    #swap player
    if turn == "X":
        turn = "O"
    else:
        turn = "X"

printBoard(theBoard)

 | | 
-+-+-
 | | 
-+-+-
 | | 
Turn for X. Move on which space?
mid-M
 | | 
-+-+-
 |X| 
-+-+-
 | | 
Turn for O. Move on which space?
top-R
 | |O
-+-+-
 |X| 
-+-+-
 | | 
Turn for X. Move on which space?
low-R
 | |O
-+-+-
 |X| 
-+-+-
 | |X
Turn for O. Move on which space?
top-L
O| |O
-+-+-
 |X| 
-+-+-
 | |X
Turn for X. Move on which space?
top-M
O|X|O
-+-+-
 |X| 
-+-+-
 | |X
Turn for O. Move on which space?
low-M
O|X|O
-+-+-
 |X| 
-+-+-
 |O|X
Turn for X. Move on which space?
mid-L
O|X|O
-+-+-
X|X| 
-+-+-
 |O|X
Turn for O. Move on which space?
mid-R
O|X|O
-+-+-
X|X|O
-+-+-
 |O|X
Turn for X. Move on which space?
low-R
O|X|O
-+-+-
X|X|O
-+-+-
 |O|X


## Nested dictionaries and lists

Modeling a tic-tac-toe board was fairly simple: The board needed only a single dictionary value with nine key-value pairs. 

As you model more complicated things, you may find you need dictionaries and lists that contain other dictionaries and lists. Lists are useful to contain an ordered series of values, and dictionaries are useful for associating keys with values. 

For example, here’s a program that uses a dictionary that contains other dictionaries in order to see who is bringing what to a picnic. The `totalBrought()` function can read this data structure and calculate the total number of an item being brought by all the guests.

In [30]:
allGuests = {"Alice": {"apples": 5, "pretzels": 12}, "Bob": {"ham sandwiches": 3, "apples": 2}, "Carol": {"cups": 3, "apple pies": 1}}

#define a function with the formal parameters guests and item
def totalBrought(guests, item): 
    numBrought = 0
    
    #inside the totalBrought function, the for loop iterates over the key-value pairs in guests
    for k, v in guests.items():
        #get() gets a key's value. If the key does not exist, the value is 0
        numBrought = numBrought + v.get(item, 0)
    return numBrought

print ("Number of things being brought:")

#function totalBrought is called, with guests = allGuests and item = "apples", "cups", etc.
#function is called with actual parameters 
print (" - Apples         " + str(totalBrought(allGuests, "apples")))
print (" - Cups           " + str(totalBrought(allGuests, "cups")))
print (" - Cakes          " + str(totalBrought(allGuests, "cakes")))
print (" - Ham Sandwiches " + str(totalBrought(allGuests, "ham sandwiches")))
print (" - Apple Pies     " + str(totalBrought(allGuests, "apple pies")))

Number of things being brought:
 - Apples         7
 - Cups           3
 - Cakes          0
 - Ham Sandwiches 3
 - Apple Pies     1


To remind you how functions work, here is another example of a function. in `sumProblem`, `x` and `y` are the formal parameters. When the function is called later on (inside another function called `main()`, actual parameters are provided). 

In [31]:
def sumProblem(x, y):
    sum = x + y
    sentence = "The sum of {} and {} is {}.".format(x,y,sum)
    print(sentence)

def main():
    sumProblem(2, 3)
    sumProblem(1234456678, 123823842)
    a = int(input("Enter an integer: "))
    b = int(input("Enter another integer: "))
    sumProblem(a, b)

main()

The sum of 2 and 3 is 5.
The sum of 1234456678 and 123823842 is 1358280520.
Enter an integer: 3
Enter another integer: 345
The sum of 3 and 345 is 348.


## Summary

Lists and dictionaries are values that contain multiple values, including other lists and dictionaries. Dictionaries are useful because you can map one item (the key) to another (the value), as opposed to lists, which simply contain a series of values in order. Values inside a dictionary are accessed using square brackets just as with lists. Instead of an integer index, dictionaries can have keys of a variety of data types: integers, floats, strings or tuples. By organising a program's values into data structures, you can create representations of real-world objects.