<h1>Default Dictionaries </h1>

<p>
Dictionaries are a convenient way to store data for later retrieval by name (key). Keys must be unique, immutable objects, and are typically strings. The values in a dictionary can be anything. For many applications the values are simple types such as integers and strings.
</p>

<p>
It gets more interesting when the values in a dictionary are collections (lists, dicts, etc.) In this case, the value (an empty list or dict) must be initialized the first time a given key is used. While this is relatively easy to do manually, the defaultdict type automates and simplifies these kinds of operations.
</p>

<p>
A defaultdict works exactly like a normal dict, but it is initialized with a function (“default factory”) that takes no arguments and provides the default value for a nonexistent key.
</p>

<b><p><font style="background-color:yellow">A defaultdict will never raise a KeyError. Any key that does not exist gets the value returned by the default factory.</font></p><b>


In [1]:
from collections import defaultdict
ice_cream = defaultdict(lambda: 'Vanilla') # will print defaultdict(<function <lambda> at 0x0000000003FE7E48>, {})

print ice_cream 

ice_cream = defaultdict(lambda: 'Vanilla')
ice_cream['Sarah'] = 'Chunky Monkey'
ice_cream['Abdul'] = 'Butter Pecan'
print ice_cream['Sarah'] # will print Chunky Monkey - a value already defined in dictionary.
print ice_cream['Joe'] # will print Vanilla, the default dictionary value.


defaultdict(<function <lambda> at 0x0000000003FE7E48>, {})
Chunky Monkey
Vanilla


Be sure to pass the function object to defaultdict(). Do not call the function, i.e. defaultdict(func), not defaultdict(func()).

<h2>Using Default Dictionaries to Build Dictonary Counters </h2>

In the following example, a defaultdict is used for counting. The default factory is int, which in turn has a default value of zero. (Note: “lambda: 0″ would also work in this situation). For each food in the list, the value is incremented by one where the key is the food. We do not need to make sure the food is already a key – it will use the default value of zero.

In [18]:
# First let's see if we use a conventional dictionary to iterate accross a list of food items
# and count the number times each food type occurs in the list 
food_list = 'spam spam spam spam spam spam eggs spam'.split()
food_count = {'spam':0} # create dictionary with only spam.

for food in food_list:
    food_count[food] += 1 # increment element's value by 1 # this will throw a key error 
                          # when it hits the 'egg' value because there is no entry in dictionary to access the 'egg' key



KeyError: 'eggs'

In [12]:
from collections import defaultdict
food_list = 'spam spam spam spam spam spam eggs spam'.split()
food_count = defaultdict(int) # default value of int is 0
for food in food_list:
    food_count[food] += 1 # increment element's value by 1

for l in food_count:
    print l, food_count[l] 
print food_count #prints defaultdict(<type 'int'>, {'eggs': 1, 'spam': 7})


eggs 1
spam 7
defaultdict(<type 'int'>, {'eggs': 1, 'spam': 7})


<h2>A More Complex Example </h2>
In the next example, we start with a list of states and cities. We want to build a dictionary where the keys are the state abbreviations and the values are lists of all cities for that state. To build this dictionary of lists, we use a defaultdict with a default factory of list. A new list is created for each new key.

In [19]:
from collections import defaultdict
city_list = [('TX','Austin'), ('TX','Houston'), ('NY','Albany'), ('NY', 'Syracuse'), ('NY', 'Buffalo'), ('NY', 'Rochester'), ('TX', 'Dallas'), ('CA','Sacramento'), ('CA', 'Palo Alto'), ('GA', 'Atlanta')]

cities_by_state = defaultdict(list)
for state, city in city_list:
    cities_by_state[state].append(city)

for state, cities in cities_by_state.iteritems():
    print state, ', '.join(cities)

NY Albany, Syracuse, Buffalo, Rochester
CA Sacramento, Palo Alto
GA Atlanta
TX Austin, Houston, Dallas


<p>In conclusion, whenever you need a dictionary, and each element’s value should start with a default value, use a defaultdict.</p>
