Item 17 Prefer defaultdict Over setdefault to Handle Missing Items in Internal State

Things to Remember
- If you're creating a dictionary to manage an arbitrary set of potential keys, then you should prefer using a defaultdict instance from the collections built-in module if it suits your problem
- If a dictionary of arbitrary keys is passed to you, and you don't cotrol its creation, then you should prefer the get method to access its items. However, it's worth considering using the setdefault method for the few situations in which it leads to shorter code 

In [None]:

visits = { # visits is a dict
    'Mexico': {'Tulum', 'Puerto Vallarta'}, # visits['Mexico'] is a set
    'Japan': {'Hakone'}
}

- use set() to create an empty set
- We cannot create empty sets using { } syntax as it creates an empty dictionary. 
- A set is an unordered collection of items.
- Every set element is unique (no duplicates) and must be immutable (cannot be changed).
- However, a set itself is mutable. We can add or remove items from it.

In [None]:
# - when you don't control the dict creation, 
#   you have to create a new set when adding 
#   a new key to the dict
 
# the using 'get' method approach is preferred
if (japan := visits.get('Japan')) is None:
    visits['Japan'] = japan = set() # initialize it to an new empty set
japan.add('Kyoto')

# - the using setdefault method can give you shorter code
# - this is worth considering as the overhead is the same 
#   as the using 'get' method approach but with shorter code
visits.setdefault('France', set()).add('Arles')
print(visits)

In [None]:
# - when you do control the dict creation cases like you're using a dict instance
#   to keep track of the internal state of a class 

class Visits:
    def __init__(self):
        self.data = {} # data is a dict
    def add(self, country, city):
        # - use setdefault make it a bit harder to understand
        #   as described in Item 16
        # - a new set instance will be constructed on every call
        #   which isn't efficient 
        city_set = self.data.setdefault(country, set())
        city_set.add(city)

visits = Visits()
visits.add('Russia', 'Yekaterinburg')
visits.add('Tanzania', 'Zanzibar')
print(visits.data)

- the defaultdict class from the collections built-in module automatically stores a default value when a key doesn't exist
- you provide a function that will return the default value to the defaultdict constructor


In [None]:
from collections import defaultdict

class Visits:
    def __init__(self):
        # - I am assuming the set method will only 
        #   be called once, and the result (default value) will be 
        #   stored and reused. Otherwise there will be no 
        #   performance gain over the setdefault method approach
        self.data = defaultdict(set)
    def add(self, country, city):
        self.data[country].add(city)
visits = Visits()
visits.add('England', 'Bath')
visits.add('England', 'London')
print(visits.data)
