# Chapter 3. Dictionaries and Sets
---

## ToC


[Automatic Handling of Missing Keys](#automatic-handling-of-missing-keys)
1. [Approach 1. defaultdict: Another Take on Missing Keys](#approach-1-defaultdict-another-take-on-missing-keys)


---

## Automatic Handling of Missing Keys

Sometimes it is convenient to have mappings that return some made-up value when a
missing key is searched. There are two main approaches to this:  

    I. Use a `defaultdict` instead of a plain dict.  
    II. Subclass `dict` or any other mapping type and add a `__missing__` method

### Approach I. defaultdict: Another Take on Missing Keys

A `collections.defaultdict` instance creates items with a default value on demand whenever a missing key is searched using `d[k]` syntax.

When instantiating a `defaultdict`, you provide a callable to produce a default value whenever `__getitem__` is passed a nonexistent key argument.

For example, given a `defaultdict` created as `dd = defaultdict(list)`, if 'new-key' is not in dd, the expression `dd['new-key']` does the following steps:

1. Calls `list()` to create a new list.
2. Inserts the list into dd using `'new-key'` as key.
3. Returns a reference to that list.

The callable that produces the default values is held in an instance attribute named `default_factory`.

Revisiting the example from previous section:

In [None]:
import collections
import re
import sys

WORD_RE = re.compile(r'\w+')

# Create a defaultdict with the list constructor as default_factory
index = collections.defaultdict(list)
# for terminal 
# with open(sys.argv[1], encoding='utf-8') as fp:
# for notebook
with open("zen.txt", encoding='utf-8') as fp:
    for line_no, line in enumerate(fp, 1):
        for match in WORD_RE.finditer(line):
            word = match.group()
            column_no = match.start() + 1
            location = (line_no, column_no)
            index[word].append(location)
# display in alphabetical order
for word in sorted(index, key=str.upper):
    print(word, index[word])

a [(19, 48), (20, 53)]
Although [(11, 1), (16, 1), (18, 1)]
ambiguity [(14, 16)]
and [(15, 23)]
are [(21, 12)]
aren [(10, 15)]
at [(16, 38)]
bad [(19, 50)]
be [(15, 14), (16, 27), (20, 50)]
beats [(11, 23)]
Beautiful [(3, 1)]
better [(3, 14), (4, 13), (5, 11), (6, 12), (7, 9), (8, 11), (17, 8), (18, 25)]
break [(10, 40)]
by [(1, 20)]
cases [(10, 9)]
complex [(5, 23)]
Complex [(6, 1)]
complicated [(6, 24)]
counts [(9, 13)]
dense [(8, 23)]
do [(15, 64), (21, 48)]
Dutch [(16, 61)]
easy [(20, 26)]
enough [(10, 30)]
Errors [(12, 1)]
explain [(19, 34), (20, 34)]
Explicit [(4, 1)]
explicitly [(13, 8)]
face [(14, 8)]
first [(16, 41)]
Flat [(7, 1)]
good [(20, 55)]
great [(21, 28)]
guess [(14, 52)]
hard [(19, 26)]
honking [(21, 20)]
idea [(19, 54), (20, 60), (21, 34)]
If [(19, 1), (20, 1)]
implementation [(19, 8), (20, 8)]
implicit [(4, 25)]
In [(14, 1)]
is [(3, 11), (4, 10), (5, 8), (6, 9), (7, 6), (8, 8), (17, 5), (18, 16), (19, 23), (20, 23)]
it [(15, 67), (19, 43), (20, 43)]
let [(21, 42)]
m

If no `default_factory` is provided, the usual `KeyError` is raised for missing keys.


![Figure 37](https://raw.githubusercontent.com/berserkhmdvhb/Training-Python/main/figures/Part_I/37.PNG)

The mechanism that makes `defaultdict` work by calling default_factory is the
`__missing__` special method:

### The `__missing__` Method