## defaultdict

## Initializing `defaultdict` Dictionary

Initialize `defaultdict` with `defaultdict(default_factory)` where `default_factory` is a callable (like `list`, `int`, `set`, or a custom function) that returns the default value for any missing key.

In [1]:
from collections import defaultdict

# The default value will be an empty list
dd = defaultdict(list) # this is the list() constructor

dd["key1"].append(1) # "key1" doesn't exist yet, so defaultdict creates it with the value []
# This is equal to
# dd["key1"] = list()
# dd["key1].append(1)

# Even accessing non-existent keys will create them in the dictionary
print(dd["key3"])
print(dd)

[]
defaultdict(<class 'list'>, {'key1': [1], 'key3': []})


## `defaultdict` versus the `setdefault` Method

`defaultdict` is slightly faster in handling missing keys than the `setdefault` method.

In [2]:
from collections import defaultdict

std_dict = {}
std_dict.setdefault("key", "Default")
print(std_dict)

# defaultdict only accepts callables as default_factory
dd = defaultdict(lambda: "Default")
dd["key"]
print(dd)

{'key': 'Default'}
defaultdict(<function <lambda> at 0x7f3dbae31a80>, {'key': 'Default'})


## Grouping Elements

In [3]:
from collections import defaultdict

# Sample list of files, each as (filename, filetype)
files = [
    ("report.docx", "document"),
    ("summary.pdf", "document"),
    ("budget.xlsx", "spreadsheet"),
    ("data.csv", "spreadsheet"),
    ("photo.jpg", "image"),
    ("diagram.png", "image")
]

# Group files by type
grouped_files = defaultdict(list)

for filename, file_type in files:
    grouped_files[file_type].append(filename)

print(dict(grouped_files))

{'document': ['report.docx', 'summary.pdf'], 'spreadsheet': ['budget.xlsx', 'data.csv'], 'image': ['photo.jpg', 'diagram.png']}


## Grouping Unique Elements

In [4]:
from collections import defaultdict

files = [
    ("report.docx", "document"),
    ("summary.pdf", "document"),
    ("budget.xlsx", "spreadsheet"),
    ("data.csv", "spreadsheet"),
    ("photo.jpg", "image"),
    ("diagram.png", "image"),
    ("photo.jpg", "image")  # duplicate file
]

# Group files by type with unique entries
grouped_files = defaultdict(set)

for filename, file_type in files:
    grouped_files[file_type].add(filename)

print(dict(grouped_files))

{'document': {'summary.pdf', 'report.docx'}, 'spreadsheet': {'budget.xlsx', 'data.csv'}, 'image': {'diagram.png', 'photo.jpg'}}


## Implementing Basic Counting Logic

In [1]:
from collections import defaultdict

# Sample list of letters
letters = ["a", "b", "a", "c", "b", "a"]

letter_counts = defaultdict(int) # int() will return 0

for letter in letters:
    letter_counts[letter] += 1

print(letter_counts)

defaultdict(<class 'int'>, {'a': 3, 'b': 2, 'c': 1})
