# Chapter 2: Lists and Dictionaries

---

## Item 11: Know How to Slice Sequences

### Slicing Basics

Python provides slicing syntax to access subsets of sequences. The basic form is `somelist[start:end]` where `start` is inclusive and `end` is exclusive.

| Slicing Pattern | Description |
|----------------|-------------|
| `a[start:end]` | Items from start through end-1 |
| `a[start:]` | Items from start through the rest |
| `a[:end]` | Items from beginning through end-1 |
| `a[:]` | Copy of the whole list |
| `a[-n:]` | Last n items |
| `a[:-n]` | All items except last n |

In [80]:
# Creating a sample list
a = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h']
print('Original list:', a)

Original list: ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h']


### Best Practices for Slicing

**Leave out zero index when slicing from the start:**

In [155]:
# Create a sample list
letters = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j']
print("Original list:", letters)

# Best Practice: Omit zero when starting from beginning
print('\n=== Starting from index 0 ===')
print('letters[:5] =', letters[:5])      # ✅ Preferred - cleaner
print('letters[0:5] =', letters[0:5])    # ❌ Explicit but verbose
assert letters[:5] == letters[0:5]       # Both are equivalent

Original list: ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j']

=== Starting from index 0 ===
letters[:5] = ['a', 'b', 'c', 'd', 'e']
letters[0:5] = ['a', 'b', 'c', 'd', 'e']


#Basic Slicing Examples

**Leave out final index when slicing to the end:**

In [82]:
# These are equivalent - prefer the shorter version
print('a[5:]      =', a[5:])
print('a[5:len(a)]=', a[5:len(a)])
assert a[5:] == a[5:len(a)]

a[5:]      = ['f', 'g', 'h']
a[5:len(a)]= ['f', 'g', 'h']


### Common Slicing Patterns

In [None]:
# Create a sample list for demonstration
a = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j']
print("Original list:", a)
print()

# BASIC SLICING PATTERNS
print("=== BASIC SLICING PATTERNS ===")

# Full list copy - creates a new list with all elements
print('a[:]    =', a[:], '        # Copy entire list')

# From start to specific position (exclusive)
print('a[:5]   =', a[:5], '     # First 5 elements (indices 0-4)')

# All elements except the last one
print('a[:-1]  =', a[:-1], '    # All except last element')

# From specific index to the end
print('a[4:]   =', a[4:], '       # From index 4 to end')

# Last few elements using negative indices
print('a[-3:]  =', a[-3:], '          # Last 3 elements')

# Specific range (start inclusive, end exclusive)
print('a[2:5]  =', a[2:5], '        # Elements from index 2 to 4')

# From middle to one before the end
print('a[2:-1] =', a[2:-1], '     # From index 2 to second last')

# Using negative indices for both start and end
print('a[-3:-1]=', a[-3:-1], '        # Third last to second last')

Full list:     a[:]    = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h']
First 5 items: a[:5]   = ['a', 'b', 'c', 'd', 'e']
All but last:  a[:-1]  = ['a', 'b', 'c', 'd', 'e', 'f', 'g']
From index 4:  a[4:]   = ['e', 'f', 'g', 'h']
Last 3 items:  a[-3:]  = ['f', 'g', 'h']
Middle slice:  a[2:5]  = ['c', 'd', 'e']
Middle to end: a[2:-1] = ['c', 'd', 'e', 'f', 'g']
Last 2 to -1:  a[-3:-1]= ['f', 'g']


### Slicing Beyond Boundaries

Slicing handles out-of-bounds indices gracefully by silently omitting missing items.

In [84]:
# Slicing beyond boundaries is safe
first_twenty_items = a[:20]
last_twenty_items = a[-20:]
print('First 20 (safe):', first_twenty_items)
print('Last 20 (safe): ', last_twenty_items)

First 20 (safe): ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h']
Last 20 (safe):  ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h']


In [85]:
# But direct indexing raises an error
try:
    print(a[20])
except IndexError as e:
    print(f'IndexError: {e}')

IndexError: list index out of range


### Slicing Creates New Lists

**Important:** The result of slicing is a new list. References to objects are maintained, but modifying the slice doesn't affect the original.

In [None]:
b = a[3:]
print('Before: ', b)
# type: ignore comment suppresses the warning
b[1] = 99  # type: ignore
print('After:  ', b)
print('No change in original:', a)

Before:  ['d', 'e', 'f', 'g', 'h']
After:   ['d', 99, 'f', 'g', 'h']
No change in original: ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h']


### Slice Assignment

Slices can be used in assignments to replace ranges in the original list. The lengths don't need to match.

In [None]:
# List shrinks when replacement is shorter
a = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h']
print('Before:', a)
a[2:7] = [99, 22, 14]# type: ignore
print('After: ', a)

Before: ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h']
After:  ['a', 'b', 99, 22, 14, 'h']


In [None]:
# List grows when replacement is longer
print('Before:', a)
a[2:3] = [47, 11]# type: ignore
print('After: ', a)

Before: ['a', 'b', 99, 22, 14, 'h']
After:  ['a', 'b', 47, 11, 22, 14, 'h']


### Copying Lists

In [89]:
# Create a copy using slice syntax
b = a[:]
print('Equal but not same object:', b == a and b is not a)

Equal but not same object: True


In [90]:
# Replace entire contents while keeping same object
b = a
print('Before a:', a)
print('Before b:', b)
a[:] = [101, 102, 103]  # Replaces contents in-place
print('After a: ', a)
print('After b: ', b)
print('Same object:', a is b)

Before a: ['a', 'b', 47, 11, 22, 14, 'h']
Before b: ['a', 'b', 47, 11, 22, 14, 'h']
After a:  [101, 102, 103]
After b:  [101, 102, 103]
Same object: True


### Key Takeaways

- Avoid being verbose: Don't supply 0 for start or length for end
- Slicing is forgiving of out-of-bounds indices
- Slice assignment replaces ranges even if lengths differ

---

## Item 12: Avoid Striding and Slicing in a Single Expression

### Stride Syntax

Python's slice syntax includes a stride parameter: `somelist[start:end:stride]`

This allows taking every nth item when slicing.

In [91]:
# Grouping by even and odd indexes
x = ['red', 'orange', 'yellow', 'green', 'blue', 'purple']
odds = x[::2]
evens = x[1::2]
print('Odds: ', odds)
print('Evens:', evens)

Odds:  ['red', 'yellow', 'blue']
Evens: ['orange', 'green', 'purple']


### Reversing with Stride

In [92]:
# Common Python trick for reversing
x = b'mongoose'
y = x[::-1]
print('Original:', x)
print('Reversed:', y)

Original: b'mongoose'
Reversed: b'esoognom'


In [93]:
# Works with Unicode strings
x = '寿司'
y = x[::-1]
print('Original:', x)
print('Reversed:', y)

Original: 寿司
Reversed: 司寿


In [94]:
# But breaks with UTF-8 byte strings
w = '寿司'
x = w.encode('utf-8')
y = x[::-1]
try:
    z = y.decode('utf-8')
except UnicodeDecodeError as e:
    print(f'UnicodeDecodeError: {e}')

UnicodeDecodeError: 'utf-8' codec can't decode byte 0xb8 in position 0: invalid start byte


### The Confusion of Complex Striding

In [95]:
# Simple stride examples
x = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h']
print('x[::2]  =', x[::2])   # Every 2nd item from start
print('x[::-2] =', x[::-2])  # Every 2nd item from end, backwards

x[::2]  = ['a', 'c', 'e', 'g']
x[::-2] = ['h', 'f', 'd', 'b']


In [96]:
# Confusing combinations
print('x[2::2]   =', x[2::2])    # Start at 2, every 2nd
print('x[-2::-2] =', x[-2::-2])  # Start at -2, backwards, every 2nd
print('x[-2:2:-2]=', x[-2:2:-2]) # Start -2, end 2, backwards, every 2nd
print('x[2:2:-2] =', x[2:2:-2])  # Impossible - returns empty

x[2::2]   = ['c', 'e', 'g']
x[-2::-2] = ['g', 'e', 'c', 'a']
x[-2:2:-2]= ['g', 'e']
x[2:2:-2] = []


### Best Practice: Separate Striding and Slicing

In [97]:
# Instead of complex expressions, use two operations
y = x[::2]      # First stride
z = y[1:-1]     # Then slice
print('Step 1 (stride):', y)
print('Step 2 (slice): ', z)

Step 1 (stride): ['a', 'c', 'e', 'g']
Step 2 (slice):  ['c', 'e']


### Alternative: itertools.islice

In [98]:
from itertools import islice

# More memory efficient for large sequences
x = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h']
result = list(islice(x, 1, 7, 2))  # start=1, end=7, step=2
print('Using islice:', result)

Using islice: ['b', 'd', 'f']


### Key Takeaways

- Using start, end, and stride together is confusing
- Prefer positive stride values without start/end
- Avoid negative stride values if possible
- Separate striding and slicing into two operations if needed

---

## Item 13: Prefer Catch-All Unpacking Over Slicing

### The Problem with Basic Unpacking

In [99]:
# Basic unpacking requires exact length match
car_ages = [0, 9, 4, 8, 7, 20, 19, 1, 6, 15]
car_ages_descending = sorted(car_ages, reverse=True)

try:
    oldest, second_oldest = car_ages_descending
except ValueError as e:
    print(f'ValueError: {e}')

ValueError: too many values to unpack (expected 2)


### Traditional Approach: Indexing and Slicing

In [100]:
# Using indexes - verbose and error-prone
oldest = car_ages_descending[0]
second_oldest = car_ages_descending[1]
others = car_ages_descending[2:]
print(f'Oldest: {oldest}, Second: {second_oldest}, Others: {others}')

Oldest: 20, Second: 19, Others: [15, 9, 8, 7, 6, 4, 1, 0]


### Starred Expressions (Catch-All Unpacking)

In [101]:
# Much cleaner with starred expression
oldest, second_oldest, *others = car_ages_descending
print(f'Oldest: {oldest}, Second: {second_oldest}, Others: {others}')

Oldest: 20, Second: 19, Others: [15, 9, 8, 7, 6, 4, 1, 0]


### Starred Expression Can Appear Anywhere

In [102]:
# At the end
oldest, *others, youngest = car_ages_descending
print(f'Oldest: {oldest}, Youngest: {youngest}, Others: {others}')

Oldest: 20, Youngest: 0, Others: [19, 15, 9, 8, 7, 6, 4, 1]


In [103]:
# At the beginning
*others, second_youngest, youngest = car_ages_descending
print(f'Youngest: {youngest}, Second youngest: {second_youngest}')
print(f'Others: {others}')

Youngest: 0, Second youngest: 1
Others: [20, 19, 15, 9, 8, 7, 6, 4]


### Restrictions on Starred Expressions

In [104]:
# Must have at least one required part
try:
    exec('*others = car_ages_descending')  # Using exec to avoid syntax error
except SyntaxError as e:
    print(f'SyntaxError: starred assignment target must be in a list or tuple')

SyntaxError: starred assignment target must be in a list or tuple


In [None]:
# Can't use multiple starred expressions at same level
try:
    exec('first, *middle, *second_middle, last = [1, 2, 3, 4]')
except SyntaxError as e:
    print('Some message without variables')

SyntaxError: multiple starred expressions not allowed


### Starred Expressions Always Become Lists

In [106]:
# Even with insufficient items, starred expression becomes empty list
short_list = [1, 2]
first, second, *rest = short_list
print(f'First: {first}, Second: {second}, Rest: {rest}')
print(f'Type of rest: {type(rest)}')

First: 1, Second: 2, Rest: []
Type of rest: <class 'list'>


### Unpacking Iterators

In [107]:
# Works with any iterator
it = iter(range(1, 3))
first, second = it
print(f'{first} and {second}')

1 and 2


In [108]:
# Practical example: CSV processing
def generate_csv():
    yield ('Date', 'Make', 'Model', 'Year', 'Price')
    for i in range(200):
        yield (f'2024-{i%12+1:02d}-01', 'Toyota', 'Camry', 2024, 25000+i*100)

# Separate header from rows
it = generate_csv()
header, *rows = it
print('CSV Header:', header)
print('Row count: ', len(rows))
print('First row: ', rows[0])

CSV Header: ('Date', 'Make', 'Model', 'Year', 'Price')
Row count:  200
First row:  ('2024-01-01', 'Toyota', 'Camry', 2024, 25000)


### Key Takeaways

- Starred expressions catch all values not assigned to other parts
- Can appear in any position
- Always become a list (possibly empty)
- Much less error-prone than slicing and indexing
- Be cautious with memory when unpacking large iterators

---

## Item 14: Sort by Complex Criteria Using the key Parameter

### Basic Sorting

In [109]:
# Sorting numbers - natural order
numbers = [93, 86, 11, 68, 70]
numbers.sort()
print('Sorted numbers:', numbers)

Sorted numbers: [11, 68, 70, 86, 93]


### Sorting Custom Objects

In [None]:
# Define a custom class
class Tool:
    def __init__(self, name, weight):
        self.name = name
        self.weight = weight

    def __repr__(self):
        return f'Tool({self.name!r}, {self.weight})'

tools = [
    Tool('level', 3.5),
    Tool('hammer', 1.25),
    Tool('screwdriver', 0.5),
    Tool('chisel', 0.25),
]

In [111]:
# Sorting without comparison methods fails
try:
    tools.sort()
except TypeError as e:
    print(f'TypeError: {e}')

TypeError: '<' not supported between instances of 'Tool' and 'Tool'


### Using the key Parameter

In [112]:
# Sort by name using lambda
print('Unsorted:', tools)
tools.sort(key=lambda x: x.name)
print('\nSorted by name:', tools)

Unsorted: [Tool('level', 3.5), Tool('hammer', 1.25), Tool('screwdriver', 0.5), Tool('chisel', 0.25)]

Sorted by name: [Tool('chisel', 0.25), Tool('hammer', 1.25), Tool('level', 3.5), Tool('screwdriver', 0.5)]


In [113]:
# Sort by weight
tools.sort(key=lambda x: x.weight)
print('Sorted by weight:', tools)

Sorted by weight: [Tool('chisel', 0.25), Tool('screwdriver', 0.5), Tool('hammer', 1.25), Tool('level', 3.5)]


### Transforming Values

In [114]:
# Case-insensitive sorting
places = ['home', 'work', 'New York', 'Paris']
places.sort()
print('Case sensitive:  ', places)

places.sort(key=lambda x: x.lower())
print('Case insensitive:', places)

Case sensitive:   ['New York', 'Paris', 'home', 'work']
Case insensitive: ['home', 'New York', 'Paris', 'work']


### Sorting by Multiple Criteria Using Tuples

Tuples are compared element by element, making them perfect for multi-criteria sorting.

In [115]:
# How tuple comparison works
saw = (5, 'circular saw')
jackhammer = (40, 'jackhammer')
print('Comparison by first element:', jackhammer < saw)

drill = (4, 'drill')
sander = (4, 'sander')
print('Same first element, compare second:', drill < sander)

Comparison by first element: False
Same first element, compare second: True


In [116]:
# Sort by weight, then by name
power_tools = [
    Tool('drill', 4),
    Tool('circular saw', 5),
    Tool('jackhammer', 40),
    Tool('sander', 4),
]

power_tools.sort(key=lambda x: (x.weight, x.name))
print('Sorted by weight then name:', power_tools)

Sorted by weight then name: [Tool('drill', 4), Tool('sander', 4), Tool('circular saw', 5), Tool('jackhammer', 40)]


### Mixing Sort Directions

In [117]:
# All ascending or all descending
power_tools.sort(key=lambda x: (x.weight, x.name), reverse=True)
print('All descending:', power_tools)

All descending: [Tool('jackhammer', 40), Tool('circular saw', 5), Tool('sander', 4), Tool('drill', 4)]


In [118]:
# Use unary minus for numeric values
power_tools.sort(key=lambda x: (-x.weight, x.name))
print('Weight descending, name ascending:', power_tools)

Weight descending, name ascending: [Tool('jackhammer', 40), Tool('circular saw', 5), Tool('drill', 4), Tool('sander', 4)]


### Multiple Sort Calls (Stable Sort)

In [119]:
# Python's sort is stable - preserves relative order for equal keys
power_tools = [
    Tool('drill', 4),
    Tool('circular saw', 5),
    Tool('jackhammer', 40),
    Tool('sander', 4),
]

# Sort by name first (lowest priority)
power_tools.sort(key=lambda x: x.name)
print('After name sort:', power_tools)

# Then sort by weight descending (highest priority)
power_tools.sort(key=lambda x: x.weight, reverse=True)
print('After weight sort:', power_tools)

After name sort: [Tool('circular saw', 5), Tool('drill', 4), Tool('jackhammer', 40), Tool('sander', 4)]
After weight sort: [Tool('jackhammer', 40), Tool('circular saw', 5), Tool('drill', 4), Tool('sander', 4)]


### Comparison Table

| Method | Advantages | Disadvantages |
|--------|------------|---------------|
| Tuple key | Simple, one call | All criteria same direction |
| Unary minus | Mix directions | Only works for numbers |
| Multiple sorts | Works for all types | Multiple calls, less obvious |

### Key Takeaways

- Use `key` parameter to customize sort behavior
- Return tuples from `key` for multiple criteria
- Use unary minus to reverse numeric sorts
- Use stable sort with multiple calls when necessary

---

## Item 15: Be Cautious When Relying on dict Insertion Ordering

### Historical Context

| Python Version | Dictionary Behavior |
|---------------|---------------------|
| Python 3.5 and earlier | Random iteration order |
| Python 3.6 | Insertion order preserved (implementation detail) |
| Python 3.7+ | Insertion order preserved (language specification) |

In [120]:
# Modern Python preserves insertion order
baby_names = {
    'cat': 'kitten',
    'dog': 'puppy',
}
print('Dictionary:', baby_names)
print('Keys:', list(baby_names.keys()))
print('Values:', list(baby_names.values()))

Dictionary: {'cat': 'kitten', 'dog': 'puppy'}
Keys: ['cat', 'dog']
Values: ['kitten', 'puppy']


### Dictionary Methods Preserve Order

In [121]:
# All dictionary methods maintain insertion order
print('keys():  ', list(baby_names.keys()))
print('values():', list(baby_names.values()))
print('items(): ', list(baby_names.items()))
print('popitem():', baby_names.popitem())  # Removes last inserted
print('After popitem:', baby_names)

keys():   ['cat', 'dog']
values(): ['kitten', 'puppy']
items():  [('cat', 'kitten'), ('dog', 'puppy')]
popitem(): ('dog', 'puppy')
After popitem: {'cat': 'kitten'}


### Function Keyword Arguments

In [122]:
# Keyword arguments now preserve order
def my_func(**kwargs):
    for key, value in kwargs.items():
        print(f'{key} = {value}')

my_func(goose='gosling', kangaroo='joey')

goose = gosling
kangaroo = joey


### Class Instance Dictionaries

In [123]:
# Instance __dict__ preserves attribute assignment order
class MyClass:
    def __init__(self):
        self.alligator = 'hatchling'
        self.elephant = 'calf'

a = MyClass()
for key, value in a.__dict__.items():
    print(f'{key} = {value}')

alligator = hatchling
elephant = calf


### The Problem: Duck Typing and Custom Dict-Like Classes

In [124]:
# Example: Voting system
votes = {
    'otter': 1281,
    'polar bear': 587,
    'fox': 863,
}

def populate_ranks(votes, ranks):
    names = list(votes.keys())
    names.sort(key=votes.get, reverse=True)
    for i, name in enumerate(names, 1):
        ranks[name] = i

def get_winner(ranks):
    return next(iter(ranks))

ranks = {}
populate_ranks(votes, ranks)
print('Ranks:', ranks)
winner = get_winner(ranks)
print('Winner:', winner)

Ranks: {'otter': 1, 'fox': 2, 'polar bear': 3}
Winner: otter


In [None]:
# Custom dict-like class with different iteration order
from collections.abc import MutableMapping

class SortedDict(MutableMapping):
    def __init__(self):
        self.data = {}

    def __getitem__(self, key):
        return self.data[key]

    def __setitem__(self, key, value):
        self.data[key] = value

    def __delitem__(self, key):
        del self.data[key]

    def __iter__(self):
        keys = list(self.data.keys())
        keys.sort()
        for key in keys:
            yield key

    def __len__(self):
        return len(self.data)

sorted_ranks = SortedDict()
populate_ranks(votes, sorted_ranks)
print('Sorted ranks data:', sorted_ranks.data)
winner = get_winner(sorted_ranks)
print('Winner (WRONG!):', winner)  # Returns 'fox' instead of 'otter'!

Sorted ranks data: {'otter': 1, 'fox': 2, 'polar bear': 3}
Winner (WRONG!): fox


### Solution 1: Conservative Implementation

In [126]:
# Don't assume insertion order
def get_winner(ranks):
    for name, rank in ranks.items():
        if rank == 1:
            return name

winner = get_winner(sorted_ranks)
print('Winner (correct):', winner)

Winner (correct): otter


### Solution 2: Runtime Type Checking

In [127]:
# Enforce dict type at runtime
def get_winner(ranks):
    if not isinstance(ranks, dict):
        raise TypeError('must provide a dict instance')
    return next(iter(ranks))

try:
    winner = get_winner(sorted_ranks)
except TypeError as e:
    print(f'TypeError: {e}')

TypeError: must provide a dict instance


### Solution 3: Type Annotations

In [128]:
# Use type hints for static analysis
from typing import Dict

def populate_ranks(votes: Dict[str, int], ranks: Dict[str, int]) -> None:
    names = list(votes.keys())
    names.sort(key=votes.get, reverse=True)
    for i, name in enumerate(names, 1):
        ranks[name] = i

def get_winner(ranks: Dict[str, int]) -> str:
    return next(iter(ranks))

# Type checkers like mypy would catch the error
# sorted_ranks = SortedDict()  # Type error!
# populate_ranks(votes, sorted_ranks)

### Key Takeaways

- Since Python 3.7, dict preserves insertion order
- Can't assume dict-like objects preserve insertion order
- Three solutions:
  1. Write code that doesn't rely on insertion order
  2. Check for dict type at runtime
  3. Use type annotations and static analysis

---

## Item 16: Prefer get Over in and KeyError to Handle Missing Dictionary Keys

### The Problem: Accessing Missing Keys

In [129]:
# Dictionary of vote counters
counters = {
    'pumpernickel': 2,
    'sourdough': 1,
}

### Approach 1: Using `in` Expression

In [130]:
# Check with 'in' - verbose
key = 'wheat'

if key in counters:
    count = counters[key]
else:
    count = 0

counters[key] = count + 1
print('After vote:', counters)

After vote: {'pumpernickel': 2, 'sourdough': 1, 'wheat': 1}


### Approach 2: Using KeyError Exception

In [131]:
# Use exception handling - more efficient
key = 'brioche'

try:
    count = counters[key]
except KeyError:
    count = 0

counters[key] = count + 1
print('After vote:', counters)

After vote: {'pumpernickel': 2, 'sourdough': 1, 'wheat': 1, 'brioche': 1}


### Approach 3: Using get Method (BEST)

In [132]:
# Best approach - clean and efficient
key = 'multigrain'

count = counters.get(key, 0)
counters[key] = count + 1
print('After vote:', counters)

After vote: {'pumpernickel': 2, 'sourdough': 1, 'wheat': 1, 'brioche': 1, 'multigrain': 1}


### Comparison of Approaches

| Method | Key Access | Assignment | Code Lines | Readability |
|--------|-----------|------------|------------|-------------|
| `in` expression | 2 times | 1 time | 4 | Medium |
| `KeyError` | 1 time | 1 time | 5 | Low |
| `get()` | 1 time | 1 time | 2 | **High** |

### Complex Values: Lists

In [133]:
# Dictionary with list values
votes = {
    'baguette': ['Bob', 'Alice'],
    'ciabatta': ['Coco', 'Deb'],
}

key = 'brioche'
who = 'Elmer'

In [134]:
# Using 'in' expression
if key in votes:
    names = votes[key]
else:
    votes[key] = names = []

names.append(who)
print('Votes:', votes)

Votes: {'baguette': ['Bob', 'Alice'], 'ciabatta': ['Coco', 'Deb'], 'brioche': ['Elmer']}


In [135]:
# Using KeyError exception
key = 'pumpernickel'
who = 'Frank'

try:
    names = votes[key]
except KeyError:
    votes[key] = names = []

names.append(who)
print('Votes:', votes)

Votes: {'baguette': ['Bob', 'Alice'], 'ciabatta': ['Coco', 'Deb'], 'brioche': ['Elmer'], 'pumpernickel': ['Frank']}


In [136]:
# Using get with assignment expression (Python 3.8+)
key = 'sourdough'
who = 'Grace'

if (names := votes.get(key)) is None:
    votes[key] = names = []

names.append(who)
print('Votes:', votes)

Votes: {'baguette': ['Bob', 'Alice'], 'ciabatta': ['Coco', 'Deb'], 'brioche': ['Elmer'], 'pumpernickel': ['Frank'], 'sourdough': ['Grace']}


### The setdefault Method

In [137]:
# Using setdefault - shorter but confusing name
key = 'rye'
who = 'Henry'

names = votes.setdefault(key, [])
names.append(who)
print('Votes:', votes)

Votes: {'baguette': ['Bob', 'Alice'], 'ciabatta': ['Coco', 'Deb'], 'brioche': ['Elmer'], 'pumpernickel': ['Frank'], 'sourdough': ['Grace'], 'rye': ['Henry']}


### Problem with setdefault

In [138]:
# setdefault assigns the default directly (not a copy)
data = {}
key = 'foo'
value = []

data.setdefault(key, value)
print('Before:', data)

value.append('hello')
print('After: ', data)  # Shared reference!

Before: {'foo': []}
After:  {'foo': ['hello']}


### When NOT to Use setdefault

In [139]:
# For counters, setdefault is wasteful
counters = {'pumpernickel': 2, 'sourdough': 1}
key = 'wheat'

# setdefault does unnecessary assignment
count = counters.setdefault(key, 0)
counters[key] = count + 1

# get is better - only one assignment needed
count = counters.get(key, 0)
counters[key] = count + 1
print('Counters:', counters)

Counters: {'pumpernickel': 2, 'sourdough': 1, 'wheat': 2}


### Key Takeaways

- Four ways to handle missing keys: `in`, `KeyError`, `get`, `setdefault`
- `get` is best for simple types like counters
- `get` with assignment expressions works well for complex types
- `setdefault` is rarely the best choice
- Consider `defaultdict` instead (next item)

---

## Item 17: Prefer defaultdict Over setdefault to Handle Missing Items in Internal State

### The Problem Revisited

In [140]:
# Tracking cities visited per country
visits = {
    'Mexico': {'Tulum', 'Puerto Vallarta'},
    'Japan': {'Hakone'},
}

In [141]:
# Using setdefault
visits.setdefault('France', set()).add('Arles')
print('Visits:', visits)

Visits: {'Mexico': {'Puerto Vallarta', 'Tulum'}, 'Japan': {'Hakone'}, 'France': {'Arles'}}


### Wrapping in a Class

In [None]:
# Class using setdefault
class Visits:
    def __init__(self):
        self.data = {}

    def add(self, country, city):
        city_set = self.data.setdefault(country, set())
        city_set.add(city)

visits = Visits()
visits.add('Russia', 'Yekaterinburg')
visits.add('Tanzania', 'Zanzibar')
print('Visits:', visits.data)

Visits: {'Russia': {'Yekaterinburg'}, 'Tanzania': {'Zanzibar'}}


### Problems with setdefault

1. Confusing method name
2. Creates new set() on EVERY call (inefficient)
3. Not clear what it does

### Solution: defaultdict

In [None]:
from collections import defaultdict

class Visits:
    def __init__(self):
        self.data = defaultdict(set)

    def add(self, country, city):
        self.data[country].add(city)

visits = Visits()
visits.add('England', 'Bath')
visits.add('England', 'London')
print('Visits:', visits.data)

Visits: defaultdict(<class 'set'>, {'England': {'London', 'Bath'}})


### Benefits of defaultdict

| Feature | setdefault | defaultdict |
|---------|-----------|-------------|
| Clarity | Poor | Good |
| Efficiency | Creates default every time | Creates only when needed |
| Code length | Longer | Shorter |
| Intent | Unclear | Clear |

### How defaultdict Works

In [144]:
# defaultdict calls the factory function when key is missing
from collections import defaultdict

# With lists
list_dict = defaultdict(list)
list_dict['key'].append('value')
print('List defaultdict:', dict(list_dict))

# With ints (useful for counters)
counter = defaultdict(int)
counter['wheat'] += 1
counter['wheat'] += 1
counter['rye'] += 1
print('Counter defaultdict:', dict(counter))

List defaultdict: {'key': ['value']}
Counter defaultdict: {'wheat': 2, 'rye': 1}


### Custom Default Factory

In [145]:
# Using a custom factory function
def default_list():
    print('Creating new list')
    return []

custom_dict = defaultdict(default_list)
custom_dict['key1'].append('value1')
custom_dict['key1'].append('value2')
custom_dict['key2'].append('value3')
print('Custom defaultdict:', dict(custom_dict))

Creating new list
Creating new list
Custom defaultdict: {'key1': ['value1', 'value2'], 'key2': ['value3']}


### Complete Example: Vote Tracking

In [None]:
from collections import defaultdict

class VoteTracker:
    def __init__(self):
        self.votes = defaultdict(list)

    def add_vote(self, bread_type, voter_name):
        self.votes[bread_type].append(voter_name)

    def get_results(self):
        results = {}
        for bread, voters in self.votes.items():
            results[bread] = len(voters)
        return results

tracker = VoteTracker()
tracker.add_vote('sourdough', 'Alice')
tracker.add_vote('sourdough', 'Bob')
tracker.add_vote('rye', 'Charlie')
tracker.add_vote('sourdough', 'Diana')

print('All votes:', dict(tracker.votes))
print('Vote counts:', tracker.get_results())

All votes: {'sourdough': ['Alice', 'Bob', 'Diana'], 'rye': ['Charlie']}
Vote counts: {'sourdough': 3, 'rye': 1}


### Key Takeaways

- Use `defaultdict` for managing internal state with arbitrary keys
- `defaultdict` is cleaner and more efficient than `setdefault`
- Still use `get` for dictionaries you don't control
- `setdefault` is rarely the best choice

---

## Item 18: Know How to Construct Key-Dependent Default Values with __missing__

### The Limitation of setdefault and defaultdict

In [147]:
# Managing file handles for profile pictures
pictures = {}
path = 'profile_1234.png'

### Approach 1: Using get with Assignment Expression

In [148]:
# Best approach without __missing__
import os

if (handle := pictures.get(path)) is None:
    try:
        handle = open(path, 'a+b')
    except OSError:
        print(f'Failed to open path {path}')
        raise
    else:
        pictures[path] = handle

# Use the handle
handle.seek(0)
image_data = handle.read()
print(f'Read {len(image_data)} bytes')

# Cleanup
handle.close()
if path in pictures:
    del pictures[path]

Read 0 bytes


### Why setdefault Doesn't Work

In [149]:
# Problems with setdefault:
# 1. Always calls open() even if key exists
# 2. Exception handling is complicated
# 3. Can't separate exception sources

try:
    handle = pictures.setdefault(path, open(path, 'a+b'))
except OSError:
    print(f'Failed to open path {path}')
    raise
else:
    handle.seek(0)
    image_data = handle.read()
    print(f'Read {len(image_data)} bytes')
    handle.close()

# Cleanup
if path in pictures:
    del pictures[path]

Read 0 bytes


### Why defaultdict Doesn't Work

In [150]:
# defaultdict factory can't receive the key as parameter
from collections import defaultdict

def open_picture(profile_path):
    try:
        return open(profile_path, 'a+b')
    except OSError:
        print(f'Failed to open path {profile_path}')
        raise

# This won't work - factory function needs no arguments
try:
    pictures = defaultdict(open_picture)
    handle = pictures[path]
except TypeError as e:
    print(f'TypeError: {e}')

TypeError: open_picture() missing 1 required positional argument: 'profile_path'


### Solution: Custom dict with __missing__

In [151]:
class Pictures(dict):
    def __missing__(self, key):
        value = open_picture(key)
        self[key] = value
        return value

pictures = Pictures()
handle = pictures[path]
handle.seek(0)
image_data = handle.read()
print(f'Read {len(image_data)} bytes')
handle.close()

# Cleanup
if path in pictures:
    del pictures[path]

Read 0 bytes


### How __missing__ Works

In [152]:
class DemoDict(dict):
    def __missing__(self, key):
        print(f'__missing__ called with key: {key}')
        value = f'default_{key}'
        self[key] = value
        return value

demo = DemoDict()
print('First access:')
result1 = demo['key1']
print(f'Result: {result1}')

print('\nSecond access (no __missing__ call):')
result2 = demo['key1']
print(f'Result: {result2}')

print('\nDictionary contents:', dict(demo))

First access:
__missing__ called with key: key1
Result: default_key1

Second access (no __missing__ call):
Result: default_key1

Dictionary contents: {'key1': 'default_key1'}


### Practical Example: Path-Dependent Configuration

In [None]:
import json

class ConfigDict(dict):
    def __init__(self, config_dir):
        self.config_dir = config_dir

    def __missing__(self, key):
        # Load configuration based on key
        config_path = f'{self.config_dir}/{key}.json'
        try:
            with open(config_path, 'r') as f:
                value = json.load(f)
        except (FileNotFoundError, json.JSONDecodeError):
            # Return default empty config
            value = {}

        self[key] = value
        return value

# Usage example (won't work without actual config files)
# config = ConfigDict('./configs')
# db_config = config['database']  # Loads ./configs/database.json
# api_config = config['api']      # Loads ./configs/api.json

### Comparison Table

| Method | Key-Dependent Default | Handles Exceptions | Code Clarity |
|--------|----------------------|-------------------|-------------|
| `setdefault` | ❌ | ❌ | Poor |
| `defaultdict` | ❌ | ❌ | Good |
| `get` + assignment expr | ✅ | ✅ | Good |
| `__missing__` | ✅ | ✅ | **Excellent** |

### Another Example: Lazy Database Connections

In [None]:
class DatabasePool(dict):
    """Lazy connection pool that creates connections on demand"""

    def __missing__(self, db_name):
        print(f'Creating connection to {db_name}')
        # Simulate connection creation
        connection = f'Connection to {db_name}'
        self[db_name] = connection
        return connection

pool = DatabasePool()

# Connections created only when needed
conn1 = pool['users_db']
print(f'Got: {conn1}')

conn2 = pool['products_db']
print(f'Got: {conn2}')

# Reusing existing connection
conn3 = pool['users_db']
print(f'Reused: {conn3}')

Creating connection to users_db
Got: Connection to users_db
Creating connection to products_db
Got: Connection to products_db
Reused: Connection to users_db


### Key Takeaways

- `setdefault` is bad when default value is expensive or can raise exceptions
- `defaultdict` can't make default value depend on the key
- Use `__missing__` method to create key-dependent default values
- `__missing__` is called only when key doesn't exist
- Perfect for lazy initialization and resource management

---

## Chapter Summary

### Lists and Sequences
- Master slicing syntax for clean, Pythonic code
- Avoid complex stride expressions
- Use starred expressions for catch-all unpacking

### Sorting
- Use `key` parameter for custom sort behavior
- Leverage tuple comparison for multi-criteria sorts
- Understand Python's stable sort algorithm

### Dictionaries
- Trust insertion order in Python 3.7+
- Be cautious with dict-like custom classes
- Prefer `get()` over `in` and `KeyError`
- Use `defaultdict` for internal state management
- Implement `__missing__` for key-dependent defaults

### Best Practices Hierarchy
1. **Most cases**: Use `get()` method
2. **Internal state**: Use `defaultdict`
3. **Key-dependent defaults**: Implement `__missing__`
4. **Rarely**: Use `setdefault`