### Associative Arrays and Dictionaries (advanced, but not too much)

This builds on the basics and adds a few power moves you'll use a lot:
- safer lookups (`get`, `setdefault`),
- elegant construction (dict comprehensions),
- merging and unpacking (`|`, `**`),
- handy batteries (`collections.Counter`, `defaultdict`),
- iteration patterns (over keys/values/items),
- sorting, pretty printing, and immutability notes.

## Safer lookups: `get`, default values, and `setdefault`

In [1]:
person = {
    'first_name': 'Eric',
    'last_name': 'Idle',
    'year_born': 1943
}

# .get avoids KeyError and lets you provide a fallback
month = person.get('month_born', 'UNKNOWN')
print('month:', month)

,
# .setdefault returns existing value or inserts-and-returns default
fav_color = person.setdefault('favorite_color', 'blue')
print('favorite_color:', fav_color)
print(person)  # notice the key got added

month: UNKNOWN
favorite_color: blue
{'first_name': 'Eric', 'last_name': 'Idle', 'year_born': 1943, 'favorite_color': 'blue'}


## Constructing dictionaries cleanly (comprehensions)

In [2]:
words = ['python', 'rocks', 'spam', 'eggs']

# Map each word to its length
lengths = {w: len(w) for w in words}
print(lengths)

# Filter while building: keep only words with length >= 5
long_only = {w: len(w) for w in words if len(w) >= 5}
print(long_only)

# Build from pairs
pairs = [('a', 1), ('b', 2), ('c', 3)]
d_from_pairs = {k: v for k, v in pairs}
print(d_from_pairs)

{'python': 6, 'rocks': 5, 'spam': 4, 'eggs': 4}
{'python': 6, 'rocks': 5}
{'a': 1, 'b': 2, 'c': 3}


## Iteration patterns: keys, values, items (and unpacking during loops)

In [3]:
capitals = {'FR': 'Paris', 'IT': 'Rome', 'ES': 'Madrid'}

print('keys:')
for code in capitals:  # same as capitals.keys()
    print(code)

print('\nvalues:')
for city in capitals.values():
    print(city)

print('\nitems (unpack into k, v):')
for code, city in capitals.items():
    print(f"{code} -> {city}")

# Dict views are dynamic: they reflect mutations
items_view = capitals.items()
capitals['DE'] = 'Berlin'
print('\nnow items_view includes DE:', list(items_view))

keys:
FR
IT
ES

values:
Paris
Rome
Madrid

items (unpack into k, v):
FR -> Paris
IT -> Rome
ES -> Madrid

now items_view includes DE: [('FR', 'Paris'), ('IT', 'Rome'), ('ES', 'Madrid'), ('DE', 'Berlin')]


## Merging and unpacking dictionaries: `|`, `|=`, and `**`

In [4]:
base = {'a': 1, 'b': 2}
extra = {'b': 20, 'c': 3}

# Python 3.9+: union operator creates a new dict
merged = base | extra   # right-hand side wins on conflicts
print('merged:', merged)

# In-place union
base |= {'d': 4}
print('base after |=:', base)

# Double-star unpacking when building literals
cfg_default = {'timeout': 10, 'retries': 3}
cfg_env = {'timeout': 20}
cfg_final = {**cfg_default, **cfg_env, 'verbose': True}
print('cfg_final:', cfg_final)

merged: {'a': 1, 'b': 20, 'c': 3}
base after |=: {'a': 1, 'b': 2, 'd': 4}
cfg_final: {'timeout': 20, 'retries': 3, 'verbose': True}


## Deleting safely: `pop`, `pop` with default, and `popitem` (LIFO)

In [5]:
stock = {'AAPL': 3, 'MSFT': 5}
qty = stock.pop('AAPL')  # returns removed value
print('removed AAPL:', qty)

# Provide a default to avoid KeyError
missing = stock.pop('GOOG', 0)
print('removed GOOG (defaulted):', missing)

# popitem removes an arbitrary item (in CPython 3.7+ it's LIFO)
ticker, qty = stock.popitem()
print('popped pair:', ticker, qty)
print('stock now:', stock)

removed AAPL: 3
removed GOOG (defaulted): 0
popped pair: MSFT 5
stock now: {}


## Counting and grouping with `collections` (`Counter`, `defaultdict`)

In [6]:
from collections import Counter, defaultdict

letters = "abracadabra"
counts = Counter(letters)
print('Counter:', counts)
print('most common 2:', counts.most_common(2))

# defaultdict for convenient grouping/aggregation
words = ['apple', 'art', 'bee', 'bat', 'car']
by_initial = defaultdict(list)
for w in words:
    by_initial[w[0]].append(w)
print('grouped:', dict(by_initial))  # cast to regular dict for display

Counter: Counter({'a': 5, 'b': 2, 'r': 2, 'c': 1, 'd': 1})
most common 2: [('a', 5), ('b', 2)]
grouped: {'a': ['apple', 'art'], 'b': ['bee', 'bat'], 'c': ['car']}


## Sorting dictionaries by key or by value (produces lists of pairs)

In [7]:
prices = {'banana': 2.5, 'apple': 1.2, 'pear': 2.1}

# Sort by key (alphabetical)
by_key = sorted(prices.items())
print('by key:', by_key)

# Sort by value (ascending)
by_value = sorted(prices.items(), key=lambda kv: kv[1])
print('by value:', by_value)

# Sort by value then key (stable sorts)
by_value_then_key = sorted(prices.items(), key=lambda kv: (kv[1], kv[0]))
print('by value then key:', by_value_then_key)

by key: [('apple', 1.2), ('banana', 2.5), ('pear', 2.1)]
by value: [('apple', 1.2), ('pear', 2.1), ('banana', 2.5)]
by value then key: [('apple', 1.2), ('pear', 2.1), ('banana', 2.5)]


## Pretty printing nested dictionaries (quick debugging)

In [8]:
from pprint import pprint

user = {
    'id': 101,
    'name': {'first': 'Terry', 'last': 'Gilliam'},
    'roles': ['writer', 'director'],
    'prefs': {'theme': 'dark', 'notifications': {'email': True, 'sms': False}}
}

pprint(user)

# shallow vs deep copy reminder (dicts hold references)
import copy
shallow = user.copy()
deep = copy.deepcopy(user)
user['prefs']['theme'] = 'light'
print('\nAfter mutating original nested key:')
print('shallow[\'prefs\'][\'theme\'] ->', shallow['prefs']['theme'])  # changed
print('deep[\'prefs\'][\'theme\']    ->', deep['prefs']['theme'])      # unchanged

{'id': 101,
 'name': {'first': 'Terry', 'last': 'Gilliam'},
 'prefs': {'notifications': {'email': True, 'sms': False}, 'theme': 'dark'},
 'roles': ['writer', 'director']}

After mutating original nested key:
shallow['prefs']['theme'] -> light
deep['prefs']['theme']    -> dark


## Keys must be hashable: tuples OK, lists not; `frozenset` can help

In [9]:
# Tuples are hashable if all their contents are hashable
grid = {(0, 0): 'origin', (1, 0): 'x', (0, 1): 'y'}
print(grid[(1, 0)])

# Using a frozenset when the logical key is a set (order-insensitive)
edges = {}
a, b = 'A', 'B'
edges[frozenset({a, b})] = 1.0   # same as {B, A}
print('edge AB weight:', edges[frozenset({'B', 'A'})])

x
edge AB weight: 1.0


## Practical mini-exercises

### 1) Frequency of words (normalized, case-insensitive) using `Counter` and `casefold()`

In [10]:
text = """Spam spam SPAM eggs, bacon; spam!"""
import re
tokens = re.findall(r"[\w']+", text.casefold())  # simple tokenization
freq = Counter(tokens)
print(freq)

Counter({'spam': 4, 'eggs': 1, 'bacon': 1})


### 2) Invert a dictionary (values must be unique & hashable)

In [11]:
countries = {'FR': 'France', 'IT': 'Italy', 'ES': 'Spain'}
inv = {v: k for k, v in countries.items()}
print(inv)

# If values may repeat, invert to sets of keys
dup = {'a': 1, 'b': 1, 'c': 2}
inv_multi = defaultdict(set)
for k, v in dup.items():
    inv_multi[v].add(k)
print(dict(inv_multi))

{'France': 'FR', 'Italy': 'IT', 'Spain': 'ES'}
{1: {'a', 'b'}, 2: {'c'}}


### 3) Safe nested access with chained `.get()` (quick-and-dirty)

In [12]:
payload = {'meta': {'page': 1}, 'data': {'user': {'name': 'Ada'}}}
name = payload.get('data', {}).get('user', {}).get('name')
missing = payload.get('data', {}).get('user', {}).get('email', '<none>')
print('name:', name, '| email:', missing)

name: Ada | email: <none>
