# Fluent Python - CH2

## Unpacking Iterables and Sequences

Unpacking items from iterables and sequences is useful because it avoids error prone indexing. Also **indexing doesn't work with iterators..**

In [195]:
# parallel assignment in unpacking
lax_cords = (33.934, -118.93902)
lat, long = lax_cords # unpacking
lat

33.934

In [196]:
# also elegant swapping
lat, long = long, lat
lat

-118.93902

### Using * To Grab Excess Items 

Defining function parameters with `*args` to grab arbitrary excess arguments is classic python feature

In fact in Python3, this idea was extended to apply to parallel assignment too

In [197]:
# grabbing excess arguments from range - turns to list
a, b, *rest = range(5)
a, b, rest

(0, 1, [2, 3, 4])

In [198]:
def fun(a, b, c, d, *rest):
    return a, b, c, d, rest

# notice unpacks until 'd' index then makes tuple for --
# unpacking rest of elements in range(i, n)
fun(*[1, 2], 3, *range(4, 7))

(1, 2, 3, 4, (5, 6))

In [199]:
# can even do this to define list, tuple, or set literals
{*range(4), 4, *[5, 6, 7]}

{0, 1, 2, 3, 4, 5, 6, 7}

### Nested Unpacking 
Nested Unpacking can be crucial for nested data types

e.g. `(a, b, (c, d))` 

We need Python to properly handle these structures.

In [200]:
# consider this nested data structure with useful info
metro_areas = [
('Tokyo', 'JP', 36.933, (35.689722, 139.691667)),
('Delhi NCR', 'IN', 21.935, (28.613889, 77.208889)),
('Mexico City', 'MX', 20.142, (19.433333,
-99.133333)),
('New York-Newark', 'US', 20.104, (40.808611,
-74.020386)),
('São Paulo', 'BR', 19.649, (-23.547778,
-46.635833)),
]

In [201]:
def main():
    for name, _, _, (lat, lon) in metro_areas:
        if lon <= 0:
            print(f"{name:15} | {lat:9.4f} | {lon:9.4f}")

main()

Mexico City     |   19.4333 |  -99.1333
New York-Newark |   40.8086 |  -74.0204
São Paulo       |  -23.5478 |  -46.6358


### Match and Case Syntax and Sequences 

This `match/case` syntax is perfect for a alternative and more readable `if/elif/else` statement

the `case pattern1` can handle many types of patterns.

1. `pattern1 = "hello"` 
2. `pattern1 = _` this is a wildcard pattern, acts as the default case.
3.

In [202]:
def process_data(data):
    match data:
        case int(x) if x > 0:
            print(f"Positive Integer: {x}")

        case str(s) if len(s) > 5:
            print(f"Long String: {s}")

        case [head, *tail]:
            print(f"List with head: {head} and tail: {tail}")
        
        case {"name": name, "age": age}:
            print(f"Name: {name}, age {age}")

        case _:
            print("Unknown data or pattern.")

process_data(10)
process_data("Hello World")
process_data([1, 2, 3])
process_data({"name": "Alice", "age": 10})


Positive Integer: 10
Long String: Hello World
List with head: 1 and tail: [2, 3]
Name: Alice, age 10


As you can see as well we can make patterns even more specific by adding type information.

`case[str(name), _, _, (float(lat), float(lon))]`

These **AREN'T** constructor calls, they're a **runtime type check**, have they manage to fail this runtime check it doesn't match the case so no error pops up..

On the surface this looks like a `switch/case` from JavaScript or C language but one key improvement from `switch` is **destructing**--a more advanced form of unpacking. Let's use our previous `metro_area` example.

In [203]:
# Using regular if/elif syntax
def main():
    for name, _, _, (lat, lon) in metro_areas:
        if lon <= 0:
            print(f"{name:15} | {lat:9.4f} | {lon:9.4f}")


# Using match/case
def main():
    for record in metro_areas:
        match record:
            # runs only if pattern matches and guard expression is truthy
            case [str(name), _, _, (lat, lon)] if lon <= 0:
                print(f"{name:15} | {lat:9.4f} | {lon:9.4f}")

main()

Mexico City     |   19.4333 |  -99.1333
New York-Newark |   40.8086 |  -74.0204
São Paulo       |  -23.5478 |  -46.6358


**Notice two crucial things.**

1. The subject of this match is `record` -- i.e, each of the tuples in `metro_areas`

2.  A `case` clause has two parts:

    2.1 A pattern

    2.2 An optional guard with the `if` keyword
    

For our case, a sequence pattern matches the subject if:
1. The subject if a sequence and, 
2. THe subject and the pattern have the same number of items and,
3. Each corresponding item matches, including nested items.

**Warning**

`str, bytes & bytearray` are not handled as sequences in the context of the `match/case` e.g. the int 987 is treated as an atomic value, not a sequence of digits. To handle them as a sequence subject, convert it in the `match` clause. For ex. 

In [204]:
phone = "16003001000" # ex. +1 U.S phone number

# only works for strs, bytes, and bytearrays
match list(phone):
    case ["1", *rest]:
        print(*rest)

6 0 0 3 0 0 1 0 0 0


### Inner Depths of Slicing in Python

Pythonic convention of excluding the last item in slices and ranges work well with zero-based indexing used in Python, C.

**Slice Objects**

This is no secret but `s[a:b:c]` can be used to specify a stride or step `c`, causing the slice to skip items. The stride can also be negative. returning items in reverse. The 3 examples make this clear.

In [205]:
s = 'bicycle'

# skips by 3 to get next letter printed
print(s[::3])

# goes backwards to get the elements in reverse
print(s[::-1])

# prints first letter, then skips 1, prints next.
print(s[::-2])

bye
elcycib
eccb


The notation `a:b:c` is only valid within `[]` when used as the indexing or subscript operator, and it produces a slice object: `slice(a, b, c)`.

Later on we'll see how to evaluate the expression `seq[start:stop:step]`, Python calls `seq.__getitem__(start, stop, step))` Knowing about slice objects is useful b/c it lets you assign names to slices, just like spread sheets allowing naming of cell ranges.

In [206]:
slicee = slice(0, 10, 5)
slicee

slice(0, 10, 5)

**Multidimensional Slicing and Ellipsis**

The `[]` operator can also take multiple indexes or slices separated by commas. The `__getitem__` and `__setitem__` special methods that handle the `[]` operator simply recieve the indices in `a[i, j]` as a tuple. In other words, to evaluate `a[i, j]`, Python calls `a.__getitem__((i, j))`

This is obviously used for packages like **numpy** where items of a 2d array `numpy.ndarray` can be fetched with the syntax:

`a[i, j]` => i as index of row (axis 0), j is the index of column (axis, 1)

However the rest of sequence types in Python are 1d, so they support one index or slice. **NOT** a tuple. (Except for `memoryview`) Basically `Ellipsis` <- (instance not class name btw) **are NOT used in standard Python library**

In [207]:
# Exploring some of my numpy's multi dimensional slicing
import numpy as np

# basically make a list range(0, 5) 
# then square each num 
a = np.arange(5) ** 2
print(a)

# more basic indexing a messing around with
print(a[::-1])
print(type(a), a.shape)

[ 0  1  4  9 16]
[16  9  4  1  0]
<class 'numpy.ndarray'> (5,)


**Assigning to Slices**

Mutable sequences can be changed in place using slice notation either by:

1. On the lefthand side of an assignment statement 

2. As the target of a `del statement`

In [208]:
l = list(range(10))
print(l)

# place list between these indices 
# if we add something like 50, 5 stays..
l[2:5] = [20, 30]
print(l)

# delete indices 5 and.6
del l[5:7]
print(l)

l[3::2] = [11, 22]
print(l)

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
[0, 1, 20, 30, 5, 6, 7, 8, 9]
[0, 1, 20, 30, 5, 8, 9]
[0, 1, 20, 11, 5, 22, 9]


Note when the target of an assignment is a slice the right hand side **must** be an iterable object, even if it has just one item.

**Using + and * with Sequences**

Its expected that sequences support `+` and `*`. Usually both operands of `+` must be the same sequence type, and neither of them is modified but a new sequence (new place in memory) for that same type is created as a result of the concatenation. 

***Original sequences remain unchanged, new sequences made***

To concat multiple copies of the same sequence, we can easily multiply it by an integer. Again saying above applies.

`l = [1, 2, 3]`

`l * 5`

`print(l)`

`>>>[1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3]`

**Warning** 

Because of this `+` and `*` always create a new object, and never change their operands. 

Beware of expressions like

`a * n` 

Where a is a sequence containing mutable items, because the result may surprise you. For ex. Trying to init a list of lists with code like this below. Which results in a ist with 3 references to the same inner list. 

In [209]:
# WARNING BAD CODE

my_list = [[]] * 3
my_list

[[], [], []]

**Building Lists of Lists**

Sometimes to init a list with a certain number of nested lists - for ex. to distribute a students in a list of teams or to represent squares on a game board the best way of doing so is with list comp..

In [210]:
board = [["_"] * 3 for i in range(3)]

# 3x3 empty game board
print(board)

# is the code that multiplies the entry within
# the list 3 times
x = ["_"] * 3
print(x)

board[1][2] = "X"
print(board)

[['_', '_', '_'], ['_', '_', '_'], ['_', '_', '_']]
['_', '_', '_']
[['_', '_', '_'], ['_', '_', 'X'], ['_', '_', '_']]


1. Created a list of three items each. Then checked structure.

2. Placed a mark in row 1, column 2.

The wrong way to do it is shown below.

In [211]:
bad_board = [["_"] * 3] * 3
# while it looks right..
print(bad_board)

# it has 3 references to the same list so its useless.
bad_board[1][2] = "X"

# not what we want..
print(bad_board)

# in essence it behaves like this code..
row = ["_"] * 3
board = []

for i in range(3):
    # same 'row' object is appended to board
    board.append(row)

[['_', '_', '_'], ['_', '_', '_'], ['_', '_', '_']]
[['_', '_', 'X'], ['_', '_', 'X'], ['_', '_', 'X']]


On the other hand list comp behaves like the code below..

1. Each iteration builds a new `row` and appends itself on to `board`

2. No same reference bugs here.

In [212]:
board = []
for i in range(3):
    # each iteration 
    row = ["_"] * 3
    board.append(row)

**Augmented Assignment with Sequences** 

The `+=` and `*=` behave quite differently, depending on the first operand. Whatever applies to `+=` will apply to `*=` etc.

The special method that makes `+=` work is `__iadd__` (in place addition)

However if `__iadd__` fails Python falls back to calling `__add__`. Consider this simple ex.

`a += b`

If a has implement `+=` then it will be called. In the case of mutable sequences (e.g. `list, bytearray, array.array`), a will be changed in place 

Changed in place means the expression will be similar to `a.extend(b)` 

If `a` **does NOT** have `+=` implemented it will be the same as the expression

`a = a + b` 

the expression `a + b` is evaluated first producing a new object, which is then bound to `a`.

In other words the identity of the object bound to `a` may or may not change depending on if `+=` is implemented.


But in general for mutable sequences its fine to assume that `+=` is implemented and that it is equivalent to something like `a.extend(b)`

In [213]:
l = [1, 2, 3]
print(id(l))
l *= 2 
# similar to l.extend([1, 2, 3]) & changed in place
print(l)
print(id(l))

# now with tuples.

t = (1, 2, 3)
print(id(t))

# notice after multiplication new tuple was created
t *= 2 
print(t)
print(id(t))

4389168960
[1, 2, 3, 1, 2, 3]
4389168960
4631559360
(1, 2, 3, 1, 2, 3)
4391564192


For obvious reasons repeated concatenation of immutable sequences (like `tuple`) is inefficient.

The interpreter has to create a whole new object instead of just modifying the original target sequence.

**Edge Cases with the `+=` operator**

Let's look at some weird outputs based on the `+=` operator.

In [214]:
t = (1, 2, [30, 40])
t[2] += [50, 60]

TypeError: 'tuple' object does not support item assignment

In [None]:
# Raised error for item assignment but..
# item / list is modified in immutable sequence?
print(t)

(1, 2, [30, 40, 50, 60])

From strange behavior like this it is a warning to:

1. Avoid putting mutable items in tuples.

2. Augmented assignment is not a atomic operation -- we just saw it throw and exception after doing part of its job 

3. Inspecting Python bytecode isn't too see what happens isn't a horrible idea.

**list.sort VS the sorted built-in**

The `list.sort` methods sorts in place--that is, without making a copy. It returns `None` to remind us that it changes the receiver & does **NOT** make a new list. 

^ This is a crucial Python API convention: functions or methods that change an object in place **should** return `None` to make it clear to the caller that the receiver was changed.

In contrast, the built-in function `sorted` creates a new list and returns it. It accepts **Any iterable object** as an arg, including immutable sequences and generators..

Both however take two optional, keyword-only args:

1. `reverse`: if `True` the items are returned descending oder (i.e from greatest to least). The default is `False`

**more importantly**

2. `key`: A one-arg function that will be applied to each item to produce its own sorting key. For ex, when sorting a list of strings, `key=str.lower` can be used to perform ***case-insensitive sort***, and `key=len` will sort the strings by character length..

^ The default is the identity function (i.e, the items themselves are compared)

***Note:*** You can also use the keyword parameter `key` with the `min()` and `max()` built-ins and with other functions from the standard library (e.g. `itertools.groupby() and `heapq.nlargest()`)

These are some examples to show the use of the built in `sort()` method.

In [None]:
fruits =  ['grape', 'raspberry', 'apple', 'banana']
new_fruits = sorted(fruits)

# both different and exist
print(fruits)
print(new_fruits)

new_fruits = sorted(fruits, key=len)
# sorted by their length
print(f"Sorted by length: {new_fruits}")

new_fruits = sorted(fruits, key=len, reverse=True)
print(f"Sorted by their length (descending): {new_fruits}")

['grape', 'raspberry', 'apple', 'banana']
['apple', 'banana', 'grape', 'raspberry']
Sorted by length: ['grape', 'apple', 'banana', 'raspberry']
Sorted by their length (descending): ['raspberry', 'banana', 'grape', 'apple']


**When a List is Not the Answer**

The `list` type is flexible and very easy to use, but depending on specific reqs, there are better options.

For example an `array` saved a lot of memory when you need to handle millions of floating-point values. On the other hand, if you are constantly adding/removing items from opposite ends of a list, it's good to use a `deque` from python's standard library.

This is some basic data structure shit to be honest.

**Arrays**

If a list only contains numbers, an `array.array` is a more efficient replacement. Arrays support all mutable sequence operations (including `.pop`, `.insert`, and `.extend`), as well as additional methods for very fast loading / saving.

A Python array is as lean as a C array. An array of `float` values does not hold full fledged `float` instances, but only packed bytes representing their machine values-- similar to an array of `double` in the C language. 

When creating an `array` you provide a typecode, a letter to determine the underlying C type used to store each item in the array. For ex, `b` is the typecode for what C calls a `signed char`, an it ranging from -128 to 127. 

If you create an `array('b')` then each item will be stored in a single byte and interpreted as an integer. For larger sequences of numbers this saved a bunch of memory. Python also does some type safing as well because of this.

In [None]:
from array import array
from random import random

floats = array("d", (random() for i in range(10**7)))
print(floats[0:3])

fp = open("floats.bin", "wb")
floats.tofile(fp)
fp.close()

floats2 = array("d")
fp = open("floats.bin", "rb")
floats2.fromfile(fp, 10**7)
fp.close()

print(floats2[0:3])
print(floats == floats2)

array('d', [0.9892777292048522, 0.4033524040657086, 0.8112342353416552])
array('d', [0.9892777292048522, 0.4033524040657086, 0.8112342353416552])
True


1. Imported the array type

2. Create an array of double-precision floats (typecode `'d'`) from any iterable object -- (in this case a generator expression)

3. Inspect the first 3 digits in the array.

4. Save the array to a binary file

5. Create an empty array of doubles

6. Read the 10 million numbers from the binary file.

7. Inspect the first 3 elements in the array.

8. Verify their the same (element wise)

***Note:*** `array.tofile` and `array.fromfile` are extremely fast. If you test it the code takes 0.1 seconds to load 10 million double precision floats. Something 600 times faster than reading numbers from a text file.

To learn more check out `array.array` docs.


**Memory Views**

The built-in `memoryview` class is a shared memory sequence type that lets you handle slices of arrays without copying bytes.

A memoryview can be viewed as a generalized NumPy array structure in Python itself (without all the mathy stuff). It allows you to share memory between data structures (things like `PIL` images of SQLite databases, NumPy arrays, etc.) *without first copying* something crucial for large datasets.

Using notation similar to the `array` module, the `memoryview.cast` method lets you change the way multiple bytes are read or written as units without moving bits around. `memoryview.cast` returns yet another `memoryview` object, always sharing the same memory.

In [None]:
from array import array
octets = array('B', range(6))
m1 = memoryview(octets)

# show m1
print(m1.tolist())

# make new memory view w/ 2 rows and 3 columns
m2 = m1.cast('B', [2, 3])
print(m2.tolist())

# again new memory view from m1 (3 rows, 2 columns)
m3 = m1.cast('B', [3, 2])
print(m3.tolist())

# modified m2 and m3
m2[1, 1] = 22 # row 1 col 1 changed
m3[1, 1] = 33

print(f"m2 changed: {m2.tolist()}\nm3 changed: {m3.tolist()}")
print(f"\nThis ALL changed m1:\n{octets}")

[0, 1, 2, 3, 4, 5]
[[0, 1, 2], [3, 4, 5]]
[[0, 1], [2, 3], [4, 5]]
m2 changed: [[0, 1, 2], [33, 22, 5]]
m3 changed: [[0, 1], [2, 33], [22, 5]]

This ALL changed m1:
array('B', [0, 1, 2, 33, 22, 5])


1. Build an array of 6 bytes (typecode 'B')

2. Build memoryview frp, that array, then export it as `list`

3. Build new memoryview from prev one but with 2 rows and 3 columns..

4. Yet another memoryview, now with 3 rows and 2 columns.

5. Overwrite byte in `m2` at row 1 column 1 with 22.

6. Overwrite byte in `m3`

7. Display og array to show memory was shared among `octets`, `m1`, `m2`, and `m3`.

**NumPy**

Let's take a quick reintroduction to `numpy` just because of how neat it is and helpful it is for both quant/ML use, but also scientific computing. Here are some high lvl operations.

In [None]:
import numpy as np

a = np.arange(12)
print(a, type(a), a.shape)

a.shape = (3, 4)
print(f"Changed a shape:\n{a, type(a)}")

# get row last of the 3x4 array 
print(a[2])
print(a[-1])

# get columns of a (column 1)
print(a[:, 1])

# of course
print(f"\nTranspose of A: {a.T}")

[ 0  1  2  3  4  5  6  7  8  9 10 11] <class 'numpy.ndarray'> (12,)
Changed a shape:
(array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]]), <class 'numpy.ndarray'>)
[ 8  9 10 11]
[ 8  9 10 11]
[1 5 9]


### Summary of Chapter 2

The most important things to know from this chapter since some of the notes of the chapter were not included are..

1. Every Python object in memory has a header with metadata. The simplest Python object, a `float`, has a value field and two metadeta fields.

- `ob_refcnt:` the object’s reference count
- `ob_type:` a pointer to the object’s type
- `ob_fval:` a C `double` holding the value of the float.

On a 64-bit Python build, each of those fields takes 8 bytes. That’s why an array of floats is much more compact than a tuple of floats:

the array is a single object holding the raw values of the floats, while the tuple contains several objects—the tuple itself (ob_type reference) and each float object contained in it. 

Another way of grouping sequence types is by mutability:

*Mutable Sequences*

For example, `list, bytearray, array.array,` and `collections.deque`

*Immutable Sequences*

For example, `tuple, str,` and `bytes`

Always keep in mind: mutable v immutable; container v flat.

**Generator Expressions are Crucial!**

In Python code, line breaks are ignored inside pairs of [], {}, or ().
So you can build multiline lists, listcomps, tuples, dictionaries, etc.,
without using the \ line continuation escape, which doesn’t work if
you accidentally type a space after it. Also, when those delimiter
pairs are used to define a literal with a comma-separated series of
items, a trailing comma will be ignored. So, for example, when cod‐
ing a multiline list literal, it is thoughtful to put a comma after the
last item, making it a little easier for the next coder to add one
more item to that list, and reducing noise when reading diffs.

In [None]:
colors = ["white", "black"]
sizes = ["S", "M", "L"] 
tshirts = [(color, size) for color in colors for size in sizes]
print(tshirts)


**The Generator Expressions**

To init tuples, arrays, and other types of sequences, you could also start from listcomp, but a genexp saves memory because it yields items one by one using the iterator protocol instead of building a whole list to just feed it into another type constructor. 

Genexps use the same syntax as listcomps, but are enclosed in parentheses rather than brackets.

In [None]:
# init a tuple and an array from a generator expression
symbols = "$¢£¥€¤"
x = tuple(ord(symbol) for symbol in symbols)
print(f"tuple built of genexps: {x}")

import array
# the "i" defines storage type while learn more abt this soon
y = array.array("I", (ord(symbol) for symbol in symbols) )
print(f"array built on genexps: {y}")

tuple built of genexps: (36, 162, 163, 165, 8364, 164)


## Chapter 3: Dictionaries and Sets 

*Hash tables* are the engines behind Python's high-performanced `dict`. With many other built-in types based on hash tables like `set` and `frozenset` 

**Modern dict Syntax**

The syntax of listcomps and genexps was adapted to `dict` comprehensions (and `set` comprehensions as well). A *dictcomp* just builds a `dict` instance from any iterable. Let's look at an ex.

In [None]:
dial_codes = [
    (880, 'Bangladesh'),
    (55, 'Brazil'),
    (86, 'China'),
    (91, 'India'),
    (62, 'Indonesia'),
    (81, 'Japan'),
    (234, 'Nigeria'),
    (92, 'Pakistan'),
    (7, 'Russia'),
    (1, 'United States'),
]

country_dial = {country: code for code, country in dial_codes}
print(country_dial)

# could make this even better
country_dial = {country.upper(): code for country, code in 
                sorted(country_dial.items()) if code < 70}

print(f"\nadvanced dict comp results: {country_dial}")

{'Bangladesh': 880, 'Brazil': 55, 'China': 86, 'India': 91, 'Indonesia': 62, 'Japan': 81, 'Nigeria': 234, 'Pakistan': 92, 'Russia': 7, 'United States': 1}

advanced dict comp results: {'BRAZIL': 55, 'INDONESIA': 62, 'RUSSIA': 7, 'UNITED STATES': 1}


**Unpacking Mappings / Dict Unpacking** 

We can apply `**` to more than one argument in a function call. This works when keys are all strings and unique across all arguments (duplicate keys not allowed.)

When the `**` unpacking operator is used in function calls or dictionary displays, it unpacks a dictionary's key-value pairs as key word argument or into another dictionary as seen here below..

Notice as well here even duplicate keys are allowed when used in a `dict` literal, because unpacking just overwrites previous keys.

In [None]:
def dump(**kwargs):
    return kwargs

a = dump(**{'x': 1}, y=2, **{'z': 3})
print(a)

# also ** can be used in a dict literal - multiple times
b = {'a': 0, **{'x': 1}, 'y': 2, **{'z':3, 'x': 4}}
print(b)

def greet(name, greeting='Hello'):
    print(f"{greeting} {name}!")

person_data = {"name": "Alice", "greeting": "tootles!"}
greet(**person_data)

d1, d2 = {'a': 1, 'b': 2}, {'c': 3, 'd': 4}
merged_d = {**d1, **d2}
print(f'\nmerged dict: {merged_d}')

{'x': 1, 'y': 2, 'z': 3}
{'a': 0, 'x': 4, 'y': 2, 'z': 3}
tootles! Alice!

merged dict: {'a': 1, 'b': 2, 'c': 3, 'd': 4}


**Merging Mappings / Dicts with |**

As we just saw we can merge dicts in a `dict` literal with the use of the `**` unpacking operator.

However Python 3.9 also allows to use `|` or `|=` to merge mappings/dictionaries. This makes sense are their the union operators.

In [None]:
d1 = {'a': 1, 'b': 3}
d2 = {'a': 2, 'b': 4, 'c': 6}

# Notice - overwritten previous keys
print(d1 | d2)


# Notice - d1 unchanged.
print(d1)

# Can also be used to update and existing mapping in place
d1 |= d2
print(d1)


{'a': 2, 'b': 4, 'c': 6}
{'a': 1, 'b': 3}


**Standard API of Mapping Types**

The `collections.abc` module provides the `Mapping` and `MutableMapping` ABCs describing the interfaces of `dict` and similar types.

The main value of the ABCs is **the documenting and formalizing the standard interfaces for mappings**, and serving as criteria for `isinstance` tests in code that needs to support mappings in a broad sense.

In [1]:
from collections import abc
my_dict = {}
print(isinstance(my_dict, abc.Mapping))
print(isinstance(my_dict, abc.MutableMapping))

True
True


To implement a custom mapping its easier to extend `collections.UserDict` or to wrap a `dict` by composition, instead of subclassing these ABCs. 

the `collections.UserDict` class has all concrete mapping classes and the basic `dict` implementation, which is built on a hash table.

**What is hashable**

Remember that all keys in a `dict` must be hashable and from the *Python Glossary*

*An object is hashable if the hash code never changes during its lifetime (it also needs a `__hash__()` method), and can be compared to other objects (it needs a `__eq__()` method). Hashable. objects which compare equal must have the same hash code.*

Of course, numeric and flat immutable types (`str`, `int`) all hashable. Container types *CAN BE* hashable if all contained objects are also hashable. 

A `tuple` for ex. is hashable if it has hashable items in it.

`tl = (1, 2, [30, 40])` <= would throw error.

User-defined types are hashable by def b/c their hash-code is their `id()`, and the `__eq__()` method inherited from the `object` class simply compares the object IDs. 

^ ***Note:*** Of course this changes if you implement your own `__eq__()` method but it still is hashable if `__hash__()` always returns the same hash code.


**Overview of Common Mapping Methods**

The basic API for mappings is quite rich, the table shows the methods implemented by `dict` and two popular variations: `defaultdict` and `OrderedDict` both from `collections`


*Dictionary Methods Comparison*



| Method | dict | defaultdict | OrderedDict | Description |
|--------|------|-------------|-------------|-------------|
| `d.clear()` | ● | ● | ● | Remove all items |
| `d.__contains__(k)` | ● | ● | ● | `k in d` |
| `d.copy()` | ● | ● | ● | Shallow copy |
| `d.__copy__()` | | | ● | Support for `copy.copy(d)` |
| `d.default_factory` | | ● | | Callable invoked by `__missing__` to set missing values |
| `d.__delitem__(k)` | ● | ● | ● | `del d[k]` - remove item with key k |
| `d.fromkeys(it, [initial])` | ● | ● | ● | New mapping from keys in iterable, with optional initial value (defaults to None) |
| `d.get(k, [default])` | ● | ● | ● | Get item with key k, return default or None if missing |
| `d.__getitem__(k)` | ● | ● | ● | `d[k]` - get item with key k |
| `d.items()` | ● | ● | ● | Get view over items - (key, value) pairs |
| `d.__iter__()` | ● | ● | ● | Get iterator over keys |
| `d.keys()` | ● | ● | ● | Get view over keys |
| `d.__len__()` | ● | ● | ● | `len(d)` - number of items |
| `d.__missing__(k)` | | ● | | Called when `__getitem__` cannot find the key |
| `d.move_to_end(k, [last])` | | | ● | Move k first or last position (last is True by default) |
| `d.__or__(other)` | ● | ● | ● | Support for `d1 \| d2` to create new dict merging d1 and d2 (Python ≥ 3.9) |
| `d.__ior__(other)` | ● | ● | ● | Support for `d1 \|= d2` to update d1 with d2 (Python ≥ 3.9) |
| `d.pop(k, [default])` | ● | ● | ● | Remove and return value at k, or default or None if missing |
| `d.popitem()` | ● | ● | ● | Remove and return the last inserted item as (key, value) |
| `d.__reversed__()` | ● | ● | ● | Support for `reverse(d)` - returns iterator for keys from last to first inserted |
| `d.__ror__(other)` | ● | ● | ● | Support for `other \| d` - reversed union operator (Python ≥ 3.9) |
| `d.setdefault(k, [default])` | ● | ● | ● | If k in d, return d[k]; else set d[k] = default and return it |
| `d.__setitem__(k, v)` | ● | ● | ● | `d[k] = v` - put v at k |
| `d.update(m, [**kwargs])` | ● | ● | ● | Update d with items from mapping or iterable of (key, value) pairs |
| `d.values()` | ● | ● | ● | Get view over values |

### Key Differences:
- **defaultdict**: Has `default_factory` and `__missing__` for automatic value creation
- **OrderedDict**: Has `move_to_end()` and `__copy__()` for order manipulation
- **dict**: Standard implementation with all basic functionality

The way `d.update(m)` handles its first argument `m` is a prime example of *duck typing* it first checks whether `m` has keys method and, if it does, assumes it is a mapping.

Otherwise, `update()` falls back to iterating over m, assuming its items are `(key, value)` pairs.

The constructor for most Python mappings uses the logic of `update()` internally, which means they can be initialized from other mappings or from any other iterable object producing `(key, value)` pairs.

`setdefault()` avoids redundant key lookups when we need to update the value of an item in place. The next section shows a bit more about it.

**Inserting or Updating Mutable Values**