### Collections

* namedtuple, dataclass
* ordereddict
* defaultdict
* counter

Sidenote: shelve

----

* UserDict
* UserList
* UserString

### Record like structures

* namedtuple as a way to document tuple like structures
* no performance overhead, drop in replacement
* a class-generator

In [1]:
import collections

In [2]:
Record = collections.namedtuple("Record", "a b c")

In [3]:
record = Record(1, 2, 3)

In [4]:
record

Record(a=1, b=2, c=3)

In [10]:
# record = Record(b=2, c=3) # TypeError

In [11]:
record = Record(1, 2, c=3)

In [12]:
record._asdict()

{'a': 1, 'b': 2, 'c': 3}

In [14]:
record._field_defaults

{}

> defaults can be None or an iterable of default values. Since fields with a default value must come after any fields without a default, the defaults are applied to the rightmost parameters. For example, if the fieldnames are ['x', 'y', 'z'] and the defaults are (1, 2), then x will be a required argument, y will default to 1, and z will default to 2.

In [19]:
Record = collections.namedtuple("Record", "a b c", defaults=[None, None, 100])

In [20]:
record = Record(1, 2)

In [21]:
record

Record(a=1, b=2, c=100)

In [23]:
Record = collections.namedtuple("Record", "a b c", defaults=[100]) # would also be "c"

In [25]:
record = Record(1, 2)

In [26]:
record

Record(a=1, b=2, c=100)

In [28]:
# record = Record(1) # TypeError

Note: 
    
* you should consider a namedtuple, if you depend on tuple indexing

### DataClasses

A data class is a record like structure. It main idea is to automatically provide useful methods.

> automatically adding generated special methods

* new in 3.7


In [32]:
from dataclasses import dataclass

@dataclass
class InventoryItem:
    """Class for keeping track of an item in inventory."""
    name: str
    unit_price: float
    quantity_on_hand: int = 0

    def total_cost(self) -> float:
        return self.unit_price * self.quantity_on_hand

In [35]:
InventoryItem.__init__

<function __main__.__create_fn__.<locals>.__init__(self, name: str, unit_price: float, quantity_on_hand: int = 0) -> None>

In [36]:
import inspect

In [44]:
InventoryItem.__init__

<function __main__.__create_fn__.<locals>.__init__(self, name: str, unit_price: float, quantity_on_hand: int = 0) -> None>

The dataclass constructor allows for some customization:
    
> init=True, repr=True, eq=True, order=False, unsafe_hash=False, frozen=False, match_args=True, kw_only=False, slots=False, weakref_slot=False

>  A field is defined as a class variable that has a type annotation. With two exceptions described below, nothing in dataclass() examines the type specified in the variable annotation.

In [45]:
@dataclass
class Example:
    a : int
    b : str

In [47]:
ex = Example(a=1.0, b="ok")

In [48]:
ex

Example(a=1.0, b='ok')

Additional machinery will be required to ensure type safety.

> [unsafe_hash]: If eq and frozen are both true, by default dataclass() will generate a __hash__() method for you. If eq is true and frozen is false, __hash__() will be set to None, marking it unhashable (which it is, since it is mutable). 

...

> frozen: If true (the default is False), assigning to fields will generate an exception. This emulates read-only frozen instances.

Special field options.

> For common and simple use cases, no other functionality is required. There are, however, some dataclass features that require additional per-field information. To satisfy this need for additional information, you can replace the default field value with a call to the provided field() function.

In [54]:
from dataclasses import field
from typing import List

In [56]:
@dataclass
class C:
    mylist: List[int] = field(default_factory=list)


In [57]:
c = C()
c.mylist += [1, 2, 3]

In [58]:
c

C(mylist=[1, 2, 3])

Argument for field:
    
> default=MISSING, default_factory=MISSING, init=True, repr=True, hash=None, compare=True, metadata=None, kw_only=MISSING

Other representations:
    
* tuple
* json?

In [59]:
@dataclass
class Example:
    a : int
    b : str

In [60]:
ex = Example(a=1, b="hello")

In [62]:
import dataclasses

In [63]:
dataclasses.astuple(ex)

(1, 'hello')

In [64]:
dataclasses.asdict(ex)

{'a': 1, 'b': 'hello'}

In [65]:
import json

In [67]:
# json.dumps(ex) # TypeError

In [68]:
json.dumps(dataclasses.asdict(ex))

'{"a": 1, "b": "hello"}'

Other serialization problems may persist (e.g. set).

Post-init processing

* there is a hook called after `__init__`, called `__post_init__`



In [69]:
@dataclass
class C:
    a: float
    b: float
    c: float = field(init=False)

    def __post_init__(self):
        self.c = self.a + self.b

In [71]:
c = C(a=1, b=2)

In [72]:
c

C(a=1, b=2, c=3)


### Custom dictionaries


* OrderedDict
* defaultdict
* Counter

### Subclassing builtin types

Various options: subclass `str`, `list`, `dict` directly or use:

* UserString
* UserList
* UserDict

Additionally, we could use an ABC (e.g. abc.MutableMapping for a dictionary).

Use cases:

* ABC, if subclass is very different
* UserX come from a time, where a subclassing builtin was not possible

Still, UserX approach has an advantage as they require less adjustment, boilerplate.



**Task**: Create a dictionary that allows "dot" access.

In [2]:
import collections

In [54]:
type(dict)

type