# Data Class Builders

 - data class is a simple class that is just a collection of fields with little to no extra functionality
 - three possible data class builders:
    - `collections.namedtuple`
    - `typing.NamedTuple`
    - `@dataclasses.dataclass`

In [1]:
class Coordinate:

    def __init__(self, lat, lon):
        self.lat = lat
        self.lon = lon

prague = Coordinate(51.07, 14.41)
display(prague)

mystery_location = Coordinate(51.07, 14.41)
display(prague == mystery_location)
display((prague.lat, prague.lon) == (mystery_location.lat, mystery_location.lon))

<__main__.Coordinate at 0x10d16b220>

False

True

 - `__repr__` inherited from `object` is not helpful
 - `__eq__` inherited from `object` compares object IDs
 - comparing coordinates requires explicit comparison of each attribute

 => data class builders provide these methods automatically
 
 => none depend on inheritance - they use different metaprogramming techniques to inject methods and data attributes into the class under construction

##### `collections.namedtuple`

In [2]:
from collections import namedtuple

Coordinate = namedtuple('Coordinate', 'lat lon')
display(issubclass(Coordinate, tuple))

prague = Coordinate(51.07, 14.41)
display(prague)
display(prague == Coordinate(51.07, 14.41))

True

Coordinate(lat=51.07, lon=14.41)

True

 - useful `__repr__`
 - meaningful `__eq__`

##### `typing.NamedTuple`

In [None]:
import typing
from typing import NamedTuple

Coordinate = typing.NamedTuple('Coordinate', [('lat', float), 
                                              ('lon', float)])
display(issubclass(Coordinate, tuple))

class Coordinate(NamedTuple):
    lat: float
    lon: float

    def __str__(self):
        ns = 'N' if self.lat >= 0 else 'S'
        we = 'E' if self.lon >= 0 else 'W'

        return f'{abs(self.lat):.1f}˚{ns}, {abs(self.lon):.1f}˚{we}'



 - supports type annotation to each field
 - since Python 3.6, `NamedTuple` can be used in a `class` statement with type annotations improving readability and method implementation
 - although `NamedTuple` appears in the `class` statement as a superclass, it is actually a metaclass

##### `@dataclasses.dataclass`

In [None]:
from dataclasses import dataclass

@dataclass(frozen=True)
class Coordinate:
    lat: float
    lon: float

    def __str__(self):
        ns = 'N' if self.lat >= 0 else 'S'
        we = 'E' if self.lon >= 0 else 'W'

        return f'{abs(self.lat):.1f}˚{ns}, {abs(self.lon):.1f}˚{we}'

 - declares instance variables, reads variable annotations and automatically generates methods for your class
 - does not depend on inheritance or a metaclass -> subclass of `object`

##### Comparison of data class builders

![table1](Illustrations/p1ch5_1.png)

 - `collections.namedtuple` and `typing.NamedTuple` produce `tuple` subclasses, therefore instances are immutable
 - `@dataclass` produces mutable classes (unless `frozen=True`)

 - only `typing.NamedTuple` and `@dataclass` support regular `class` statement making it easier to implement methods

 - both tuple variants provide an instance method to construct `dict` object from fields in the data class instance
 - `dataclasses` module provides a function `.asdict`

 - both tuple variants provide field names and default values with `._fields` and `._field_defaults` class attributes
 - `dataclasses` module provides a `fields` function

- both `typing.NamedTuple` and `@dataclass` provide field types

 - both tuple variants use instance methods `._replace` to change attributes while the decorator uses module-level function `.replace`

 - framework may need to build data classes on the fly at runtime -> both tuple variants use standrd calls while the decorator has a module-level function `.make_dataclass`

##### Classic named tuples

 - class name and a list of field names (either as list of strings or space delimited string) are needed to create a named tuple
 - field values must be passed as separate positional arguments to the constructor
 - fields can be accessed by name or position

In [3]:
City = namedtuple('City', 'name country population coordinates')

prague = City('Prague', 'Czech Republic', 1.309, (51.07, 14.41))
display(prague)
display(prague.country)
display(prague[2])

City(name='Prague', country='Czech Republic', population=1.309, coordinates=(51.07, 14.41))

'Czech Republic'

1.309

 - as a `tuple` subclass, it inherits useful methods such as `__eq__` and comparison operators (enabling sorting)
 - in addition to those inherited, `_fields` class attribute and `_asdict()` instance method are the most useful

In [5]:
display(City._fields)
display(prague._asdict())

('name', 'country', 'population', 'coordinates')

{'name': 'Prague',
 'country': 'Czech Republic',
 'population': 1.309,
 'coordinates': (51.07, 14.41)}

 - since Python 3.7, `namedtuple` accepts the `defaults` keyword-only argument providing an iterable of N default values for each of the N rightmost fields of the class
 - methods can be added to `namedtuple` as a class attribute but it is hacky and much easier to do with the other two data class builders

##### Typed named tuples