# Chapter 5. Data Class Builders
---

## ToC

[Objectives](#objectives)  

1. [Overview of Data Class Builders](#overview-of-data-class-builders)  
    1.1. [Main Features](#main-features)  
2. [Classic Named Tuples](#classic-named-tuples)  
3. [Typed Named Tuples](#typed-named-tuples)

---

## Objectives

Python offers a few ways to build a simple class that is just a collection of fields, with little or no extra functionality. That pattern is known as a “data class”—and `data classes` is one of the packages that supports this pattern. This chapter covers three
different class builders that you may use as shortcuts to write data classes:

- `collections.namedtuple`:
The simplest way—available since Python 2.6.

- `typing.NamedTuple`:
An alternative that requires type hints on the fields—since Python 3.5, with
`class` syntax added in 3.6.

- `@dataclasses.dataclass`
A class decorator that allows more customization than previous alternatives, adding lots of options and potential complexity—since Python 3.7.

After covering those class builders, we will discuss why Data Class is also the name of a *code smell*: a coding pattern that may be a symptom of poor object-oriented design.

![Figure 71](https://raw.githubusercontent.com/berserkhmdvhb/Training-Python/main/figures/Part_I/71.PNG)

## Overview of Data Class Builders

Consider a simple class to represent a geographic coordinate pair,

### Simple Class

In [1]:
"""
``Coordinate``: a simple class with a custom ``__str__``::

    >>> moscow = Coordinate(55.756, 37.617)
    >>> print(moscow)  # doctest:+ELLIPSIS
    <coordinates.Coordinate object at 0x...>
"""

# tag::COORDINATE[]
class Coordinate:

    def __init__(self, lat, lon):
        self.lat = lat
        self.lon = lon

# end::COORDINATE[]

In [2]:
# __repr__ inherited from object is not very helpful
moscow = Coordinate(55.76, 37.62)
moscow

<__main__.Coordinate at 0x15adae27e00>

In [3]:
location = Coordinate(55.76, 37.62)
# Meaningless '=='. The __eq__ method inherited from object compares object IDs
location == moscow

False

In [4]:
# Comparing two coordinates requires explicit comparison of each attribute.
(location.lat, location.lon) == (moscow.lat, moscow.lon)

True

![Figure 72](https://raw.githubusercontent.com/berserkhmdvhb/Training-Python/main/figures/Part_I/72.PNG)

### `collections.namedtuple`

`namedtuple`: a factory function that builds a subclass of tuple with the name and fields you specify:

In [5]:
from collections import namedtuple
Coordinate = namedtuple('Coordinate', 'lat lon')
issubclass(Coordinate, tuple)

True

In [6]:
moscow = Coordinate(55.756, 37.617)
moscow

Coordinate(lat=55.756, lon=37.617)

In [7]:
moscow = Coordinate(55.756, 37.617)
moscow

Coordinate(lat=55.756, lon=37.617)

In [8]:
moscow == Coordinate(lat=55.756, lon=37.617)

True

### `typing.NamedTuple`

The newer `typing.NamedTuple` provides the same functionality, adding a type annotation
to each field.

In [12]:
import typing
Coordinate = typing.NamedTuple("Coordinate", [("lat", float), ("lon", float)])
issubclass(Coordinate, tuple)

True

In [13]:
typing.get_type_hints(Coordinate)

{'lat': float, 'lon': float}

![Figure 73](https://raw.githubusercontent.com/berserkhmdvhb/Training-Python/main/figures/Part_I/73.PNG)

`typing.NamedTuple` can also be used in a class statement

In [4]:
from typing import NamedTuple
class Coordinate(NamedTuple):
    lat: float
    lon: float
    def __str__(self):
        ns = 'N' if self.lat >= 0 else 'S'
        we = 'E' if self.lon >= 0 else 'W'
        return f'{abs(self.lat):.1f}°{ns}, {abs(self.lon):.1f}°{we}'

In [5]:
nt = Coordinate(55.756, 37.617)
nt.lat

55.756

In [6]:
print(nt)

55.8°N, 37.6°E


![Figure 74](https://raw.githubusercontent.com/berserkhmdvhb/Training-Python/main/figures/Part_I/74.PNG)

### `dataclass`

In [7]:
from dataclasses import dataclass
@dataclass(frozen=True)
class Coordinate:
    lat: float
    lon: float
    def __str__(self):
        ns = 'N' if self.lat >= 0 else 'S'
        we = 'E' if self.lon >= 0 else 'W'
        return f'{abs(self.lat):.1f}°{ns}, {abs(self.lon):.1f}°{we}'

The `@dataclass` decorator does not depend on inheritance or a metaclass, so it should not interfere with your own use of these mechanisms. In previous example, the `Coordinate` class is a subclass of `object`.

### Main Features

The different data class builders have a lot in common, as summarized in following table:

![Figure 75](https://raw.githubusercontent.com/berserkhmdvhb/Training-Python/main/figures/Part_I/75.PNG)

![Figure 76](https://raw.githubusercontent.com/berserkhmdvhb/Training-Python/main/figures/Part_I/76.PNG)

## Classic Named Tuples

The `collections.namedtuple` function is a factory that builds subclasses of tuple enhanced with field names, a class name, and an informative `__repr__`. Python standard library that are used to return tuples now return named tuples for convenience, without affecting the user’s code at all

![Figure 77](https://raw.githubusercontent.com/berserkhmdvhb/Training-Python/main/figures/Part_I/77.PNG)

**Example:** Defining and using a named tuple type

In [12]:
from collections import namedtuple
City = namedtuple('City', 'name country population coordinates')
# Field values must be passed as separate positional arguments to the constructor 
# (in contrast, the tuple constructor takes a single iterable).
tokyo = City('Tokyo', 'JP', 36.933, (35.689722, 139.691667))
tokyo

City(name='Tokyo', country='JP', population=36.933, coordinates=(35.689722, 139.691667))

In [13]:
tokyo.population

36.933

In [14]:
tokyo.coordinates

(35.689722, 139.691667)

In [15]:
tokyo[1]

'JP'

**Example:** Named tuple attributes and methods

In [16]:
City._fields

('name', 'country', 'population', 'coordinates')

`._make()` builds City from an iterable; `City(*delhi_data)` would do the same.

In [25]:
Coordinate = namedtuple('Coordinate', 'lat lon')
delhi_data = ('Delhi NCR', 'IN', 21.935, Coordinate(28.613889, 77.208889))
delhi = City._make(delhi_data)
delhi

City(name='Delhi NCR', country='IN', population=21.935, coordinates=Coordinate(lat=28.613889, lon=77.208889))

In [26]:
delhi._asdict()

{'name': 'Delhi NCR',
 'country': 'IN',
 'population': 21.935,
 'coordinates': Coordinate(lat=28.613889, lon=77.208889)}

In [27]:
# serialize the data in JSON format
import json
json.dumps(delhi._asdict())

'{"name": "Delhi NCR", "country": "IN", "population": 21.935, "coordinates": [28.613889, 77.208889]}'

![Figure 78](https://raw.githubusercontent.com/berserkhmdvhb/Training-Python/main/figures/Part_I/78.PNG)

`namedtuple` accepts the `defaults` keyword-only argument providing an iterable of N default values for each of the N *rightmost* fields of the class

**Example:** Named tuple attributes and methods

In [32]:
Coordinate = namedtuple('Coordinate', 'lat lon reference', defaults=[0, 0, 'WGS84'])
Coordinate()

Coordinate(lat=0, lon=0, reference='WGS84')

In [37]:
Coordinate = namedtuple('Coordinate', 'lat lon reference', defaults=[0 ,'WGS84'])
Coordinate(1)

Coordinate(lat=1, lon=0, reference='WGS84')

In [29]:
Coordinate._field_defaults

{'reference': 'WGS84'}

it’s easier to code methods with the class syntax supported by `typing.NamedTuple` and `@dataclass`. You can also add methods to a namedtuple, but it’s a hack (Pg171)

## Typed Named Tuples

In [7]:
from typing import NamedTuple
from typing import get_type_hints
from inspect import get_annotations
class Coordinate(NamedTuple):
    lat: float = 0
    lon: float = 0
    reference: str = 'WGS84'

c = Coordinate()    
c

Coordinate(lat=0, lon=0, reference='WGS84')

In [11]:
get_type_hints(Coordinate)

{'lat': float, 'lon': float, 'reference': str}

In [12]:
get_type_hints(c)

{'lat': float, 'lon': float, 'reference': str}

In [14]:
get_annotations(c)

TypeError: Coordinate(lat=0, lon=0, reference='WGS84') is not a module, class, or callable.

In [13]:
get_annotations(Coordinate)

{'lat': float, 'lon': float, 'reference': str}

`inspect.get_annotations` is more strict than `typing.get_type_hints`, as shown in example above, it only works on following objects:
- Modules
- Classes
- Callables (functions/methods)