## Named Tuples vs Data Classes

In [172]:
import sys
from collections import namedtuple
from typing import Any, NamedTuple, OrderedDict

A named tuple can be defined either by using the `collections.namedtuple` function or defining a subclass inheriting from `typing.NamedTuple`.

In [173]:
Grade: NamedTuple = namedtuple('Grade', ('score', 'weight'))

In [174]:
grade = Grade(99, 0.10)
grade

Grade(score=99, weight=0.1)

In [175]:
class GradeNt(NamedTuple):
    """Represents a grade, inheriting from NamedTuple."""
    score: int
    weight: float


In [176]:
grade_nt = GradeNt(99, 0.10)
grade_nt

GradeNt(score=99, weight=0.1)

In [177]:
grade, grade_nt

(Grade(score=99, weight=0.1), GradeNt(score=99, weight=0.1))

Both objects have the same value and size.

In [178]:
grade == grade_nt

True

In [179]:
sys.getsizeof(grade), sys.getsizeof(grade_nt)

(72, 72)

However, using a class has the added benefit of enabling type annotations and default values. Also a real docstring. One could possibly argue that the code is more readable and explicit.

In [180]:
class GradeNtDefault(NamedTuple):
    """Represents a grade, inheriting from NamedTuple."""
    score: int
    weight: float
    comment: str = 'Good job from NamedTuple class'

In [181]:
grade_nt1 = GradeNtDefault(99, 0.10)
grade_nt1

GradeNtDefault(score=99, weight=0.1, comment='Good job from NamedTuple class')

In both cases values are still accesible via named attributes and numerical index:

In [182]:
grade.score, grade_nt1.score,

(99, 99)

In [183]:
grade[0], grade_nt1[0]

(99, 99)

To prevent access via numerical index, we can override the dunder (double underscore) `__getitem__` method and throw a `TypeError` whenever we try to access a value.

In [184]:
class GradeNtGi(NamedTuple):
    """Represents a grade, inheriting from NamedTuple."""
    score: int
    weight: float
    comment: str = 'Good job from NamedTuple class with overridden __getitem__'

    def __getitem__(self, value):
        """Prevent accessing value via numerical index."""
        raise TypeError(f"'Grade' is not subscriptable by index")


In [185]:
grade_nt2 = GradeNtGi(99, 0.10)
grade_nt2

GradeNtGi(score=99, weight=0.1, comment='Good job from NamedTuple class with overridden __getitem__')

In [186]:
# grade_nt2[0]  # Throws TypeError

This works as expected when we try to use a numerical index, but it also throws a `TypeError` when we use named attributes:

In [187]:
# grade_nt2.score  # Throws TypeError

To allow acces with named attributes we need to override another special method: `__getattribute__`. Named tuples do not have access to the `__dict__` attribute, but by using `_asdict()` we can get a value using the attribute name as key. The attribute name is passed to `__getattribute__` as an argument `name`.

In [188]:
class GradeGiGa(NamedTuple):
    """Represents a grade, inheriting from NamedTuple."""
    score: int
    weight: float
    comment: str = 'Good job from NamedTuple class with overridden __getitem__ and __getattribute__'

    def __getattribute__(self, name: str):
        """Enable getting values with named attributes.
        
        Note that namedtuple has no attribute __dict__, _asdict() method is used instead.
        """
        try:
            return super().__getattribute__(name)
        except TypeError:
            class_dict: OrderedDict[str, Any] = self._asdict()  # No __dict__ attribute for namedtuple
            return class_dict[name]

    def __getitem__(self, value):
        """Prevent accessing value via numerical index to avoid unintentional use in external APIs."""
        raise TypeError(f"'Grade' is not subscriptable by index")


  """Entry point for launching an IPython kernel.


In [189]:
grade_nt3 = GradeGiGa(99, 0.10)
grade_nt3

GradeGiGa(score=99, weight=0.1, comment='Good job from NamedTuple class with overridden __getitem__ and __getattribute__')

Now we can access values with named attributes but not with numerical index.

In [190]:
grade_nt3.score, grade_nt3.weight

(99, 0.1)

In [191]:
# grade_nt3[0], grade_nt3[1]  # Throws TypeError

This works but is not an ideal solution. Since Python **3.7** there is the [`dataclass` module](https://www.python.org/dev/peps/pep-0557/). Dataclasses seem to be able to replace named tuples in most cases if not all. They too do not require a defined `__init__` method and are declared in a very similar way to a class inheriting from `NamedTuple`. The class is decorated with the `@dataclass` decorator, and to make the objects "immutable", the `frozen` keyword can be set to `True`.

In [192]:
from dataclasses import dataclass

@dataclass(frozen=True)  # Try commenting out the decorator
class GradeDc:  # Inherits from object
    """Represents a grade."""
    score: int
    weight: float
    comment: str = 'Good job from dataclass!'


In [193]:
grade_dc = GradeDc(99, 0.10)
grade_dc

GradeDc(score=99, weight=0.1, comment='Good job from dataclass!')

Dataclasses already has the desired behaviour of only allowing named attributes.

In [194]:
grade_dc.score, grade_dc.weight

(99, 0.1)

In [195]:
# grade_dc[0], grade_dc[1]  # Throws TypeError

In [196]:
grade_nt3, grade_dc

(GradeGiGa(score=99, weight=0.1, comment='Good job from NamedTuple class with overridden __getitem__ and __getattribute__'),
 GradeDc(score=99, weight=0.1, comment='Good job from dataclass!'))

Dataclasses are somewhat more lightweight than named tuples but the main benefit is that we don't have to override any dunder methods to get the desired behaviour, and a dataclass describes more accurately what we want to achieve: a lightweight class that is easy to declare and instantiate.

In [197]:
sys.getsizeof(grade), sys.getsizeof(grade_nt), sys.getsizeof(grade_nt1), sys.getsizeof(grade_dc)

(72, 72, 80, 64)