- Author: Ben Du
- Date: 2020-10-14 23:34:04
- Title: Dataclass vs namedtuple in Python
- Slug: dataclass-vs-namedtuple-in-python
- Category: Computer Science
- Tags: Computer Science, Python, programming, collections, namedtuple, dataclass
- Modified: 2021-04-14 23:34:04


## Tips and Traps 

1. Prefer `Dataclass` to `namedtuple` for many reasons.
    - A namedtuple is immutable while a dataclass can be both 
        mutable (`frozen=False` which is the default) or immutable (`frozen=True`).
        
    However,
    namedtuple does have one advantage over dataclass. 
    Members of a namedtuple is assible both via the dot operator and index.
    In situations where both dot accessing and index accessing of members is required,
    a namedtuple comes handy.
    For examples, 
    a list of namedtuple objects can be used as the data for creating a pandas DataFrame
    but not a list of dataclass objects.


## dataclass - mutable

In [1]:
from dataclasses import dataclass


@dataclass
class Person:
    name: str
    age: int = 10

In [2]:
Person("Ben")

Person(name='Ben', age=10)

In [3]:
p = Person("Ben", 34)
p

Person(name='Ben', age=34)

In [4]:
str(p)

"Person(name='Ben', age=34)"

## dataclass - immutable

Attributes of the class `Person` is mutable
since `frozen=False` (default).

In [13]:
p.age = 20
p

Person(name='Ben', age=20)

In [18]:
@dataclass(frozen=True)
class PersonImmutable:
    name: str
    age: int = 10

In [19]:
PersonImmutable("Ben")

PersonImmutable(name='Ben', age=10)

In [20]:
p = PersonImmutable("Ben", 30)
p

PersonImmutable(name='Ben', age=30)

Attribute of `PersonImmutable` is immutable since `frozen=True`.

In [22]:
p.age = 20

FrozenInstanceError: cannot assign to field 'age'

## namedtuple

In [2]:
from collections import namedtuple

In [7]:
PersonNT = namedtuple("PersonNT", ["name", "age"])

In [8]:
p = PersonNT("Ben Du", 30)
p

PersonNT(name='Ben Du', age=30)

In [9]:
p[0]

'Ben Du'

In [10]:
p[1]

30

## Objects of namedtuple as pandas DataFrame Data

In [11]:
import pandas as pd

In [12]:
pd.DataFrame(data=[p])

Unnamed: 0,name,age
0,Ben Du,30


## References

https://docs.python.org/3/library/collections.html#collections.namedtuple

[Difference between DataClass vs NamedTuple vs Object in Python](https://www.geeksforgeeks.org/difference-between-dataclass-vs-namedtuple-vs-object-in-python/)

[DataClass vs NamedTuple vs Object: A Battle of Performance in Python](https://medium.com/@jacktator/dataclass-vs-namedtuple-vs-object-for-performance-optimization-in-python-691e234253b9)