## <font color='darkblue'>Preface</font>
<font size='3ptx'><b>([article source](https://haosquare.com/python-dataclass/#dataclass_%E8%A7%A3%E6%B1%BA%E7%9A%84%E7%97%9B%E9%BB%9E)) It is always efficient to learn things by examples with context. Here we will going to start from a simple coding example to learn why [datalclasses](https://docs.python.org/3/library/dataclasses.html) is useful and how to use it.</b></font>

Let's imagine you are responsible to maintain a function which will return an employee info by input his/her id:

In [18]:
from datetime import datetime

UNKNOWN_EMPLOYEE = ('?', '?', 0, datetime.now())

employee_info_dict = {
    # UID: (Name, Job, Salary, On-boarding date)
    1: ('John', 'Developer', 12345, datetime.strptime('2023-03-01', '%Y-%m-%d')),
    2: ('Mary', 'Manager', 45000, datetime.strptime('2022-01-12', '%Y-%m-%d')),
    3: ('Ken', 'CEO', 999999, datetime.strptime('2021-06-06', '%Y-%m-%d')),
}

def get_employee(uid: int) -> tuple[str, str, int, datetime]:
    return  employee_info_dict.get(uid, UNKNOWN_EMPLOYEE)

In [19]:
# Retrieve employee 'Mary'
get_employee(2)

('Mary', 'Manager', 45000, datetime.datetime(2022, 1, 12, 0, 0))

In [20]:
# Return unknown employee
get_employee(99)

('?', '?', 0, datetime.datetime(2023, 4, 26, 22, 35, 54, 986183))

## <font color='darkblue'>Use Class Employee</font>
Using tuple to store employee information is inconvenient to extend and access interested field. Let's we want to add one more field `address` into employee tuple, then you have to rewrite the function `get_employee`:

In [21]:
UNKNOWN_EMPLOYEE = ('?', '?', 0, datetime.now(), '?')

employee_info_dict = {
    # UID: (Name, Job, Salary, On board date, Address)
    1: ('John', 'Developer', 12345, datetime.strptime('2023-03-01', '%Y-%m-%d'), 'address1'),
    2: ('Mary', 'Manager', 45000, datetime.strptime('2022-01-12', '%Y-%m-%d'), 'address2'),
    3: ('Ken', 'CEO', 999999, datetime.strptime('2021-06-06', '%Y-%m-%d'), 'address3'),
}

def get_employee(uid: int) -> tuple[str, str, int, datetime]:
    return  employee_info_dict.get(uid, UNKNOWN_EMPLOYEE)

In [22]:
# Get employee John
employee_john = get_employee(1)
print(f'Address of John is {employee_john[4]}')  # Use position 4 to access address of employee is lack of readability.

Address of John is address1


So we decide to define a class <b><font color='blue'>Employee</font></b> to represent the employee's oinformation:

In [23]:
class Employee:
    def __init__(self, uid: int, name: str, job: str, salary: int, on_board_date: datetime, address: str):
        self.uid = uid
        self.name = name
        self.job = job
        self.salary = salary
        self.on_board_date = on_board_date
        self.address = address

Next, we have to refine `employee_info_dict` by using class <b><font color='blue'>Employee</font></b>:

In [24]:
UNKNOWN_EMPLOYEE = Employee(-1, '?', '?', -1, datetime.now(), '?')

employee_info_dict = {
    # UID: (Name, Job, Salary, On board date, Address)
    1: Employee(1, 'John', 'Developer', 12345, datetime.strptime('2023-03-01', '%Y-%m-%d'), 'address1'),
    2: Employee(2, 'Mary', 'Manager', 45000, datetime.strptime('2022-01-12', '%Y-%m-%d'), 'address2'),
    3: Employee(3, 'Ken', 'CEO', 999999, datetime.strptime('2021-06-06', '%Y-%m-%d'), 'address3'),
}

def get_employee(uid: int) -> Employee:
    return  employee_info_dict.get(uid, UNKNOWN_EMPLOYEE)

In [25]:
# Get employee Key
employee_ken = get_employee(3)

# Now we could access the data of employee in a better way
print(f'{employee_ken.name} lives in {employee_ken.address}')

Ken lives in address3


In [26]:
# What if we print the employee object?
employee_ken

<__main__.Employee at 0x7f066c70d6a0>

It is not easy to intrepret the information by printing the <b><font color='blue'>Employee</font></b> object directly. Let's implement methods `__repr__` and `__str__` to print useful employee information for better readability:

In [27]:
class Employee:
    def __init__(self, uid: int, name: str, job: str, salary: int, on_board_date: datetime, address: str):
        self.uid = uid
        self.name = name
        self.job = job
        self.salary = salary
        self.on_board_date = on_board_date
        self.address = address
        
    def __str__(self):
        return self.__repr__()
    
    def __repr__(self):
        return f'Employee: name={self.name}; job={self.job}; onboarding date={self.on_board_date}'

In [28]:
UNKNOWN_EMPLOYEE = Employee(-1, '?', '?', -1, datetime.now(), '?')

employee_info_dict = {
    # UID: (Name, Job, Salary, On board date, Address)
    1: Employee(1, 'John', 'Developer', 12345, datetime.strptime('2023-03-01', '%Y-%m-%d'), 'address1'),
    2: Employee(2, 'Mary', 'Manager', 45000, datetime.strptime('2022-01-12', '%Y-%m-%d'), 'address2'),
    3: Employee(3, 'Ken', 'CEO', 999999, datetime.strptime('2021-06-06', '%Y-%m-%d'), 'address3'),
}

In [29]:
# We could have more meanful message in printing the employee object now:
get_employee(3)

Employee: name=Ken; job=CEO; onboarding date=2021-06-06 00:00:00

For somehow, if you want the prevent the object from being edited on fields, we need extra work to achieve that:

In [33]:
employee = get_employee(3)
print(f'Before editing: {employee}')
employee.name = 'AAA'
print(f'After editing: {employee}')

Before editing: Employee: name=Kenny; job=CEO; onboarding date=2021-06-06 00:00:00
After editing: Employee: name=AAA; job=CEO; onboarding date=2021-06-06 00:00:00


In [34]:
get_employee(3)

Employee: name=AAA; job=CEO; onboarding date=2021-06-06 00:00:00

Now let's rewrite class <b><font color='blue'>Employee</font></b> to make it uneditable:

In [35]:
class Employee:
    def __init__(self, uid: int, name: str, job: str, salary: int, on_board_date: datetime, address: str):
        self._uid = uid
        self._name = name
        self._job = job
        self._salary = salary
        self._on_board_date = on_board_date
        self._address = address
        
    @property
    def uid(self):
        return self._uid
    
    @property
    def name(self):
        return self._name
    
    @property
    def job(self):
        return self._job
    
    @property
    def salary(self):
        return self._salary
    
    @property
    def on_board_date(self):
        return self._on_board_date
    
    @property
    def address(self):
        return self._address
        
    def __str__(self):
        return self.__repr__()
    
    def __repr__(self):
        return f'Employee: name={self.name}; job={self.job}; on board date={self.on_board_date}'

In [36]:
UNKNOWN_EMPLOYEE = Employee(-1, '?', '?', -1, datetime.now(), '?')

employee_info_dict = {
    # UID: (Name, Job, Salary, On board date, Address)
    1: Employee(1, 'John', 'Developer', 12345, datetime.strptime('2023-03-01', '%Y-%m-%d'), 'address1'),
    2: Employee(2, 'Mary', 'Manager', 45000, datetime.strptime('2022-01-12', '%Y-%m-%d'), 'address2'),
    3: Employee(3, 'Ken', 'CEO', 999999, datetime.strptime('2021-06-06', '%Y-%m-%d'), 'address3'),
}

In [37]:
employee = get_employee(3)
print(employee)

Employee: name=Ken; job=CEO; on board date=2021-06-06 00:00:00


In [38]:
employee.name

'Ken'

In [39]:
# AttributeError: can't set attribute 'name'
# employee.name = 'Kenny'

AttributeError: can't set attribute

Now let's consider another example, consider we want to do some operations on X-Y plane:

In [42]:
from __future__ import annotations


class XYPos:
    def __init__(self, x: int, y: int):
        self._x = x
        self._y = y
        
    @property
    def x(self) -> int:
        return self._x
    
    @property
    def y(self) -> int:
        return self._y
    
    def up(self) -> XYPos:
        return XYPos(x=self.x, y=self.y + 1)
    
    def down(self) -> XYPos:
        return XYPos(x=self.x, y=self.y - 1)
    
    def right(self) -> XYPos:
        return XYPos(x=self.x + 1, y=self.y)
        
    def left(self) -> XYPos:
        return XYPos(x=self.x - 1, y=self.y)
    
    def __str__(self):
        return f'Pos(x={self.x}, y={self.y})'
    
    def __repr__(self):
        return self.__str__()
    
    
ORIGIN = XYPos(0, 0)

In [40]:
print(f'{ORIGIN.up()}')
print(f'{ORIGIN.down()}')
print(f'{ORIGIN.left()}')
print(f'{ORIGIN.right()}')

Pos(x=0, y=1)
Pos(x=0, y=-1)
Pos(x=-1, y=0)
Pos(x=1, y=0)


So we expect `ORIGIN.up().down()` to be `ORIGIN`. But...

In [43]:
ORIGIN.up().down() == ORIGIN

False

For default implementation of method [`__eq__`](https://docs.python.org/3/reference/datamodel.html#object.__eq__) or a class:
> By default, object implements \_\_eq__() by using **[is](https://docs.python.org/3/library/operator.html#operator.is_)** syntax, returning <font color='blue'><b>NotImplemented</font></b> in the case of a false comparison: `True if x is y else NotImplemented`.

In [44]:
ORIGIN.up().down() is ORIGIN

False

So we have to implement [`__eq__`](https://docs.python.org/3/reference/datamodel.html#object.__eq__) by our own to let same position to have same equality.

In [45]:
from typing import Any, Optional


class XYPos:
    def __init__(self, x: int, y: int):
        self._x = x
        self._y = y
        
    @property
    def x(self) -> int:
        return self._x
    
    @property
    def y(self) -> int:
        return self._y
    
    def __eq__(self, another_pos: Any) -> bool:
        if not isinstance(another_pos, XYPos):
            return False
        
        return another_pos.x == self.x and another_pos.y == self.y
    
    def up(self) -> XYPos:
        return XYPos(x=self.x, y=self.y + 1)
    
    def down(self) -> XYPos:
        return XYPos(x=self.x, y=self.y - 1)
    
    def right(self) -> XYPos:
        return XYPos(x=self.x + 1, y=self.y)
        
    def left(self) -> XYPos:
        return XYPos(x=self.x - 1, y=self.y)
    
    def __str__(self):
        return f'Pos(x={self.x}, y={self.y})'
    
    def __repr__(self):
        return self.__str__()
    
    
ORIGIN = XYPos(0, 0)

In [46]:
# Now we could get what we want:
ORIGIN.up().down() == ORIGIN

True

## <font color='darkblue'>Using Dataclasses</font>
Now let's check how to apply [**dataclasses**](https://docs.python.org/3/library/dataclasses.html) in user case of Employee :
> [**dataclasses**](https://docs.python.org/3/library/dataclasses.html) provides a decorator and functions for automatically adding generated [special method](https://docs.python.org/3/glossary.html#term-special-method)s such as [`__init__()`](https://docs.python.org/3/reference/datamodel.html#object.__init__) and [`__repr__()`](https://docs.python.org/3/reference/datamodel.html#object.__repr__) to user-defined classes. It was originally described in [**PEP 557**](https://peps.python.org/pep-0557/).

In [11]:
from dataclasses import dataclass, field

### <font color='darkgreen'>Employee dataclass</font>

In [47]:
@dataclass(frozen=True)
class Employee:
    uid: int = -1
    name: str = '?'
    job: str = '?'
    salary: int = -1
    on_board_date: datetime = field(default_factory=datetime.now)
    address: str = '?'

In [48]:
UNKNOWN_EMPLOYEE = Employee()

employee_info_dict = {
    1: Employee(1, 'John', 'Developer', 12345, datetime.strptime('2023-03-01', '%Y-%m-%d'), 'address1'),
    2: Employee(2, 'Mary', 'Manager', 45000, datetime.strptime('2022-01-12', '%Y-%m-%d'), 'address2'),
    3: Employee(3, 'Ken', 'CEO', 999999, datetime.strptime('2021-06-06', '%Y-%m-%d'), 'address3'),
}

In [49]:
def get_employee(uid: int) -> Employee:
    return  employee_info_dict.get(uid, UNKNOWN_EMPLOYEE)

### <font color='darkgreen'>Meaningful Object Representation</font>

In [50]:
get_employee(1)

Employee(uid=1, name='John', job='Developer', salary=12345, on_board_date=datetime.datetime(2023, 3, 1, 0, 0), address='address1')

### <font color='darkgreen'>Object equality</font>

In [52]:
print(get_employee(1) == get_employee(1))
print(get_employee(1) == UNKNOWN_EMPLOYEE)

True
False


### <font color='darkgreen'>Frozen object</font>

In [54]:
employee_john = get_employee(1)

# FrozenInstanceError: cannot assign to field 'name'
# employee_john.name = 'Johny'

## <font color='darkblue'>Supplement</font>
* [Medium - 9 Reasons Why You Should Start Using Python Dataclasses](https://towardsdatascience.com/9-reasons-why-you-should-start-using-python-dataclasses-98271adadc66)
* [RealPython - Data Classes in Python 3.7+ (Guide)](https://realpython.com/python-data-classes/)