# Dataclasses

Before Python 3.7, representing simple data structures often involved using tuples, dictionaries, or basic classes, each with its limitations in readability, maintainability, and structure. 
Data classes, introduced in Python 3.7 via PEP 557, addressed these challenges by offering a structured yet flexible way to define classes primarily for storing data.

With **dataclasses**:

- Readability is enhanced with less boilerplate and clear type annotations.
= Maintainability improves due to auto-generated methods and reduced custom code.
- Flexibility is achieved as they combine the best features of tuples and dictionaries.
- Type Safety is promoted by integrating with static type checkers like mypy.

In essence, data classes represent a modern approach to data-centric object-oriented programming in Python, emphasizing clarity, structure, and type safety.

Defining a dataclass

In [3]:
from dataclasses import dataclass, field

@dataclass
class DcPerson:
    first_name: str
    last_name: str
    age: int

The dataclasses module in Python facilitates the auto-generation of boilerplate code for several foundational class operations, ensuring enhanced efficiency and maintainability. 

Simultaneously, it upholds the intrinsic properties and extensibility of traditional classes, ensuring that developers retain full control over the class's behavior and structure.

Basic functionalities provided by dataclasses.

1. **Auto-generation of Special Methods**:
    - `__init__`: Constructor method
    - `__repr__`: Representation method
    - `__eq__`: Equality comparison method
    - Ordering methods (`__lt__`, `__le__`, `__gt__`, `__ge__`) with the `order` parameter
    - `__hash__`: Hash method, generated under certain conditions


You can observe below that to define a dataclass you need not to implement the  __init__ method.

In [16]:
from dataclasses import dataclass, field, asdict, astuple, replace, is_dataclass

@dataclass(order=True, frozen=True)
class Person:
    name: str
    age: int
    email: str = field(default='')


person1 = Person('Alice', 25, 'alice@example.com')
person2 = Person('Bob', 30, 'bob@example.com')
person3 = Person('Charlie', 35, 'charlie@example.com')

print(person1, person2, person3, sep='\n')


Person(name='Alice', age=25, email='alice@example.com')
Person(name='Bob', age=30, email='bob@example.com')
Person(name='Charlie', age=35, email='charlie@example.com')


As we can observe above dataclass autoomatically adds the __repr__ implementation to class to present it with its name and all field data. </b>

To exclude any field in the __repr__ representation of class, we can add `field(repr=false)` in front of field as described below.

In [13]:
@dataclass
class Person1:
    name: str
    age: int = field(repr=False)
    email: str = field(default='')
    
person4 = Person1('Dave', 40, 'dave@dummuy.com')

print(person4)

Person1(name='Dave', email='dave@dummuy.com')


Unlike default classes where default equality behavior is to compare the id of class instances, dataclass compares all the field values to determine the equality by default. We can also specify which fields to ignore while implementing the `__eq__`.

In [23]:
class DefaultPerson:
    def __init__(self, name, age, email=''):
        self.name = name
        self.age = age
        self.email = email
        
person5 = DefaultPerson('Eve', 45, 'dummy@abc.com')
person6 = DefaultPerson('Eve', 45, 'dummy@abc.com')

print(f'Is {person5} and {person6} from Default class equal? {person5 == person6}')

person7 = Person('Eve', 45, 'dummy@abc.com')
person8 = Person('Eve', 45, 'dummy@abc.com')

print(f'Is {person7} and \n {person8} from Dataclass equal? {person7 == person8}')


Is <__main__.DefaultPerson object at 0x107df78b0> and <__main__.DefaultPerson object at 0x107df7340> from Default class equal? False
Is Person(name='Eve', age=45, email='dummy@abc.com') and 
 Person(name='Eve', age=45, email='dummy@abc.com') from Dataclass equal? True


Dataclass to only compare age of person to determine equality.

2. **Field Definitions**:
    - Default values for fields
    - `default_factory`: A callable that produces a default value for fields
    - Field metadata


3. **Decorator Parameters**:
    - `order`: Generate ordering methods
    - `frozen`: Create immutable data classes
    - `unsafe_hash`: Force generation of a `__hash__` method

4. **Utility Functions**:
    - `dataclasses.field()`: Provides customization for individual fields
    - `dataclasses.asdict()`: Converts data class instances to dictionaries
    - `dataclasses.astuple()`: Converts data class instances to tuples
    - `dataclasses.replace()`: Creates a new instance replacing specified fields
    - `dataclasses.is_dataclass()`: Checks if an object is a data class or an instance of one

5. **Post Initialization**:
    - `__post_init__`: Allows further initialization after the `__init__` method

6. **Field Objects**:
    - Describe the name, type, and default value of each field in the data class

7. **Inheritance**:
    - Ability to inherit from other data classes and regular classes

In [1]:
class Person:
    
    def __init__(self, first_name, last_name, age):
        self.first_name = first_name
        self.last_name = last_name
        self.age = age

In [2]:
personA = Person("Jas", "Gujral", 35)
personB = Person("Jas", "Gujral", 35)
personA == personB

False

In [4]:
dc_personA = DcPerson("Jas", "Gujral", 35)
dc_personB = DcPerson("Jas", "Gujral", 35)
dc_personA == dc_personB

True

In [5]:
print(f'{personA=}')
print(f'{dc_personA=}')

personA=<__main__.Person object at 0x107fc72e0>
dc_personA=DcPerson(first_name='Jas', last_name='Gujral', age=35)


In [6]:
@dataclass
class DcPerson:
    first_name: str
    last_name: str
    age: int
    married: bool = False
    friends: list[str] = []

ValueError: mutable default <class 'list'> for field friends is not allowed: use default_factory

In [9]:
@dataclass
class DcPerson:
    first_name: str
    last_name: str
    age: int
    married: bool = False
    friends: list[str] = field(default_factory=list) # This is evoked separately for each instance of class.

In [10]:
dc_personC = DcPerson("Jas", "Gujral", 35)
dc_personC

DcPerson(first_name='Jas', last_name='Gujral', age=35, married=False, friends=[])

In [None]:
@dataclass
class DcPerson1:
    person = []
    first_name: str
    last_name: str
    age: int
    
    def __post_init__(self):
        cls = self.__class__
        

For detail example and features, refer to official document => https://docs.python.org/3.10/library/dataclasses.html