<h1 align = "center">Introduction to <code>dataclasses</code></h1>

---

Among various *[python runtime servies](https://docs.python.org/3/library/python.html)*, the **[`dataclasses`](https://docs.python.org/3/library/dataclasses.html)** module provides a decorator and functions for automatically adding special methods (such as `__init__()` and `__repr__()`) to a given user-defined classes. In this notebook, a simple guide is provided which I have personally used/using in some of my projects. In addition, a general structure of dataclass (blueprint) is defined at the end. Helpful Links:
* https://zetcode.com/python/dataclass/
* https://www.infoworld.com/article/3563878/how-to-use-python-dataclasses.html

In [105]:
import inspect # required for adv. handling; check boilerplate
import warnings # warnings module to warn user of certain things

In [99]:
from dataclasses import (
    asdict,
    dataclass,
    field,
    fields
)

## Defining a Data Class

In [65]:
@dataclass()
class MyClass(object):
    name : str
    height : float
    weight : float

In [7]:
obj = MyClass("Debmalya Pramanik", 168.0, 78.8)
print(obj, repr(obj), asdict(obj))

MyClass(name='Debmalya Pramanik', height=168.0, weight=78.8) MyClass(name='Debmalya Pramanik', height=168.0, weight=78.8) {'name': 'Debmalya Pramanik', 'height': 168.0, 'weight': 78.8}


### Controlling `repr` and `str` Functionalities

In [66]:
@dataclass
class MyClass(object):
    name : str
    height : float = field(default = 168.2)
    weight : float = field(default = 78.80, repr = False)

In [11]:
obj = MyClass("Debmalya Pramanik")
print(obj, repr(obj), asdict(obj))

MyClass(name='Debmalya Pramanik', height=168.2) MyClass(name='Debmalya Pramanik', height=168.2) {'name': 'Debmalya Pramanik', 'height': 168.2, 'weight': 78.8}


### Inherit `dataclass` and Initialize

In [12]:
class ActualClass(MyClass):
    def __init__(self, *args, **kwargs):
        super().__init__(*args, **kwargs)
        
obj = ActualClass("Debmalya Pramanik")
print(obj, repr(obj), asdict(obj))

ActualClass(name='Debmalya Pramanik', height=168.2) ActualClass(name='Debmalya Pramanik', height=168.2) {'name': 'Debmalya Pramanik', 'height': 168.2, 'weight': 78.8}


## Controls and Adv. Functionalities

In dataclass, a function can be defined to control its behaviour. In addition, `@property` and other methods can also be defined as shown below.

### `@property` in a dataclass

In [67]:
@dataclass
class EmployeeDetails(object):
    firstName  : str
    familyName : str
        
    @property
    def fullName(self) -> str:
        # this is not displayed in `repr`
        return self.firstName + " " + self.familyName

In [74]:
emp = EmployeeDetails("Debmalya", "Pramanik")
repr(emp), emp.fullName

("EmployeeDetails(firstName='Debmalya', familyName='Pramanik')",
 'Debmalya Pramanik')

### Adding methods

In [76]:
@dataclass
class EmployeeDetails(object):
    firstName   : str
    familyName  : str
    designation : str = field(default = "manager")
        
    @property
    def fullName(self) -> str:
        # this is not displayed in `repr`
        return self.firstName + " " + self.familyName
    
    def salary(self, arg : any = None) -> str:
        # let's add one `arg` so that we know we can control
        # the method inside dataclass, in addition any features
        # can be defined/worked in dataclasses as below
        if self.designation == "manager":
            s = 15_000
        else:
            s = 10_000
            
        return f"Salary = {s}; Argument (arg) = {arg}"
    
emp = EmployeeDetails("Debmalya", "Pramanik")
repr(emp), emp.fullName, emp.salary()

("EmployeeDetails(firstName='Debmalya', familyName='Pramanik', designation='manager')",
 'Debmalya Pramanik',
 'Salary = 15000; Argument (arg) = None')

### Controlling with `__post_init__`

In [8]:
@dataclass
class distance(object):
    """
    Base distance class provides module defaults
    for keyword arguments and/or other parameters.
    """
    
    input_distance_unit  : str = field(default = "km")
    output_distance_unit : str = field(default = "km")

    def __post_init__(self):
        if self.input_distance_unit not in ["km", "m"]:
            raise ValueError()

In [11]:
obj = distance(input_distance_unit = "jh")
obj # this will raise `ValueError`

ValueError: 

## Boilerplate

In [114]:
class UnitError(ValueError):
    """Raised when the Provided Argument Value is not Accepted"""

In [104]:
class TypeWarning(Warning):
    """Warning is Raised when Argument has a Type that is Not Expected"""

In [116]:
@dataclass
class MyClass(object):
    """
    `dataclass` boilerplate that I personally use in most of my
    projects. The boilerplate provides the following use casses in
    addition to basic features as discussed above:
      * A `__post_init__` method is defined which can be used for
        parameter checking.
      * A `classmethod` that can take several arguments, and
        filters only necessary arguments to the class. This is
        particularly helpful when a child class/function accepts
        many keyword arguments in addition to the ones defined here.
    """
    
    foo : str
    bar : type = field(default = "value", repr = False)
        
    @classmethod
    def from_dict(cls, env):
        """
        This method accepts `n` keyword arguments, even ones
        which are not defined in the data class and filters only
        the ones defined here. Help link:
        # https://stackoverflow.com/a/55096964/6623589
        """
        
        return cls(**{
            k : v for k, v in env.items()
            if k in inspect.signature(cls).parameters
        })
    
    def __accepted_units__(self, param : str):
        """
        Lets assume that `foo` accepts only certain values,
        this function can be used to define those variables,
        and check if the passed values is accepted/not.
        
        :param param: Name of the parameter, typically can be
                      using: `dataclasses.fields(self)[#].name`
        """
        
        return {
            "foo" : ["accepted-1", "accepted-2"]
        }.get(param, None)
        
    def __post_init__(self) -> None:
        """
        This method is automatically envoked just after `__init__()`
        as the name suggests. The method can be used for controlling
        `init` arguments. For example: say a argument only accepts
        certain values. In addition, the method can also be used to
        check if passed argument type is same as that provided in
        defination - else raise a warning or an error as required.
        """
        
        for f in dataclasses.fields(self):
            # for each f (field name) check if defined
            # type matches with the data type of the variable
            if type(getattr(self, f.name)) != f.type:
                # if the data type is not matched, then either
                # raise an error like:
                # raise TypeError(f"Expected `{f.name}` of type {f.type}, but got {type(getattr(self, f.name))}")
                # or, you can even raise a custom warning like:
                warnings.warn(f"Expected `{f.name}` of type {f.type}, but got {type(getattr(self, f.name))}", TypeWarning)
                
            # for each f (field name) check if given value
            # is accepted or not. This is set under `try... catch`
            # as all fields may not have default accepted arguments.
            try:
                if getattr(self, f.name) not in self.__accepted_units__(f.name):
                    # raise an error like:
                    raise UnitError(f"{getattr(self, f.name)} is not accepted.")
            except TypeError:
                # TypeError: argument of type 'NoneType' is not iterable
                pass

In [118]:
MyClass(foo = "accepted-3")

UnitError: accepted-3 is not accepted.

In [113]:
"a" in None

TypeError: argument of type 'NoneType' is not iterable

**Advanced Functionalities** and Multiple Initialization

In [69]:
@dataclass()
class distance(object):
    input_distance_unit  : str = field(default = "km")
    output_distance_unit : str = field(default = "km")
        
    @classmethod
    def from_dict(cls, env):
        return cls(**{
            k : v for k, v in env.items()
            if k in inspect.signature(cls).parameters
        })
    
    @property
    def max_distance(self) -> float:
        if self.input_distance_unit == "km":
            d = 10
        else:
            d = 10_000
            
        return d

In [73]:
distance().max_distance
# distance().max_distance()

10

In [71]:
distance(input_distance_unit = "m")

distance(input_distance_unit='m', output_distance_unit='km')

In [70]:
asdict(distance()), repr(distance())

({'input_distance_unit': 'km', 'output_distance_unit': 'km'},
 "distance(input_distance_unit='km', output_distance_unit='km')")

In [45]:
def myFunction(a = 10, **kwargs):
    dist = distance.from_dict(kwargs)
    return dist

In [46]:
myFunction(input_distance_unit = "m", omg = 9)

distance(input_distance_unit='m', output_distance_unit='km')

In [32]:
@dataclass
class Config:
    var_1: str = field(default = "km")
    var_2: str = field(default = "km")

    @classmethod
    def from_dict(cls, env):      
        return cls(**{
            k: v for k, v in env.items() 
            if k in inspect.signature(cls).parameters
        })


# usage:
params = {'var_1': 'a', 'var_2': 'b', 'var_3': 'c'}
c = Config.from_dict(**params)   # works without raising a TypeError
asdict(c)

TypeError: from_dict() got an unexpected keyword argument 'var_1'