# Dataclasses

## Instance Level Attributes
Anytime I assign some value to an attribute on an object, it is spawned into existance.

In [5]:
class One:
    def __init__(self):
        # assigning x to 1 spawns x
        self.name = "ek"

In [6]:
one = One()
print(one.name)
one.__dict__

ek


{'name': 'ek'}

In [7]:
try:
    one.value
except AttributeError as err:
    print(err)

'One' object has no attribute 'value'


In [8]:
# Now value is a valid attribute
one.value = 1
print(one.value)
one.__dict__

1


{'name': 'ek', 'value': 1}

A common practice I have seen is to define all the instance attributes in the init method and give them all some default values.

In [13]:
class Two:
    def __init__(self, value: int):
        self.name: str | None = None
        self.value = value

In [14]:
two = Two(2)
print(two.name, two.value)

None 2


In [15]:
two.name = "do"
print(two.name, two.value)

do 2


## Class Level Attributes
The closest analogy to the C-family of languages are static variables.

In [27]:
class Three:
    name = "teen"

    def __init__(self):
        self.value = 3

In [28]:
print(Three.name)
Three.__dict__

teen


mappingproxy({'__module__': '__main__',
              '__firstlineno__': 1,
              'name': 'teen',
              '__init__': <function __main__.Three.__init__(self)>,
              '__static_attributes__': ('value',),
              '__dict__': <attribute '__dict__' of 'Three' objects>,
              '__weakref__': <attribute '__weakref__' of 'Three' objects>,
              '__doc__': None})

In [29]:
three = Three()
print(three.name, three.value)
three.__dict__

teen 3


{'value': 3}

Even though I can call `Three.name` it is just referring to the class level `name`. When I change this attribute, all the objects will see this change.

In [30]:
three_1 = Three()
print(three.name, three.value)

teen 3


In [31]:
Three.name = "three"
print(three.name, three_1.name)

three three


However, I can create a instance level attribute with the same name `name`. `three.name` will now refer to this new instance level attribute.

In [32]:
three.name = "tres"
three.__dict__

{'value': 3, 'name': 'tres'}

Now when I change the class level attribute, this instance does not change.

In [34]:
Three.name = "trun"
print(three.name, three_1.name)

tres trun


Sometimes I'll see what look like class level variables with only type annotation, but no default value set. These don't mean anything to the Python runtime, they are just there for mypy and the human reader.

In [35]:
class Four:
    name: str

    def __init__(self, value: int):
        self.value = value

In [36]:
Four.__dict__

mappingproxy({'__module__': '__main__',
              '__firstlineno__': 1,
              '__annotations__': {'name': str},
              '__init__': <function __main__.Four.__init__(self, value: int)>,
              '__static_attributes__': ('value',),
              '__dict__': <attribute '__dict__' of 'Four' objects>,
              '__weakref__': <attribute '__weakref__' of 'Four' objects>,
              '__doc__': None})

In [37]:
try:
    Four.name
except AttributeError as err:
    print(err)

type object 'Four' has no attribute 'name'


I have seen this as an alternative way to declaring all the instance variables in one place. The benefit over setting all the instance level attributes in the init method, I don't have to provide default values, e.g., I don't need to annotate name as either None or string, it can just be a string.

### Slots
Both the ways of declaring instance level variables - either in the init method with default values, or as class-level annotations, rely on convention. As a user, I can still spawn arbitrary attributes on the object. To avoid this, I can use the class level `__slots__` variable to define the list of allowable instance attribute names. Declaring `__slots__` will take away the `__dict__` attribute.

More details [here](https://wiki.python.org/moin/UsingSlots)

In [47]:
class Five:
    __slots__ = ("name", "value")

    def __init__(self):
        self.name = "paanch"
        self.value = 5

In [48]:
five = Five()
print(five.name, five.value)

paanch 5


In [49]:
try:
    five.__dict__
except AttributeError as err:
    print(err)

'Five' object has no attribute '__dict__'


In [51]:
try:
    five.prev = four
except AttributeError as err:
    print(err)

'Five' object has no attribute 'prev' and no __dict__ for setting new attributes


## Dataclass

Dataclasses are a convenient way to define classes. As a user all I need to do is define the fields aka attributes of the dataclass. 

> A field is defined as a class variable that has a type annotation

Once I do this, I get the following for free -
  * `__init__` with keyword args
  * `__repr__`
  * `eq`

In [72]:
from dataclasses import dataclass, FrozenInstanceError

In [54]:
@dataclass
class Cookie:
    calories: int = 0
    flavor: str = ""

In [55]:
# Got the init method
c1 = Cookie(calories=200, flavor="Chocolate Chip")

In [56]:
# Got the __repr__
c1

Cookie(calories=200, flavor='Chocolate Chip')

In [59]:
# Got the __eq__
c2 = Cookie(calories=200, flavor="Chocolate Chip")
c1 == c2

True

In [74]:
# I can set these fields
c1.calories = 220
c1

Cookie(calories=220, flavor='Chocolate Chip')

In [61]:
# But these are not hashable, because they are mutable
try:
    set((c1,))
except TypeError as err:
    print(err)

unhashable type: 'Cookie'


There are two ways to make a dataclass hashable, one is the brute force way and another is to "freeze" the class.

In [66]:
# The brute force way is to set the unsafe_hash to true

@dataclass(unsafe_hash=True)
class  IceCream:
    calories: int = 0
    flavor: str = ""

In [64]:
ice1 = IceCream(calories=200, flavor="Strawberry")
ice1

IceCream(calories=200, flavor='Strawberry')

In [65]:
set((ice1,))

{IceCream(calories=200, flavor='Strawberry')}

In [68]:
# A better and more semantically correct way is to freeze the class.
@dataclass(frozen=True)
class Stock:
    symbol: str = ""
    price: float = 0.0

In [69]:
s1 = Stock("APPL", 243.36)
set((s1,))

{Stock(symbol='APPL', price=243.36)}

Of course this means that I cannot set any of the fields of an existing object.

In [73]:
try:
    s1.price = 245
except FrozenInstanceError as err:
    print(err)

cannot assign to field 'price'


The so-called field declaration of dataclasses are still plain old class level attributes. The init method creates the actual instance attributes that mirror the type and default values of the class level attributes.

In [75]:
print(Cookie.flavor, Cookie.calories)

 0


While a field has to have a type annotation, it is not required to have a default value. In the example below, `title` and `rating` don't exist until an object is instantiated.

In [80]:
@dataclass
class VideoGame:
    title: str
    rating: float

In [81]:
try:
    VideoGame.title
except AttributeError as err:
    print(err)

type object 'VideoGame' has no attribute 'title'


In [82]:
game = VideoGame(title="Halo", rating=4.9)
game

VideoGame(title='Halo', rating=4.9)