<a id='sect0'></a>
## <font color='darkblue'>Data Classes in Python 3.7+ (Guide)</font>

### <font color='darkgreen'>Table of Contents</font>
* <font size='3ptx'><b><a href='#sect1'>Alternatives to Data Classes</a></b></font>
* <font size='3ptx'><b><a href='#sect2'>Basic Data Classes</a></b></font>
  * Default Values
  * Type Hints
  * Adding Methods
* <font size='3ptx'><b><a href='#sect3'>More Flexible Data Classes</a></b></font>
  * Advanced Default Values
  * You Need Representation?
  * Comparing Cards
* <font size='3ptx'><b><a href='#sect4'>Immutable Data Classes</a></b></font>
* <font size='3ptx'><b><a href='#sect5'>Inheritance</a></b></font>
* <font size='3ptx'><b><a href='#sect6'>Optimizing Data Classes</a></b></font>

## <font color='darkblue'>Alternatives to Data Classes</font>
([article source](https://realpython.com/python-data-classes/)) <font size='3ptx'><b>For simple data structures, you have probably already used a tuple or a dict</b></font>. You could represent the queen of hearts card in either of the following ways:

```python
queen_of_hearts_tuple = ('Q', 'Hearts')
queen_of_hearts_dict = {'rank': 'Q', 'suit': 'Hearts'}
```
<br/>

It works. However, it puts a lot of responsibility on you as a programmer:
* You need to remember that the `queen_of_hearts_...` <a href='https://realpython.com/python-variables/'>variable</a> represents a card.
* For the <b><a href='https://docs.python.org/3/library/stdtypes.html#tuple'>tuple</a></b> version, you need to remember the order of the attributes. Writing `('Spades', 'A')` will mess up your program but probably not give you an easily understandable error message.
* If you use the <b><a href='https://docs.python.org/3/library/stdtypes.html#dict'>dict</a></b> version, you must make sure the names of the attributes are consistent. For instance `{'value': 'A', 'suit': 'Spades'}` will not work as expected.
<br/>

Furthermore, using these structures is not ideal:

```python
>>> queen_of_hearts_tuple[0]  # No named access
'Q'
>>> queen_of_hearts_dict['suit']  # Would be nicer with .suit
'Hearts'
```
<br/>

A better alternative is the <b><a href='https://dbader.org/blog/writing-clean-python-with-namedtuples'>namedtuple</a></b>. It has long been used to create readable small data structures. We can in fact recreate the data class example above using a <b><a href='https://dbader.org/blog/writing-clean-python-with-namedtuples'>namedtuple</a></b> like this:

In [1]:
from collections import namedtuple

NamedTupleCard = namedtuple('NamedTupleCard', ['rank', 'suit'])

This definition of <b><font color='blue'>NamedTupleCard</font></b> will give the exact same output as our <b>DataClassCard</b> example did:

In [2]:
queen_of_hearts = NamedTupleCard('Q', 'Hearts')
queen_of_hearts

NamedTupleCard(rank='Q', suit='Hearts')

In [3]:
print(f'rank={queen_of_hearts.rank}; suit={queen_of_hearts.suit}')

rank=Q; suit=Hearts


In [4]:
queen_of_hearts == NamedTupleCard('Q', 'Hearts')

True

<b>So why even bother with data classes?</b> First of all, data classes come with many more features than you have seen so far. At the same time, the <b><a href='https://dbader.org/blog/writing-clean-python-with-namedtuples'>namedtuple</a></b> has some other features that are not necessarily desirable. By design, a <b><a href='https://dbader.org/blog/writing-clean-python-with-namedtuples'>namedtuple</a></b> is a regular tuple. This can be seen in comparisons, for instance:

In [5]:
queen_of_hearts == ('Q', 'Hearts')

True

While this might seem like a good thing, <b>this lack of awareness about its own type can lead to subtle and hard-to-find bugs, especially since it will also happily compare two different namedtuple classes</b>:

In [6]:
Person = namedtuple('Person', ['first_initial', 'last_name'])
ace_of_spades = NamedTupleCard('A', 'Spades')
ace_of_spades == Person('A', 'Spades')

True

The <b><a href='https://dbader.org/blog/writing-clean-python-with-namedtuples'>namedtuple</a></b> also comes with some restrictions. For instance, it is hard to add default values to some of the fields in a <b><a href='https://dbader.org/blog/writing-clean-python-with-namedtuples'>namedtuple</a></b>. A namedtuple is also by nature immutable. That is, the value of a <b><a href='https://dbader.org/blog/writing-clean-python-with-namedtuples'>namedtuple</a></b> can never change. In some applications, this is an awesome feature, but in other settings, it would be nice to have more flexibility:

In [7]:
card = NamedTupleCard('7', 'Diamonds')
# AttributeError: can't set attribute
# card.rank = '9'

<b><a href='https://docs.python.org/3/library/dataclasses.html'>Data classes</a></b> will not replace all uses of <b><a href='https://dbader.org/blog/writing-clean-python-with-namedtuples'>namedtuple</a></b>. For instance, if you need your data structure to behave like a tuple, then a named tuple is a great alternative!

Another alternative, and one of the <a href='https://mail.python.org/pipermail/python-dev/2017-December/151034.html'>inspirations for data classes</a>, is the attrs project. With <b><a href='http://www.attrs.org/'>attrs</a></b> installed (<font color='blue'>pip install attrs</font>), you can write a card class as follows:

In [8]:
#!pip install attrs

In [9]:
import attr

@attr.s
class AttrsCard:
    rank = attr.ib()
    suit = attr.ib()

This can be used in exactly the same way as the <b><font color='blue'>DataClassCard</font></b> and <b><font color='blue'>NamedTupleCard</font></b> examples earlier. The <b><a href='http://www.attrs.org/'>attrs</a></b> project is great and does support some features that data classes do not, including converters and validators. Furthermore, <b><a href='http://www.attrs.org/'>attrs</a></b> has been around for a while and is supported in Python 2.7 as well as Python 3.4 and up. However, as <b><a href='http://www.attrs.org/'>attrs</a></b> is not a part of the standard library, it does add an external dependency to your projects. Through <b><a href='https://docs.python.org/3/library/dataclasses.html'>data classes</a></b>, similar functionality will be available everywhere.

<a id='sect2'></a>
## <font color='darkblue'>Basic Data Classes</font> ([back](#sect0))
* <font size='3ptx'><b><a href='#sect2_1'>Default Values</a></b></font>
* <font size='3ptx'><b><a href='#sect2_2'>Type Hints</a></b></font>
* <font size='3ptx'><b><a href='#sect2_3'>Adding Methods</a></b></font>

Let us get back to <b><a href='https://docs.python.org/3/library/dataclasses.html'>data classes</a></b>. As an example, we will create a <b><font color='blue'>Position</font></b> class that will represent geographic positions with a <i>name</i> as well as the latitude and longitude:

In [10]:
from dataclasses import dataclass

@dataclass
class Position:
    name: str
    lon: float
    lat: float

What makes this a data class is the <b><font color='orange'>@dataclass</font></b> decorator just above the class definition. Beneath the class Position: line, you simply list the fields you want in your data class. The <b>:</b> notation used for the fields is using a new feature in Python 3.6 called <a href='https://www.python.org/dev/peps/pep-0526/'><b><font color='darkblue'>variable annotations</font></b></a>. We will <a href='https://realpython.com/python-data-classes/#type-hints'>soon</a> talk more about this notation and why we specify data types like `str` and `float`.

Those few lines of code are all you need. The new class is ready for use:

In [11]:
pos = Position(
    name='Oslo',
    lon=10.8,
    lat=59.9)

print(pos)

Position(name='Oslo', lon=10.8, lat=59.9)


In [12]:
print(f'{pos.name} is at {pos.lat}°N, {pos.lon}°E')

Oslo is at 59.9°N, 10.8°E


You can also create data classes similarly to how named tuples are created. The following is (<font color='brown'>almost</font>) equivalent to the definition of <b><font color='blue'>Position</font></b> above:

In [13]:
from dataclasses import make_dataclass

Position = make_dataclass('Position', ['name', 'lat', 'lon'])

A data class is a regular Python class. The only thing that sets it apart is that it has basic <b><a href='https://docs.python.org/reference/datamodel.html#basic-customization'>data model methods</a></b> like .\_\_init__(), .\_\_repr__(), and .\_\_eq__() implemented for you.

<a id='sect2_1'></a>
### <font color='darkgreen'>Default Values</font>
It is easy to add default values to the fields of your data class:

In [14]:
from dataclasses import dataclass

@dataclass
class Position:
    name: str
    lon: float = 0.0
    lat: float = 0.0

This works exactly as if you had specified the default values in the definition of the .\_\_init__() method of a regular class:

In [15]:
Position(name='Null Island')

Position(name='Null Island', lon=0.0, lat=0.0)

In [16]:
Position(name='Greenwich', lat=51.8)

Position(name='Greenwich', lon=0.0, lat=51.8)

In [17]:
Position(name='Vancouver', lon=-123.1, lat=49.3)

Position(name='Vancouver', lon=-123.1, lat=49.3)

<a href='https://realpython.com/python-data-classes/#advanced-default-values'>Later</a> you will learn about <a href='https://docs.python.org/3/library/dataclasses.html#default-factory-functions'>default_factory</a>, which gives a way to provide more complicated default values.

<a id='sect2_2'></a>
### <font color='darkgreen'>Type Hints</font>
So far, we have not made a big fuss of the fact that data classes support <b><a href='https://realpython.com/python-type-checking/'>typing</a></b> out of the box. You have probably noticed that we defined the fields with a type hint: `name: str` says that `name` should be a text string (<font color='brown'>`str` type</font>).

In fact, <b>adding some kind of type hint is mandatory when defining the fields in your data class.</b> Without a type hint, the field will not be a part of the data class. However, if you do not want to add explicit types to your data class, use <b><a href='https://docs.python.org/3/library/typing.html#the-any-type'>typing.Any</a></b>:

In [18]:
from dataclasses import dataclass
from typing import Any

@dataclass
class WithoutExplicitTypes:
    name: Any
    value: Any = 42

<b>While you need to add type hints in some form when using data classes, <font color='darkred'>these types are not enforced at runtime</font></b>. The following code runs without any problems:

In [19]:
Position(3.14, 'pi day', 2018)

Position(name=3.14, lon='pi day', lat=2018)

This is how typing in Python usually works: <b><a href='https://www.python.org/dev/peps/pep-0484/#non-goals'>Python is and will always be a dynamically typed language</a></b>. To actually catch type errors, type checkers like <b><a href='http://mypy-lang.org/'>Mypy</a></b> can be run on your source code.

<a id='sect2_3'></a>
### <font color='darkgreen'>Adding Methods</font>
You already know that <font size='3ptx'><b>a data class is just a regular class. That means that you can freely add your own methods to a data class</b></font>. As an example, let us calculate the distance between one position and another, along the Earth’s surface. One way to do this is by using <b><a href='https://en.wikipedia.org/wiki/Haversine_formula'>the haversine formula</a></b>:
![1.png](images/1.PNG)
<br/>

You can add a <font color='blue'>.distance_to()</font> method to your data class just like you can with normal classes:

In [20]:
from dataclasses import dataclass
from math import asin, cos, radians, sin, sqrt

@dataclass
class Position:
    name: str
    lon: float = 0.0
    lat: float = 0.0

    def distance_to(self, other):
        r = 6371  # Earth radius in kilometers
        lam_1, lam_2 = radians(self.lon), radians(other.lon)
        phi_1, phi_2 = radians(self.lat), radians(other.lat)
        h = (sin((phi_2 - phi_1) / 2)**2
             + cos(phi_1) * cos(phi_2) * sin((lam_2 - lam_1) / 2)**2)
        return 2 * r * asin(sqrt(h))

It works as you would expect:

In [21]:
oslo = Position(name='Oslo', lon=10.8, lat=59.9)
vancouver = Position(name='Vancouver', lon=-123.1, lat=49.3)
oslo.distance_to(vancouver)

7181.7841229421165

<a id='sect3'></a>
## <font color='darkblue'>More Flexible Data Classes</font> ([back](#sect0))
* <font size='3ptx'><b><a href='#sect3_1'>Advanced Default Values</a></b></font>
* <font size='3ptx'><b><a href='#sect3_2'>You Need Representation?</a></b></font>
* <font size='3ptx'><b><a href='#sect3_3'>Comparing Cards</a></b></font>

So far, you have seen some of the basic features of the data class: it gives you some convenience methods, and you can still add default values and other methods. <b>Now you will learn about some more advanced features like parameters to the <a href='https://docs.python.org/3/library/dataclasses.html#dataclasses.dataclass'>@dataclass</a> decorator and the <a href='https://docs.python.org/3/library/dataclasses.html#dataclasses.field'>field()</a> function. Together, they give you more control when creating a data class</b>.

Let us return to the playing card example you saw at the beginning of the tutorial and add a class containing a deck of cards while we are at it:
> 撲克牌中的四種花色，即黑桃（spade）、紅桃（heart）、梅花（club）、方塊（dianmond），代表一年中的春夏秋冬四季，而每種花色剛好13張，指每個季節有13個星期。([source](https://kknews.cc/news/ax965yg.html))

In [22]:
from dataclasses import dataclass
from typing import List

@dataclass
class PlayingCard:
    rank: str
    suit: str

@dataclass
class Deck:
    cards: List[PlayingCard]

A simple deck containing only two cards can be created like this:

In [23]:
queen_of_hearts = PlayingCard(rank='Q', suit='Heart')
ace_of_spades = PlayingCard(rank='A', suit='Spade')
two_cards = Deck([queen_of_hearts, ace_of_spades])

two_cards

Deck(cards=[PlayingCard(rank='Q', suit='Heart'), PlayingCard(rank='A', suit='Spade')])

<a id='sect3_1'></a>
### <font color='darkgreen'>Advanced Default Values</font>
<b>Say that you want to give a default value to the <font color='blue'>Deck</font></b>. It would for example be convenient if <font color='blue'>Deck()</font> created a regular (<font color='brown'>French</font>) deck of 52 playing cards. First, specify the different ranks and suits. Then, add a function <font color='blue'>make_french_deck()</font> that creates a list of instances of <font color='blue'><b>PlayingCard</b></font>:

In [24]:
RANKS = '2 3 4 5 6 7 8 9 10 J Q K A'.split()
SUITS = '♣ ♢ ♡ ♠'.split()

def make_french_deck():
    print("Making french deck...")
    return [PlayingCard(rank=r, suit=s) for s in SUITS for r in RANKS]

For fun, the four different suits are specified using their <b><a href='https://en.wikipedia.org/wiki/Playing_cards_in_Unicode'>Unicode symbols</a></b>.

To simplify comparisons of cards later, the ranks and suits are also listed in their usual order.

In [25]:
make_french_deck()[:10]

Making french deck...


[PlayingCard(rank='2', suit='♣'),
 PlayingCard(rank='3', suit='♣'),
 PlayingCard(rank='4', suit='♣'),
 PlayingCard(rank='5', suit='♣'),
 PlayingCard(rank='6', suit='♣'),
 PlayingCard(rank='7', suit='♣'),
 PlayingCard(rank='8', suit='♣'),
 PlayingCard(rank='9', suit='♣'),
 PlayingCard(rank='10', suit='♣'),
 PlayingCard(rank='J', suit='♣')]

In theory, you could now use this function to specify a default value for <font color='blue'><b>Deck</b>.cards</font>:
```python
from dataclasses import dataclass
from typing import List

@dataclass
class Deck:  # Will NOT work: mutable default <class 'list'> for field cards is not allowed: use default_factory
    cards: List[PlayingCard] = make_french_deck()
```

In [26]:
#@dataclass
#class Deck:  # Will NOT work
#    cards: List[PlayingCard] = make_french_deck()

<b>Don’t do this! This introduces one of the most common anti-patterns in Python: <a href='http://docs.python-guide.org/en/latest/writing/gotchas/#mutable-default-arguments'>using mutable default arguments</a>.</b> The problem is that all instances of <b><font color='blue'>Deck</font></b> will use the same list object as the default value of the <font color='blue'>.cards</font> property. This means that if, say, one card is removed from one <b><font color='blue'>Deck</font></b>, then it disappears from all other instances of <b><font color='blue'>Deck</font></b> as well. Actually, data classes try to <a href='https://www.python.org/dev/peps/pep-0557/#mutable-default-values'>prevent you from doing this</a>, and the code above will raise a <b><font color='blue'>ValueError</font></b>.

Instead, data classes use something called a `default_factory` to handle mutable default values. To use `default_factory` (<font color='brown'>and many other cool features of data classes</font>), you need to use the <a href='https://docs.python.org/3/library/dataclasses.html#dataclasses.field'>field()</a> specifier:

In [27]:
from dataclasses import dataclass, field
from typing import List

@dataclass
class Deck:
    cards: List[PlayingCard] = field(default_factory=make_french_deck)

The argument to `default_factory` can be any zero parameter callable. Now it is easy to create a full deck of playing cards:

In [28]:
Deck()

Making french deck...


Deck(cards=[PlayingCard(rank='2', suit='♣'), PlayingCard(rank='3', suit='♣'), PlayingCard(rank='4', suit='♣'), PlayingCard(rank='5', suit='♣'), PlayingCard(rank='6', suit='♣'), PlayingCard(rank='7', suit='♣'), PlayingCard(rank='8', suit='♣'), PlayingCard(rank='9', suit='♣'), PlayingCard(rank='10', suit='♣'), PlayingCard(rank='J', suit='♣'), PlayingCard(rank='Q', suit='♣'), PlayingCard(rank='K', suit='♣'), PlayingCard(rank='A', suit='♣'), PlayingCard(rank='2', suit='♢'), PlayingCard(rank='3', suit='♢'), PlayingCard(rank='4', suit='♢'), PlayingCard(rank='5', suit='♢'), PlayingCard(rank='6', suit='♢'), PlayingCard(rank='7', suit='♢'), PlayingCard(rank='8', suit='♢'), PlayingCard(rank='9', suit='♢'), PlayingCard(rank='10', suit='♢'), PlayingCard(rank='J', suit='♢'), PlayingCard(rank='Q', suit='♢'), PlayingCard(rank='K', suit='♢'), PlayingCard(rank='A', suit='♢'), PlayingCard(rank='2', suit='♡'), PlayingCard(rank='3', suit='♡'), PlayingCard(rank='4', suit='♡'), PlayingCard(rank='5', suit='♡

In [29]:
d1 = Deck()
print(f'd1 has {len(d1.cards)} cards')
d1.cards = []
print(f'd1 has {len(d1.cards)} cards')

# Every new Deck will trigger `make_french_deck`
d2 = Deck()
print(f'd2 has {len(d2.cards)} cards')

Making french deck...
d1 has 52 cards
d1 has 0 cards
Making french deck...
d2 has 52 cards


In [30]:
def method1(alist=[1]):
    alist.append(alist[-1]+1)
    print(alist)

In [31]:
method1()

[1, 2]


In [32]:
method1()

[1, 2, 3]


<b>The <a href='https://docs.python.org/3/library/dataclasses.html#dataclasses.field'>field()</a> specifier is used to customize each field of a data class individually</b>. You will see some other examples later. For reference, these are the parameters <a href='https://docs.python.org/3/library/dataclasses.html#dataclasses.field'>field()</a> supports:
* **default**: Default value of the field
* **default_factory**: Function that returns the initial value of the field
* **init**: Use field in .\_\_init__() method? (<font color='brown'>Default is True.</font>)
* **repr**: Use field in repr of the object? (<font color='brown'>Default is True.</font>)
* **compare**: Include the field in comparisons? (<font color='brown'>Default is True.</font>)
* **hash**: Include the field when calculating <a href='https://docs.python.org/3/library/functions.html#hash'>hash()</a>? (<font color='brown'>Default is to use the same as for compare.</font>)
* **metadata**: A mapping with information about the field
<br/>

In the Position example, you saw how to add simple default values by writing <font color='blue'>lat: float = 0.0</font>. However, if you also want to customize the field, for instance to hide it in the repr, you need to use the `default` parameter: <font color='blue'>lat: float = field(default=0.0, repr=False)</font>. You may not specify both `default` and `default_factory`.

In [33]:
@dataclass
class MyData:
    name: str=field(default='John', repr=True)
    age: int=field(default=99, repr=False)

# Field `age` won't be shown
my_data = MyData()
my_data

MyData(name='John')

In [34]:
print(f'my_data.age={my_data.age}')

my_data.age=99


The `metadata` parameter is not used by the data classes themselves but is available for you (<font color='brown'>or third party packages</font>) to attach information to fields. In the <b><font color='blue'>Position</font></b> example, you could for instance <b>specify that latitude and longitude should be given in degrees</b>:

In [35]:
from dataclasses import dataclass, field

@dataclass
class Position:
    name: str
    lon: float = field(default=0.0, metadata={'unit': 'degrees'})
    lat: float = field(default=0.0, metadata={'unit': 'degrees'})

The `metadata` (<font color='brown'>and other information about a field</font>) can be retrieved using the <font color='blue'>fields()</font> function (<font color='brown'>note the plural s</font>):

In [36]:
from dataclasses import fields

fields(Position)

(Field(name='name',type=<class 'str'>,default=<dataclasses._MISSING_TYPE object at 0x7f9d61bac2b0>,default_factory=<dataclasses._MISSING_TYPE object at 0x7f9d61bac2b0>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),_field_type=_FIELD),
 Field(name='lon',type=<class 'float'>,default=0.0,default_factory=<dataclasses._MISSING_TYPE object at 0x7f9d61bac2b0>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({'unit': 'degrees'}),_field_type=_FIELD),
 Field(name='lat',type=<class 'float'>,default=0.0,default_factory=<dataclasses._MISSING_TYPE object at 0x7f9d61bac2b0>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({'unit': 'degrees'}),_field_type=_FIELD))

In [37]:
lat_unit = fields(Position)[2].metadata['unit']
lat_unit

'degrees'

In [38]:
p = Position('test')
# metadata only exist in dataclass Position. Not the instance of it.
# Got AttributeError: 'float' object has no attribute 'metadata'
#p.lon.metadata  // 'float' object has no attribute 'metadata'

# fields can work on object of dataclass too.
#fields(p)

<a id='sect3_2'></a>
### <font color='darkgreen'>You Need Representation?</font>
Recall that we can create decks of cards out of thin air:
```python
>>> Deck()
Deck(cards=[PlayingCard(rank='2', suit='♣'), PlayingCard(rank='3', suit='♣'), ...
            PlayingCard(rank='K', suit='♠'), PlayingCard(rank='A', suit='♠')])
```
<br/>

While this representation of a Deck is explicit and readable, it is also very verbose. I have deleted 48 of the 52 cards in the deck in the output above. On an 80-column display, simply printing the full Deck takes up 22 lines! Let us add a more concise representation. In general, a Python object has two different string representations:
* **repr(obj)** is defined by obj.\_\_repr__() and should return a developer-friendly representation of obj. If possible, this should be code that can recreate obj. Data classes do this.

* **str(obj)** is defined by obj.\_\_str__() and should return a user-friendly representation of obj. Data classes do not implement a .\_\_str__() method, so Python will fall back to the .\_\_repr__() method.
<br/>

In [39]:
class MyClass:
    def __repr__(self):
        return "__repr__"
    
    def __str__(self):
        return "__str__"
    
my_obj = MyClass()
my_obj

__repr__

In [40]:
str(my_obj)

'__str__'

In [41]:
print(my_obj)

__str__


In [42]:
repr(my_obj)

'__repr__'

Let us implement a user-friendly representation of a <b><font color='blue'>PlayingCard</font></b>:

In [43]:
from dataclasses import dataclass

@dataclass
class PlayingCard:
    rank: str
    suit: str

    def __str__(self):
        return f'{self.suit}{self.rank}'

The cards now look much nicer, but the deck is still as verbose as ever:

In [44]:
ace_of_spades = PlayingCard(rank='A', suit='♠')
print(ace_of_spades)

♠A


In [45]:
print(Deck())

Making french deck...
Deck(cards=[PlayingCard(rank='2', suit='♣'), PlayingCard(rank='3', suit='♣'), PlayingCard(rank='4', suit='♣'), PlayingCard(rank='5', suit='♣'), PlayingCard(rank='6', suit='♣'), PlayingCard(rank='7', suit='♣'), PlayingCard(rank='8', suit='♣'), PlayingCard(rank='9', suit='♣'), PlayingCard(rank='10', suit='♣'), PlayingCard(rank='J', suit='♣'), PlayingCard(rank='Q', suit='♣'), PlayingCard(rank='K', suit='♣'), PlayingCard(rank='A', suit='♣'), PlayingCard(rank='2', suit='♢'), PlayingCard(rank='3', suit='♢'), PlayingCard(rank='4', suit='♢'), PlayingCard(rank='5', suit='♢'), PlayingCard(rank='6', suit='♢'), PlayingCard(rank='7', suit='♢'), PlayingCard(rank='8', suit='♢'), PlayingCard(rank='9', suit='♢'), PlayingCard(rank='10', suit='♢'), PlayingCard(rank='J', suit='♢'), PlayingCard(rank='Q', suit='♢'), PlayingCard(rank='K', suit='♢'), PlayingCard(rank='A', suit='♢'), PlayingCard(rank='2', suit='♡'), PlayingCard(rank='3', suit='♡'), PlayingCard(rank='4', suit='♡'), Playing

To show that it is possible to add your own .\_\_repr__() method as well, we will violate the principle that it should return code that can recreate an object. <b><a href='https://www.python.org/dev/peps/pep-0020/'>Practicality beats purity</a></b> after all. The following code adds a more concise representation of the <b><font color='blue'>Deck</font></b>:

In [46]:
from dataclasses import dataclass, field
from typing import List

@dataclass
class Deck:
    cards: List[PlayingCard] = field(default_factory=make_french_deck)

    def __repr__(self):
        cards = ', '.join(f'{c!s}' for c in self.cards)
        return f'{self.__class__.__name__}({cards})'

Note the `!s` specifier in the `{c!s}` format string. It means that we explicitly want to use the <font color='blue'>str()</font> representation of each <font color='blue'><b>PlayingCard</b></font>. With the new .\_\_repr__(), the representation of <b><font color='blue'>Deck</font></b> is easier on the eyes

In [47]:
Deck()

Making french deck...


Deck(♣2, ♣3, ♣4, ♣5, ♣6, ♣7, ♣8, ♣9, ♣10, ♣J, ♣Q, ♣K, ♣A, ♢2, ♢3, ♢4, ♢5, ♢6, ♢7, ♢8, ♢9, ♢10, ♢J, ♢Q, ♢K, ♢A, ♡2, ♡3, ♡4, ♡5, ♡6, ♡7, ♡8, ♡9, ♡10, ♡J, ♡Q, ♡K, ♡A, ♠2, ♠3, ♠4, ♠5, ♠6, ♠7, ♠8, ♠9, ♠10, ♠J, ♠Q, ♠K, ♠A)

<a id='sect3_3'></a>
### <font color='darkgreen'>Comparing Cards</font>
<b>In many card games, cards are compared to each other</b>. For instance in a typical <b><a href='https://en.wikipedia.org/wiki/Trick-taking_game'>trick taking game</a></b>, the highest card takes the trick. As it is currently implemented, the <b><font color='blue'>PlayingCard</font></b> class does not support this kind of comparison:
```python
>>> queen_of_hearts = PlayingCard('Q', '♡')
>>> ace_of_spades = PlayingCard('A', '♠')
>>> ace_of_spades > queen_of_hearts
TypeError: '>' not supported between instances of 'Card' and 'Card'
```
<br/>

This is, however, (seemingly) easy to rectify:

In [48]:
from dataclasses import dataclass

@dataclass(order=True)
class PlayingCard:
    rank: str
    suit: str

    def __str__(self):
        return f'{self.suit}{self.rank}'

The <b><a href='https://docs.python.org/3/library/dataclasses.html#dataclasses.dataclass'>@dataclass</a></b> decorator has two forms. So far you have seen the simple form where <b><a href='https://docs.python.org/3/library/dataclasses.html#dataclasses.dataclass'>@dataclass</a></b> is specified without any parentheses and parameters. However, you can also give parameters to the <b><a href='https://docs.python.org/3/library/dataclasses.html#dataclasses.dataclass'>@dataclass</a></b> decorator in parentheses. The following parameters are supported:
* **init**: Add .\_\_init__() method? (<font color='brown'>Default is True.</font>)
* **repr**: Add .\_\_repr__() method? (<font color='brown'>Default is True.</font>)
* **eq**: Add .\_\_eq__() method? (<font color='brown'>Default is True.</font>)
* **order**: Add ordering methods? (<font color='brown'>Default is False.</font>)
* **unsafe_hash**: Force the addition of a .\_\_hash__() method? (<font color='brown'>Default is False.</font>)
* **frozen**: If True, assigning to fields raise an exception. (<font color='brown'>Default is False.</font>)
<br/>

See the <a href='https://www.python.org/dev/peps/pep-0557/#id7'>original PEP</a> for more information about each parameter. After setting <font color='blue'>order=True</font>, instances of <b><font color='blue'>PlayingCard</font></b> can be compared:

In [49]:
queen_of_hearts = PlayingCard('Q', '♡')
ace_of_spades = PlayingCard('A', '♠')
ace_of_spades > queen_of_hearts

False

In [50]:
ord('A')

65

In [51]:
ord('Q')

81

In [52]:
('A', '♠') > ('Q', '♡')

False

How are the two cards compared though? You have not specified how the ordering should be done, and for some reason Python seems to believe that a Queen is higher than an Ace…

<b>It turns out that data classes compare objects as if they were tuples of their fields</b>. In other words, a Queen is higher than an Ace because 'Q' comes after 'A' in the alphabet:

In [53]:
('A', '♠') > ('Q', '♡')

False

That does not really work for us. Instead, we need to define some kind of sort index that uses the order of RANKS and SUITS. Something like this:
```python
>>> RANKS = '2 3 4 5 6 7 8 9 10 J Q K A'.split()
>>> SUITS = '♣ ♢ ♡ ♠'.split()
>>> card = PlayingCard('Q', '♡')
>>> RANKS.index(card.rank) * len(SUITS) + SUITS.index(card.suit)
42
```
<br/>

For <b><font color='blue'>PlayingCard</font></b> to use this sort index for comparisons, we need to add a field <font color='violet'>.sort_index</font> to the class. However, this field should be calculated from the other fields <font color='violet'>.rank</font> and <font color='violet'>.suit</font> automatically. This is exactly what the special method .\_\_post_init__() is for. It allows for special processing after the regular .\_\_init__() method is called:

In [54]:
from dataclasses import dataclass, field

RANKS = '2 3 4 5 6 7 8 9 10 J Q K A'.split()
SUITS = '♣ ♢ ♡ ♠'.split()

@dataclass(order=True)
class PlayingCard:
    sort_index: int = field(init=False, repr=False)
    rank: str
    suit: str

    def __post_init__(self):        
        self.sort_index = (RANKS.index(self.rank) * len(SUITS)
                           + SUITS.index(self.suit))
        print(f'{self} with sort_index={self.sort_index}')

    def __str__(self):
        return f'{self.suit}{self.rank}'

<b>Note that <font color='violet'>.sort_index</font> is added as the first field of the class. That way, the comparison is first done using <font color='violet'>.sort_index</font> and only if there are ties are the other fields used.</b> Using <font color='blue'>field()</font>, you must also specify that <font color='violet'>.sort_index</font> should not be included as a parameter in the .\_\_init__() method (<font color='brown'>because it is calculated from the</font> <font color='violet'>.rank</font> <font color='brown'>and</font> <font color='violet'>.suit</font> <font color='brown'>fields</font>). To avoid confusing the user about this implementation detail, it is probably also a good idea to remove <font color='violet'>.sort_index</font> from the `repr` of the class.

Finally, aces are high:

In [55]:
queen_of_hearts = PlayingCard('Q', '♡')
ace_of_spades = PlayingCard('A', '♠')
print(f'queen_of_hearts.sort_index={queen_of_hearts.sort_index}; ace_of_spades.sort_index={ace_of_spades.sort_index}')

♡Q with sort_index=42
♠A with sort_index=51
queen_of_hearts.sort_index=42; ace_of_spades.sort_index=51


In [56]:
ace_of_spades > queen_of_hearts

True

In [57]:
(51, 'A', '♠') > (42, 'Q', '♡')

True

You can now easily create a sorted deck:

In [58]:
Deck(sorted(make_french_deck()))

Making french deck...
♣2 with sort_index=0
♣3 with sort_index=4
♣4 with sort_index=8
♣5 with sort_index=12
♣6 with sort_index=16
♣7 with sort_index=20
♣8 with sort_index=24
♣9 with sort_index=28
♣10 with sort_index=32
♣J with sort_index=36
♣Q with sort_index=40
♣K with sort_index=44
♣A with sort_index=48
♢2 with sort_index=1
♢3 with sort_index=5
♢4 with sort_index=9
♢5 with sort_index=13
♢6 with sort_index=17
♢7 with sort_index=21
♢8 with sort_index=25
♢9 with sort_index=29
♢10 with sort_index=33
♢J with sort_index=37
♢Q with sort_index=41
♢K with sort_index=45
♢A with sort_index=49
♡2 with sort_index=2
♡3 with sort_index=6
♡4 with sort_index=10
♡5 with sort_index=14
♡6 with sort_index=18
♡7 with sort_index=22
♡8 with sort_index=26
♡9 with sort_index=30
♡10 with sort_index=34
♡J with sort_index=38
♡Q with sort_index=42
♡K with sort_index=46
♡A with sort_index=50
♠2 with sort_index=3
♠3 with sort_index=7
♠4 with sort_index=11
♠5 with sort_index=15
♠6 with sort_index=19
♠7 with sort_inde

Deck(♣2, ♢2, ♡2, ♠2, ♣3, ♢3, ♡3, ♠3, ♣4, ♢4, ♡4, ♠4, ♣5, ♢5, ♡5, ♠5, ♣6, ♢6, ♡6, ♠6, ♣7, ♢7, ♡7, ♠7, ♣8, ♢8, ♡8, ♠8, ♣9, ♢9, ♡9, ♠9, ♣10, ♢10, ♡10, ♠10, ♣J, ♢J, ♡J, ♠J, ♣Q, ♢Q, ♡Q, ♠Q, ♣K, ♢K, ♡K, ♠K, ♣A, ♢A, ♡A, ♠A)

Or, if you don’t care about sorting, this is how you draw a random hand of 10 cards:

In [59]:
from random import sample

Deck(sample(make_french_deck(), k=10))

Making french deck...
♣2 with sort_index=0
♣3 with sort_index=4
♣4 with sort_index=8
♣5 with sort_index=12
♣6 with sort_index=16
♣7 with sort_index=20
♣8 with sort_index=24
♣9 with sort_index=28
♣10 with sort_index=32
♣J with sort_index=36
♣Q with sort_index=40
♣K with sort_index=44
♣A with sort_index=48
♢2 with sort_index=1
♢3 with sort_index=5
♢4 with sort_index=9
♢5 with sort_index=13
♢6 with sort_index=17
♢7 with sort_index=21
♢8 with sort_index=25
♢9 with sort_index=29
♢10 with sort_index=33
♢J with sort_index=37
♢Q with sort_index=41
♢K with sort_index=45
♢A with sort_index=49
♡2 with sort_index=2
♡3 with sort_index=6
♡4 with sort_index=10
♡5 with sort_index=14
♡6 with sort_index=18
♡7 with sort_index=22
♡8 with sort_index=26
♡9 with sort_index=30
♡10 with sort_index=34
♡J with sort_index=38
♡Q with sort_index=42
♡K with sort_index=46
♡A with sort_index=50
♠2 with sort_index=3
♠3 with sort_index=7
♠4 with sort_index=11
♠5 with sort_index=15
♠6 with sort_index=19
♠7 with sort_inde

Deck(♠A, ♠2, ♢Q, ♠J, ♠7, ♠4, ♡2, ♣8, ♡8, ♣5)

<a id='sect4'></a>
## <font color='darkblue'>Immutable Data Classes</font> ([back](#sect0))
<b>One of the defining features of the <a href='https://docs.python.org/3/library/collections.html#collections.namedtuple'>namedtuple</a> you saw earlier is that it is immutable.</b> That is, the value of its fields may never change. For many types of data classes, this is a great idea! To make a data class immutable, set <font color='blue'>frozen=True</font> when you create it. For example, the following is an immutable version of the <font color='blue'><b>Position</b></font> class <a href='https://realpython.com/python-data-classes/#basic-data-classes'>you saw earlier</a>:

In [60]:
from dataclasses import dataclass

@dataclass(frozen=True)
class Position:
    name: str
    lon: float = 0.0
    lat: float = 0.0

In a frozen data class, you can not assign values to the fields after creation:
```python
>>> pos = Position('Oslo', 10.8, 59.9)
>>> pos.name
'Oslo'
>>> pos.name = 'Stockholm'
dataclasses.FrozenInstanceError: cannot assign to field 'name'
```
<br/>

Be aware though that if your data class contains mutable fields, those might still change. This is true for all nested data structures in Python (<font color='brown'>see <a href='https://www.youtube.com/watch?v=p9ppfvHv2Us'>this video for further info</a></font>):

In [61]:
from dataclasses import dataclass
from typing import List

@dataclass(frozen=True)
class ImmutableCard:
    rank: str
    suit: str

@dataclass(frozen=True)
class ImmutableDeck:
    cards: List[ImmutableCard]

In [62]:
ideck = ImmutableDeck([ImmutableCard('Q', '♡'), ImmutableCard('A', '♠')])
# ideck.cards = []  # cannot assign to field 'cards'
ideck.cards.append(ImmutableCard('2', '♣'))
ideck

ImmutableDeck(cards=[ImmutableCard(rank='Q', suit='♡'), ImmutableCard(rank='A', suit='♠'), ImmutableCard(rank='2', suit='♣')])

Even though both <b><font color='blue'>ImmutableCard</font></b> and <b><font color='blue'>ImmutableDeck</font></b> are immutable, the list holding cards is not. You can therefore still change the `cards` in the deck:
```python
>>> queen_of_hearts = ImmutableCard('Q', '♡')
>>> ace_of_spades = ImmutableCard('A', '♠')
>>> deck = ImmutableDeck([queen_of_hearts, ace_of_spades])
>>> deck
ImmutableDeck(cards=[ImmutableCard(rank='Q', suit='♡'), ImmutableCard(rank='A', suit='♠')])
>>> deck.cards[0] = ImmutableCard('7', '♢')
>>> deck
ImmutableDeck(cards=[ImmutableCard(rank='7', suit='♢'), ImmutableCard(rank='A', suit='♠')])
```
<br/>

<b>To avoid this, make sure all fields of an immutable data class use immutable types</b> (<font color='brown'>but remember that types are not enforced at runtime</font>). The <font color='blue'><b>ImmutableDeck</b></font> should be implemented using a tuple instead of a list.

<a id='sect5'></a>
## <font color='darkblue'>Inheritance</font> ([back](#sect0))
<b>You can <a href='https://realpython.com/python3-object-oriented-programming/'>subclass</a> data classes quite freely</b>. As an example, we will extend our <b><font color='blue'>Position</font></b> example with a `country` field and use it to record capitals:

In [63]:
from dataclasses import dataclass

@dataclass
class Position:
    name: str
    lon: float
    lat: float

@dataclass
class Capital(Position):
    country: str

In this simple example, everything works without a hitch:

In [64]:
Capital(name='Oslo', lon=10.8, lat=59.9, country='Norway')

Capital(name='Oslo', lon=10.8, lat=59.9, country='Norway')

The `country` field of <font color='blue'><b>Capital</b></font> is added after the three original fields in Position. Things get a little more complicated if any fields in the base class have default values. For example:
```python
from dataclasses import dataclass

@dataclass
class Position:
    name: str
    lon: float = 0.0
    lat: float = 0.0

@dataclass
class Capital(Position):
    country: str  # Does NOT work
```
<br/>

<b>This code will immediately crash with a <font color='blue'>TypeError</font> complaining that “non-default argument ‘country’ follows default argument.”</b> The problem is that our new `country` field has no default value, while the `lon` and `lat` fields have default values. The data class will try to write an .\_\_init__() method with the following signature:
```python
def __init__(name: str, lon: float = 0.0, lat: float = 0.0, country: str):
    ...
```
<br/>

However, this is not valid Python. <a href='https://docs.python.org/reference/compound_stmts.html#function-definitions'>If a parameter has a default value, all following parameters must also have a default value</a>. In other words, if a field in a base class has a default value, then all new fields added in a subclass must have default values as well.

<b>Another thing to be aware of is how fields are ordered in a subclass.</b> Starting with the base class, fields are ordered in the order in which they are first defined. If a field is redefined in a subclass, its order does not change. For example, if you define <b><font color='blue'>Position</font></b> and <b><font color='blue'>Capital</font></b> as follows:

In [65]:
from dataclasses import dataclass

@dataclass
class Position:
    name: str
    lon: float = 0.0
    lat: float = 0.0
        

@dataclass
class Capital(Position):
    country: str = 'Unknown'
    lat: float = 40.0

Then the order of the fields in <b><font color='blue'>Capital</font></b> will still be `name`, `lon`, `lat`, `country`. However, the default value of lat will be 40.0.

In [66]:
Capital('Madrid', country='Spain')

Capital(name='Madrid', lon=0.0, lat=40.0, country='Spain')

In [67]:
for f in fields(Capital): print(f)

Field(name='name',type=<class 'str'>,default=<dataclasses._MISSING_TYPE object at 0x7f9d61bac2b0>,default_factory=<dataclasses._MISSING_TYPE object at 0x7f9d61bac2b0>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),_field_type=_FIELD)
Field(name='lon',type=<class 'float'>,default=0.0,default_factory=<dataclasses._MISSING_TYPE object at 0x7f9d61bac2b0>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),_field_type=_FIELD)
Field(name='lat',type=<class 'float'>,default=40.0,default_factory=<dataclasses._MISSING_TYPE object at 0x7f9d61bac2b0>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),_field_type=_FIELD)
Field(name='country',type=<class 'str'>,default='Unknown',default_factory=<dataclasses._MISSING_TYPE object at 0x7f9d61bac2b0>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),_field_type=_FIELD)


<a id='sect6'></a>
## <font color='darkblue'>Optimizing Data Classes</font> ([back](#sect0))
I’m going to end this tutorial with a few words about <b><a href='https://docs.python.org/reference/datamodel.html#slots'>slots</a></b>. Slots can be used to make classes faster and use less memory. Data classes have no explicit syntax for working with slots, but the normal way of creating slots works for data classes as well. (<font color='brown'>They really are just regular classes!</font>)

In [68]:
from dataclasses import dataclass

@dataclass
class SimplePosition:
    name: str
    lon: float
    lat: float

@dataclass
class SlotPosition:
    __slots__ = ['name', 'lon', 'lat']
    name: str
    lon: float
    lat: float

Essentially, slots are defined using .\_\_slots__ to list the variables on a class. Variables or attributes not present in .\_\_slots__ may not be defined. Furthermore, a slots class may not have default values.

<b>The benefit of adding such restrictions is that certain optimizations may be done. For instance, slots classes take up less memory</b>, as can be measured using <b><a href='https://pythonhosted.org/Pympler/'>Pympler</a></b>:

In [69]:
#!pip install pympler

In [70]:
from pympler import asizeof

simple = SimplePosition('London', -0.1, 51.5)
slot = SlotPosition('Madrid', -3.7, 40.4)
asizeof.asizesof(simple, slot)
# (424, 160)

(424, 160)

<b>Similarly, slots classes are typically faster to work with.</b> The following example measures the speed of attribute access on a slots data class and a regular data class using <b><a href='https://docs.python.org/3/library/timeit.html'>timeit</a></b> from the standard library.

In [71]:
from timeit import timeit

timeit('slot.name', setup="slot=SlotPosition('Oslo', 10.8, 59.9)", globals=globals())

0.03253677603788674

In [72]:
timeit('simple.name', setup="simple=SimplePosition('Oslo', 10.8, 59.9)", globals=globals())

0.034331331960856915

In this particular example, the slot class is about 35% faster.