# Chapter 5. Data Class Builders
---

## ToC

1. [Data Class as a Code Smell](#more-about-dataclass)  
    1.1. [Data Class as Scaffolding](#data-class-as-scaffolding)  
    1.2. [Data Class as Intermediate Representation](#data-class-as-intermediate-representation)  
    1.3. [Pattern Matching Class Instances](#pattern-matching-class-instances)  
2. [Pattern Matching Class Instances](#pattern-matching-class-instances)  
    2.1. [Simple Class Patterns](#simple-class-patterns)  
    2.2. [Keyword Class Patterns](#keyword-class-patterns)  
    2.3. [Positional Class Patterns](#positional-class-patterns)

---

## Data Class as a Code Smell

Whether you implement a data class by writing all the code yourself or leveraging
one of the class builders described in this chapter, be aware that it may signal a problem
in your design.

**Resources**

[Martin Fowler Post (Must Read!)](https://martinfowler.com/bliki/CodeSmell.html)  
[Refactoring Guru Website](https://refactoring.guru/refactoring/smells)


In [Refactoring: Improving the Design of Existing Code, 2nd ed.](https://martinfowler.com/books/refactoring.html) (Addison-Wesley), Martin Fowler and Kent Beck present a catalog of “code smells”—patterns in code that may indicate the need for refactoring. The entry titled “Data Class” starts like
this:
> These are classes that have fields, getting and setting methods for fields, and nothing
else. Such classes are dumb data holders and are often being manipulated in far too
much detail by other classes.

The main idea of object-oriented programming is to place behavior and data together
in the same code unit: a class. If a class is widely used but has no significant behavior
of its own, it’s possible that code dealing with its instances is scattered (and even
duplicated) in methods and functions throughout the system—a recipe for maintenance
headaches. That’s why Fowler’s refactorings to deal with a data class involve
bringing responsibilities back into it.

Taking that into account, there are a couple of common scenarios where it makes
sense to have a data class with little or no behavior.

### Data Class as Scaffolding

In this scenario, the data class is an initial, simplistic implementation of a class to
jump-start a new project or module. With time, the class should get its own methods,
instead of relying on methods of other classes to operate on its instances. Scaffolding
is temporary; eventually your custom class may become fully independent from the
builder you used to start it.

Python is also used for quick problem solving and experimentation, and then it’s OK
to leave the scaffolding in place

### Data Class as Intermediate Representation

Python’s data class builders all provide a method or function to convert
an instance to a plain `dict`, and you can always invoke the constructor with a
dict used as keyword arguments expanded with **. Such a `dict` is very close to a
JSON record.

In this scenario, the data class instances should be handled as immutable objects—
even if the fields are mutable, you should not change them while they are in this
intermediate form. If you do, you’re losing the key benefit of having data and behavior
close together. When importing/exporting requires changing values, you should
implement your own builder methods instead of using the given “as dict” methods or
standard constructors.

## Pattern Matching Class Instances

Class patterns are designed to match class instances by type and—optionally—by
attributes. There are three variations of class patterns: simple, keyword, and positional.

### Simple Class Patterns

Recall the example of simple class patterns used in notebook `Part_I/Chapter_02_ArrayOfSequences/02_UnpackingSeqsIterables_PatternMatching.ipynb`

```python
case [str(name), _, _, (float(lat), float(lon))]:
```

That pattern matches a four-item sequence where the first item must be an instance
of `str`, and the last item must be a 2-tuple with two instances of `float`.

The following is a class pattern that matches `float` values without binding a variable (the case body can
refer to `x` directly if needed):

```python
match x:
    case float():
        do_something_with(x)
```

which is equivalent to following:

```python
if isinstance(x, float):
    do_something_with(x)
```

But the following is likely to be a bug in your code:

```python
match x:
    case float: # DANGER!!!
            do_something_with(x)
```

In latter, `case float:` matches any subject, because Python sees
`float` as a variable, which is then bound to the subject.

The simple pattern syntax of `float(x)` is a special case that applies only to nine
blessed built-in types, listed at the end of the [“Class Patterns” section of PEP 634—
Structural Pattern Matching: Specification](https://peps.python.org/pep-0634/):

`bytes` `dict` `float` `frozenset` `int` `list` `set` `str` `tuple`

If the class is not one of those nine blessed built-ins, then the argument-like variables
represent patterns to be matched against attributes of an instance of that class.

In [2]:
match 3.14:
    # bind the entire matched value
    case float(x):
        print(x)


3.14


In [6]:
from dataclasses import dataclass
@dataclass
class Point:
    x: int
    y: int

match Point('a', 2):
    # matched against the attributes of the Point instance
    case Point(x, y):
        print(x, y)


a 2


In [10]:
from dataclasses import dataclass

@dataclass
class Point:
    x: int
    y: int

def check_point(p):
    match p:
        case Point(x, y) if isinstance(x, int) and isinstance(y, int) and x > 0 and y > 0:
            print(f"Matched: x={x}, y={y}")
        case _:
            print("No match")

def describe(p):
    match p:
        case Point(x, y) if x == y:
            print("Diagonal point (x == y)")
        case Point(x, y) if x > 0 and y > 0:
            print("Point in the first quadrant")
        case Point(x, y):
            print(f"Point at x={x}, y={y}")
        case _:
            print("Not a point")
            
# --- Run test cases ---
print("=== check_point examples ===")
check_point(Point(3, 4))       # Match
check_point(Point(-1, 4))      # No match
check_point(Point(0, 0))       # No match
check_point(Point("x", 2))     # No match
check_point("not a point")     # No match
print("\n" + "=" * 15 + " END OF CHECK " + "=" * 15 + "\n")


print("\n=== describe examples ===")
describe(Point(2, 2))          # Diagonal
describe(Point(5, 3))          # First quadrant
describe(Point(-1, 0))         # General point
describe("not a point")        # Not a Point instance



=== check_point examples ===
Matched: x=3, y=4
No match
No match
No match
No match



=== describe examples ===
Diagonal point (x == y)
Point in the first quadrant
Point at x=-1, y=0
Not a point


### Keyword Class Patterns

In [None]:
import typing

class City(typing.NamedTuple):
    continent: str
    name: str
    country: str
        
cities = [
    City('Asia', 'Tokyo', 'JP'),
    City('Asia', 'Delhi', 'IN'),
    City('North America', 'Mexico City', 'MX'),
    City('North America', 'New York', 'US'),
    City('South America', 'São Paulo', 'BR')
]

In [13]:
def match_asian_cities():
    results = []
    for city in cities:
        match city:
            case City(continent='Asia'):
                results.append(city)
    return results

match_asian_cities()

[City(continent='Asia', name='Tokyo', country='JP'),
 City(continent='Asia', name='Delhi', country='IN')]

If you want to collect the value of the `country` attribute, you could write:

In [16]:
def match_asian_countries():
    results = []
    for city in cities:
        match city:
            case City(continent='Asia', country=cc):
                results.append(cc)
    return results

match_asian_countries()

['JP', 'IN']

now the `cc` variable is bound to the `country` attribute of the instance. This
also works if the pattern variable is called `country` as well:

```python
match city:
    case City(continent='Asia', country=country):
        results.append(country)
```

Keyword class patterns are very readable, and work with any class that has public
instance attributes, but they are somewhat verbose.

### Positional Class Patterns

In [18]:
def match_asian_cities_pos():
    results = []
    for city in cities:
        match city:
            case City('Asia'):
                results.append(city)
    return results

In [19]:
match_asian_cities_pos()

[City(continent='Asia', name='Tokyo', country='JP'),
 City(continent='Asia', name='Delhi', country='IN')]

The pattern `City('Asia')` matches any `City` instance where the **first** attribute value
is `'Asia'`, regardless of the values of the other attributes.

If you want to collect the value of the `country` attribute, you could write:

In [20]:
def match_asian_countries_pos():
    results = []
    for city in cities:
        match city:
            case City('Asia', _, country):
                results.append(country)
    return results

match_asian_countries_pos()

['JP', 'IN']

The pattern `City('Asia', _, country)` matches the same cities as before, but now
the `country` variable is bound to the **third** attribute of the instance.

What makes City or any class work with positional patterns is the presence of a special
class attribute named `__match_args__`:

In [21]:
City.__match_args__

('continent', 'name', 'country')

`__match_args__` declares the names of the attributes in the order they
will be used in positional patterns.

![Figure 85](https://raw.githubusercontent.com/berserkhmdvhb/Training-Python/main/figures/Part_I/85.PNG)