# Python Data Model

> Coding like poetry should be short and concise. ―Santosh Kalwar

- Overview
- Special Methods
- Number and Boolean Value
- Collection
- Class and Metaclass

## Motivation

There are three basic requirements in design of data operations:

- Consistent: it is easy to predict the behaviors of a new construct instance.
- Composable: constructs can be selected and assembled in various combinations to enable desired behaviors.
- Open: developers can create new types that work in the same way as the built-in types and standard library types.

## Language Constructs

Python language constructs can be classified into four categories:

- built-in operators such as `+`, `-`, `>=`, list index `[]`, function call `()`, and so on.
- built-in functions such as `len()`, `repr()`, `bool()`, etc.
- specific syntax such as `for` loop statement and `with` context manager statement.
- built-in types and new type definitions (`class`). 


## Python Data Model

Python *data model* is the set of APIs that defines the interfaces of language constructs that satisfies the three basic requirements:

- consistent: it is standardized by Python language specification and PEPs.
- Composable: the APIs work well with each other.
- Open: new objects fit well with the Python language syntax.

It is defined in Python Language Reference [Data Model](https://docs.python.org/3/reference/datamodel.html).

## Pythonic

> "There should be one-- and preferably only one --obvious way to do it."  - The Zen of Python

Python promotes an idiomatic coding style, the so-called *Pythonic* style, that leverages Python data model and demonstrates idiomatic language features.

For example, to find an object's length, you use the built-in `len()` function, not a function like `length()`/`size()`, or a method like `my_object.len()` or `my_object.size()`.

Every Python developers should be familiar with common Python idioms. Following are two resources:

- [Python Programming/Idioms](https://en.wikibooks.org/wiki/Python_Programming/Idioms)
- [Idiomatic Python](https://intermediate-and-advanced-software-carpentry.readthedocs.io/en/latest/idiomatic-python.html)

# Special Methods

The Python data model is a set of APIs. The APIs are defined as a set of standard *special methods*.

All special methods follow a special naming style: starting and ending with double underscores: `__*__`. They are known as *dunder* (double underscore) methods.

Developers *should not* create or use any dunder identifier not standardized by the language reference because they are subject to breakage without warning in future Python versions.

In a Python interpreter, built-in functions, operators, and special syntax invoke these special class methods to perform data operations.

## Build-in Functions

`len()` invokes the `__len__()` method to get the length/size of an object.

`repr()` invokes the `__repr()` method to compute the string representation (serialized string) of an object. It is used by developer for debugging purpose.

`str()`, `format()`, and `print()` invokes `__str__()` method to compute a user friendly representation of an object. The default implementation of `object` calls `object.__repr__()`.

## Built-in Operators

`+` invokes the `__add__()` method on its left operand. If the first operand doesn't define the `__add__()` method, it invokes `__radd__()` method of the right operand. If both are not defined, it returns `NotImplemented` exception.

`==` invokes the `__eq__()` method. By default, object implements __eq__() by using `is` that checks if two references point to the same object. In most cases, this is not what you want and you should implement the `__eq__()` method. 

`self[key]` invokes `self.__getitem__(self, key)` for sequence type where `key` is an integer and mapping type where `key` is any immutable value.

Method call `self(...)` invokes `self.__call__(self, ...)`. If a class defines `__call__(self, ...)` method, its instances are callable using syntax `instance(...)`.

## Special Syntax

`for` statement uses `__iter__` method to loop over items of a collection.

`with` statement uses `__enter__` and `__exit__` methods to manage object context. Classes deal with file, database, and network should use the two methods to manage resources.

## Built-in Types and New Types

Python defines a set of built-in types. Each type has a set of valid operations. New types are defined using `class` to emulate the built-in type behaviors. Following sections will give examples emulating built-in types and customizing new type creation.

- Number and Bool
- Collection
- Class and Metaclass

# Emulating Number

Python is a high level programming language that has built-in number operators and functions such as `+`, `-` (unary negation or binary subtraction), `*`, `/`, `//` (floor division), `%`, `**`, `<`, `<=`, `==`, `!=`, `>`, `>=`, `abs`, `&` (bitwise And), `~` (bitwise inversion), `^` (bitwise XOR), `|` (bitwise Or), and so on.

Each of these operators or functions has one or more corresponding special methods. For example, `+` invokes either `__add__` of its left operand or `__radd__` of its second operand if the `__add__` is not defined by the left operand.

If a new type defines the corresponding methods, an instance of the type can be operands of the built-in operators and functions

## `+` and `*` Operators

As an example, the following code defines a new `Vector` type that works well with `+` and
`*`. It also defines `__repr__` to have a better string representation of the data.

In [None]:
class Vector:

    def __init__(self, x=0, y=0):
        self.x = x
        self.y = y
    
    def __add__(self, other):
        x = self.x + other.x
        y = self.y + other.y
        return Vector(x, y)
    
    # scalar multiplication
    def __mul__(self, number):
        x = self.x * number
        y = self.y * number
        return Vector(x, y)
    
    # a string representation
    # the x!r conversion flag means `repr(x)`
    def __repr__(self):
        return f"Vector({self.x!r}, {self.y!r})"


point_1 = Vector(2, 4)
point_2 = Vector(3, 5)
point_3 = point_1 + point_2
point_4 = point_3 * 10

print(point_3, point_4) # Vector(5, 9) Vector(50, 90)

## Boolean Value

Any Python object can be used in a boolean context or be an operand of built-in `bool()` function. Boolean context include conditions in `if` or `while` statement, or as operands of `and`, `or`, and `not` logical operators. Every object is either *truthy* or *falsy* in a boolean context.

By default, any instance of a new type is truthy unless either `__bool__()` or `__len__()` method is defined in the type. In a boolean context or a call of `bool()`, the `__bool()__`  method is called. If the `__bool__()` method is not defined, Python calls `__len__()` method. If the result is 0, it is falsy or `False`. Otherwise, it is truthy or `True`.

The `Vector` type has an additional `__bool__()` method in the following code:

In [None]:
class Vector:

    def __init__(self, x=0, y=0):
        self.x = x
        self.y = y
    
    def __add__(self, other):
        x = self.x + other.x
        y = self.y + other.y
        return Vector(x, y)
    
    # scalar multiplication
    def __mul__(self, number):
        x = self.x * number
        y = self.y * number
        return Vector(x, y)
    
    # a string representation
    # the x!r conversion flag means `repr(x)`
    def __repr__(self):
        return f"Vector({self.x!r}, {self.y!r})"
    
    def __bool__(self):
        return bool(self.x) or bool(self.y)


point_1 = Vector()
point_2 = Vector(3, 5)

print(bool(point_1), bool(point_2)) # False True

# Collection

Python built-in collection types include `str`, `list`, `tuple`, `range`, `set`, `dict`, and so on. They all have three special methods:

- `__len__()` to support built-in `len()` function or `bool()` function.
- `__iter__()` to support `for`, unpacking, and other iteration operations.
- `__contains__` to support `in` operator.

Except `set`, all collection types support getting a value by a key (an index or any immutable object) using syntax `obj[key]`. It is equivalent to `type(obj).__getitem__(obj, key)`.

By implementing the corresponding special methods, a new type can emulate a built-in collection type the works well in a Pythonic style.