### 05 - Classes and Decorators

#### Outline

* Class Definition, Initialization, and Inheritance
* Methods, Properties, and Caching
* Abstract Base Classes & Protocols
* Decorators
* Context Objects
* Pydantic and Dataclasses


In [None]:
from __future__ import (
    annotations,
)  # Must be at beginning of file; required for self-referencing class method

# These are part of the standard library
import os
from pathlib import Path

working_directory = Path(
    os.path.abspath("")
)  # Immediately stuff the string into a Path object
static_dir = working_directory / "static" / "05"

----
#### Class Definition, Initialization, and Inheritance

Classes are a combination of data and functions that can be performed on the data.

Inheritance can be useful but messy.

As a rule of thumb (but not a death pact),
* Don't use multiple inheritance unless the second superclass is a primitive
  * Sometimes the usage is specifically required by an API (like Qt); this is a red flag, but sometimes you just need to power through
* Don't use multiple levels of inheritance (two levels ok if the first level is an abstract class)

Inheriting classes carry all of the baggage (extra methods, stored fields, etc) of their ancestor classes. If used in excess, this can bloat both the complexity and RAM usage of a system.

In [None]:
# This is the class definition
class Example:
    """Classes should also have a docstring describing what they do"""

    # These are class attributes.
    # NOTE: Similar to functions, we can assign a default here,
    # and that default will be shared among all instances,
    # so we have to remember to avoid mutable default arguments.
    a: list[float]
    """A list of stuff"""  # Attributes can and should have a docstring
    b: int
    """A number of things"""
    c: float
    """How much stuff"""

    # This is a required method, the constructor
    def __init__(self, a: list[float], b: int, c: float):
        """The init function docstring is the one people will see when they mouse-over the class name

        Args:
            a: _description_
            b: _description_
            c: _description_
        """
        # These are instance attributes
        # Best-practice is for these to always match a declaration of a class attribute,
        # but syntax does not impose this requirement
        self.a = a
        self.b = b
        self.c = c
        print("Example init")

    # This is an instance method
    def bc(self) -> float:
        """b times c"""
        return self.b * self.c


# These are instances
e1 = Example([1.0], 3, 7.0)  # This is calling Example.__init__()
e2 = Example([2.0, 4.3], 1, 0.0)

# We can access attributes and call functions
print(f"e1.a: {e1.a}")
print(f"e2.a: {e2.a}")
print(f"e1.bc() = {e1.bc()}")
print(f"e2.bc() = {e2.bc()}")

In [None]:
# This is inheritance
import json


# We can inherit from the class we defined (or anything else)
class HashableExample(Example):
    """A wrapper for `Example` that does explicit hashing based on attr values rather than address"""

    def __init__(self, a: list[float], b: int, c: float):
        # Call superclass's constructor, otherwise its part of the functionality will not work
        # NOTE: Show what happens if we remove super init
        super().__init__(a, b, c)
        print("HashableExample init")

    def __hash__(self) -> int:
        """Inefficient but effective method for hashing this type"""
        return hash(json.dumps({"id": id(self), "a": self.a, "b": self.b, "c": self.c}))


e3 = HashableExample([1.0], 3, 7.0)

emap: dict[HashableExample, int] = {e3: id(e3)}
emap

----
#### Methods, Properties, and Caching

Methods
* **Instance methods** are functions that belong to a given instance of the class, and take `self` as the first argument
* **Class methods** provide functions that can be run without instantiating the class, and take `cls` as the first argument
* **Static methods** have no syntactic association with the class, but are grouped with it conceptually

Properties
* **Properties** act like an attribute, but they run a function when accessed
  * This eliminates unary getters and setters
* **Cached properties** are run once when accessed, and never again
  * This allows constructing implicit dependency trees 
  * Should only be combined with **frozen** classes! There is no cache invalidation


##### Kinds of methods and properties

In [None]:
import numpy as np


class MethodsAndPropertiesExample:
    """
    An example of different kinds of methods and properties,
    representing a cylinder.
    """

    _radius: float  # relatively-private, get/set with property
    """[m]"""
    density: float
    """[kg/m^3]"""

    def __init__(self, radius, density):
        self.radius = radius
        self.density = density

    # Class methods take `cls`, the class, as the first argument rather than an instance of the class
    # NOTE: we need future annotations to annotate these properly
    @classmethod
    def from_density_and_mass_per_length(
        cls, density: float, mass_per_length: float
    ) -> MethodsAndPropertiesExample:
        """Back-calculate radius from mass per length and density

        Args:
            density: [kg/m^3]
            mass_per_length: [kg/m]

        Returns:
            Instance of MethodsAndPropertiesExample
        """
        radius = (mass_per_length / density / np.pi) ** 0.5  # [m]
        return cls(radius, density)

    # Staticmethods are associated with the class, but don't take either `self` or `cls` arguments.
    # They _could_ be standalone functions, but are grouped with the class by choice.
    @staticmethod
    def cylinder_volume(radius: float, length: float) -> float:
        """[m^3]"""
        area = np.pi * radius**2  # [m^2]
        return area * length

    # This is an instance method, which takes `self` (an instance of the class) as its first argument
    def mass(self, length: float) -> float:
        """[kg] mass for a given length"""
        return (
            self.section_area * length * self.density
        )  # This uses the `section_area` property

    # The minimal form of a property only provides a getter function
    @property
    def mass_per_length(self) -> float:
        """[kg/m] 1D mass density"""
        return (
            MethodsAndPropertiesExample.cylinder_volume(self.radius, 1.0) * self.density
        )

    # Properties can also specify `setter` and `deleter` functions
    @property
    def radius(self) -> float:
        """[m]"""
        return self._radius

    @radius.setter
    def radius(self, val: float):
        if val <= 0.0:
            raise ValueError("Radius must be positive nonzero")
        else:
            self._radius = val

    @radius.deleter
    def radius(self):
        raise AttributeError("Radius is a required attribute and must not be deleted")


##### Caching

Caching, aka memoizing, is the practice of storing results for reuse.

Cache invalidation - figuring out when you need to calculate a new value instead of using the stored one - is considered one of the "very hard problems" of software engineering.

In [None]:
# Cached properties require careful handling because, by default, the cache is never invalidated
from functools import cached_property
from dataclasses import dataclass


@dataclass(frozen=True)  # Error on assigning to an attribute after init
class CachedPropertyExample:
    a: float
    """First value"""
    b: float
    """Second value"""

    @cached_property
    def ab(self) -> float:
        """a times b. Calculated once, then cached indefinitely."""
        print("Calculating a times b")
        return self.a * self.b

    def invalidate_cache(self):
        """Property caches can be invalidated manually"""
        try:
            del self.ab
        except AttributeError:
            pass


# NOTE: discuss lack of look-through docs on dataclass constructor
cpe = CachedPropertyExample(1.0, 2.0)
cpe.ab  # NOTE: notice how the spam only prints once
cpe.ab


In [None]:
# We can also cache functions that are not part of a class.
# The arguments must be hashable and will be compared by their hash to find a matching cache entry.
# Only use this if the function is significantly more expensive than hashing the inputs!
from functools import cache, lru_cache


@cache  # Unbounded cache size - may run you out of RAM
def infinite_cache_func(c: HashableExample) -> float:
    # <Some expensive operation>
    ...


@lru_cache(maxsize=10)
def limited_cache_func(c: HashableExample) -> float:
    # <Some expensive operation>
    ...


##### Special methods

Methods on a class that are enclosed by double underscores are "special" or "magic" methods, and have some connection to language syntax or builtin functions.

All special methods can be overwritten with user-defined functions. This allows implementing things like
* Operator overloading
* Assignment validation or freezing
* Nested data access patterns

and much more.

The python language does not maintain a list of special functions. I have done my best to collect them here, but there are likely more that I don't know about.

| Special Method | Related Builtin | Comments |
| -------------- | --------------- | -------- |
| `__call__` | `instance(*args, **kwargs)` | Call an instance as a function |
| _Identification and representation_ | | 
| - | `id(instance)` | Get the memory address of an instance of a class |
| `__str__` | `str(instance)` | Make a string representation of the instance, usually for display during debugging |
| `__repr__` | `repr(instance)` | Nowadays, this is more often used for a storeable string representation like json. Historically, it was used to store all info about the instance as an executable script that rebuilds it from scratch (don't do this, though).  |
| _Access_ | | 
| `__getattribute__(attr)` | `getattr(instance, attr)` | Get the value of an attribute of the class |
| `__getattr__(attr)` | `getattr(instance, attr)` | Old syntax; used automatically if `__getattribute__` fails |
| `__setattr__(attr, val)` | `setattr(instance, attr, val)` | Set the value of an attribute |
| - | `hasattr(instance, attr)` | Check if an instance of a class has an attribute |
| `__dict__` | `dict(instance)` | Get a dictionary of all the methods and attributes |
| _Collections_ | | |
| `__getitem__(key)` | `val = instance[key]` | Indexing and slicing syntax (`key` may be a `slice` object) |
| `__setitem__(key)` | `instance[key] = val` |  |
| `__contains__(key)` | `key in instance` | |
| `__reversed__` | `reversed(instance)` | Get a representation of the instance with the order of elements reversed |
| _Iterators_ | | |
| `__len__` | `len(instance)` | Get the length of an instance that represents a collection |
| `__iter__` | `iter(instance)` | Get an iterator over the instance; the iterator must implement `__next__` |
| `__next__` | `next(iterator)` | Get the next value from an iterator or `raise StopIteration` if no elements remain |
||||
| `__r<op>__`<br>NOTE: all binary operators have a `__r<op>__` right-side variant like `__radd__` that is used if the left-side operation fails. Among other things, this allows supporting bidirectional logic in custom numeric types, which is essential for things like symbolics and autodiff tracers. | | |
| `__i<op>__`<br>NOTE: all binary operators that support compound assignment like `a *= b` have an `__i<op>__` variant that defines operate-assign behavior. |
||||
| _Math operators_ | | |
| `__add__` | `self + other` | |
| `__sub__` | | |
| `__mul__` | | |
| `__truediv__` | | New syntax for division python 3+ |
| `__pow__` | `self ** other` | |
| `__floordiv__` | `self // other` | Floor-rounding division |
| `__mod__` | `self % other` | modulo (remainder) operator |
| `__neg__` | `-instance` | unary negation |
| `__pos__` | `+instance` | unary positiv...ation? |
| `__invert__` | `~instance` | unary positive-negative inversion |
| _Logical operators_ | | Note it is not possible to overload `and` and `or` logical operators in python |
| `__eq__` | `self == other` | |
| `__gt__` | `self > other` | |
| `__lt__` | `self < other` | |
| `__le__` | `self <= other` | |
| `__ge__` | `self >= other` | |
| `__ne__` | `self != other` | |
| _Bitwise operators_ | | |
| `__or__` | `self \| other` | bitwise-or |
| `__and__` | `self & other` | bitwise-and |
| `__xor__` | `self ^ other` | bitwise exclusive or |
| `__not__` | `~instance` | bitwise-not |
| _Context objects_ | | |
| `__enter__` | `with instance as local_name:` | Enter a context |
| `__exit__(self, exc_type, exc_value, traceback)` | | Called unconditionally on exiting the context |
| _Descriptor objects_ | | Descriptor objects allow implicitly defining things like properties. This is uncommon to see used. They have their own special methods, which are left out of this table.|

----
#### Abstract Base Classes and Protocols

* `ABC` enforces that subclasses must implement certain interface methods or properties
  * This allows specifying that subclasses should _behave in a certain way_ without being too specific about how that is achieved
* `Protocol` implicitly indicates that any class that _already_ implements certain methods or properties can be considered to be an instance of that protocol
  * `Protocol` does not work well with pydantic, dataclasses, typechecking, or basically anything else, and is not recommended for use
  * Names of fields and arguments have to match exactly to pass typechecking

In [None]:
# from typing import Self  # python 3.11+
from typing_extensions import Self  # Python 3.10- backport
from abc import ABC, abstractmethod


class ExampleAbstractBase(ABC):
    @abstractmethod
    def must_implement_method(self, other: Self) -> Self:
        """Join `other` with self"""
        ...

    @abstractmethod
    def method_with_default_impl(self) -> str:
        return str(self)

    @classmethod
    @abstractmethod
    def must_implement_classmethod(cls, repr: str) -> Self: ...

    @property
    @abstractmethod
    def must_implement_property(self): ...

    @must_implement_property.setter
    def must_implement_property(self): ...


# NOTE: show autocomplete for methods & docstring inherited from parent
class ExampleConstrained(ExampleAbstractBase):
    joined: list
    """Instances joined with this one"""

    def __init__(self):
        self.joined = []

    def must_implement_method(self, other) -> Self:
        self.joined.append(other)
        return self

    def method_with_default_impl(self):
        return super().method_with_default_impl()

    # And so on
    ...


# NOTE: We did not define all required functions here, but it doesn't error on the class definition!
#       The error will happen on the first attempt to initialize an instance of the class
try:
    ExampleConstrained()
except TypeError as e:
    print(e)

----
#### Decorators

Decorators are function wrappers. They are a function-of-a-function that returns a new, transformed, function.

This can be used to perform logging or instrument a program or to add new features (like caching or symbolic expression mapping) to an existing function.

In addition to operating on functions like `def decorator(func)`, decorators can also operate on instance methods, class methods, and classes in the same way.

In [None]:
# Basic decorators are just a function that returns a function,
# but by default, they wipe out the docstring and type annotations
# of the original
from typing import Callable


def my_decorator(f: Callable) -> Callable:
    """A minimal decorator"""
    print(
        "This part runs when the decorator is applied to a function (during module import)"
    )

    def f_wrapped(*args, **kwargs):
        """A wrapped function"""
        print(f"Running wrapped function with: {args}")
        return f(*args, **kwargs)

    return f_wrapped


@my_decorator
def my_function(a: int) -> int:
    """A minimal function"""
    return a**2


# NOTE: show that type hints are erased in mouseover
print("The wrapper runs as expected:")
my_function(5)
print("...but the docstring of the original function is lost:")
print(my_function.__doc__)

In [None]:
# Decorators that preserve docstring and type hints
from functools import wraps
from typing import Callable, ParamSpec, TypeVar  # Python 3.10+

P = ParamSpec("P")
T = TypeVar("T")


def my_decorator_2(f: Callable[P, T]) -> Callable[P, T]:  # This preserves type hints
    """A better decorator"""
    print(
        "This part runs when the decorator is applied to a function (during module import)"
    )

    @wraps(f)  # This preserves docs
    def f_wrapped(*args: P.args, **kwargs: P.kwargs) -> T:  # This preserves type hints
        """A better wrapped function"""
        print(f"Running wrapped function with: {args}")
        return f(*args, **kwargs)

    return f_wrapped


@my_decorator_2
def my_function_2(a: int) -> int:
    """A minimal function"""
    return a**2


# NOTE: show that type hints are restored in mouseover
print("The wrapper runs as expected:")
my_function_2(5)
print("...and it even has the original type hints and docstring:")
print(my_function_2.__doc__)

----
#### Context Managers

Context managers provide a setup-and-teardown pattern. They are used primarily to manage resources outside the program, such as file pipes and connections to non-Python objects, for example, C drivers for a USB device.

A context object is any class that implements `__enter__` and `__exit__` methods. `__exit__` _always_ runs, even if there is an exception raised, much like a `finally` block.

In [None]:
class MyContext:
    def __init__(self, *args, **kwargs):
        pass

    def __enter__(self):
        """This is called when we enter a `with` scope"""
        print("Entering context")

    def __exit__(self, exc_type, exc_value, traceback):
        """
        This is called when we leave the `with` scope, even if there is an exception raised,
        much like a `finally` statement.
        """
        print(f"Exiting context with exception info: {exc_type, exc_value, traceback}")


with MyContext() as ctx:
    print("Doing stuff")

In [None]:
try:
    with MyContext() as ctx:
        print("Doing stuff in inner scope")
        raise ValueError("oops!")

except ValueError:
    print("Captured ValueError in outer scope")

----
#### Pydantic and Dataclasses

* Dataclasses are useful if all you want is immutability or to automate generating the `__init__` function
  * Nested dataclasses become nonfunctional rapidly - this is not a real path to robust serialization capability
* Pydantic provides that along with tools for validation, initialization, run-time typechecking, serialization/deserialization, and more
  * Use pydantic for anything that will be nested, use defaults on mutable types, need serialization/deserialization, typechecking, etc
  * Also provides much better look-through/mouseover docs
  * Can be extended to handle array data
  * Pydantic classes can be autogenerated from an OpenAPI json spec using `datamodel-code-generator`

In [None]:
# Handling numpy arrays in pydantic

import numpy as np
from pydantic import ConfigDict
from pydantic_numpy.model import NumpyModel
from pydantic_numpy.typing import NpNDArray  # Array of any type or dimensionality


class SerializableWithArray(NumpyModel):
    model_config = ConfigDict(validate_assignment=True, frozen=False, extra="forbid")

    metadata: dict[str, str]
    arr1: NpNDArray
    arr2: NpNDArray


d = SerializableWithArray(
    metadata={"hello": "arrays"}, arr1=np.ones(2), arr2=np.zeros(2)
)
d_serialized = d.model_dump_json(indent=2)
d_deserialized = SerializableWithArray.model_validate_json(d_serialized)
print(d_serialized)

In [None]:
# Pydantic with numpy and pint

from pydantic import Field, model_validator
from pint import Quantity, UnitRegistry
from typing import Union, Any

ureg = UnitRegistry()
Q_ = ureg.Quantity


class SerializableQuantity(NumpyModel):
    model_config = ConfigDict(validate_assignment=True, frozen=False, extra="forbid")
    
    # NOTE: discuss tagged vs. untagged unions
    magnitude: Union[float, NpNDArray] = Field(union_mode="left_to_right")
    units: str

    @model_validator(mode="before")
    def validate(v: Any) -> Any:
        # Allow assigning a Quantity to a field of this type
        if isinstance(v, Quantity):
            return {"magnitude": v.m, "units": str(v.units)}
        else:
            return v

    def as_quantity(self) -> Quantity:
        return Quantity(self.magnitude, self.units)


class UsesSerializableQuantity(NumpyModel):
    metadata: dict[str, str]
    q1: SerializableQuantity
    q2: SerializableQuantity


b = UsesSerializableQuantity(
    metadata={"hello": "arrays"}, q1=Q_(1.0, "m"), q2=Q_(np.zeros(2), "kg")
)
b_ser = b.model_dump_json(indent=2)
b_des = b.model_validate_json(b_ser)
print(b_ser)
print(b)
print(b_des)

print(f"\nAccess value with units like this: b.q1.as_quantity() = {b.q1.as_quantity()}")

In [None]:
# Validation, floats, & discriminated unions
from pydantic import BaseModel, EmailStr, field_validator, model_validator
from typing import Any, Literal

class Left(BaseModel):
    kind: Literal["left"] = "left"

class Right(BaseModel):
    kind: Literal["right"] = "right"

class MyValidated(BaseModel):
    
    # Simple field validation has shorthand
    a: int = Field(gt=0)
    # Some common but not-so-simple validations are available as types
    b: EmailStr
    # Custom validation functions can be provided
    c: list[str]
    # By default, non-finite numbers are not handled because the json spec doesn't include them
    d: float = Field(allow_inf_nan=True)
    # Unions can be distinguised by a discriminator field
    e: Left | Right = Field(discriminator="kind")

    # Some fields can be kept for internal use.
    # This field is not serialized or initialized by pydantic,
    # and is instead initialized by user code in the after-validator (post-init function).
    _f: Callable[[int], str]

    @field_validator("c")
    @classmethod
    def validate_c(cls, v: Any) -> Any:
        assert len(v) > 0
        return v
    
    @model_validator(mode="before")
    def validate_input(v: Any) -> Any:
        # This runs on the inputs before the instance is constructed.
        # We can do things like check for software version compatibility here before parsing a json
        return v
    
    @model_validator(mode="after")
    def post_init(self) -> MyValidated:
        # This runs on an instance after pydantic is done constructing it.
        def f(i: int) -> str:
            return self.c[i]
        self._f = f

        return self
        