# Understanding the Python Data Model: Metaclasses and ABCs

## Python Objects
- _"Objects are Python’s abstraction for data. All data in a Python program is represented by objects or by relations between objects"_
- Every `object` has:
    + **Identity**: it never changes once the object has been created (_~= memory address_)
    + **Type**: a reference to another object (its `class` or `type`) which determines the operations that the object supports. In theory, it shouldn't change after creation
    + **Value**: 
        + _usually_ just a mapping from strings to counted object references (`__dict__`). It supports definition of new attributes at any time
        + _but_ some classes are more restricted (e.g. slotted classes, builtins and types defined in CPython extensions)
- The _type_ of an object is another object referenced in the `__class__` attribute
    - _Usually_ `type(foo) == foo.__class__`    

# Python Classes

- A Python class is just another Python object:
    + It stores shared definitions for all instances (dunder and regular methods, class variables)
    + It works as a factory of new instances:
        + Its `__call__()` method creates new objects whose `__class__` attribute points back to the class
        + Default `type.__call__(cls, ...)` implementation calls `cls.__new__(...)` then calls `cls.__init__(...)` (if appropriate)
- Python syntax is translated into _dunder_ function calls looked up in the `instance.__class__` dict (previous level of abstraction of the calling instance)
    + `foo + bar` => `foo.__class__.__add__(foo, bar)`

- The class of a class, a _metaclass_, is also a class (and therefore it essentially follows the same pattern)

 ## Python Data Model

<img src="images/meta-diagram.svg" width="1400">

**Note:** _"The type of all types is `type`"_  is not _strictly_ true. The opposite case is only useful for very narrow use cases


## Metaclasses: Implementing custom class behaviour

- Example: we want to use `A + B` as alternative syntax for `A | B`:
    ```python
    A + B == A | B == Union[A, B]
    ```
- Where should be defined the `__add__()` method?
    ```python
    class A:
        def __add__(self, other):
            ...
    ```
    - **NO!!** This works for `A` instances: `a = A(); b = B(); c = a + b`
    - We need to use a custom metaclass!!
    

In [1]:
from typing import Union

class CustomMeta(type):
    def __add__(cls, other):
        return Union[cls, other]

# 'metaclass' keyword sets the metaclass of a new class
class A(metaclass=CustomMeta):
    pass

# Inheriting from classes with custom metaclass
# also sets the metaclass of the new class
class B(A):
    pass

print(f"{type(A) = }, {type(B) = }")

print(f"{A + B = }")

print("\nCreating A instances...")
a = A()
b = B()
print(a)
# This should fail:
#print(a + b)

type(A) = <class '__main__.CustomMeta'>, type(B) = <class '__main__.CustomMeta'>
A + B = typing.Union[__main__.A, __main__.B]

Creating A instances...
<__main__.A object at 0x7f6b00069d80>


## Implementing custom class behaviour with `type`
- The standard `type` metaclass provides some hooks to customize the class behavior directly in the class, to avoid excesive use of small metaclasses:
    - _(auto-classmethod)_ `object.__init_subclass__(cls, **kwargs)`: Called whenever the containing `class` is subclassed. `cls` is then the new subclass.


In [2]:
class Parent:
    def __init_subclass__(cls, /, **kwargs):
        print(f"__init_subclass__({cls}, {kwargs})")
        super().__init_subclass__(**kwargs)
        if "FOO" not in cls.__dict__:
            cls.FOO = "Default!!"
        
class Child1(Parent):
    FOO = 42

class Child2(Child1):
    pass

print(f"\n{Child1.FOO = }, {Child2.FOO = }")

__init_subclass__(<class '__main__.Child1'>, {})
__init_subclass__(<class '__main__.Child2'>, {})

Child1.FOO = 42, Child2.FOO = 'Default!!'


## Implementing custom class behaviour with `type`
- `type` metaclass customization hooks:
    - _(auto-classmethod)_ `object.__class_getitem__(cls, key)`: Return an object representing the specialization of a generic class by type arguments found in key.
        + It has lower priority than `__getitem__` in the metaclass
        + It is used mostly for run-time implementation of `Generic` typing annotations

In [3]:
class Parent:
    def __class_getitem__(cls, key):
        if isinstance(key, str):
            return [sub for sub in cls.__subclasses__() if sub.__name__ == key]
        else:
            return cls.__subclasses__()[key]
        
class Child0(Parent): ...
class Child1(Parent): ...
class Child2(Parent): ...

print(f"{Parent[0:-1] = }")

print(f"\n{Parent['Child2'] = }")

Parent[0:-1] = [<class '__main__.Child0'>, <class '__main__.Child1'>]

Parent['Child2'] = [<class '__main__.Child2'>]


## Metaclasses: Implementing custom class behaviour
- Using metaclasses is also possible to override class-related operators like `isinstance()` and `issubclass()`
    + `class.__instancecheck__(cls, instance)`: Return `true` if instance should be considered a (direct or indirect) instance of `class`.
    + For performance reasons, it is only called if `type(instance) != cls`


In [4]:
import dataclasses, types

class CustomMeta(type):
    def __instancecheck__(cls, instance):
        print(f"__instancecheck__({cls}, {instance})")
        if hasattr(cls, "DISCRIMINATING_ATTR"):
            assert isinstance(cls.DISCRIMINATING_ATTR, str)
            return getattr(instance, cls.DISCRIMINATING_ATTR, False) is True
        return False

class HappyClass(metaclass=CustomMeta):
    DISCRIMINATING_ATTR = "happy"

a = types.SimpleNamespace(happy=True)
b = types.SimpleNamespace(value=42)

@dataclasses.dataclass
class Foo:
    happy: bool

foo = Foo(True)

print(f"{isinstance(HappyClass(), HappyClass) = }")  # Skip custom __instancecheck__
print(f"{isinstance(a, HappyClass) = }")
print(f"{isinstance(b, HappyClass) = }")
print(f"{isinstance(foo, HappyClass) = }")


isinstance(HappyClass(), HappyClass) = True
__instancecheck__(<class '__main__.HappyClass'>, namespace(happy=True))
isinstance(a, HappyClass) = True
__instancecheck__(<class '__main__.HappyClass'>, namespace(value=42))
isinstance(b, HappyClass) = False
__instancecheck__(<class '__main__.HappyClass'>, Foo(happy=True))
isinstance(foo, HappyClass) = True


## Metaclasses: Implementing custom class behaviour
- Using metaclasses is also possible to override class-related operators like `isinstance()` and `issubclass()`
    + `class.__subclasscheck__(cls, subclass)`: Return `true` if `subclass` should be considered a (direct or indirect) `subclass` of `class`.


In [5]:
class CustomMeta(type):
    def __subclasscheck__(cls, subclass):
        print(f"__subclasscheck__({cls}, {subclass})")
        name = cls.__name__.split(".")[-1]
        sub_name = subclass.__name__.split(".")[-1]
        return sub_name.startswith(f"{name}") and sub_name[len(name)].isupper()

class Foo(metaclass=CustomMeta):
    pass

class Bar(Foo):
    pass

class FooSomething:
    pass

class NotFooSomething:
    pass

print(f"{issubclass(Bar, Foo) = }")
print(f"{issubclass(FooSomething, Foo) = }")
print(f"{issubclass(NotFooSomething, Foo) = }")

__subclasscheck__(<class '__main__.Foo'>, <class '__main__.Bar'>)
issubclass(Bar, Foo) = False
__subclasscheck__(<class '__main__.Foo'>, <class '__main__.FooSomething'>)
issubclass(FooSomething, Foo) = True
__subclasscheck__(<class '__main__.Foo'>, <class '__main__.NotFooSomething'>)
issubclass(NotFooSomething, Foo) = False


## Metaclasses: customizing class creation    
- Customization can happen at multiple points
- For subclasses of `type`, the simplest way is to use the same mechanism available for customization of instances, since classes are just instances of the metaclasss:
    - `__new__()` -> before the _instance_ (a new class in this case) is created 
    - `__init__()` -> after the instance has been created (weird usage in this case)

In [6]:
import abc

class MyMeta(type):
    def __new__(*args, **kwargs):
        print(f"MyType.__new__({args}, {kwargs})")
        return type.__new__(*args, **kwargs)
    
    def __init__(cls, *args, **kwargs):
        print(f"\nMyType.__init__({cls}, {args}, {kwargs})")
        cls.VALUE = 0
        return None
    

class A(metaclass=MyMeta):
    VALUE = 42

class ABC(metaclass=abc.ABCMeta):
    pass
    
class B(abc.ABC): pass
    
B.__class__

#print(f"\n{A.VALUE = }")

MyType.__new__((<class '__main__.MyMeta'>, 'A', (), {'__module__': '__main__', '__qualname__': 'A', 'VALUE': 42}), {})

MyType.__init__(<class '__main__.A'>, ('A', (), {'__module__': '__main__', '__qualname__': 'A', 'VALUE': 42}), {})


abc.ABCMeta

## Metaclasses: customizing class creation
- `class` keyword is just syntatic sugar to call the metaclass
    ```python
    class MyType(A, B, C, metaclass=MyMeta, kwarg2=33):
        VALUE = 42        
        def x(self, b): return ...
    ```
    is roughly equivalent to ([3.3.3.1. Metaclasses](https://docs.python.org/3/reference/datamodel.html#metaclasses)):
    ```python
    # MRO entries are resolved
    bases = solve_mro_entries(A, B, C)  # For regular bases == (A, B, C)
    # the proper metaclass (the most derived `type` subclass) is selected
    kwargs = dict(metaclass=MyMeta, kwarg2=33)
    metaclass = kwargs.pop("metaclass", type)
    # the class namespace is prepared
    body_ns = metaclass.__prepare__("MyType", bases, **kwargs) # == dict()
    # the class body is executed
    exec(SOURCE_LINES[1:4], globals(), body_ns)
    # the class object is created
    new_class = metaclass("MyType", bases, body_ns, **kwargs)
    ```
    

In [8]:
import pprint

class FooMeta(type):
    def __new__(mcls, name, bases, ns, **kwargs):
        new_ns = {}
        filtered = {}
        for key, value in ns.items():
            if key in ["foo", "bar"]:
                filtered[key] = value
            else:
                new_ns[key] = value
        new_ns["__filtered__"] = filtered
        return type(name, bases, new_ns, **kwargs)

class Foo(metaclass=FooMeta):
    def foo(self):
        return 42
    
    def bar(self):
        return 42
    
    def something(self):
        return 42

pprint.pprint(vars(Foo))

mappingproxy({'__dict__': <attribute '__dict__' of 'Foo' objects>,
              '__doc__': None,
              '__filtered__': {'bar': <function Foo.bar at 0x7f6b00070dc0>,
                               'foo': <function Foo.foo at 0x7f6b00070550>},
              '__module__': '__main__',
              '__weakref__': <attribute '__weakref__' of 'Foo' objects>,
              'something': <function Foo.something at 0x7f6b00071b40>})


## abc — Abstract Base Classes module

- A module of the standard library providing the infrastructure for defining abstract base classes (ABCs) ([PEP 3119](https://peps.python.org/pep-3119))
- Used in other modules of the standard library
    - [`collection.abc`](https://docs.python.org/3/library/collections.abc.html) module (_Abstract Base Classes for Containers_) (motivating use case discussed in the previous PEP)
    - [`numbers`](https://docs.python.org/3/library/numbers.html) module (_Numeric abstract base classes_) ([PEP 3141](https://peps.python.org/pep-3141))
        + Number :> Complex :> Real :> Rational :> Integral
- Contents:
    - `@abstractmethod` decorator to flag methods as _abstract_
    - `ABCMeta` metaclass (and `ABC` class shortcut to inherit from):
    

## Abstract Base Classes and Abstract Methods

- Methods tagged as `abc.abstractmethod()` **must** be defined in subclasses
    - If an implementation is not provided, the subclass remains _abstract_ and therefore _non-instantiable_
    - The goal is to detect classes with partial implementation of interfaces even if the missing methods are not called

- `abc.abstractmethod()` can be combined with other common method decorators (`classmethod`, `staticmethod` `property`)

In [9]:
import abc

class MyABC(abc.ABC): # Equivalent to: class MyABC(metaclass=abc.ABCMeta)
    @abc.abstractmethod
    def foo(self, number):
        ...

class MyClass(MyABC):
    def foo(self, number):
        return number * 2

print(f"{MyClass = }")
print(f"{MyClass() = }\n")
    
class MyClass2(MyABC):
    pass

print(f"{MyClass2 = }")
print(f"{MyClass2() = }")

MyClass = <class '__main__.MyClass'>
MyClass() = <__main__.MyClass object at 0x7f6b0006b670>

MyClass2 = <class '__main__.MyClass2'>


TypeError: Can't instantiate abstract class MyClass2 with abstract method foo

## Abstract Base Classes and custom subclass checks

- `ABCMeta` class defines custom `__subclasscheck__()` and `__instancecheck__()` methods with extra functionality:
    - `ABCMeta.__subclasscheck__()` calls `class.__subclasshook__()` class method
    - Resuls of previous positive and negative checks are cached
    - Specific classes can be manually _registered_ as _virtual_ subclasses of the ABC
    - `isinstance(x, B)` is equivalent to `issubclass(x.__class__, B)` (or `issubclass(type(x), B)`)


- `__subclasshook__()` works as follows:
  ```python
    @classmethod
    def __subclasshook__(cls, C): ...
  ```
- Returning `True` -> `C` is considered a subclass of the ABC
- Returning `False` -> `C` is **not** considered a subclass of the ABC
- Returning `NotImplemented` -> the subclass check continues with the usual mechanism


In [10]:
import abc

# Previous example with custom metaclass:  CustomMeta.__subclasscheck__(...)
# __subclasshook__ provides the same functionality but it is defined in the
# class body, not in the metaclass
class Foo(abc.ABC):
    @classmethod
    def __subclasshook__(cls, C):
        print(f"__subclasshook__({cls}, {C})")
        name = cls.__name__.split(".")[-1]
        C_name = C.__name__.split(".")[-1]
        if (
            C_name.startswith(f"{name}")
            and len(C_name) > len(name)
            and C_name[len(name)].isupper()
        ):
            return True

        return NotImplemented

In [11]:
class Bar(Foo): ...

class FooSomething: ...
class NotFooSomething: ...
class BarSomething: ...

print(f"{isinstance(Foo(), Foo) = }\n")

print(f"{issubclass(Bar, Foo) = }")
print(f"{isinstance(Bar(), Foo) = }\n")

print(f"{issubclass(FooSomething, Foo) = }, {isinstance(FooSomething(), Foo) = }\n")

# If is not a direct or indirect (via __subclasshook__) subclass of Foo, 
# it might still be a subclass of Foo if it satisfies  __subclasshook__()
# of any of the subclasses of Foo
print(f"{issubclass(NotFooSomething, Foo) = }\n")

print(f"{issubclass(BarSomething, Bar) = }")
print(f"{issubclass(BarSomething, Foo) = }")


isinstance(Foo(), Foo) = True

__subclasshook__(<class '__main__.Foo'>, <class '__main__.Bar'>)
issubclass(Bar, Foo) = True
isinstance(Bar(), Foo) = True

__subclasshook__(<class '__main__.Foo'>, <class '__main__.FooSomething'>)
issubclass(FooSomething, Foo) = True, isinstance(FooSomething(), Foo) = True

__subclasshook__(<class '__main__.Foo'>, <class '__main__.NotFooSomething'>)
__subclasshook__(<class '__main__.Bar'>, <class '__main__.NotFooSomething'>)
issubclass(NotFooSomething, Foo) = False

__subclasshook__(<class '__main__.Bar'>, <class '__main__.BarSomething'>)
issubclass(BarSomething, Bar) = True
__subclasshook__(<class '__main__.Foo'>, <class '__main__.BarSomething'>)
issubclass(BarSomething, Foo) = True


## Abstract Base Classes and custom subclass checks

- `register(cls)` takes one class argument
- After the call `B.register(C)`, the call `issubclass(C, B)` will return `True` object.)


In [11]:
from abc import ABC

class MyABC(ABC):
    pass

print(f"{issubclass(tuple, MyABC) = }")
print(f"{isinstance((), MyABC) = }\n")

MyABC.register(tuple)

print(f"{issubclass(tuple, MyABC) = }")
print(f"{isinstance((), MyABC) = }")


issubclass(tuple, MyABC) = False
isinstance((), MyABC) = False

issubclass(tuple, MyABC) = True
isinstance((), MyABC) = True


## Abstract Base Classes and Protocols

- Typing Protocols are defined in [PEP 544](https://peps.python.org/pep-0544)
    - `Protocol` implementation is just an ABC with a custom `_ProtocolMeta` metaclass inheriting from `ABCMeta`
- They solve more or less the same problems of ABCs but using a static type checker...
    - Define the interface that any class should implement to be considered a subclass of the protocol
    - Subclasses don't need to inherit directly from the protocol class
- But they are different than ABCs:
    - `ABC`s **only** provide runtime checks, static type checkers only support direct subclasses of the ABC
    - `Protocol`s are meant to be used with a static type checker and they do not define runtime checks by default
    - If the `Protocol` contains data (non-method) members:
        + `isinstance(x, Protocol)` is **NOT** equivalent to `issubclass(x.__class__, Protocol)` 


In [12]:
from typing import Protocol

class SupportsClose(Protocol):
    def close(self) -> None:
        ...

class A:
    def close(self) -> None:
        return None

class B:
    close = print

class C:
    close = sum

class D:
    close = None

instanceA: SupportsClose = A()
instanceB: SupportsClose = B()
instanceC: SupportsClose = C()
instanceD: SupportsClose = D()

class S(SupportsClose):
    pass

instanceS: SupportsClose = S()


[https://mypy-play.net/?mypy=latest&python=3.10&gist=6338fd8b19db825bcb14d01e82d55ab1](https://mypy-play.net/?mypy=latest&python=3.10&gist=6338fd8b19db825bcb14d01e82d55ab1)

In [13]:
from typing import Protocol, List

class Template(Protocol):
    name: str        # This is a protocol member
    value: int = 0   # This one too (with default)

    def method(self) -> None:
        self.temp: List[int] = [] # Error in type checker

class Concrete:
    def __init__(self, name: str, value: int) -> None:
        self.name = name
        self.value = value

    def method(self) -> None:
        return

var: Template = Concrete('value', 42)  # OK

[https://mypy-play.net/?mypy=latest&python=3.10&gist=26b78d56b12e9ff56db4842634bcedf4](https://mypy-play.net/?mypy=latest&python=3.10&gist=26b78d56b12e9ff56db4842634bcedf4)

## Runtime-checkable Protocols

- The optional `@runtime_checkable` decorator provides runtime checks by defining a valid `ABCMeta.__subclasshook__()`
- **Warning:** `isinstance()` checks with `@runtime_checkable` Protocols have serious performance issues
    - Some extra checks required for specific `typing` module implementation details
    - If the `Protocol` contains data (non-method) members:
        - `isinstance(x, Protocol)` is **NOT** equivalent to `issubclass(x.__class__, Protocol)` ...
        - which means `isinstance()`  results are not cached per class but recomputed every time
        - `issubclass()` checks are not supported
    

In [14]:
from typing import Protocol, List, runtime_checkable

@runtime_checkable
class Template(Protocol):
    name: str        # This is a protocol member
    value: int = 0   # This one too (with default)

    def method(self) -> None:
        self.temp: List[int] = [] # Error in type checker

class Concrete:
    def __init__(self, name: str, value: int) -> None:
        self.name = name
        self.value = value

    def method(self) -> None:
        return

var: Template = Concrete('value', 42)  # OK
    
print(f"{isinstance(var, Template) = }")
print(f"{issubclass(var.__class__, Template) = }")


isinstance(var, Template) = True


TypeError: Protocols with non-method members don't support issubclass()