# Chapter 24: Class Metaprogramming

## Classes as Objects

## `type`: The Built-In Class Factory

In [5]:
class MySuperClass:
    pass

class MyMixin:
    pass

class MyClass(MySuperClass, MyMixin):
    x = 42
    
    def x2(self):
        return self.x * 2
    
obj = MyClass()
print(obj.x)
print(obj.x2())

42
84


In [4]:
MyClass2 = type('MyClass',
                (MySuperClass, MyMixin),
                {'x': 42, 'x2': lambda self:self.x*2},
                )

obj2 = MyClass2()
print(obj2.x)
print(obj2.x2())

42
84


In [None]:
monkey = {'name': 'Monkey', 
        'attrs': ['age', 'weight','food']}

# How to create a Monkey class from this dict?
# attributes should not have default values and they'll
# be initialed in __init__ method.

# Use a class factory function.

## A Class Factory Function

In [10]:
# tag::RECORD_FACTORY[]
from typing import Union, Any
from collections.abc import Iterable, Iterator

FieldNames = Union[str, Iterable[str]]  # <1>

def record_factory(cls_name: str, field_names: FieldNames) -> type[tuple]:  # <2>

    slots = parse_identifiers(field_names)  # <3>

    def __init__(self, *args, **kwargs) -> None:  # <4>
        attrs = dict(zip(self.__slots__, args))
        attrs.update(kwargs)
        for name, value in attrs.items():
            setattr(self, name, value)

    def __iter__(self) -> Iterator[Any]:  # <5>
        for name in self.__slots__:
            yield getattr(self, name)

    def __repr__(self):  # <6>
        values = ', '.join(f'{name}={value!r}'
            for name, value in zip(self.__slots__, self))
        cls_name = self.__class__.__name__
        return f'{cls_name}({values})'

    cls_attrs = dict(  # <7>
        __slots__=slots,
        __init__=__init__,
        __iter__=__iter__,
        __repr__=__repr__,
    )

    return type(cls_name, (object,), cls_attrs)  # <8>


def parse_identifiers(names: FieldNames) -> tuple[str, ...]:
    if isinstance(names, str):
        names = names.replace(',', ' ').split()  # <9>
    if not all(s.isidentifier() for s in names):
        raise ValueError('names must all be valid identifiers')
    return tuple(names)
# end::RECORD_FACTORY[]

In [17]:
Dog = record_factory('Dog', 'name weight owner')
rex = Dog('Rex', 30, 'Bob')
rex

Dog(name='Rex', weight=30, owner='Bob')

In [18]:
monkey = {'name': 'Monkey', 
        'attrs': ['age', 'weight','food']}

Monkey = record_factory(monkey['name'], monkey['attrs'])
leo = Monkey(3, 30, 'banana')
print(leo)

Monkey(age=3, weight=30, food='banana')


## Introducing `__init__subclass__`

## Enhancing Classes with a Class Decorator

A class decorator is a callable similarly to a function decorator: it gets the decorated class as an argument, and should return a class to replace the decorated class. Class decorators often return the decorated class itself, after injecting more methods in it via attribute assignment.

The most common reason to choose a class decorator over the simpler `__init_subclass__` is to avoid interfering with other class features, such as inheritance and metaclasses.

## What Happens When: Import Time Versus Runtime

At import time, the interpreter:

1. Parses the source code of a .py module in one pass from top to bottom. This is when a `SyntaxError` may occur.
2. Compiles the bytecode to be executed.
3. Executes the top-level code of the compiled module.

Although parsing and compiling are definitely "import time" activities, other things may happen at that time, because almost every statement in Python is executable in the sense that they can potentially run user code and may change the state of the user program.

In particular, the `import` statement is not merely a declaration, but it actually runs all the top-level code of a module when it is imported for the first time in the process. Further imports of the same module will use a cache, and then the only effect will be binding the imported objects to names in the client module. That top-level code may do anything, including actions typical of "runtime", such as writing to a log or connecting to a database. That's why the border between "import time" and "runtime" is fuzzy: the `import` statement can trigger all sorts of "runtime" behavior. Conversely, "import time" can also happen deep inside runtime, because the `import` statement and the `__import__()` built-in can be used inside any regular function.

### Evaluation Time Experiments

In [1]:
# Example 24-10 builderlib.py: top of the module

print('@ builderlib module start')

class Builder:
    print('@ Builder body')
    
    def __init_subclass__(cls) -> None:
        print(f'@ Builder.__init_subclass__({cls!r})')
        
        def inner_0(self):
            print(f'@ SuperA.__init__subclass__:inner_0({self!r})')
            
        cls.method_a = inner_0
            
    def __init__(self) -> None:
        super().__init__()
        print(f'@ Builder.__init__({self!r})')
        
def deco(cls):
    print(f'@ deco({cls!r})')
    
    def inner_1(self):
        print(f'@ deco:inner_1({self!r})')
    
    cls.method_b = inner_1
    return cls

# Example 24-11 builderlib.py: bottom of the module

class Descriptor:
    print('@ Descriptor body')
    
    def __init__(self) -> None:
        print(f'@ Descriptor.__init__({self!r})')
        
    def __set_name__(self, owner, name):
        args = (self, owner, name)
        print(f'@ Descriptor.__set_name__({args!r})')
        
    def __set__(self, instance, value):
        args = (self, instance, value)
        print(f'@ Descriptor.__set__({args!r})')
        
    def __repr__(self) -> str:
        return '<Descriptor instance>'
    
print('@ builderlib module end')

@ builderlib module start
@ Builder body
@ Descriptor body
@ builderlib module end


1. class body print statement are executed at import time
2. def body print statements are not called.

In [None]:
# Example 24-12. evaldemo.py: script to experiment with builderlib.py

print('# evaldemo module start')

@deco  # <1>
class Klass(Builder):  # <2>
    print('# Klass body')

    attr = Descriptor()  # <3>

    def __init__(self):
        super().__init__()
        print(f'# Klass.__init__({self!r})')

    def __repr__(self):
        return '<Klass instance>'


def main():  # <4>
    obj = Klass()
    obj.method_a()
    obj.method_b()
    obj.attr = 999

if __name__ == '__main__':
    main()

print('# evaldemo module end')

Example 24-13. import `evaldemo.py` at the Python prompt(You can't reproduce this in the jupyter notebook)

```python
>>> import evaldemo
@ builderlib module start  # 1
@ Builder body
@ Descriptor body
@ builderlib module end
# evaldemo module start
# Klass body  # 2
@ Descriptor.__init__(<Descriptor instance>) # 3
@ Descriptor.__set_name__(<Descriptor instance>,
    <class 'evaldemo.Klass'>, 'attr')  # 4
@ Builder.__init_subclass__(<class 'evaldemo.Klass'>) # 5
@ deco(<class 'evaldemo.Klass'>) # 6
# evaldemo module end
```

1. The top four lines are the result of from `builderlib` import... . They will not appear if you didn't close the console after the first import, because the `builderlib` is already loaded.

2. This signal that Python started reading the body of `Klass`. At this point, the class object does not exist yet.

3. The descriptor instance is created and bound to `attr` in the namespace that Python will pass to the default class object constructor: `type.__new__()`.

4. At this point, Python's built-in `type.__new__()` has created the `Klass` object and calls `__set_name__` on each descriptor instance of descriptor classes that provide that method, passing `Klass` as the `owner` argument.

5. `type.__new__` then calls `__init_subclass__` on `Builder`, passing `Klass` as the single argument.

6. When `type.__new__` returns the class object, Python applies the decorator `deco` to it. In this example, the class returned by `deco` is bound to `Klass` in the module namespace.

Example 24-14. Running `evaldemo.py` as a program(You can't reproduce this in the jupyter notebook)

```python
@ builderlib module start
@ Builder body
@ Descriptor body
@ builderlib module end
# evaldemo module start
# Klass body
@ Descriptor.__init__(<Descriptor instance>)
@ Descriptor.__set_name__(<Descriptor instance>, <class '__main__.Klass'>, 'attr')
@ Builder.__init_subclass__(<class '__main__.Klass'>) 
@ deco(<class '__main__.Klass'>) # 1
@ Builder.__init__(<Klass instance>) # 2
# Klass.__init__(<Klass instance>)
@ SuperA.__init_subclass__:inner_0(<Klass instance>)  # 3
@ deco:inner_1(<Klass instance>) # 4
@ Descriptor.__set__(<Descriptor instance>, <Klass instance>, 999) # 5
# evaldemo module end
```

1. The top 10 lines--including this one--are the same as shown in Example 24-13.

2. Triggered by `super().__init__()` in `Klass.__init__()`.

3. Triggered by `obj.method_a()` in `main`; `method_a` was injected by `SuperA.__init_subclass__`.

4. Triggered by `obj.method_b()` in `main`; `method_b` was injected by `deco`.

5. Triggered by `obj.attr = 999` in `main`; `attr` value was set by `Descriptor.__set__`.

## Metaclasses 101

> [Metaclasses] are deeper magic than 99% of users should ever worry about. If you wonder whether you need them, you don't (the people who actually need them know with certainty that they need them, and don't need an explanation about why).
> 
> -- Tim Peters, inventor of the Timsort algorithm, and author of the Zen of Python

![Both diagrams are true](./img/2024-01-05-12-02-49.png)

### How a Metaclass Customizes a Class

To use a metaclass, it's critical to understand how `__new__` works on any class.

The same mechanics happen at a "meta" level when a metaclass is about to create a new instance, which is a class. Consider this declaration:

```python
class Klass(SuperKlass, metaclass=MetaKlass):
    x = 42
    def __init__(self, y) -> None:
        self.y = y
```

To process that `class` statement, Python calls `MetaKlass__new__` with these arguments:

- `meta_cls`: The metaclass itself(MetaKlass), because `__new__` works as class method.

- `cls_name`: The string `Klass`.

- `base`: The single-element tuple `(SuperKlass,)`, with more elements in the case of multiple inheritance.

- `cls_dict`: A mapping like: `{x: 42, __init__: <function __init__ at 0x7f9e9c2b9d30>}`


When you implement `MetaKlass.__new__`, you can inspect and change those arguments before passing them to `super().__new__`, which will eventually call `type.__new__` to create the new class object.

After `super().__new__` returns, you can also apply further processing to the newly created class before returning it to Python. Python then calls `SuperKlass.__init_subclass__`, passing the class you created, and then applies a class decorator to it, if one is present. Finally, Python binds the class object to its name in the surrounding namespace--usually the global namespace of a module, if the `class` statement was a top-level statement.

What does top-level statement mean?

top-level statement: A statement that is not nested inside any other statement. For example, a `class` statement is a top-level statement, but a `def` statement nested inside a `class` statement is not.

The most common processing made in a metaclass `__new__` is to add or replace items in the `cls_dict`--the mapping that represents the namespace of the class under construction. For instance, before calling `super().__new__`, you can inject methods in the class under construction by adding functions to `cls_dict`. However, note that adding methods can also be done after the class is built, which is why we were able to do it using `__init_subclass__` or a class decorator.

One attribute that you must add to the `cls_dict` before `type.__new__` runs is `__slots__`, as discussed in page 921. The `__new__` method of a metaclass is the ideal place to configure `__slots__`. The next section shows how to do that.

### A Nice Metaclass Example

Example 24-15. metabunch/from3.6/bunch.py: MetaBunch metaclass and Bunch class

In [None]:
class MetaBunch(type):  # <1>
    def __new__(meta_cls, cls_name, bases, cls_dict):  # <2>

        defaults = {}  # <3>

        def __init__(self, **kwargs):  # <4>
            for name, default in defaults.items():  # <5>
                setattr(self, name, kwargs.pop(name, default))
            if kwargs:  # <6>
                extra = ', '.join(kwargs)
                raise AttributeError(f'No slots left for: {extra!r}')

        def __repr__(self):  # <7>
            rep = ', '.join(f'{name}={value!r}'
                            for name, default in defaults.items()
                            if (value := getattr(self, name)) != default)
            return f'{cls_name}({rep})'

        new_dict = dict(__slots__=[], __init__=__init__, __repr__=__repr__)  # <8>

        for name, value in cls_dict.items():  # <9>
            if name.startswith('__') and name.endswith('__'):  # <10>
                if name in new_dict:
                    raise AttributeError(f"Can't set {name!r} in {cls_name!r}")
                new_dict[name] = value
            else:  # <11>
                new_dict['__slots__'].append(name)
                defaults[name] = value
        return super().__new__(meta_cls, cls_name, bases, new_dict)  # <12>


class Bunch(metaclass=MetaBunch):  # <13>
    pass

1. To create a new metaclass, inherit from `type`.

2. `__new__` works as a class method, but the class is a metaclass, so I like to name the first argument `meta_cls` to make it clear that it's a class object, not an instance. The remaining three arguments are the same as the three-argument signature of `type.__new__`.

3. `defaults` will hold a mapping of attribute names and their default values.

4. This will be injected into the new class.

5. Read the `defaults` and set the corresponding instance attribute with a value popped from `kwargs` or a default value.

6. If there is still any item in `kwargs`, it means there are no slots left where we can place them. We believe in `failing fast` as best practice, so we don't want to silently ignore extra items. A quick and effective solution is to pop one item from `kwargs` and try to set it on the instance, triggering an `AttributeError` on purpose.

7. `__repr__` returns a string that looks like a constructor call--e.g., `Point(x=3)`, omitting the keyword arguments with default values.

8. Initialize namespace for the new class.

9. Iterate namespace for the new class.

10. If a dunder `name` is found, copy the item to the new class namespace, unless it's already there. This prevents users from overwriting `__init__`, `__repr__`, and other attributes set by Python, such as `__qualname__` and `__module__`.

11. If not a dunder `name`, append to `__slots__` and save its `value` in `defaults`.

12. Build and return the new class.

13. Provide a base class, so users don't need to see `MetaBunch`.

`MetaBunch` works because it is able to configure `__slots__` before calling `super().__new__` to build the final class. (Who is super here, `type` or `MetaBunch`?) As usual when metaprogramming, understanding the sequence of actions is key. Let's do another evaluation time experiment, now with a metaclass.

### Metaclass Evaluation Time Experiment

`method_c` was injected by `MetaKlass.__new__` after decorating `Klass` with `deco`.

## Metaclass in the Real World

Metaclasses are powerful, but tricky. Before deciding to implement a metaclass, consider the following point.

### Modern Features Simplify or Replace Metaclasses

Over time, several common use cases of metaclasses were made redundant by new language features:

- Class decorator:
  - Simpler to understand that metaclasses, and less likely to cause conflicts with base classes and metaclasses.

- `__set_name__`:
  - Avoids the need for custom metaclass logic to automatically set the name of a descriptor.

- `__init_subclass__`:
  - Provides a way to customize creation that is transparent to the end user and even simpler than a decorator--but may introduce conflicts in a complex class hierarchy.

- Built-in `dict` preserving key insertion order:
  - Eliminated the #1 reason to use `__prepare__`: to provide an `OrderedDict` to store the namespace of the class under construction. Python only calls `__prepare__` on metaclasses, so if you needed to process the class namespace in the order it appears in the source code, you had to use a metaclass before Python 3.6.

### Metaclasses Are Stable Language Features

Metaclasses were introduced in Python 2.2 in 2002, together with so-called "new-style classes", descriptors, and properties.

It is remarkable that the `MetaBunch` example, first posted in July 2002, still works in Python 3.9.

### A Class Can Only Have One Metaclass

If your class declaration involves two or more metaclasses, you will see this puzzling error message:

> TypeError: metaclass conflict: the metaclass of a derived class must be a (non-strict) subclass of the metaclasses of all its bases.

This may happen even without multiple inheritance. For example, a declaration like this could trigger that `TypeError`:

```python
class Record(abc.ABC, metaclass=PersistentMeta):
    pass
```

We saw that `abc.ABC` is an instance of the `abc.ABCMeta` metaclass. If that `Persistent` metaclass is not itself a subclass of `abc.ABCMeta`, the `TypeError` will be raised.

There are two ways of dealing with that error:

- Find some other way of doing what you need to do, while avoiding at least one of the metaclasses involved.

- Write your own `PersistentABCMeta` metaclass as a subclass of both `abc.ABCMeta` and `PersistentMeta`, using multiple inheritance, and use that as the only metaclass of `Record`.

### Metaclasses Should Be Implementation Details

Beside `type`, there are only six metaclasses in the entire Python 3.9 standard library. The better known metaclasses are probably `abc.ABCMeta`, `typing.NamedTupleMeta` and `enum.EnumMeta`. None of them are intended to appear explicitly in user code. We may consider them implementation details.