# Programming with Python

## Lecture 07: Metaprogramming and descriptors

### Armen Gabrielyan

#### Yerevan State University / ASDS

#### 29 Mar, 2025

### Descriptor invocation

A descriptor can be called directly with `desc.__get__(obj)` or `desc.__get__(None, cls)`.

But it is more common for a descriptor to be invoked automatically from attribute access.

Descriptors are invoked by the [`__getattribute__(self, name)`](https://docs.python.org/3/reference/datamodel.html#object.__getattribute__) method, which returns the attribute value or raises an `AttributeError` exception if an attribute is not found.

The expression `obj.x` looks up the attribute `x` in the chain of namespaces for `obj`. If the search finds a descriptor outside of the instance `__dict__`, its `__get__()` method is invoked according to the precedence rules listed below.

1. **`__getattribute__` method:** First the object's `__getattribute__` method is called, which is responsible for attribute access.
2. **Data Descriptors:** If the attribute is found in the class (or its parent classes) and is a data descriptor, the descriptor's `__get__` method is called.
3. **Instance Dictionary:** If the attribute is found in the object's `__dict__`, that value is returned.
4. **Non-Data Descriptors**: If the attribute is found in the class (or its parent classes) and is a non-data descriptor (implements only `__get__`), the descriptor's `__get__` method is called.
5. **Class Dictionary:** If the attribute is found in the class's `__dict__` (or its parent classes), that value is returned.
6. **`__getattr__` method:** If the attribute is not found anywhere else and the object has a `__getattr__` method, that method is called.
7. **`AttributeError`:** If all the above steps fail, Python raises an `AttributeError`.

In [None]:
class DataDescriptor:
    """A descriptor that implements both __get__ and __set__"""
    def __init__(self, name):
        self.name = name
        
    def __get__(self, instance, owner=None):
        print(f"2. DataDescriptor.__get__ called for {self.name}")
        return f"DataDescriptor value for {self.name}"
        
    def __set__(self, instance, value):
        print(f"DataDescriptor.__set__ called for {self.name} with value {value}")

class NonDataDescriptor:
    """A descriptor that implements only __get__"""
    def __init__(self, name):
        self.name = name
        
    def __get__(self, instance, owner=None):
        print(f"4. NonDataDescriptor.__get__ called for {self.name}")
        return f"NonDataDescriptor value for {self.name}"
    
class OverridingNoGetDescriptor:
    """A descriptor that implements only __set__"""
    def __init__(self, name):
        self.name = name
        
    def __set__(self, instance, value):
        print(f"OverridingNoGetDescriptor.__set__ called for {self.name} with value {value}")

class MyClass:
    data_desc_attr = DataDescriptor("data_desc") # Data descriptor
    non_data_desc_attr = NonDataDescriptor("non_data_desc") # Non-data descriptor
    over_no_get_attr = OverridingNoGetDescriptor("overriding_no_get_attr") # Overriding descriptor with no __get__
    class_attr = "class attribute" # Regular class attribute
    
    def __init__(self):
        self.instance_attr = "instance attribute" # Regular instance attribute
        
    def __getattr__(self, name):
        print(f"6. __getattr__ called for {name}")
        return f"__getattr__ value for {name}"
        
    def __getattribute__(self, name):
        print(f"1. __getattribute__ called for {name}")
        result = super().__getattribute__(name)
        print(f"__getattribute__ result={result}")
        return result

#### Lookup chain for different attributes

In [None]:
obj = MyClass()

print("Accessing data descriptor:")
print(obj.data_desc_attr)

print("\nAccessing instance attribute:")
print(obj.instance_attr)

print("\nAccessing non-data descriptor:")
print(obj.non_data_desc_attr)

print("\nAccessing class attribute:")
print(obj.class_attr)

print("\nAccessing non-existent attribute:")
print(obj.doesnt_exist_attr)

print("\nAccessing overriding with no __get__ descriptor:")
print(obj.over_no_get_attr)

#### Instance attributes override non-data descriptors

In [None]:
obj = MyClass()

obj.__dict__["non_data_desc_attr"] = "instance value overriding non-data descriptor"

print("\nAccessing non-data descriptor after adding instance attribute:")
print(obj.non_data_desc_attr)

#### Data descriptors override instance attributes

In [None]:
obj = MyClass()

obj.__dict__["data_desc_attr"] = "instance value trying to override data descriptor"

print("\nAccessing data descriptor after adding instance attribute:")
print(obj.data_desc_attr)

#### Instance attributes override "Overriding descriptor with no `__get__`" descriptors

In [None]:
obj = MyClass()

obj.__dict__["over_no_get_attr"] = "instance value overriding \"Overriding descriptor with no __get__\" descriptor"

print("\nAccessing overriding wiht no __get__ descriptor after adding instance attribute:")
print(obj.over_no_get_attr)

### `__set_name__` magic method

The `__set_name__(self, owner, name)` method is a special method for descriptors. It is automatically called when a descriptor is assigned as a class attribute that defines the name of the attribute the descriptor is managing. Thus, we do not need to manually set the attribute name.

In [None]:
class SimpleDescriptor:
    def __set_name__(self, owner, name):
        print(f"__set_name__ called: owner={owner}, name={name}")
        self.name = name  # Store attribute name

    def __get__(self, instance, owner):
        return instance.__dict__.get(self.name, None)

    def __set__(self, instance, value):
        instance.__dict__[self.name] = value

class MyClass:
    attr = SimpleDescriptor()  # Triggers __set_name__

In [None]:
obj = MyClass()

obj.attr = 42
obj.attr

### Validation example

The `Validator` class is both an abstract base class and a managed attribute descriptor. Custom validators need to inherit from `Validator` and must supply a `validate()` method.

In [None]:
from abc import ABC, abstractmethod

class Validator(ABC):
    """Base descriptor class for validation"""

    def __set_name__(self, owner, name):
        self.name = name
        self.private_name = '_' + name

    def __get__(self, obj, objtype=None):
        return getattr(obj, self.private_name)

    def __set__(self, obj, value):
        self.validate(value)
        setattr(obj, self.private_name, value)

    @abstractmethod
    def validate(self, value):
        pass

In [None]:
class StringValidator(Validator):
    """Validates string values with optional length constraints"""

    def __init__(self, min_length=None, max_length=None):
        self.min_length = min_length
        self.max_length = max_length

    def validate(self, value):
        if not isinstance(value, str):
            raise TypeError(f"{self.name} must be a string")
        
        if self.min_length is not None and len(value) < self.min_length:
            raise ValueError(f"{self.name} must be at least {self.min_length} characters long")
        
        if self.max_length is not None and len(value) > self.max_length:
            raise ValueError(f"{self.name} must be no more than {self.max_length} characters long")
        
        return value

In [None]:
class RangeValidator(Validator):
    """Validates that a numeric value is within a specified range"""

    def __init__(self, minimum=None, maximum=None):
        self.minimum = minimum
        self.maximum = maximum
    
    def validate(self, value):
        if not isinstance(value, (int, float)):
            raise TypeError(f"{self.name} must be a number")
        
        if self.minimum is not None and value < self.minimum:
            raise ValueError(f"{self.name} must be at least {self.minimum}")
        
        if self.maximum is not None and value > self.maximum:
            raise ValueError(f"{self.name} must be no more than {self.maximum}")
        
        return value

In [None]:
class Person:
    name = StringValidator(min_length=2, max_length=50)
    age = RangeValidator(minimum=0, maximum=150)
    
    def __init__(self, name, age):
        self.name = name
        self.age = age

    def __repr__(self):
        return f"Person(name={self.name}, age={self.age})"

In [None]:
try:
    Person("J", 42)
except ValueError as e:
    print(e)

In [None]:
try:
    Person("John Doe", -42)
except ValueError as e:
    print(e)

In [None]:
Person("John Doe", 42)

### LazyProperty example

A lazy evaluation of a property is a design pattern used in programming where a property of an object is computed only when it is first accessed, and the result is then cached for future accesses. This can improve performance by delaying expensive computations until they are actually needed.

We can use *non-data* descriptor to implement lazy property.

In [None]:
import time

class LazyProperty:
    def __init__(self, func):
        self.func = func
        self.name = func.__name__

    def __get__(self, instance, owner):
        if instance is None:
            return self
        value = self.func(instance)
        setattr(instance, self.name, value)  # Cache the computed value
        return value

class Model:
    @LazyProperty
    def expensive_value(self):
        time.sleep(5)
        return 42

In [None]:
m = Model()

In [None]:
m.expensive_value # computed on the first access

In [None]:
m.expensive_value # cached value is returned

In [None]:
m.__dict__

## Descriptors in Python internals

### Descriptors in properties

`property()` is implemented in terms of the descriptor protocol. `property()` returns a `Property` object that implements the descriptor protocol. It uses the parameters `fget`, `fset` and `fdel` for the actual implementation of the three methods of the protocol.

Here is a pure Python equivalent that implements most of the core functionality.

In [None]:
class Property:
    "Emulate PyProperty_Type() in Objects/descrobject.c"

    def __init__(self, fget=None, fset=None, fdel=None, doc=None):
        self.fget = fget
        self.fset = fset
        self.fdel = fdel
        if doc is None and fget is not None:
            doc = fget.__doc__
        self.__doc__ = doc

    def __set_name__(self, owner, name):
        self.__name__ = name

    def __get__(self, obj, objtype=None):
        if obj is None:
            return self
        if self.fget is None:
            raise AttributeError
        return self.fget(obj)

    def __set__(self, obj, value):
        if self.fset is None:
            raise AttributeError
        self.fset(obj, value)

    def __delete__(self, obj):
        if self.fdel is None:
            raise AttributeError
        self.fdel(obj)

    def getter(self, fget):
        return type(self)(fget, self.fset, self.fdel, self.__doc__)

    def setter(self, fset):
        return type(self)(self.fget, fset, self.fdel, self.__doc__)

    def deleter(self, fdel):
        return type(self)(self.fget, self.fset, fdel, self.__doc__)

### Descriptors in functions and methods

Python’s object oriented features are built upon a function based environment. Using non-data descriptors, the two are merged seamlessly.

Functions stored in class dictionaries get turned into methods when invoked. Methods only differ from regular functions in that the object instance is prepended to the other arguments. By convention, the instance is called self but could be called this or any other variable name.

Methods can be created manually with `types.MethodType` which is roughly equivalent to the following class in Python.

In [None]:
class MethodType:
    "Emulate PyMethod_Type in Objects/classobject.c"

    def __init__(self, func, obj):
        self.__func__ = func
        self.__self__ = obj

    def __call__(self, *args, **kwargs):
        func = self.__func__
        obj = self.__self__
        return func(obj, *args, **kwargs)

    def __getattribute__(self, name):
        "Emulate method_getset() in Objects/classobject.c"
        if name == '__doc__':
            return self.__func__.__doc__
        return object.__getattribute__(self, name)

    def __getattr__(self, name):
        "Emulate method_getattro() in Objects/classobject.c"
        return getattr(self.__func__, name)

    def __get__(self, obj, objtype=None):
        "Emulate method_descr_get() in Objects/classobject.c"
        return self

To support automatic creation of methods, functions include the `__get__()` method for binding methods during attribute access. This means that functions are non-data descriptors that return bound methods during dotted lookup from an instance.

The following code shows how this works.

In [None]:
class Function:
    ...

    def __get__(self, obj, objtype=None):
        "Simulate func_descr_get() in Objects/funcobject.c"
        if obj is None:
            return self
        return MethodType(self, obj)

#### How the function descriptor works in practice

In [None]:
class MyClass:
    def my_func(self):
         return self

Accessing the function through the class dictionary or dotted access from a class does not invoke `__get__()`. Instead, it just returns the underlying function object.

In [None]:
MyClass.__dict__["my_func"]

In [None]:
MyClass.my_func

The dotted lookup from an instance calls `__get__()` which returns a bound method object.

In [None]:
obj = MyClass()
obj.my_func

Internally, the bound method stores the underlying function and the bound instance.

In [None]:
obj.my_func.__func__

In [None]:
obj.my_func.__self__

In [None]:
obj is obj.my_func.__self__

This is why `self` variable name is commonly used in methods.

### Method binding

Non-data descriptors provide a simple mechanism for variations on the usual patterns of binding functions into methods.

This chart summarizes the binding and its two most useful variants:


| Transformation | Called from an object | Called from a class |
|---|---|---|
| function | f(obj, *args) | f(*args) |
| staticmethod | f(*args) | f(*args) |
| classmethod | f(type(obj), *args) | f(cls, *args) |

### Static methods

Using the non-data descriptor protocol, a pure Python version of `staticmethod()` would look like the following.

In [None]:
class StaticMethod:
    "Emulate PyStaticMethod_Type() in Objects/funcobject.c"

    def __init__(self, f):
        self.f = f

    def __get__(self, obj, objtype=None):
        return self.f

    def __call__(self, *args, **kwds):
        return self.f(*args, **kwds)

### Class methods

Using the non-data descriptor protocol, a pure Python version of classmethod() would look like the following.

In [None]:
class ClassMethod:
    "Emulate PyClassMethod_Type() in Objects/funcobject.c"
    def __init__(self, f):
        self.f = f

    def __get__(self, obj, klass=None):
        if klass is None:
            klass = type(obj)
        def newfunc(*args):
            return self.f(klass, *args)
        return newfunc

## Metaprogramming

**Metaprogramming** is a computer programming technique in which computer programs have the ability to treat other programs as their data. It means that a program can be designed to read, generate, analyse, or transform other programs, and even modify itself, while running.

[Wikipedia - Metaprogramming](https://en.wikipedia.org/wiki/Metaprogramming)

Some of the metaprogramming features in Python are:

- dynamic code generation
- decorators
- descriptors
- metaclasses

### `eval()`

The `eval()` function allows executing a string containing a expression and returning its result.

<div class="alert alert-block alert-warning">
<b>Note:</b> Avoid using eval() with untrusted input because it is dangerous and can execute arbitrary code.
</div>

In [None]:
# basic example

eval("4 + 2")

In [None]:
# using variables

x = 42
y = 24

eval("x + y")

In [None]:
# evaluating collections

eval("{'name': 'John Doe', 'age': 42}")

### `exec()`

The `exec()` function allows executing dynamically generated code. Unlike `eval()`, which only evaluates expressions, `exec()` can run statements like loops, function definitions, and class definitions.

<div class="alert alert-block alert-warning">
<b>Note:</b> Avoid using exec() with untrusted input because it is dangerous and can execute arbitrary code.
</div>

In [None]:
# basic example

code = """
x = 4
y = 2

print(x + y)
"""

exec(code)

In [None]:
# defining functions

code = """
def greet(name="world"):
    return f"Hello, {name}"
"""

exec(code)

greet("John Doe")

In [None]:
# executing loops

exec(
"""
for i in range(3):
    print("Iteration:", i)
"""
)

In [None]:
# defining classes

code = """
class DynamicClass:
    def greet(self):
        return "Hello from DynamicClass!"
"""

exec(code)

obj = DynamicClass()
obj.greet()

## Class metaprogramming

This section is heavily influenced by the following:

*References:*

- Fluent Python, Luciano Ramalho

### Classes are objects

In [None]:
class A:
    pass


class B(A):
    pass


class C(B):
    pass

In [None]:
B.__bases__

In [None]:
B.__subclasses__()

In [None]:
B.mro()

### `type()`

The `type()` function in Python is used to determine the type of an object or dynamically create new classes.

In [None]:
type(42)

In [None]:
for t in int, str, list, set, tuple:
    print(type(t))

In [None]:
class A:
    pass

type(A)

In [None]:
type(ValueError)

In [None]:
type(type)

`type` is a **metaclass**, meaning it is a class that builds other classes.

```
type(name, bases, attributes)
```

- `name`: Name of the class.
- `bases`: Tuple of base classes.
- `attributes`: Dictionary of attributes and methods.

In [None]:
A = type("A", (object,), {"a": 42, "greet": lambda self: "Hello from A!"})
B = type("B", (A,), {"b": 24, "greet": lambda self: "Hello from B!"})

In [None]:
from dis import dis

In [None]:
a = A()
a.a, a.greet()

In [None]:
dis(A)

In [None]:
class A:
#     def __init__(self):
#         self.a = 42
        
    def greet(self):
        return "Hello from A!"

In [None]:
dis(A)

In [None]:
b = B()
b.b, b.greet()

### Class factory function

In the following example we define a `record_factory` function that acts like `@dataclass`

In [None]:
from typing import Union, Any
from collections.abc import Iterable, Iterator


FieldNames = str | Iterable[str]

def parse_identifiers(names: FieldNames) -> tuple[str, ...]:
    if isinstance(names, str):
        names = names.replace(',', ' ').split()
    if not all(s.isidentifier() for s in names):
        raise ValueError('names must all be valid identifiers')
    return tuple(names)


def record_factory(cls_name: str, field_names: FieldNames) -> type[tuple]:
    slots = parse_identifiers(field_names)

    def __init__(self, *args, **kwargs) -> None:
        attrs = dict(zip(self.__slots__, args))
        attrs.update(kwargs)
        for name, value in attrs.items():
            setattr(self, name, value)

    def __iter__(self) -> Iterator[Any]:
        for name in self.__slots__:
            yield getattr(self, name)

    def __repr__(self):
        values = ", ".join(f"{name}={value!r}"
            for name, value in zip(self.__slots__, self))
        cls_name = self.__class__.__name__
        return f"{cls_name}({values})"

    cls_attrs = dict(
        __slots__=slots,
        __init__=__init__,
        __iter__=__iter__,
        __repr__=__repr__,
    )

    return type(cls_name, (object,), cls_attrs)

In [None]:
Person = record_factory("Person", "name age")

In [None]:
p = Person(name="John Doe", age=42)
p

In [None]:
name, age = p

print(name)
print(age)

### `__init_subclass__()` method

The `__init_subclass__()` method is a special class method that is automatically called whenever a subclass is created. It allows a base class to customize the behavior of its subclasses.

In [None]:
from collections.abc import Callable
from typing import Any, NoReturn, get_type_hints


class Field:
    def __init__(self, name: str, constructor: Callable) -> None:
        if not callable(constructor) or constructor is type(None):
            raise TypeError(f'{name!r} type hint must be callable')
        self.name = name
        self.constructor = constructor

    def __set__(self, instance: Any, value: Any) -> None:
        if value is ...:
            value = self.constructor()
        else:
            try:
                value = self.constructor(value)
            except (TypeError, ValueError) as e:
                type_name = self.constructor.__name__
                msg = f'{value!r} is not compatible with {self.name}:{type_name}'
                raise TypeError(msg) from e
        instance.__dict__[self.name] = value

In [None]:
class Checked:
    @classmethod
    def _fields(cls) -> dict[str, type]:
        return get_type_hints(cls)

    def __init_subclass__(subclass) -> None:
        super().__init_subclass__()
        for name, constructor in subclass._fields().items():
            setattr(subclass, name, Field(name, constructor))

    def __init__(self, **kwargs: Any) -> None:
        for name in self._fields():
            value = kwargs.pop(name, ...)
            setattr(self, name, value)
        if kwargs:
            self.__flag_unknown_attrs(*kwargs)

    def __setattr__(self, name: str, value: Any) -> None:
        if name in self._fields():
            cls = self.__class__
            descriptor = getattr(cls, name)
            descriptor.__set__(self, value)
        else:
            self.__flag_unknown_attrs(name)

    def __flag_unknown_attrs(self, *names: str) -> NoReturn:
        plural = 's' if len(names) > 1 else ''
        extra = ', '.join(f'{name!r}' for name in names)
        cls_name = repr(self.__class__.__name__)
        raise AttributeError(f'{cls_name} object has no attribute{plural} {extra}')

    def _asdict(self) -> dict[str, Any]:
        return {
            name: getattr(self, name)
            for name, attr in self.__class__.__dict__.items()
            if isinstance(attr, Field)
        }

    def __repr__(self) -> str:
        kwargs = ', '.join(
            f'{key}={value!r}' for key, value in self._asdict().items()
        )
        return f'{self.__class__.__name__}({kwargs})'

In [None]:
class Person(Checked):
    name: str
    age: int
    salary: float

In [None]:
person = Person(name="John Doe", age=42, salary=199_999.99)
person

In [None]:
person.name, person.age

In [None]:
person.age = "text"

In [None]:
person = Person()
person

In [None]:
person = Person(first_name="John Doe", age=42, salary=199_999.99)

The `__init_subclass__()` method is called after the class is created. Adding `__slots__` to an existing class has no effect, meaning we cannot use the `__init_subclass__()` method for that purpose.