# Chapter 35: Descriptors Deep Dive

This notebook explores Python's descriptor protocol -- the mechanism that powers `property`, `classmethod`, `staticmethod`, and many ORM frameworks. You will learn the difference between data and non-data descriptors, how `__set_name__` works, and how to build reusable validation descriptors.

## Key Concepts
- **Descriptor protocol**: Objects that define `__get__`, `__set__`, or `__delete__`
- **Data descriptor**: Defines both `__get__` and `__set__` (or `__delete__`) -- takes priority over instance `__dict__`
- **Non-data descriptor**: Defines only `__get__` -- instance `__dict__` takes priority
- **`__set_name__`**: Called automatically when a descriptor is assigned to a class attribute
- **Validation descriptors**: Reusable attribute validators using the descriptor protocol

## Section 1: The Descriptor Protocol

A descriptor is any object that defines at least one of `__get__`, `__set__`, or `__delete__`. When such an object is stored as a class attribute, Python invokes these methods during attribute access instead of returning the descriptor itself.

In [None]:
# A minimal descriptor that logs access
class VerboseDescriptor:
    """A descriptor that prints when its __get__ is called."""

    def __get__(self, obj: object, objtype: type | None = None) -> str:
        print(f"__get__ called: obj={obj!r}, objtype={objtype!r}")
        return "descriptor value"


class Demo:
    attr = VerboseDescriptor()


# Accessing via instance triggers __get__ with obj set
d = Demo()
print(f"Instance access: {d.attr}")

print()

# Accessing via class triggers __get__ with obj=None
print(f"Class access: {Demo.attr}")

## Section 2: Data Descriptors

A **data descriptor** defines both `__get__` and `__set__` (or `__delete__`). Data descriptors take precedence over instance `__dict__` entries. This is how `property` works under the hood.

In [None]:
class DataDescriptor:
    """A data descriptor that stores values in the instance __dict__."""

    def __set_name__(self, owner: type, name: str) -> None:
        self.name = name

    def __get__(self, obj: object, objtype: type | None = None) -> object:
        if obj is None:
            return self
        print(f"  DataDescriptor.__get__ for '{self.name}'")
        return obj.__dict__.get(self.name, "<not set>")

    def __set__(self, obj: object, value: object) -> None:
        print(f"  DataDescriptor.__set__ for '{self.name}' = {value!r}")
        obj.__dict__[self.name] = value


class Product:
    name = DataDescriptor()
    price = DataDescriptor()


p = Product()
p.name = "Widget"  # Goes through __set__
p.price = 9.99

print()
print(f"Name:  {p.name}")   # Goes through __get__
print(f"Price: {p.price}")

print()

# Even though 'name' exists in instance __dict__, the data descriptor
# __get__ is still called (data descriptors have priority)
print(f"Instance __dict__: {p.__dict__}")

## Section 3: Non-Data Descriptors

A **non-data descriptor** only defines `__get__` (no `__set__` or `__delete__`). The instance `__dict__` takes priority over non-data descriptors. This is the mechanism behind `functools.cached_property`.

In [None]:
from typing import Any, Callable


class CachedProperty:
    """Non-data descriptor that caches the result in the instance __dict__."""

    def __init__(self, func: Callable[..., Any]) -> None:
        self.func = func
        self.name = func.__name__

    def __get__(self, obj: object, objtype: type | None = None) -> Any:
        if obj is None:
            return self
        print(f"  Computing '{self.name}' for the first time...")
        value = self.func(obj)
        # Store in instance __dict__ so next access bypasses descriptor
        obj.__dict__[self.name] = value
        return value


class Circle:
    def __init__(self, radius: float) -> None:
        self.radius = radius

    @CachedProperty
    def area(self) -> float:
        return 3.14159 * self.radius ** 2


c = Circle(5.0)

# First access goes through __get__ and computes
print(f"Area: {c.area}")

# Second access reads from instance __dict__ (no descriptor call)
print(f"Area: {c.area}")

# Verify the value is cached in instance __dict__
print(f"\n'area' in __dict__: {'area' in c.__dict__}")
print(f"Cached value: {c.__dict__['area']}")

## Section 4: Data vs Non-Data Descriptor Lookup Order

Python's attribute lookup follows this order:
1. **Data descriptors** on the class (and its MRO)
2. **Instance `__dict__`**
3. **Non-data descriptors** on the class (and its MRO)

This is why data descriptors always intercept access, while non-data descriptors can be shadowed.

In [None]:
class DataDesc:
    """Data descriptor (has __get__ and __set__)."""

    def __get__(self, obj: object, objtype: type | None = None) -> str:
        return "from data descriptor"

    def __set__(self, obj: object, value: object) -> None:
        pass  # Intentionally does nothing for demo


class NonDataDesc:
    """Non-data descriptor (has only __get__)."""

    def __get__(self, obj: object, objtype: type | None = None) -> str:
        return "from non-data descriptor"


class Example:
    data_attr = DataDesc()
    non_data_attr = NonDataDesc()


e = Example()

# Put values directly in instance __dict__
e.__dict__["data_attr"] = "from instance dict"
e.__dict__["non_data_attr"] = "from instance dict"

# Data descriptor wins over instance __dict__
print(f"data_attr:     {e.data_attr}")

# Instance __dict__ wins over non-data descriptor
print(f"non_data_attr: {e.non_data_attr}")

## Section 5: `__set_name__` -- Automatic Name Discovery

When a descriptor is assigned as a class attribute, Python automatically calls `__set_name__(self, owner, name)` during class creation. This lets the descriptor know its attribute name without requiring it to be passed explicitly.

In [None]:
class Tracker:
    """Descriptor that tracks which names it was assigned to."""

    def __set_name__(self, owner: type, name: str) -> None:
        self.owner_name = owner.__name__
        self.attr_name = name
        print(f"__set_name__ called: owner={owner.__name__}, name={name!r}")

    def __get__(self, obj: object, objtype: type | None = None) -> object:
        if obj is None:
            return self
        return f"{self.owner_name}.{self.attr_name}"


# __set_name__ is called for each descriptor during class creation
class MyModel:
    field_a = Tracker()
    field_b = Tracker()
    field_c = Tracker()


print()
m = MyModel()
print(f"field_a: {m.field_a}")
print(f"field_b: {m.field_b}")
print(f"field_c: {m.field_c}")

## Section 6: Validation Descriptors

One of the most powerful uses of descriptors is building reusable validators. A validation descriptor checks values in `__set__` and raises errors for invalid data.

In [None]:
class Validated:
    """Data descriptor that validates values are non-negative integers."""

    def __set_name__(self, owner: type, name: str) -> None:
        self.name = name

    def __get__(self, obj: object, objtype: type | None = None) -> object:
        if obj is None:
            return self
        return obj.__dict__.get(self.name)

    def __set__(self, obj: object, value: object) -> None:
        if not isinstance(value, int) or value < 0:
            raise ValueError(f"{self.name} must be a non-negative int")
        obj.__dict__[self.name] = value


class Order:
    quantity = Validated()
    item_count = Validated()


order = Order()
order.quantity = 10
order.item_count = 5
print(f"Quantity:   {order.quantity}")
print(f"Item count: {order.item_count}")

# Invalid values are rejected
print()
for bad_value in [-1, "ten", 3.5]:
    try:
        order.quantity = bad_value
    except ValueError as e:
        print(f"Rejected {bad_value!r}: {e}")

## Section 7: Composable Validation Descriptors

We can build a family of validators by using a base class with a `validate` method that subclasses override.

In [None]:
class ValidatorBase:
    """Base class for validation descriptors."""

    def __set_name__(self, owner: type, name: str) -> None:
        self.name = name

    def __get__(self, obj: object, objtype: type | None = None) -> object:
        if obj is None:
            return self
        return obj.__dict__.get(self.name)

    def __set__(self, obj: object, value: object) -> None:
        self.validate(value)
        obj.__dict__[self.name] = value

    def validate(self, value: object) -> None:
        raise NotImplementedError


class PositiveInt(ValidatorBase):
    def validate(self, value: object) -> None:
        if not isinstance(value, int) or value <= 0:
            raise ValueError(f"{self.name} must be a positive integer, got {value!r}")


class NonEmptyString(ValidatorBase):
    def validate(self, value: object) -> None:
        if not isinstance(value, str) or len(value) == 0:
            raise ValueError(f"{self.name} must be a non-empty string, got {value!r}")


class Employee:
    name = NonEmptyString()
    age = PositiveInt()
    employee_id = PositiveInt()


emp = Employee()
emp.name = "Alice"
emp.age = 30
emp.employee_id = 1001

print(f"Name: {emp.name}, Age: {emp.age}, ID: {emp.employee_id}")

# Test validation
print()
for attr, bad_val in [("name", ""), ("age", -5), ("employee_id", "ABC")]:
    try:
        setattr(emp, attr, bad_val)
    except ValueError as e:
        print(f"Rejected {attr}={bad_val!r}: {e}")

## Section 8: How `property` Is a Descriptor

Python's built-in `property` is itself a data descriptor. Understanding this helps demystify how properties work.

In [None]:
# property is a data descriptor because it has __get__, __set__, and __delete__
print(f"property has __get__:    {hasattr(property, '__get__')}")
print(f"property has __set__:    {hasattr(property, '__set__')}")
print(f"property has __delete__: {hasattr(property, '__delete__')}")


# A property-based class
class Temperature:
    def __init__(self, celsius: float) -> None:
        self._celsius = celsius

    @property
    def celsius(self) -> float:
        return self._celsius

    @celsius.setter
    def celsius(self, value: float) -> None:
        if value < -273.15:
            raise ValueError("Temperature below absolute zero")
        self._celsius = value

    @property
    def fahrenheit(self) -> float:
        return self._celsius * 9 / 5 + 32


t = Temperature(100.0)
print(f"\n{t.celsius}C = {t.fahrenheit}F")

t.celsius = 0.0
print(f"{t.celsius}C = {t.fahrenheit}F")

# Temperature.celsius is a property descriptor on the class
print(f"\nType of Temperature.celsius: {type(Temperature.__dict__['celsius'])}")

## Section 9: Functions Are Non-Data Descriptors

Regular functions are non-data descriptors. Their `__get__` method is what creates bound methods when accessed via an instance.

In [None]:
class Greeter:
    def greet(self, name: str) -> str:
        return f"Hello, {name}!"


g = Greeter()

# Functions have __get__ but not __set__ -- they are non-data descriptors
greet_func = Greeter.__dict__["greet"]
print(f"greet is a: {type(greet_func)}")
print(f"Has __get__: {hasattr(greet_func, '__get__')}")
print(f"Has __set__: {hasattr(greet_func, '__set__')}")

# Accessing via instance invokes __get__, producing a bound method
bound = g.greet
print(f"\nBound method: {bound}")
print(f"Bound to:     {bound.__self__}")
print(f"Result:       {bound('World')}")

# Manually calling __get__ does the same thing
manual_bound = greet_func.__get__(g, Greeter)
print(f"\nManual bind:  {manual_bound('Python')}")

## Section 10: Descriptor with `__delete__`

A descriptor can also define `__delete__` to control what happens when `del obj.attr` is called.

In [None]:
class ManagedAttribute:
    """Descriptor that manages the full lifecycle of an attribute."""

    def __set_name__(self, owner: type, name: str) -> None:
        self.name = name

    def __get__(self, obj: object, objtype: type | None = None) -> object:
        if obj is None:
            return self
        return obj.__dict__.get(self.name, "<undefined>")

    def __set__(self, obj: object, value: object) -> None:
        print(f"  Setting {self.name} = {value!r}")
        obj.__dict__[self.name] = value

    def __delete__(self, obj: object) -> None:
        print(f"  Deleting {self.name}")
        obj.__dict__.pop(self.name, None)


class Config:
    setting = ManagedAttribute()


cfg = Config()
cfg.setting = "enabled"
print(f"Value: {cfg.setting}")

del cfg.setting
print(f"After delete: {cfg.setting}")

## Summary

### Descriptor Protocol
- A **descriptor** is any object with `__get__`, `__set__`, or `__delete__`
- **Data descriptors** define `__get__` and `__set__` (or `__delete__`) -- they have priority over instance `__dict__`
- **Non-data descriptors** define only `__get__` -- instance `__dict__` entries shadow them

### Attribute Lookup Order
1. Data descriptors on the class (via MRO)
2. Instance `__dict__`
3. Non-data descriptors on the class (via MRO)

### `__set_name__`
- Called automatically during class creation: `descriptor.__set_name__(owner_class, attr_name)`
- Lets descriptors discover their attribute name without explicit configuration

### Validation Descriptors
- Enforce constraints in `__set__` before storing values
- Use a base class with a `validate` method for composable validators
- Store validated values in `obj.__dict__[self.name]` to avoid infinite recursion

### Built-in Descriptors
- `property` is a data descriptor (`__get__` + `__set__` + `__delete__`)
- Functions are non-data descriptors (`__get__` only) -- this creates bound methods