### Descriptors and Slots

#### Where do object and classes store their attributes? 

In [1]:
class A:
    v = 123

In [2]:
A.__dict__

mappingproxy({'__module__': '__main__',
              'v': 123,
              '__dict__': <attribute '__dict__' of 'A' objects>,
              '__weakref__': <attribute '__weakref__' of 'A' objects>,
              '__doc__': None})

A mappingproxy makes a dictionary read-only (also, string keys are enforced for class dictionary).

In [3]:
from types import MappingProxyType

In [4]:
mp = MappingProxyType({1: 2})

In [5]:
mp

mappingproxy({1: 2})

In [6]:
# mp[1] = 3 # TypeError: 'mappingproxy' object does not support item assignment

In [7]:
a = A()

In [8]:
a.__dict__

{}

In [9]:
a.__class__.__dict__

mappingproxy({'__module__': '__main__',
              'v': 123,
              '__dict__': <attribute '__dict__' of 'A' objects>,
              '__weakref__': <attribute '__weakref__' of 'A' objects>,
              '__doc__': None})

To use the descriptor, it must be stored as a **class variable** in another class.

#### What is a descriptor?

> In general, a descriptor is an attribute value that has one of the methods in the descriptor protocol. Those methods are `__get__()`, `__set__()`, and `__delete__()`. If any of those methods are defined for an attribute, it is said to be a descriptor. 

* read-only attributes

There is a default descriptor, that we does the usual attribute lookup:

* instance
* type
* continuing types in the method-resolution order

#### A class with a class attribute and a descriptor

In [10]:
class A:
    x = 10

In [11]:
a = A()

In [12]:
a.x

10

In [13]:
class D:
    def __get__(self, obj, objtype=None):
        print("D.__get__", obj)
        return 10

In [14]:
class A:
    x = D()

In [15]:
a = A()

In [16]:
a.x

D.__get__ <__main__.A object at 0x7fed817ac550>


10

Now x is a descriptor.

In [17]:
A.__dict__

mappingproxy({'__module__': '__main__',
              'x': <__main__.D at 0x7fed817ac6d0>,
              '__dict__': <attribute '__dict__' of 'A' objects>,
              '__weakref__': <attribute '__weakref__' of 'A' objects>,
              '__doc__': None})

The value of `A.x` will is now a computed value. 

#### Example: Directory Size

In [18]:
import os

class DirectorySize:

    def __get__(self, obj, objtype=None):
        return len(os.listdir(obj.dirname))

class Directory:

    size = DirectorySize()              # Descriptor instance

    def __init__(self, dirname):
        self.dirname = dirname          # Regular instance attribute

In [19]:
d = Directory(".")
d.size

30

In [20]:
!ls -a

.				    Descriptors.ipynb
..				    Exceptions.ipynb
ClassDecorators.ipynb		    FlowControl.ipynb
CollectionsABC.ipynb		    FunctionalAspects.ipynb
Collections.ipynb		    Generators.ipynb
DataModelAttributeMagic.ipynb	    Interfaces.ipynb
DataModelAugmentedAssignment.ipynb  .ipynb_checkpoints
DataModelComparison.ipynb	    Iterators.ipynb
DataModelInitialization.ipynb	    MatchStatement.ipynb
DataModelItemAccess.ipynb	    Metaclasses.ipynb
DataModelOperatorMagic.ipynb	    MethodCalls.ipynb
DataModelOverview.ipynb		    PositionalOnlyArgs.ipynb
DataModelStringMagic.ipynb	    SentinelValue.ipynb
DataModelTypeConversions.ipynb	    Slots.ipynb
DatamodelUnary.ipynb		    Snippets
Decorators.ipynb		    Tasks.ipynb


In order to do something instance specific, the `obj` is passed in. In this example, we also depend on an attribute `dirname` on object - so this descriptor is less general.

#### Sidenote: Properties

There is a similar technique, called [properties](https://docs.python.org/3/library/functions.html#property).
    
* Properties work best when they know about the class
* **Descriptors are more general**, can often apply to any class
* Use descriptors if behaviour is different for classes and instances
* Properties are syntactic sugar


In [21]:
class Directory:
    
    def __init__(self, dirname):
        self.dirname = dirname          # Regular instance attribute
    
    @property
    def size(self):
        return self._compute_size()
        
    def _compute_size(self):
        return len(os.listdir(self.dirname))

In [22]:
d = Directory(".")
d.size

30

In [23]:
# d.size = 3 # read-only: AttributeError: property 'size' of 'Directory' object has no setter

A note on properties:
    
* The expectation of an attribute access to be fast should be kept in mind. 

#### Revisit, objtype

In [24]:
import os

class DirectorySize:

    def __get__(self, obj, objtype=None):
        if obj is None:
            print(f"called on type! {objtype}")
        else:
            print(f"DirectorySize: {obj} {objtype}")
            return len(os.listdir(obj.dirname))

class Directory:

    size = DirectorySize()              # Descriptor instance

    def __init__(self, dirname):
        self.dirname = dirname          # Regular instance attribute

In [25]:
d = Directory(".")
d.size

DirectorySize: <__main__.Directory object at 0x7fed817ac2b0> <class '__main__.Directory'>


30

Why do we need the extra type information? Because the instance could be some subclass.

In [26]:
Directory.size

called on type! <class '__main__.Directory'>


#### Use case: Managing instance data

* descriptor is a delegate and can intercept attribute access

Example: logging access

Notes:

* we are still storing the instance data in the instance
* we use a per convention hidden name `_age` to store the value

In [27]:
import logging

logging.basicConfig(level=logging.INFO)

class LoggedAgeAccess:

    def __get__(self, obj, objtype=None):
        value = obj._age
        logging.info('Accessing %r giving %r', 'age', value)
        return value

    def __set__(self, obj, value):
        logging.info('Updating %r to %r', 'age', value)
        obj._age = value

class Person:

    age = LoggedAgeAccess()             # Descriptor instance

    def __init__(self, name, age):
        self.name = name                # Regular instance attribute
        self.age = age                  # Calls __set__()

    def birthday(self):
        self.age += 1                   # Calls both __get__() and __set__()

In [28]:
mary = Person('Mary M', 30)         # The initial age update is logged
dave = Person('David D', 40)

INFO:root:Updating 'age' to 30
INFO:root:Updating 'age' to 40


In [29]:
vars(mary), vars(dave)                     # The actual data is in a private attribute

({'name': 'Mary M', '_age': 30}, {'name': 'David D', '_age': 40})

In [30]:
mary.birthday()                     # Updates are logged as well

INFO:root:Accessing 'age' giving 30
INFO:root:Updating 'age' to 31


In [31]:
dave.name, dave.age                           # Regular attribute lookup isn't logged


INFO:root:Accessing 'age' giving 40


('David D', 40)

### A few problems
    
* `_age` is hardcoded
* and we do not know which attribute name the Person class uses

In PEP 487 – Simpler customisation of class creation, we get some support for this kind of problem.

* https://peps.python.org/pep-0487/ 

> upon class creation, a __set_name__ hook is called on all the attribute (descriptors) defined in the class [...] 

* added in Python 3.6

Previously, you would have to pass the attribute name explicitly to the descriptor, which was repetetive. Or use some class decorator, that would automate the name mapping.

```python
class configure_descriptors:
    def __init__(self, **kwargs):
        self.descs = {dname: dcls(dname) for dname, dcls in kwargs.items()}

    def __call__(self, class_):
        for dname, descriptor in self.descs.items():
            setattr(class_, dname, descriptor)
        return class_


@configure_descriptors(
    descriptor=LoggedAttr
)
class DecoratedManaged:
    """The descriptor is provided by the decorator"""
```

We want to follow a convention, e.g. to store every attribute with a `_` prefix in the instance.

Let's see `__set_name__` in a smaller example.



In [39]:
class D:
    def __set_name__(self, owner, name):
        print(f"__set_name__ owner={owner} name={name}")
    
class A:
    attr = D()

# __set_name__ is executed, when the class is defined 


__set_name__ owner=<class '__main__.A'> name=attr


In [40]:
import logging

logging.basicConfig(level=logging.INFO)

class LoggedAccess:

    def __set_name__(self, owner, name):
        self.public_name = name
        self.private_name = '_' + name

    def __get__(self, obj, objtype=None):
        value = getattr(obj, self.private_name)
        logging.info('Accessing %r giving %r', self.public_name, value)
        return value

    def __set__(self, obj, value):
        logging.info('Updating %r to %r', self.public_name, value)
        setattr(obj, self.private_name, value)

class Person:

    name = LoggedAccess()                # First descriptor instance
    age = LoggedAccess()                 # Second descriptor instance

    def __init__(self, name, age):
        self.name = name                 # Calls the first descriptor
        self.age = age                   # Calls the second descriptor

    def birthday(self):
        self.age += 1



In [41]:
mary = Person('Mary M', 30)         # The initial age update is logged
dave = Person('David D', 40)

INFO:root:Updating 'name' to 'Mary M'
INFO:root:Updating 'age' to 30
INFO:root:Updating 'name' to 'David D'
INFO:root:Updating 'age' to 40


In [42]:
vars(mary)

{'_name': 'Mary M', '_age': 30}

In [43]:
vars(dave)

{'_name': 'David D', '_age': 40}

Notes:
    
> Optionally, descriptors can have a `__set_name__()` method. This is only used in cases where a descriptor needs to know either the **class** where it was created or the **name of class variable** it was assigned to.

`__set_name__` is not descriptor specific.

Sidenote:

Previously, this needed to be done manually, e.g. passing the attribute name again as an argument to the descriptor.

```python
attr = CachedAttr("attr", f)
```

#### Task: Descriptor for adding range limits for integer values

In [52]:
class IntRange:
    
    def __init__(self, min=None, max=None):
        self.min = min
        self.max = max

    def __set_name__(self, owner, name):
        self.public_name = name
        self.private_name = '_' + name
        
    def __get__(self, obj, objtype):
        return getattr(obj, self.private_name)
    
    def __set__(self, obj, value):
        if self.min is not None:
            if value < self.min:
                raise ValueError("value too small")
        if self.max is not None:
            if value > self.max:
                raise ValueError("value too large")
        setattr(obj, self.private_name, value)

In [53]:
class Record:
    x = IntRange(min=10, max=20)
    y = IntRange(min=0)

In [54]:
record = Record()

In [55]:
record.x = 10

In [57]:
# record.x = 9 # ValueError: value too small

In [59]:
# record.x = 100 # ValueError: value too large

#### Code Review: Validation

Small example for a tiny validation layer, with a pleasant surface.

In [60]:
from abc import ABC, abstractmethod

class Validator(ABC):

    def __set_name__(self, owner, name):
        self.private_name = '_' + name

    def __get__(self, obj, objtype=None):
        return getattr(obj, self.private_name)

    def __set__(self, obj, value):
        self.validate(value)
        setattr(obj, self.private_name, value)

    @abstractmethod
    def validate(self, value):
        pass

These examples a slightly more general. They do not depend on the existence of specific attributes in the enclosing class and can be used in many contexts. The only thing we mutate is a private attribute name.

In [61]:
class OneOf(Validator):

    def __init__(self, *options):
        self.options = set(options)

    def validate(self, value):
        if value not in self.options:
            raise ValueError(f'Expected {value!r} to be one of {self.options!r}')

We can define numbers or strings with some bounds, too.

In [62]:
class Number(Validator):

    def __init__(self, minvalue=None, maxvalue=None):
        self.minvalue = minvalue
        self.maxvalue = maxvalue

    def validate(self, value):
        if not isinstance(value, (int, float)):
            raise TypeError(f'Expected {value!r} to be an int or float')
        if self.minvalue is not None and value < self.minvalue:
            raise ValueError(
                f'Expected {value!r} to be at least {self.minvalue!r}'
            )
        if self.maxvalue is not None and value > self.maxvalue:
            raise ValueError(
                f'Expected {value!r} to be no more than {self.maxvalue!r}'
            )

class String(Validator):

    def __init__(self, minsize=None, maxsize=None, predicate=None):
        self.minsize = minsize
        self.maxsize = maxsize
        self.predicate = predicate

    def validate(self, value):
        if not isinstance(value, str):
            raise TypeError(f'Expected {value!r} to be an str')
        if self.minsize is not None and len(value) < self.minsize:
            raise ValueError(
                f'Expected {value!r} to be no smaller than {self.minsize!r}'
            )
        if self.maxsize is not None and len(value) > self.maxsize:
            raise ValueError(
                f'Expected {value!r} to be no bigger than {self.maxsize!r}'
            )
        if self.predicate is not None and not self.predicate(value):
            raise ValueError(
                f'Expected {self.predicate} to be true for {value!r}'
            )


To use, we define the use of validators in the class.


In [63]:
class Component:

    name = String(minsize=3, maxsize=10, predicate=str.isupper)
    kind = OneOf('wood', 'metal', 'plastic')
    quantity = Number(minvalue=0)

    def __init__(self, name, kind, quantity):
        self.name = name
        self.kind = kind
        self.quantity = quantity
        

In [68]:
c = Component("PART213", "wood", 10)

In [70]:
# c = Component("PART213", "woody", 10) # ValueError

#### Other use cases

* validation
* logging, tracing
* ORM
* caching

Interestingly, you can see people using descriptors for what is today implemented as type hints, e.g.

```python
class A:
    name = TypedAttribute("name", str)
    x = TypedAttribute("x", int)
```