## 20. Attribute Descriptors
A descriptor is a class that implements a protocol consisting of the __get__, __set__, and __delete__ methods. Understanding descriptors is key to Python mastery. 

## LineItem Take #3: A Simple Descriptor
Descriptor class: A class implementing the descriptor protocol. <br>
Managed class: The class where the descriptor instances are declared as class attributes

In [2]:
class Quantity:
    def __init__(self, storage_name):
        self.storage_name = storage_name

    def __set__(self, instance, value):
        ''' self is the descriptor instance, and instance is the managed instance.'''
        if value > 0:
            instance.__dict__[self.storage_name] = value
        else:
            raise ValueError("value must be > 0")


In [4]:
class LineItem:
    weight = Quantity("weight")
    price = Quantity("price")
    def __init__(self, description, weight, price):
        self.description = description
        self.weight = weight
        self.price = price
        
    def subtotal(self):
        return self.weight * self.price


In [5]:
item = LineItem("toy", 2.5, 100)

In [6]:
item.__dict__

{'description': 'toy', 'weight': 2.5, 'price': 100}

In [8]:
LineItem.__dict__

mappingproxy({'__module__': '__main__',
              'weight': <__main__.Quantity at 0x1eb22c70470>,
              'price': <__main__.Quantity at 0x1eb22c70518>,
              '__init__': <function __main__.LineItem.__init__(self, description, weight, price)>,
              'subtotal': <function __main__.LineItem.subtotal(self)>,
              '__dict__': <attribute '__dict__' of 'LineItem' objects>,
              '__weakref__': <attribute '__weakref__' of 'LineItem' objects>,
              '__doc__': None})

In [9]:
import traceback

try:
    item.weight = 0
except:
    traceback.print_exc()

Traceback (most recent call last):
  File "<ipython-input-9-f23604991c7c>", line 4, in <module>
    item.weight = 0
  File "<ipython-input-2-31662b8935ca>", line 9, in __set__
    raise ValueError("value must be > 0")
ValueError: value must be > 0


## LineItem Take #4: Automatic Storage Attribute Names
To avoid retyping the attribute name in the descriptor declarations, we’ll generate a unique string for the storage_name of each Quantity instance. 

In [2]:
class Quantity:
    __counter = 0
    def __init__(self):
        cls = self.__class__
        prefix = cls.__name__
        index = cls.__counter
        self.storage_name = '_{}#{}'.format(prefix, index)
        cls.__counter += 1

    def __get__(self, instance, owner):
        return getattr(instance, self.storage_name)

    def __set__(self, instance, value):
        if value > 0:
            setattr(instance, self.storage_name, value)
        else:
            raise ValueError('value must be > 0')


In [3]:
class LineItem:
    weight = Quantity()
    price = Quantity()
    def __init__(self, description, weight, price):
        self.description = description
        self.weight = weight
        self.price = price
    
    def subtotal(self):
        return self.weight * self.price


In [4]:
item1 = LineItem("toy", 1.5, 100)
item2 = LineItem("computer", 2, 300)

In [5]:
item1.__dict__

{'description': 'toy', '_Quantity#0': 1.5, '_Quantity#1': 100}

In [6]:
item2.__dict__

{'description': 'computer', '_Quantity#0': 2, '_Quantity#1': 300}

In [7]:
LineItem.__dict__

mappingproxy({'__module__': '__main__',
              'weight': <__main__.Quantity at 0x21bcbe954e0>,
              'price': <__main__.Quantity at 0x21bcbe95588>,
              '__init__': <function __main__.LineItem.__init__(self, description, weight, price)>,
              'subtotal': <function __main__.LineItem.subtotal(self)>,
              '__dict__': <attribute '__dict__' of 'LineItem' objects>,
              '__weakref__': <attribute '__weakref__' of 'LineItem' objects>,
              '__doc__': None})

In [9]:
LineItem.__dict__["weight"].__dict__

{'storage_name': '_Quantity#0'}

## LineItem Take #5: A New Descriptor Type
We’ll then rewrite Quantity and implement NonBlank by inheriting from Validated and just coding the validate methods.

In [10]:
import abc

class AutoStorage:
    __counter = 0
    def __init__(self):
        cls = self.__class__
        prefix = cls.__name__
        index = cls.__counter
        self.storage_name = '_{}#{}'.format(prefix, index)
        cls.__counter += 1

    def __get__(self, instance, owner):
        if instance is None:
            return self
        else:
            return getattr(instance, self.storage_name)
 
    def __set__(self, instance, value):
        setattr(instance, self.storage_name, value)


In [11]:
class Validated(abc.ABC, AutoStorage):
    def __set__(self, instance, value):
        value = self.validate(instance, value)
        super().__set__(instance, value)
    
    @abc.abstractmethod
    def validate(self, instance, value):
        """ return validated value or raise ValueError """


In [12]:
class Quantity(Validated):
    """a number greater than zero"""
    def validate(self, instance, value):
        if value <= 0:
            raise ValueError('value must be > 0')
        return value


class NonBlank(Validated):
    """a string with at least one non-space character"""
    def validate(self, instance, value):
        value = value.strip()
        if len(value) == 0:
            raise ValueError('value cannot be empty or blank')
        return value


We could have these descriptors sitting in a separate module and then import them... here we just use them from within the notebook.

In [14]:
class LineItem:
    description = NonBlank()  # description must be a non-blank string
    weight = Quantity()       # the quantities must be > 0
    price = Quantity()
    def __init__(self, description, weight, price):
        self.description = description
        self.weight = weight
        self.price = price
    
    def subtotal(self):
        return self.weight * self.price


## Overriding Versus Nonoverriding Descriptors
Recall that there is an important asymmetry in the way Python handles attributes. Reading an attribute through an instance normally returns the attribute defined in the instance, but if there is no such attribute in the instance, a class attribute will be retrieved. On the other hand, assigning to an attribute in an instance normally creates the attribute in the instance, without affecting the class at all.


In [1]:
def cls_name(obj_or_cls):
    cls = type(obj_or_cls)
    if cls is type:
        cls = obj_or_cls
    return cls.__name__.split('.')[-1]


In [2]:
def display(obj):
    cls = type(obj)
    if cls is type:
        return '<class {}>'.format(obj.__name__)
    elif cls in [type(None), int]:
        return repr(obj)
    else:
        return '<{} object>'.format(cls_name(obj))


In [4]:
def print_args(name, *args):
    pseudo_args = ', '.join(display(x) for x in args)
    print('-> {}.__{}__({})'.format(cls_name(args[0]), name, pseudo_args))


In [5]:
class Overriding:
    """a.k.a. data descriptor or enforced descriptor"""
    def __get__(self, instance, owner):
        print_args('get', self, instance, owner)

    def __set__(self, instance, value):
        print_args('set', self, instance, value)


In [6]:
class OverridingNoGet:
    """an overriding descriptor without ``__get__``"""
    def __set__(self, instance, value):
        print_args('set', self, instance, value)


In [7]:
class NonOverriding:
    """a.k.a. non-data or shadowable descriptor"""
    def __get__(self, instance, owner):
        print_args('get', self, instance, owner)


In [8]:
class Managed:
    over = Overriding()
    over_no_get = OverridingNoGet()
    non_over = NonOverriding()
    def spam(self):
        print('-> Managed.spam({})'.format(display(self)))


## Overriding Descriptor
A descriptor that implements the __set__ method is called an overriding descriptor, because although it is a class attribute, a descriptor implementing __set__ will override attempts to assign to instance attributes.

In [9]:
obj = Managed()
obj.over

-> Overriding.__get__(<Overriding object>, <Managed object>, <class Managed>)


In [10]:
Managed.over

-> Overriding.__get__(<Overriding object>, None, <class Managed>)


In [11]:
obj.over = 7

-> Overriding.__set__(<Overriding object>, <Managed object>, 7)


In [12]:
obj.over

-> Overriding.__get__(<Overriding object>, <Managed object>, <class Managed>)


In [13]:
obj.__dict__['over'] = 8  # bypassing the descriptor by setting a value directly to obj.__dict__

In [14]:
vars(obj)

{'over': 8}

In [15]:
obj.over

-> Overriding.__get__(<Overriding object>, <Managed object>, <class Managed>)


## Overriding Descriptor Without __get__
Usually, overriding descriptors implement both __set__ and __get__, but it’s also possible to implement only __set__, as we saw in Example 20-1. In this case, only writing is handled by the descriptor. Reading the descriptor through an instance will return the descriptor object itself because there is no __get__ to handle that access. 

In [16]:
obj.over_no_get

<__main__.OverridingNoGet at 0x22bc1be0588>

In [17]:
Managed.over_no_get

<__main__.OverridingNoGet at 0x22bc1be0588>

In [18]:
obj.over_no_get = 7

-> OverridingNoGet.__set__(<OverridingNoGet object>, <Managed object>, 7)


In [19]:
obj.over_no_get

<__main__.OverridingNoGet at 0x22bc1be0588>

In [20]:
obj.__dict__['over_no_get'] = 9 

In [21]:
obj.over_no_get

9

In [22]:
obj.over_no_get = 7

-> OverridingNoGet.__set__(<OverridingNoGet object>, <Managed object>, 7)


In [24]:
obj.over_no_get  # descriptor is shadowed as long as there is a namesake instance attribute.

9

## Nonoverriding Descriptor
If a descriptor does not implement __set__, then it’s a nonoverriding descriptor.  Setting
an instance attribute with the same name will shadow the descriptor, rendering it ineffective for handling that attribute in that specific instance. Methods are implemented as nonoverriding descriptors. 

In [26]:
obj = Managed()

In [27]:
obj.non_over

-> NonOverriding.__get__(<NonOverriding object>, <Managed object>, <class Managed>)


In [28]:
obj.non_over = 7

In [29]:
obj.non_over  # shadows the namesake descriptor attribute in the Managed class.

7

In [30]:
Managed.non_over

-> NonOverriding.__get__(<NonOverriding object>, None, <class Managed>)


In [31]:
del obj.non_over

In [32]:
obj.non_over

-> NonOverriding.__get__(<NonOverriding object>, <Managed object>, <class Managed>)


Python contributors and authors use different terms when discussing these concepts. Overriding descriptors are also called data descriptors or enforced descriptors. Nonoverriding descriptors are also known as nondata descriptors or shadowable descriptors.


## Overwriting a Descriptor in the Class

In [33]:
obj = Managed()

In [34]:
Managed.over = 1
Managed.over_no_get = 2
Managed.non_over = 3

In [35]:
obj.over, obj.over_no_get, obj.non_over

(1, 2, 3)

## Methods Are Descriptors
A function within a class becomes a bound method because all user-defined functions have a __get__ method, therefore they operate as descriptors when attached to a class.

In [36]:
obj = Managed()

In [37]:
obj.spam

<bound method Managed.spam of <__main__.Managed object at 0x0000022BC1BE0CF8>>

In [38]:
Managed.spam

<function __main__.Managed.spam(self)>

In [39]:
obj.spam = 7

In [40]:
obj.spam

7

Because functions do not implement __set__, they are nonoverriding descriptors.

As usual with descriptors, the __get__ of a function returns a reference to itself when the access happens through the managed class. But when the access goes through an instance, the __get__ of the function returns a bound method object: a callable that wraps the function and binds the managed instance (e.g., obj) to the first argument of the function (i.e., self)

In [44]:
import collections

class Text(collections.UserString):
    def __repr__(self):
        return 'Text({!r})'.format(self.data)

    def reverse(self):
        return self[::-1]


In [45]:
word = Text('forward')

In [46]:
word

Text('forward')

In [47]:
word.reverse()

Text('drawrof')

In [48]:
type(Text.reverse), type(word.reverse)

(function, method)

In [49]:
Text.reverse.__get__(word)

<bound method Text.reverse of Text('forward')>

In [50]:
Text.reverse.__get__(None, Text)

<function __main__.Text.reverse(self)>

In [51]:
word.reverse

<bound method Text.reverse of Text('forward')>

In [52]:
word.reverse.__self__

Text('forward')

In [53]:
word.reverse.__func__ is Text.reverse

True

The way functions are turned into bound methods is a prime example of how descriptors are used as infrastructure in the language.

## Descriptor Usage Tips
-  Use property to Keep It Simple.
-  Read-only descriptors require __set__.  The __set__ method of a read-only attribute
should just raise AttributeError with a suitable message.
-  In a descriptor designed only for validation, the __set__ method should check the
value argument it gets, and if valid, set it directly in the instance __dict__ using
the descriptor instance name as key. That way, reading the attribute with the same
name from the instance will be as fast as possible.
-  Caching can be done efficiently with __get__ only.
-  Nonspecial methods can be shadowed by instance attributes

***