# Descriptors

## Overview

A **descriptor** is a mechanism that can be used to customize what happens when you reference a **class** or **instance attribute**. 

Normally, Python just gets and sets values on attributes without any special processing. It’s just basic storage. Sometimes, however, you might want to do more. 
- You might need to validate the value that’s being assigned to an attribute. 
- You may want to retrieve a value and cache it for later use, so that future references don’t have all the overhead, 
- etc.

These are all things that would normally need to be done with a method, but if you’ve already started with a basic attribute, changing to a method would require changing all the code that uses the attribute to use a method call instead. This potential change is a primary motivation for typical Java programs to always use methods even for basic attribute access. In Java, the common pattern is to have all attributes private, and provide public access through methods, simply to accommodate potential future changes in the internals of that attribute access.

Python’s descriptors are an alternative approach. Instead of starting with methods all the time, you can start with basic attributes and write all the code you want. Then, if you ever need advanced processing to occur when you access those attributes, you can just add in a descriptor to do the work, without updating all the other code.

Descriptors are a powerful, general purpose protocol. They are the mechanism behind **properties**, **methods**, **static methods**, **class methods**, and **super()**.

## How to create a descriptor

A **descriptor** is a Python object that is assigned as an **attribute** (data attribute or method) of a class. 

This object is an instance of a class that provides special methods (described below), and the attribute it is assigned to is the one that will have the special processing. So the actual extra code will be inside the descriptor’s class, rather than the class it will be assigned to.

The methods a descriptor should provide are:
(*note that `obj` below refers to the object where the attribute was accessed, and `type` is the class where the descriptor was assigned as an attribute*):

**`__get__(self, obj, owner=None)`**
Called when the attribute of the owner class or its instance is accessed. It returns the value of the attribute. If the attribute is illegal, it can throw a corresponding exception like `ValueError`. If the attribute does not exist, it will report something like `AttributeError`.

**`__set__(self, obj, value)`**
Used to set the property’s values, `None` will be returned.

**`__delete__(self, obj)`**
Controls the deletion of attributes; `None` will be returned.

**`__set_name__(self, owner, name)`**
Called when the owner class is created. This allows descriptor instances to access their own name as defined in the owner class.

Any object that defines any of the above methods (not necessarily all of them) will implement the ***descriptor protocol***, and it will be considered as a ***descriptor***.

If an object defines both `__get__()` and `__set__()`, it is considered a **data descriptor**.

Descriptors that only define `__get__()` are called **non-data descriptors** (they are typically used for methods but other uses are possible).

**Note**: to make a **read-only data descriptor**, define both `__get__()` and `__set__()` with the `__set__()` raising an **`AttributeError`** when called. Defining the `__set__()` method with an exception raising placeholder is enough to make it a data descriptor.

**Note**: descriptors are standard classes that just implement a specific set of methods, they can also contain anything else used on standard Python classes. For instance, you can define `__init__()` on a descriptor class, so that you can customize descriptors with individual attributes.


### How are attributes accessed

The default behavior for attribute access is to get, set, or delete the attribute from an object's dictionary. For instance, `a.x` has a lookup chain starting with  `a.__dict__['x']`, then `type(a).__dict__['x']`, and continuing through the base classes of `type(a)` excluding metaclasses.

If the looked-up value is an object defining one of the descriptor methods, then Python may override the default behavior and invoke the descriptor method instead. Where this occurs in the precedence chain depends on which descriptor methods were defined.

Data and non-data descriptors differ in how overrides are calculated with respect to entries in an instance's dictionary:
- If an instance's dictionary has an entry with the same name as a data descriptor, the data descriptor takes precedence.
- If an instance's dictionary has an entry with the same name as a non-data descriptor, the dictionary entry takes precedence.



## Descriptor Example

The following code creates a class whose objects are data descriptors which print a message for each attribute access.

Every Python object has a namespace that’s separate from the namespace of its class, so that each object can have different values attached to it. Normally, the object’s attributes are a direct pass-through to this namespace, but descriptors short-circuit that process. 

Thankfully, Python allows another way to access the object’s namespace directly: the **`__dict__`** attribute of the object. The `__dict__` attribute, is a standard Python dictionary containing mappings for the various values attached to it.

The `__dict__` attribute is, for a descriptor, the simplest way to store a value . 

In the descriptor, we have to assign the value to the dictionary using a name, and the only way we know what name to use is to supply it explicitly. For this example, the constructor takes a required `name` argument, which will be used for the dictionary’s key.

**Note**: if the value being retrieved isn’t set, the expected behavior is to raise an `AttributeError`.

In [7]:
class SimpleDescriptor(object):
    def __init__(self, name):
        self.name = name
    def __get__(self, instance, cls):
        print(f"Call to get: {self.name}")
        if self.name not in instance.__dict__:
            raise AttributeError(self.name)
        return instance.__dict__[self.name]
    def __set__(self, instance, value):
        print(f"Call to set: {self.name}->{value}")
        instance.__dict__[self.name] = value
        
class Point:
    x=SimpleDescriptor("x")
    y=SimpleDescriptor("y")
    def __init__(self, x, y):
        self.x=x
        self.y=y
    def __repr__(self):
        return f"<{self.x},{self.y}>"
    
p1=Point(10,20)
p1.x = p1.x + 2
print("p1.y is:", p1.y)
print("p1 is:", p1)

Call to set: x->10
Call to set: y->20
{'__module__': '__main__', 'x': <__main__.SimpleDescriptor object at 0x0000000005733E10>, 'y': <__main__.SimpleDescriptor object at 0x0000000005733E80>, '__init__': <function Point.__init__ at 0x000000000573A0D0>, '__repr__': <function Point.__repr__ at 0x000000000573A7B8>, '__dict__': <attribute '__dict__' of 'Point' objects>, '__weakref__': <attribute '__weakref__' of 'Point' objects>, '__doc__': None}
Call to get: x
Call to set: x->12
Call to get: y
p1.y is: 20
p1 is: Call to get: x
Call to get: y
<12,20>


## Invoking Descriptors

Descriptors are assigned to class attributes, and the special methods are called automatically when the attribute is accessed (the method used depends on what type of access is being performed).

For example, `obj.d` looks up `d` in the dictionary of `obj`. If `d` defines the method `__get__()`, then  `d.__get__(obj)` is invoked according to the precedence rules listed below.

The details of invocation depend on whether `obj` is an object or a class.

For objects, the machinery is in `object.__getattribute__()` which transforms `obj.d` into `type(obj).__dict__['d'].__get__(obj, type(obj))`. 

The implementation works through a precedence chain that gives data descriptors priority over instance variables, instance variables priority over non-data descriptors, and assigns lowest priority to `__getattr__()` if provided.

For classes, the machinery is in `type.__getattribute__()` which transforms `AClass.d` into `AClass.__dict__['d'].__get__(None, AClass)`.

The important points to remember are:

- descriptors are invoked by the `__getattribute__()` method
- overriding `__getattribute__()` prevents automatic descriptor calls
- `object.__getattribute__()` and `type.__getattribute__()` make different calls to `__get__()`.
- data descriptors always override instance dictionaries.
- non-data descriptors may be overridden by instance dictionaries.

The object returned by `super()` also has a custom `__getattribute__()` method for invoking descriptors. 

If a class `C` inherits from `B` and `A` (`class C(B,A): ...`), the call `super(B,obj).m()` searches `obj.__class__.__mro__` for the base class `A` immediately following `B` and then returns `A.__dict__['m'].__get__(obj, A)`. 

If it is not a descriptor, `m` is returned unchanged. If not in the dictionary, `m` reverts to a search using `object.__getattribute__()`.

The details above show that the mechanism for descriptors is embedded in the `__getattribute__()` methods for object, type, and super. 

Classes inherit this machinery when they derive from object or if they have a meta-class providing similar functionality. 

Likewise, classes can turn-off descriptor invocation by overriding `__getattribute__()`.

## Python Features implemented by descriptors

Built-in functions `classmethod()`, `staticmethod()`, `property()`, and `functools.cached_property()` are all implemented as descriptors.

A property attribute is one that triggers method calls when accessed. Property is implemented as a data descriptor.  
A property has an easier interface than a descriptor and a different abstraction. The methods are typically defined in the same class in which the attribute resides. 

There are two ways to link methods to a property: 

1. using the `property(fget=None, fset=None, fdel=None, doc=None)` built-in function; 
2. using decorators `@property`, `@x.setter` and `@x.deleter` where x is the name of the property and also the name of the methods so decorated. 

`property()` returns a property object that implements the descriptor protocol. It uses the parameters `fget`, `fset` and `fdel` for the actual implementation of the three correponding methods of the protocol.

## Python Descriptors in Methods and Functions

The magic that transforms `obj.method(*args)` call into `method(obj, *args)` is inside a `__get__()` implementation of the function object that is, in fact, a non-data descriptor. 

In particular, the function object implements `__get__()` so that it returns a bound method when you access it with dot notation. The (`*args`) that follow invoke the functions by passing all the extra arguments needed.

This works for regular instance methods just like it does for class methods or static methods. So, if you call a static method with `obj.method(*args)`, then it’s automatically transformed into `method(*args)`. Similarly, if you call a class method with `obj.method(type(obj), *args)`, then it’s automatically transformed into `method(type(obj), *args)`.

In the official docs, you can find some examples of how `staticmethod()` and `classmethod()` would be implemented if they were written in pure Python instead of the actual C implementation. 

Static methods return the underlying function without changes. Calling either `obj.f` or `class.f` is the equivalent of a direct lookup into `object.__getattribute__(obj, "f")` or `object.__getattribute__(class, "f")`. As a result, the function becomes identically accessible from either an object or a class.

For instance, a possible `staticmethod()` implementation could be this:

`class StaticMethod:
    def __init__(self, f):
        self.f = f
    def __get__(self, obj, objtype=None):
        return self.f
    def __call__(self, *args, **kwds):
        return self.f(*args, **kwds)
`

One more time a non-data descriptor is used.
