# 01 - Descriptors

These are the underpinning mechanism for properties, methods, slots, and even functions!

Suppose we want a `Point2D` class whose coordinates must always be **integers**.

Since plain attributes cannot guarantee this, we typically use `property` with getter and setter methods. After adding multiple properties, we end up with a lot of very similar boiler-plate code..

What we would like is to have some class e.g. `IntegerValue` that defines these get and set methods according to how we want. Then, we could use a number of `IntegerValue` instances within our class and have them **bound** to our instance just like attributes.

The solution is the **descriptor protocol**.

There are 4 main methods that make up the protocol - not all are required.
- `__get__` -> `p.x`
- `__set__` -> `p.x = 100`
- `__delete__` -> `del p.x`
- `__set_name__` -> new in Python 3.6 - we'll come back to this later.

There are two types of descriptors:

- Non-data descriptors: Implement only `__get__` **only** (and optionally `__set_name__`).
- Data descriptors: Implement `__set__` and/or `__delete__` (and often `__get__`).

This distinction affects how Python accesses data.

#### Example 1

Let's create a simple non-data descriptor (don't worry about `instance` and `owner_class` for now):

In [10]:
from datetime import datetime

class TimeUTC:
    def __get__(self, instance, owner_class):
        return datetime.utcnow().isoformat()

So `TimeUTC` is a class that implements the `__get__` method only, and is therefore considered a non-data descriptor.

We can now use it to create properties in other classes:

In [11]:
class Logger:
    current_time = TimeUTC()

Note that `current_time` is a class attribute:

In [12]:
l = Logger()
l.current_time

'2024-08-05T14:09:20.516021'

This should seem quite odd. All `l.current_time` should do is return the (`repr` of our) `TimeUTC` instance. 

Instead it **calls** the `__get__` method.

This works when we access the class attribute through the class itself:

In [13]:
Logger.current_time

'2024-08-05T14:09:20.875894'

#### Example 2

Lets create a `Deck` class that will return a **random** card from 2 to Ace and a **random** suit from the 4 suits. 

Since both attributes, card and suit, are effectively making a random choice from an iterable, we can cut down on the repeated code using a descriptor.

We'll do this example both the traditional and the descriptor way to demonstrate their similarities/differences.

##### Traditional

In [14]:
from random import seed, choice

In [22]:
seed(0)

class Deck:
    @property
    def card(self):
        return choice(tuple('23456789JQKA') + ('10',))

    @property
    def suit(self):
        return choice(('Spade', 'Heart', 'Diamond', 'Club'))

d = Deck()

for _ in range(5):
    print(d.card, d.suit)

8 Club
2 Diamond
J Club
8 Diamond
9 Diamond


##### Non-data Descriptor

In [24]:
seed(0)

class Choice:
    def __init__(self, *choices):
        self.choices = choices

    def __get__(self, instance, owner_class):
        return choice(self.choices)

class Deck:
    card = Choice(*'23456789JQKA', '10')
    suit = Choice('Spade', 'Heart', 'Diamond', 'Club')

d = Deck()

for _ in range(5):
    print(d.card, d.suit)

8 Club
2 Diamond
J Club
8 Diamond
9 Diamond


So **non-data descriptors** are very similar to `property` without a lot of the boilerplate. 

**Non-data descriptors** are therefore very useful if we have several properties with very similar validation. The end result is a lot less code in the target class.

# 02 - Getters and Setters

In the previous subsection, we saw the two ways of accessing the attribute:

#### The `__get__` method

In [29]:
class TimeUTC:
    def __get__(self, instance, owner_class):
        return datetime.utcnow().isoformat()

class Logger:
    current_time = TimeUTC()

In [30]:
l.current_time, Logger.current_time

('2024-08-05T14:59:26.162925', '2024-08-05T14:59:26.162925')

When `__get__` was called, we may want to know 

- which **instance** was used or `None` if called from the class.
- what class owns the `TimeUTC` (descriptor) instance. In our case, it belongs to the `Logger` class.

This is why we have the **signature**: `__get__(self, instance, owner_class)`

(Note: A **signature** refers to the structure and components of a function or method definition, detailing how it can be called.)

These components are passed into the `__get__` method when it's called. This means we can control the return based on whether it was 
- called from the class.
- called from the instance.

Very often, we choose to:
- return the descriptor `TimeUTC` **instance** when called from the **class** (`Logger`). This gives us an easy handle to the descriptor instance.
- return the attribute **value** (`datetime.now()`) when called from an instance of the class (`l`).

In [31]:
class TimeUTC:
    def __get__(self, instance, owner_class):
        if instance is None:
            return self
        return datetime.utcnow().isoformat()

class Logger:
    current_time = TimeUTC()

In [32]:
l.current_time, Logger.current_time

('2024-08-05T14:59:28.312074', <__main__.TimeUTC at 0x2a476fbb6a0>)

Returning the descriptor instance when called from the class is consistent with how `property` works:

In [33]:
class Logger:
    @property
    def current_time(self):
        return datetime.utcnow().isoformat()

Logger.current_time

<property at 0x2a4771dd530>

So `property` **implements** the **descriptor protocol**. It might be a little easier to see if it's not used as a decorator:

In [37]:
class Logger:
    def current_time(self):
        return datetime.utcnow().isoformat()

    current_time = property(fget=current_time)

l = Logger()

Logger.current_time, l.current_time

(<property at 0x2a4771fc900>, '2024-08-05T15:32:03.094962')

#### Caveat

There's an important caveat with these descriptors that we'll soon see as less of an issue.

You'll notice that these descriptors are class attributes: `current_time = TimeUTC()`.

Since only one instance of `TimeUTC` is being made, all `Logger` instances will share this one instance. 

For this particular example where we only get a constant value, there's no issue.

But what if want to "store" and retrieve" instance-specific data using `__set__`? After all, setting a value should be specific to the instance. 

In fact, this is not an issue because both the `__get__` and `__set__` methods need to know the `instance` and we can use this information to store instance-specific data.

Let's demonstrate the issue with another example:

In [55]:
class Countdown:
    'non-data descriptor'
    def __init__(self, start):
        self.start = start + 1

    def __get__(self, instance, owner):
        if instance is None:
            return self
        self.start -= 1
        return self.start

In [56]:
class Rocket:
    countdown = Countdown(10)

In [57]:
rocket1 = Rocket()
rocket2 = Rocket()

In [58]:
rocket1.countdown, rocket1.countdown, rocket1.countdown, rocket1.countdown, rocket1.countdown,

(10, 9, 8, 7, 6)

In [59]:
rocket2.countdown

5

#### The `__set__` method

The signature is as follows: `__set__(self, instance, value)`
- `self`: this references the descriptor instance, just like we had for the `__get__` example (e.g. `TimeUTC()`).
- `instance`: the instance that the descriptor is *bound* to. This will be `None` if the descriptor was called from the class that the descriptor belongs to.
- `value`: the value we want to assign to the attribute.

Why is there no `owner_class` like we have in `__get__`?

Setters (and deleters) are **always** called from instances. We never want to set a class attribute from an instance, we only want to affect the instance.

In [60]:
class IntegerValue:    
    def __set__(self, instance, value):
        print(f'__set__ called, instance={instance}, value={value}')

class Point2D:
    x = IntegerValue()

p = Point2D()
p.x = 100

__set__ called, instance=<__main__.Point2D object at 0x000002A4773ACFA0>, value=100


The reason why I haven't elaborated on how to actually set the value for the instance is because its *not* straightforward.

Currently, we are suffering from the caveat outlined earlier where different instances are sharing the same `IntegerValue` instance.

We might naively think to set the value on the instance like `instance.x = value`, but we don't have access to the symbol `x`. So what symbol would we use? There's plenty of other issues to consider. 

The next subsection will explore the various solutions.

# 03 - Using as Instance Properties

# 04 - Strong and Weak References

# 05 - Back to Instance Properties

# 06 - The __set_name__ Method

# 07 - Property Lookup Resolution

# 08 - Properties and Descriptors

# 09 - Application - Example 1

# 10 - Application - Example 2

# 11 - Functions and Descriptors