### Descriptors

As we discussed earlier, Python instance properties (using `@property` for example) are based on Python descriptors which are simply classes that implement the **descriptor protocol**.

The protocol is comprised of the following special methods - not all are required.
- `__get__`: used to retrieve the property value
- `__set__`: used to store the property value (we'll see where we can do this in a bit)
- `__del__`: delete a property from the instance
- `__set_name__`: new to Python 3.6, we can use this to capture the property name as it is being defined in the owner class (the class where the property is defined).

There are two types of descriptors we need to distingush as I explain in the video lecture:
- non-data descriptors: these are descriptors that only implement `__get__` (and optionally `__set_name__`)
- data descriptors: these implement the `__set__` method, and normally, also the `__get__` method.

As we'll see in a bit, functions in Python actually implement the (non-data) descriptor protocol as well - indeed that is how instance methods work!

Let's start with a quick example first:

In [13]:
from datetime import datetime

class TimeUTC:
    def __get__(self, instance, owner_class):
        return datetime.utcnow().isoformat()

So `TimeUTC` is a **non-data descriptor** since it only implements `__get__`.

In [14]:
class Logger:
    current_time = TimeUTC()

In [15]:
l = Logger()

In [16]:
l.current_time

'2019-03-13T18:59:49.435411'

As we discussed in the lecture, the `__get__` method will know what instance (if any) was used to call it, as well as the class that owns the instance of `TimeUTC` (the descriptor instance).

This information is passed to `__get__` when it gets called:

In [18]:
class TimeUTC:
    def __get__(self, instance, owner_class):
        print('__get__ called', instance, owner_class)
        return datetime.utcnow().isoformat()

In [19]:
class Logger:
    current_time = TimeUTC()

When accessing the `current_time` attribute from the class:

In [20]:
Logger.current_time

__get__ called None <class '__main__.Logger'>


'2019-03-13T19:17:25.700536'

and as you can see, `instance` was `None`, while `owner_class` is the `Logger` class defined in our global scope (not an instance of `Logger`, but the class itself)

Now let's create an instance of the `Logger` class:

In [21]:
l = Logger()

In [23]:
hex(id(l))

'0x10a93a860'

and call `current_time` from the instance:

In [22]:
l.current_time

__get__ called <__main__.Logger object at 0x10a93a860> <class '__main__.Logger'>


'2019-03-13T19:19:09.859898'

and as you can see here, `instance` was the instance `l`, and `owner_class` is still the same `Logger` class.

Often we choose to return the descriptor instance when called from the class, like so:

In [24]:
class TimeUTC:
    def __get__(self, instance, owner_class):
        if not instance:
            return self
        return datetime.utcnow().isoformat()

In [25]:
class Logger:
    current_time = TimeUTC()

So now we have:

In [26]:
Logger.current_time

<__main__.TimeUTC at 0x10a93ab00>

the instance of the `TimeUTC` class, but when called from an instance:

In [27]:
l = Logger()
l.current_time

'2019-03-13T20:03:39.600736'

we get the time string returned instead.

Looking at data descriptors now we're going to implement a `__set__` method

We'll need to decide where to store the attribute value - naively we'll store it directly in the descriptor itself (spoiler alert - that won't work the way we probably want it to work):

In [44]:
class IntegerValue:
    
    def __set__(self, instance, value):
        self.value = int(value)
        
    def __get__(self, instance, owner_class):
        if not instance:
            return self
        return self.value

In [45]:
class Point2D:
    x = IntegerValue()
    y = IntegerValue()

In [46]:
p1 = Point2D()

In [47]:
p1.x = 100.1

In [48]:
p1.x

100

Ok, that works...

In [49]:
p1.y = 200.2

In [50]:
p1.y

200

That seems to work too...

But now let's create a second point:

In [51]:
p2 = Point2D()

In [52]:
p2.x = 1.1

In [53]:
p2.x

1

In [54]:
p1.x

1

So, although we were aiming to modify the `x` value on `p2` we ended up modifying it on `p1` as well - this is because both `p1` and `p2` share the same **class level** instance of `IntegerValue`.

As you can see, when we "store" data we need to be mindful of the **instance** we are storing the data for - otherwise if we just store the data in the descriptor instance, insce all instances of our class (`Point2D`) share the same instance of the descriptor, we are essentially working with a **class** level property, not an instance property (which is how the `@property` descriptor works - it creates **instance** properties).

Since we know the instance we are dealing with in both the `__get__` and `__set__` methods, we could easily use the instance dictionary to store the attribute value:

In [63]:
class IntegerValue:
    
    def __set__(self, instance, value):
        instance.value = int(value)
        
    def __get__(self, instance, owner_class):
        if not instance:
            return self
        return instance.value

In [64]:
class Point2D:
    x = IntegerValue()
    y = IntegerValue()

In [65]:
p1 = Point2D()
p2 = Point2D()

In [69]:
p1.x = 1.1
p2.x = 10.1

In [70]:
p1.x, p2.x

(1, 10)

And let's see what's in our instance dictionaries:

In [73]:
p1.__dict__

{'value': 1}

As you can see we used `value` to store the value for `x` in that instance.

Where's the value for `y` going to get stored?

Yeah, in `value` as well!

In [74]:
p1 = Point2D()
p1.x = 10.1

In [75]:
p1.__dict__

{'value': 10}

In [77]:
p1.y = 20.2

In [78]:
p1.__dict__

{'value': 20}

In [79]:
p1.y

20

that looks good, but:

In [81]:
p1.x

20

That's not so good - we overwrote the value for `x`.

Ok, no big deal, we just have to give a different name to the dictionary key...

How do we modify this code so that every instance of `IntegerValue` uses a different symbol for storage?? 

In [82]:
class IntegerValue:
    
    def __set__(self, instance, value):
        instance.value = int(value)
        
    def __get__(self, instance, owner_class):
        if not instance:
            return self
        return instance.value

Even if we figure out a way to do this, or if we only used a single `IntegerValue` in our class:

In [83]:
class Point1D:
    x = IntegerValue()

In [88]:
p1 = Point1D()
p1.x = 100.1

In [89]:
p1.x

100

In [90]:
p2 = Point1D()
p2.x = 200.2

In [91]:
p2.x

200

In [92]:
p1.x

100

That works, but what if we had defined our `Point1D` class using slots?

In [94]:
class Point1D:
    __slots__ = 'origin', 
    x = IntegerValue()
    
    def __init__(self, origin):
        self.origin = origin

In [95]:
p1 = Point1D(0)

In [96]:
p1.x = 100.1

AttributeError: 'Point1D' object has no attribute 'value'

Right, we cannot assign to `value` because it is not defined in `__slots__` and we don't have a `__dict__` available.

We could solve this in one of two ways: add `value` to slots, or add `__dict__` to slots:

In [97]:
class Point1D:
    __slots__ = 'origin', 'value'
    x = IntegerValue()
    
    def __init__(self, origin):
        self.origin = origin
    

In [99]:
p = Point1D(0)
p.x = 100.1

In [100]:
p.x

100

In [101]:
class Point1D:
    __slots__ = 'origin', '__dict__'
    x = IntegerValue()
    
    def __init__(self, origin):
        self.origin = origin

In [102]:
p1 = Point1D(0)
p1.x = 100.1
p1.x

100

But this is really not a very user-friendly thing to do to users of our data descriptor!

There are some other ways of doing this, where we do not use the instances to store the data but instead store the values somewhere else (and retrieve them from that same place).

We could try to use a dictionary for that - assuming our instances are hashable:

In [103]:
class IntegerValue:
    def __init__(self):
        self.data = {}
        
    def __set__(self, instance, value):
        self.data[instance] = int(value)
        
    def __get__(self, instance, owner_class):
        if not instance:
            return self
        return self.data.get(instance)

In [104]:
class Point2D:
    x = IntegerValue()
    y = IntegerValue()

In [105]:
p1 = Point2D()
p2 = Point2D()

In [106]:
p1.x = 10.1
p1.y = 20.2
p2.x = 100.1
p2.y = 200.2

In [107]:
p1.x, p1.y

(10, 20)

In [108]:
p2.x, p2.y

(100, 200)

But we have a potential for memory leaks!

Let's look at this closer.

In [129]:
p = Point2D()
p.x = 12345

So we have a number of objects at play here:

- the `p` instance 
- the instance of `IntegerValue` we are using for the `x` property

In [130]:
p_id = id(p)
x_attrib_id = id(Point2D.x)

In [131]:
hex(p_id), hex(x_attrib_id)

('0x10a95cc18', '0x10a95a588')

Now, let's delete `p` - this should remove all the references to that object, so it should get garbage collected (through reference counting)

In [132]:
del p

`p` is no longer in our global namespace:

In [133]:
'p' in globals()

False

But what about the data descriptor instance? It **still** has a reference to that object as a **key** in it's dictionary!

In [134]:
x_data_descriptor = Point2D.x

In [135]:
x_data_descriptor.data

{<__main__.Point2D at 0x10a95aeb8>: 10,
 <__main__.Point2D at 0x10a95acf8>: 100,
 <__main__.Point2D at 0x10a95c438>: 100,
 <__main__.Point2D at 0x10a95cc18>: 12345}

As you can see we have an entry for our "deleted" point still in that dictionary! The object still exists, and we have a memory leak.
Let's make sure the object actually still exists:

In [137]:
point = list(x_data_descriptor.data.keys())[-1]

In [138]:
point.x

12345

and in fact, the `id` will match our original `id`:

In [139]:
id(point), p_id

(4472556568, 4472556568)

### Weak References

We know that with reference counting, objects are garbage collected only when all references to them are gone. This is the problem we just saw.

What we need is a way to hold a reference to an object without "affecting" the reference count - or at least letting Python know that although we have a referfence to an object, we don't want our reference to "count" in it's reference counting.

This is called a **weak reference** - as opposed to a normal or **strong** reference.

In [321]:
class Person:
    def __init__(self, name):
        self.name = name

In [337]:
p = Person('Guido')

Now we can read and write the `name` property of `p`:

In [338]:
p.name

'Guido'

In [339]:
p.name = 'Alex'

In [340]:
p.name

'Alex'

We can set up a second (strong) reference to our object `p`:

In [341]:
p2 = p

In [342]:
id(p2), id(p)

(4472791224, 4472791224)

And now, even though we delete `p`:

In [343]:
del p

In [344]:
'p' in globals()

False

The object still exists, as it is referenced by `p2`:

In [345]:
p2.name

'Alex'

Let's try creating a weak reference instead:

In [346]:
p = Person('Raymond')
p2 = weakref.ref(p)

In [349]:
hex(id(p)), hex(id(p2))

('0x10a996a20', '0x10a995ae8')

Ah, not the same id!

In fact, that weakref does have a (weak) reference to `p`, but it is pointing (indirectly) to the same object:

In [348]:
p2

<weakref at 0x10a995ae8; to 'Person' at 0x10a996a20>

We can get the original object back by calling the value returned from `ref`:

In [353]:
hex(id(p2()))

'0x10a996a20'

In [354]:
p2().name

'Raymond'

So now we have two references to the `Person` instance.

What's the reference count?

In [355]:
import ctypes

ctypes.c_long.from_address(id(p)).value

1

We can get the number of weak references, using the `getweakrefcount` from the `weakref` module:

In [360]:
weakref.getweakrefcount(p)

1

Now let's delete our original strong reference (`p`) - that should leave us with only a single weak reference:

In [361]:
del p

In [362]:
'p' in globals()

False

What happened to our weak reference?

In [365]:
p2, p2()

(<weakref at 0x10a995ae8; dead>, None)

As you can see, the original object is `dead` - concise terminology I guess :-), and calling `p2()` returns `None`.

Using the strong reference count is not accurate since we don't know what's stored at that memory address anymore. But the weak reference tells us that the original object is now gone (note that `dead` really means the object is no longer usable - but because of the non-deterministic nature of the garbage collector, the actual may actually hang around for a while until it is actually destroyed. From our viewpoint though, the object is gone, and the memory it was holfin on to, will, eventually, be released.

This can be really useful for avoiding memory leaks in our data descriptors - instead of using a dictionary that contains strong references to our instances as the keys, we can use weak references instead - that way if the original object goes away (the instance that contains the property value), we can be assured that the object will be garbage collected - so no memory leak.

Not every object in Python supports wesk references. Dictionaries do not for example, neither do tuple or ints. But custom classes do, and that's really all we're interested in here. Just be aware of that, and read up the Python docs on `weakref` if you want more info.

There is a lot of functionality in the `weakref` module, more than we really need for this course, but there is one more I want to discuss: the `WeakKeyDictionary`.

It works like a standard dictionary, but all the keys are stored as weak references to the key object, instead of strong references we would have with a standard dictionary.

In [419]:
p1 = Person('Guido')
p2 = Person('Raymond')
p3 = Person('Mark')
p4 = Person('Alex')

In [420]:
d = weakref.WeakKeyDictionary()
d[p1] = 'Guido'
d[p2] = 'Raymond'
d[p3] = 'Mark'
d[p4] = 'Alex'

id_p1 = id(p1)
print(hex(id_p1))

0x10a9ab400


In [421]:
ctypes.c_long.from_address(id_p1).value

1

In [422]:
weakref.getweakrefcount(p1)

1

In [423]:
list(d.keyrefs())

[<weakref at 0x10a97ab38; to 'Person' at 0x10a9ab400>,
 <weakref at 0x10a97a9f8; to 'Person' at 0x10a9ab5c0>,
 <weakref at 0x10a97a098; to 'Person' at 0x10a9ab630>,
 <weakref at 0x10a97a868; to 'Person' at 0x10a9ab668>]

In [424]:
hex(id(p1))

'0x10a9ab400'

Now, what happens if we destroy `p1` for example, by remoiving the only strong reference we have to it:

In [425]:
del p1

In [426]:
list(d.keyrefs())

[<weakref at 0x10a97a9f8; to 'Person' at 0x10a9ab5c0>,
 <weakref at 0x10a97a098; to 'Person' at 0x10a9ab630>,
 <weakref at 0x10a97a868; to 'Person' at 0x10a9ab668>]

As you can see, `p1` has been removed from the `WeakKeyDictionary` automatically. That's handy too.

We can access elements in the `WeakKeyDictionary` using the original objects:

In [427]:
d[p2]

'Raymond'

In summary, it is very difficult to do low-level operations like reference counting and so on in Python - it was not really built to expose these inner workings to us, the Python developer. Just understand the difference between weak and strong references, and trust Python to do its memory management correctly - of course you need to be aware of the memory leak traps like the one we encountered with data descriptors.

### Data Descriptors and Weak References

Now that we understand weak references and WeakKeyDictionaries, let's go back to our data descriptor example, and use weak references to insure no memory leaks.

In [429]:
from weakref import WeakKeyDictionary

class IntegerValue:
    def __init__(self):
        self.data = WeakKeyDictionary()
        
    def __set__(self, instance, value):
        self.data[instance] = int(value)
        
    def __get__(self, instance, owner_class):
        if not instance:
            return self
        return self.data.get(instance)

In [430]:
class Point2D:
    x = IntegerValue()
    y = IntegerValue()

In [431]:
p1 = Point2D()
p2 = Point2D()

p1.x = 10.1
p1.y = 20.2

p2.x = 100.1
p2.y = 200.2

In [433]:
p1.x, p1.y

(10, 20)

In [434]:
p2.x, p2.y

(100, 200)

In [436]:
list(Point2D.x.data.keyrefs())

[<weakref at 0x10a990598; to 'Point2D' at 0x10a9abe48>,
 <weakref at 0x10a9a0458; to 'Point2D' at 0x10a9abe10>]

In [438]:
list(Point2D.y.data.keyrefs())

[<weakref at 0x10a9a02c8; to 'Point2D' at 0x10a9abe48>,
 <weakref at 0x10a9a06d8; to 'Point2D' at 0x10a9abe10>]

And now, if we delete one of our points:

In [439]:
del p1

In [441]:
list(Point2D.x.data.keyrefs())

[<weakref at 0x10a9a0458; to 'Point2D' at 0x10a9abe10>]

In [442]:
list(Point2D.y.data.keyrefs())

[<weakref at 0x10a9a06d8; to 'Point2D' at 0x10a9abe10>]

As you can observe our weak dict no longer has weak references to the point `p1` - because the object `p1` was referencing was garbage collected, and this was picked up by the weak dict.

So, this technique will work well, but there is still one slight issue.

Our `Point2D` class was hashable - but what happens if it is not?

In [448]:
class Point2D:
    x = IntegerValue()
    y = IntegerValue()
        
    def __eq__(self, other):
        if isinstance(other, Point2D):
            return self.x == other.x and self.y == other.y
        return False

In [449]:
p1 = Point2D()

In [450]:
p1.x = 10.1

TypeError: unhashable type: 'Point2D'

The problem is that we are trying to make a non-hashable object a key in a dictionary - and that obviously cannot work, not even with weak key dictionaries!

So, we'll need some kind of work-around.

If we consider what we need to do to keep the attribute values for each point instances: need a way to look up a value for a specific object. But two things:
1. we need to keep weak references to the object only
2. we cannot use the object directly as a key in a dictionary

How about just using the `id` of the object?

In [451]:
class IntegerValue:
    def __init__(self):
        self.data = {}
        
    def __set__(self, instance, value):
        self.data[id(instance)] = int(value)
        
    def __get__(self, instance, owner_class):
        if not instance:
            return self
        return self.data.get(id(instance))

In [452]:
class Point2D:
    x = IntegerValue()
    y = IntegerValue()

In [455]:
p1 = Point2D()
p1.x = 10.1
p1.y = 20.2

p2 = Point2D()
p2.x = 100.1
p2.y = 200.2

In [456]:
p1.x, p1.y, p2.x, p2.y

(10, 20, 100, 200)

So that seems to work just fine. But let's look at the dictionaries used to store the values in the data descriptors before and after we finalize the objects:

In [457]:
Point2D.x.data, Point2D.y.data

({4475660944: 10, 4475658760: 10, 4475662064: 100},
 {4475660944: 20, 4475658760: 20, 4475662064: 200})

In [458]:
del p1
del p2

In [459]:
Point2D.x.data, Point2D.y.data

({4475660944: 10, 4475658760: 10, 4475662064: 100},
 {4475660944: 20, 4475658760: 20, 4475662064: 200})

As you can see the dictionary does not get cleaned up.
That's probably not going to be a problem in practice, but there is always a chance of a new object getting the same id as an old one still present in the dictionary. Although it won't affect the setter (it will just replace the old value), we could end up with a value for a getter for a property that has not been set yet.

Although unlikely, it would not be good practice to allow for that possibility.

Somehow we need to clean up the dictionary when the objects are finalized.

There are a number of ways we could do this, but remember how the `weakref.ref` is a callable that returns `None` if the object it was pointing to has been finalized?

We can use that to our advantage - instead of just storing the value in our dictionary, we are going to store both the weakref and the value. Whenever a value is requested, we'll first make sure the weakref is not `None`.

In [460]:
class IntegerValue:
    def __init__(self):
        self.data = {}
        
    def __set__(self, instance, value):
        key = id(instance)
        weak_ref = weakref.ref(instance)
        self.data[key] = weak_ref, int(value)
        
    def __get__(self, instance, owner_class):
        if not instance:
            return self
        key = id(instance)
        weak_ref, value = self.data.get(key, (None, None))
        if value is not None and weak_ref is not None:
            # key present in dictionary
            # but object has ben garbage collected
            # remove it from the dictionary
            del self.data[key]
        return self.data.get(key)

In [464]:
class Point2D:
    x = IntegerValue()
    y = IntegerValue()

In [465]:
p1 = Point2D()

In [466]:
p1.x = 10.1

In [467]:
Point2D.x.data

{4472819840: (<weakref at 0x10a99c278; to 'Point2D' at 0x10a99d080>, 10)}

In [468]:
del p1

In [470]:
Point2D.x.data

{4472819840: (<weakref at 0x10a99c278; dead>, 10)}

So, if we ever get another point object with the same memory address we should be fine.
However, we are still leaving "dead" entries behind in the dictionary.

This lazy deletion could be a problem in certain circumstances.

We could do this another way yet, using a feature the `weakref.ref` class has: a callback feature for when the weakly referenced object is about to get finalized.

Let's see how it works first

In [496]:
p = Point2D()

In [497]:
def finalize_callback(weak_ref):
    print('object being finalized...')
    print(hex(id(weak_ref)), weak_ref())

In [498]:
p_weak = weakref.ref(p, finalize_callback)

In [499]:
del p

object being finalized...
0x10ac5fea8 None


Nice!! So let's use that technique instead:

In [512]:
class IntegerValue:
    def __init__(self):
        self.data = {}
        
    def __set__(self, instance, value):
        key = id(instance)
        weak_ref = weakref.ref(instance, self.instance_finalizer)
        self.data[key] = weak_ref, int(value)
        
    def __get__(self, instance, owner_class):
        if not instance:
            return self
        return self.data.get(id(instance))
    
    def instance_finalizer(self, weak_ref):
        # now we need to find the key corresponding to that weak_ref
        # unfortunately we do not have the object being finalized
        # so we have to do a reverse lookup from the weak refs 
        # stored in the data dictionary
        reverse_lookup = [key for key, value in self.data.items()
                          if value[0] is weak_ref]
        if reverse_lookup:
            # key found
            print('Cleaning up weak ref entry!')
            key = reverse_lookup[0]
            del self.data[key]

In [513]:
class Point2D:
    x = IntegerValue()
    y = IntegerValue()

In [514]:
p = Point2D()
p.x = 10.1

In [515]:
del p

Cleaning up weak ref entry!


In [516]:
Point2D.x.data

{}

So now this also covers working with non-hashable objects.

### __set_name__ Descriptor Method

Let's first see how it works:

In [525]:
class Descriptor:
    def __set_name__(self, owner_class, name):
        print(f'getting property name: {name}')
        self.name = name
        
    def __set__(self, instance, value):
        print(f'set called for property {self.name}')
        
    def __get__(self, instance, owner_class):
        print(f'get called for property {self.name}')

In [526]:
class MyClass:
    my_attrib = Descriptor()

getting property name: my_attrib


See how the `__set_name__` method was called immediately after the descriptor was created?

And now we have that name captured in the descriptor instances themselves:

In [527]:
obj = MyClass()

In [528]:
obj.my_attrib

get called for property my_attrib


In [530]:
obj.my_attrib = 100

set called for property my_attrib


### Property Value Lookup Resolution

In [566]:
class DataDescriptor:
    def __get__(self, instance, owner_class):
        print('using __get__')
    
    def __set__(self, instance, value):
        print('using __set__')

In [567]:
class MyClass:
    prop = DataDescriptor()

In [568]:
m = MyClass()

In [569]:
m.prop

using __get__


In [570]:
m.prop = 100

using __set__


In [571]:
m.__dict__

{}

In [572]:
m.__dict__['prop'] = 'Some value'

In [573]:
m.__dict__

{'prop': 'Some value'}

In [574]:
m.prop

using __get__


In [575]:
m.prop = 100

using __set__


So, with data descriptors, the `__get__` and `__set__` methods are always called (by default anyways - that can be overrideen if desired)

Now let's see how it works with non-data descriptors:

In [576]:
class NonDataDescriptor:
    def __get__(self, instance, owner_class):
        print('using __get__')

In [577]:
class MyClass:
    prop = NonDataDescriptor()

In [578]:
m = MyClass()

In [579]:
m.prop

using __get__


In [580]:
m.__dict__['prop'] = 100

In [581]:
m.__dict__

{'prop': 100}

In [582]:
m.prop

100

Aha, that returns the value stored in the instance dictionary - did not call the `__get__` method!

What about setting via dotted notation:

In [584]:
m.prop = 200

In [586]:
m.__dict__

{'prop': 200}

Ok, so that went straight to the instance dictionary too - but that should be expected, our descriptor class does not implement a `__set__` method.

How about getting the property using the `getattr` function?

In [587]:
getattr(m, 'prop')

200

That also uses the dictionary.

One last variation, what happens if we have a non-data descriptor and we try to set the same property name using dotted notation instead of using `__dict__`:

In [593]:
m = MyClass()

In [594]:
m.__dict__

{}

In [595]:
m.prop = 100

In [596]:
m.__dict__

{'prop': 100}

And now of course, when we try to retrieve `prop` the instance dictionary will take precendence:

In [597]:
m.prop

100

So the basics here is:

##### Data Descriptors
- always uses the descriptor instance

##### Non-Data Descriptors
- looks in the instance dictionary first
- falls back to the descriptor instance

### Validation using Descriptors

So now here is a typical example of how we can leverage data descriptors in a re-useable fashion:

In [592]:
from numbers import Integral

class IntegerRange:
    def __init__(self, min_value, max_value):
        self.min_value = min_value
        self.max_value = max_value
        
    def __set_name__(self, owner_class, name):
        self.name = name
        
    def __get__(self, instance, owner_class):
        return instance.__dict__[self.name]
    
    def __set__(self, instance, value):
        if not isinstance(value, Integral):
            raise ValueError(f'{self.name}: must be an integer.')
        value = int(value)
        if self.min_value <= value <= self.max_value:
            instance.__dict__[self.name] = value
        else:
            raise ValueError(f'{self.name}: must be between '
                             f'{self.min_value} and {self.max_value}.')
        

In [552]:
class Person:
    age = IntegerRange(0, 150)

In [553]:
p = Person()

In [554]:
p.age = "10"

ValueError: age: must be an integer.

In [555]:
p.age = 200

ValueError: age: must be between 0 and 150.

In [556]:
p.age = 18

In [557]:
p.age

18

In [558]:
p.age = -10

ValueError: age: must be between 0 and 150.

### Exercise: property and Descriptors

Now that we understand descriptors, let's go back to that `property` and the related decorators we used for quick and simple property definitions in our classes:

In [610]:
from numbers import Integral

class Person:
    @property
    def age(self):
        return self._age
    
    @age.setter
    def age(self, value):
        if not isinstance(value, Integral):
            raise ValueError('age: must be an integer')
        value = int(value)
        if value < 0:
            raise ValueError('age: must be a non-negative integer.')
        self._age = value

In [611]:
p = Person()

In [612]:
p.age = -10

ValueError: age: must be a non-negative integer.

In [613]:
p.age = 10

In [614]:
p.age

10

What the property (and the decorators) do is essentially create all the (in this case data) descriptor for us.

Recall how the non-decorator version worked:

In [615]:
class Person:
    def get_age(self):
        return self._age
    
    def set_age(self, value):
        if not isinstance(value, Integral):
            raise ValueError('age: must be an integer')
        value = int(value)
        if value < 0:
            raise ValueError('age: must be a non-negative integer.')
        self._age = value
        
    age = property(fget=get_age, fset=set_age)

In [616]:
p = Person()

In [617]:
p.age = 10

In [618]:
p.age

10

In [619]:
p.age = -10

ValueError: age: must be a non-negative integer.

Let's see how we might create our own version of `property` - it's a good exercise and will help solidy our understanding of data descriptors. For simplicity we'll omit support for `__del__` - but iot works the same way.

First let's deal with the non-decorator version of `property`:

In [624]:
class MakeProperty:
    def __init__(self, fget=None, fset=None):
        self.fget = fget
        self.fset = fset
        
    def __set_name__(self, owner_class, name):
        self.name = name
        
    def __get__(self, instance, owner_class):
        if not instance:
            return self
        if not self.fget:
            raise AttributeError(f'{self.name} is not readable.')
        return self.fget(instance)
    
    def __set__(self, instance, value):
        if not self.fset:
            raise AttributeError(f'{self.name} is not writable.')
        self.fset(instance, value)

This is enough to make a generic property creator:

In [629]:
class Person:
    def get_name(self):
        return self._name
    
    def set_name(self, value):
        self._name = value
        
    name = MakeProperty(fget=get_name, fset=set_name)   

In [626]:
p = Person()

In [627]:
p.name = 'Alex'

In [628]:
p.name

'Alex'

Now, let's handle the decorators. 

Remember how the `property` decorator works - the `@property` decorator is used to specify the getter, and whatever returns from that decorator must have a `setter` attribute we can use to decorate the setter method.

In [645]:
class MakeProperty:
    def __init__(self, fget=None, fset=None):
        self.fget = fget
        self.fset = fset
        
    def __get__(self, instance, owner_class):
        if not instance:
            return self
        if not self.fget:
            raise AttributeError('attribute is not readable.')
        return self.fget(instance)
    
    def __set__(self, instance, value):
        if not self.fset:
            raise AttributeError('attribute is not writable.')
        self.fset(instance, value)
        
    def getter(self, fget):
        self.fset = fset
        return self
    
    def setter(self, fset):
        self.fset = fset
        return self

In [686]:
class Person:
    @MakeProperty
    def name(self):
        return self._name
    
    @name.setter
    def name(self, value):
        self._name = value

In [687]:
p = Person()

In [688]:
p.name = 'Alex'

In [689]:
p.name

'Alex'

Of course our implementation is simplistic and omits things, like the `__doc__` attribute iof the original function for example, for example, or the `__delete__`, but this is the basic idea.

### Functions are Non-Data Descriptors

Functions as we know are objects. In fact, they implement the non-data descriptor protocol!

Yes, they have a `__get__` method. That's how functions defined in a class, actually become instance methods when called from class instances.

Let's see this:

In [1]:
class Person:
    def __init__(self, name):
        self.name = name
        
    def say_hello(self):
        return f'{self.name} says hello'

In [2]:
say_hello_func = Person.__dict__['say_hello']

In [4]:
say_hello_func

<function __main__.Person.say_hello(self)>

So, as we can see, the `Person` class contains an attribute in its instance dictionary for `say_hello` which is just a plain ordinary function.

A function is an object, and it has attributes too:

In [6]:
dir(say_hello_func)

['__annotations__',
 '__call__',
 '__class__',
 '__closure__',
 '__code__',
 '__defaults__',
 '__delattr__',
 '__dict__',
 '__dir__',
 '__doc__',
 '__eq__',
 '__format__',
 '__ge__',
 '__get__',
 '__getattribute__',
 '__globals__',
 '__gt__',
 '__hash__',
 '__init__',
 '__init_subclass__',
 '__kwdefaults__',
 '__le__',
 '__lt__',
 '__module__',
 '__name__',
 '__ne__',
 '__new__',
 '__qualname__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__setattr__',
 '__sizeof__',
 '__str__',
 '__subclasshook__']

Notice the `__get__` attribute?

We know what the `__get__` method looks like when it is called:
```
def __get__(self, instance, owner_class)
```

What's `self` in this case? The function itself (it is the descriptor)

What will `instance` be? The instance we are calling the function from.

What is the `owner_class`? It is the `Person` class in this case.

So let's call the **function** as if it were being called using a dotted notation.

First we create an instance of `Person`:

In [7]:
p = Person('Alex')

Now we can call the method this way:

In [8]:
p.say_hello()

'Alex says hello'

But this is the same as doing this:

In [13]:
say_hello_func.__get__(p, Person)()

'Alex says hello'

(Note that we do not specify `self` ourselves since we are calling the `__get__` function as a method already bound to `say_hello_func` when we write `say_hello_func.__get__`)

And that's how functions become "automatically" bound to the instance when calling them using dotted notation.

Remember how we programmed our own descriptors `__get__` when the instance was `None`? We just returned the descriptor itself.

In this case, the descriptor is the function `say_hello` defined in the `Person` class - so we can recover the descriptor from the class using dotted notation too:

In [15]:
Person.say_hello

<function __main__.Person.say_hello(self)>

This is the same as doing this:

In [16]:
Person.__dict__['say_hello'].__get__(None, Person)

<function __main__.Person.say_hello(self)>

In fact, just out of curiosity, let's try writing our own "function" class that is both callable and implements a non-data descriptor so we can bind it to instances.

In [67]:
from functools import partial

class CustomFunc:
    def __call__(self, instance, *args, **kwargs):
        # define the body of the "function" here...
        print('instance:', instance)
        print('args:', args)
        print('kwargs:', kwargs)
        return f'{instance.name} says hello!'
    
    def __get__(self, instance, owner_class):
        if instance is None:
            return self
        return partial(self.__call__, instance)

In [68]:
class Person:
    def __init__(self, name):
        self.name = name
        
    say_hello = CustomFunc()

In [69]:
Person.say_hello

<__main__.CustomFunc at 0x1089e7f60>

In [70]:
p = Person('Alex')

In [71]:
p.say_hello(1, 2, 3, a=1, b=2)

instance: <__main__.Person object at 0x1089e7c18>
args: (1, 2, 3)
kwargs: {'a': 1, 'b': 2}


'Alex says hello!'

Now don't write code like this!! Python provides us functions that implement the descriptor protocol, but I just wanted to show you how we could roughly approximate the same functionality using custom classes.

Nice to gain a better understanding of descriptors, but not at all practical!

### Exercise

Create two data descriptors to handle 
- an integer-only field, named `IntegerField`, with a min and max value (just like we did before)
- a string-only field, named `CharField`, with a min and max length

For simplicity assume this will only be used for objects (class instances) that have an available `__dict__` - in other words you can use it for instance storage.

After you have done that, use inheritance to create a base descriptor that can factor out the repetitive code from the two descriptors above.

Finally, as a small enhancement, make the `IntegerField` such that `min` and `max` can be unlimited.
For the `CharField` make it such that values can be assigned without a maximum length.

#### Solution

First let's write each data descriptor completely distinct from each other, then we'll determine what we could factor out into a base class.

In [107]:
from numbers import Integral

class IntegerField:
    def __init__(self, minimum, maximum):
        self.minimum = minimum
        self.maximum = maximum
        
    def __set_name__(self, owner_class, name):
        self.name = name
        
    def __set__(self, instance, value):
        if not isinstance(value, Integral):
            raise ValueError(f'{self.name} must be an integer number.')
        if value < self.minimum or value > self.maximum:
            raise ValueError(f'{self.name} must be between {self.minimum} and {self.maximum}')
        instance.__dict__[self.name] = value
        
    def __get__(self, instance, owner_class):
        if instance is None:
            return self
        return instance.__dict__[self.name]

Let's test this to make sure it works:

In [108]:
p = Person()
p.age = 10
print(p.age)

10


In [109]:
try:
    p.age = -100
except ValueError as ex:
    print(ex)

age must be between 0 and 200


In [110]:
try:
    p.age = 'Unknown'
except ValueError as ex:
    print(ex)

age must be an integer number.


OK, so now let's write the `CharField` descriptor:

In [133]:
class CharField:
    def __init__(self, min_length, max_length):
        self.min_length = min_length
        self.max_length = max_length
        
    def __set_name__(self, owner_class, name):
        self.name = name
    
    def __set__(self, instance, value):
        if not isinstance(value, str):
            raise ValueError(f'{self.name} must be a string.')
        if value is None or len(value) < self.min_length or len(value) > self.max_length:
            raise ValueError(f'{self.name} length must be between '
                             f'{self.min_length} and {self.max_length} chars long.')
        instance.__dict__[self.name] = value
        
    def __get__(self, instance, owner_class):
        if instance is None:
            return self
        return instance.__dict__[self.name]

In [134]:
class Person:
    age = IntegerField(0, 200)
    name = CharField(1, 20)
    
    def __init__(self, name, age):
        self.name = name
        self.age = age
        
    def __repr__(self):
        return f'Person(name={self.name}, age={self.age})'

In [135]:
p = Person('Alex', 18)
print(p)

Person(name=Alex, age=18)


In [137]:
try:
    p.name = ''
except ValueError as ex:
    print(ex)

name length must be between 1 and 20 chars long.


In [139]:
try:
    p.name = 'a' * 50
except ValueError as ex:
    print(ex)

name length must be between 1 and 20 chars long.


OK, so now let's see what code appears to be common to these two validators (and would likely be common with other validators too):

In [167]:
class IntegerField:
    def __init__(self, minimum, maximum):
        self.minimum = minimum
        self.maximum = maximum
        
    def __set_name__(self, owner_class, name):
        self.name = name
        
    def __set__(self, instance, value):
        if not isinstance(value, Integral):
            raise ValueError(f'{self.name} must be an integer number.')
        if value < self.minimum or value > self.maximum:
            raise ValueError(f'{self.name} must be between {self.minimum} and {self.maximum}')
        instance.__dict__[self.name] = value
        
    def __get__(self, instance, owner_class):
        if instance is None:
            return self
        return instance.__dict__[self.name]

In [168]:
class CharField:
    def __init__(self, min_length, max_length):
        self.min_length = min_length
        self.max_length = max_length
        
    def __set_name__(self, owner_class, name):
        self.name = name
    
    def __set__(self, instance, value):
        if not isinstance(value, str):
            raise ValueError(f'{self.name} must be a string.')
        if len(value) < self.min_length or len(value) > self.max_length:
            raise ValueError(f'{self.name} length must be between '
                             f'{self.min_length} and {self.max_length} chars long.')
        instance.__dict__[self.name] = value
        
    def __get__(self, instance, owner_class):
        if instance is None:
            return self
        return instance.__dict__[self.name]

As we can see we have commonalities int:
- `__set_name__` (exactly the same)
- `__get__` (exactly the same)
- `__set__` (validation tests are different, but storage mechanism is the same)

In [169]:
class ValidatorField:
    def __set_name__(self, owner_class, name):
        self.name = name
        
    def __get__(self, instance, owner_class):
        if instance is None:
            return self
        return instance.__dict__[self.name]
    
    def __set__(self, instance, value):
        instance.__dict__[self.name] = value

And we can now re-write our validators inheriting from this base `ValidatorField`:

In [170]:
class IntegerField(ValidatorField):
    def __init__(self, minimum, maximum):
        self.minimum = minimum
        self.maximum = maximum
        
    def __set__(self, instance, value):
        if not isinstance(value, Integral):
            raise ValueError(f'{self.name} must be an integer number.')
        if value < self.minimum or value > self.maximum:
            raise ValueError(f'{self.name} must be between {self.minimum} and {self.maximum}')
        super().__set__(instance, value)

In [171]:
class CharField(ValidatorField):
    def __init__(self, min_length, max_length):
        self.min_length = min_length
        self.max_length = max_length
        
    def __set__(self, instance, value):
        if not isinstance(value, str):
            raise ValueError(f'{self.name} must be a string.')
        if len(value) < self.min_length or len(value) > self.max_length:
            raise ValueError(f'{self.name} length must be between '
                             f'{self.min_length} and {self.max_length} chars long.')
        super().__set__(instance, value)

In [172]:
class Person:
    age = IntegerField(0, 200)
    name = CharField(1, 20)
    
    def __init__(self, name, age):
        self.name = name
        self.age = age
        
    def __repr__(self):
        return f'Person(name={self.name}, age={self.age})'      
    

In [173]:
p = Person('Alex', 18)

In [174]:
p

Person(name=Alex, age=18)

In [175]:
try:
    p.age = -10
except ValueError as ex:
    print(ex)

age must be between 0 and 200


In [176]:
try:
    p.name = ''
except ValueError as ex:
    print(ex)

name length must be between 1 and 20 chars long.


And of course, we can re-use these validators anywhere we need them!

Now let's make a few enhancements to make our validator a bit more useful.

We'll start with `CharField` and allow unlimited max length:

In [177]:
class CharField(ValidatorField):
    def __init__(self, min_length, max_length=None):
        self.min_length = min_length
        self.max_length = max_length
        
    def __set__(self, instance, value):
        if not isinstance(value, str):
            raise ValueError(f'{self.name} must be a string.')
        if (len(value) < self.min_length 
            or (self.max_length is not None and len(value) > self.max_length)):
            raise ValueError(f'{self.name} length must be between '
                             f'{self.min_length} and {self.max_length} chars long.')
        super().__set__(instance, value)

In [178]:
class Person:
    name = CharField(1)

In [179]:
p = Person()

In [180]:
p.name = 'a'

In [181]:
p.name = 'a'*10_000

In [182]:
try:
    p.name = ''
except ValueError as ex:
    print(ex)

name length must be between 1 and None chars long.


And now let's do something similar for `IntegerField`:

In [188]:
class IntegerField(ValidatorField):
    def __init__(self, minimum=None, maximum=None):
        self.minimum = minimum
        self.maximum = maximum
        
    def __set__(self, instance, value):
        if not isinstance(value, Integral):
            raise ValueError(f'{self.name} must be an integer number.')
        if ((self.minimum is not None and value < self.minimum) or 
            (self.maximum is not None and value > self.maximum)):
            raise ValueError(f'{self.name} out of bounds.')
        super().__set__(instance, value)

In [189]:
class Point2D:
    x = IntegerField(0)
    y = IntegerField(maximum=10)

In [195]:
p = Point2D()

In [196]:
p.x = 1_000_000

In [197]:
p.y = -1_000_000

In [198]:
try:
    p.x = -10
except ValueError as ex:
    print(ex)

x out of bounds.


In [199]:
try:
    p.y = 12
except ValueError as ex:
    print(ex)

y out of bounds.
