# Item 30: Consider @property Instead of Refactoring Attributes

- The built-in @porperty decoraton makes it easy for simple accesses of an instance's attributes to act smarter. One advanced but common use of @property is transitioning what was once a simple numerical attribute into an on-the-fly calcaulation. This is extremely helpful because it lets you migrate all existing usage of a class to have new behaviors without rewriting any of the call sites. It also provides an important stopgap for improving your interfaces over time.

- For example, say you want to implement a leaky bucket quota using plain Python obejcts. Here, the Bucket class represents how much quota remains and the duration for which the quota will be available:


In [7]:
from datetime import datetime
from datetime import timedelta

In [8]:
class Bucket(object):
    def __init__(self, period):
        self.period_delta = timedelta(seconds=period)
        self.reset_time = datetime.now()
        self.quota = 0
        
    def __repr__(self):
        return 'Bucket(quota=%d)' % self.quota

- The leaky bucket algorithm works by ensuring that, whenever the bucket is filled, the amount of quota does not carry over from one period to the next.

In [22]:
def fill(bucket, amount):
    now = datetime.now()
    if now - bucket.reset_time > bucket.period_delta:
        bucket.quota = 0
        bucket.reset_time = now
    bucket.quota += amount

- Each time a quota consumer wants to do something, it first must ensure that it can deduct the amount of quota it needs to use.


In [23]:
def deduct(bucket, amount):
    now = datetime.now()
    if now - bucket.reset_time > bucket.period_delta:
        return False
    if bucket.quota - amount < 0:
        return False
    bucket.quota -= amount
    return True

- To use this class, first I fill the bucket.

In [24]:
bucket = Bucket(60)
fill(bucket, 100)
print(bucket)

AssertionError: 

In [25]:
if deduct(bucket, 99):
    print('Had 99 quota')
else:
    print('Not enough for 99 quota')
print(bucket)

Not enough for 99 quota
Bucket(max_quota=0, quota_consumed=0)


- Eventually, I'm prevented from making progress because I try to deduct more quota than is available. In this case, the bucket's quota level remains unchanged.

In [26]:
if deduct(bucket, 3):
    print('Had 3 quota')
else:
    print('Not enough for 3 quota')
print(bucket)

Not enough for 3 quota
Bucket(max_quota=0, quota_consumed=0)


- The problem with this implementation is that I never know what quota level the bucket started with. The quota is deducted over the course of the period until it reaches zero. At that point, deduct will alwats return False. When that happens, it would be useful to know whether callers to deduct are being blocked because the Bucket ran out of quota or because the Bucket never had quota in the first place.

- To fix this, I can change the class to keep track of the max_quota issued in the period and the quota_consumed in the period.

In [29]:
class Bucket(object):
    def __init__(self, period):
        self.period_delta = timedelta(seconds=period)
        self.reset_time = datetime.now()
        self.max_quota = 0
        self.quota_consumed = 0
        
    def __repr__(self):
        return ('Bucket(max_quota=%d, quota_consumed=%d)' %
                (self.max_quota, self.quota_consumed))
    
    @property
    def quota(self):
        return self.max_quota - self.quota_consumed
    
    @quota.setter
    def quota(self, amount):
        delta = self.max_quota - amount
        if amount == 0:
            # Quota being reset for a new period
            self.quota_consumed = 0
            self.max_quota = 0
        elif delta < 0:
            # Quota being filled for the new period
            assert self.quota_consumed == 0
            self.max_quota = amount
        else:
            # Quota being consumed during the period
            assert self.max_quota >= self.quota_consumed
            self.quota_consumed += delta

- I use @property method to compute the current level of quota on-the-fly using these new attributes.

- When the quota attribute is assigned, I take special action matching the current interface of the class used by fill and deduct.

- Rerunning the demo code from above produces the same results.

In [30]:
bucket = Bucket(60)
print('Initial', bucket)
fill(bucket, 100)
print('Filled', bucket)

if deduct(bucket, 99):
    print('Had 99 quota')
else:
    print('Now enough for 99 quota')
    
print('Now', bucket)

if deduct(bucket, 3):
    print('Had 3 quota')
else:
    print('Now enough for 3 quota')
    
print('Still', bucket)    

Initial Bucket(max_quota=0, quota_consumed=0)
Filled Bucket(max_quota=100, quota_consumed=0)
Had 99 quota
Now Bucket(max_quota=100, quota_consumed=99)
Now enough for 3 quota
Still Bucket(max_quota=100, quota_consumed=99)


- The best part is that the code using Bucket.quota doesn't have to change or know that the class has changed. New usage of Bucket can do the right thing and access max_quota and quota_consumed directly.

- I especially like @property because it lets you make incremental progress toward a better data model over time. Reading the Bucket example above, you may have thought toyourself, "fill and deduct should have been implemented as instance methods in the fir place." Although you're probably right, in practice there are many situations in which objects start with poorly defined intergfaces or act as dumb data containers. This happens when code grows over time, scope increases, multiple authors contribute without anyone considering long-term hygiene, etc.

- @property is a toll to help you address problems you'll come across in real-world code. Don't overuse it. When you ffind yourself repeatedly extending @property methods,  it's probably time to refactoring your class instead of further paving over your code's poor design.

## Things to Remember 

- Use @property to give existing instance attributes new functionality
- Make incremental progress toward better data models by using @property
- Consider refactoring a class and all call sites when you find yourself using @property too heavily.

# Item 31: Use Descriptors for Reuasble @property Methods

- The big problem with the @properrty built-in is reuse. The methods it decorates can't be reused for multiple attributes of the same class. They also can't be reused by unrelated classes.

- For example, say you want a class to validate that the grade received by a student on a homework assignment is a percentage.

In [31]:
class Homework(object):
    def __init__(self):
        self._grade = 0
    @property 
    def grade(self):
        return self._grade
    
    @grade.setter
    def grade(self, value):
        if not (0 <= value <= 100):
            raise ValueError('Grade must be between 0 and 100')
        self._grade = value

In [32]:
galileo = Homework()
galileo.grade = 95

In [34]:
type(galileo)

__main__.Homework

- Say you also want to give the student a grade for an exam, where the exam has multiple subjects, each with a separate grade.

In [37]:
class Exam(object):
    def __init__(self):
        self._writing_grade = 0
        self._math_grade = 0
        
    @staticmethod
    def _check_grade(value):
        if not (0 <= value <= 100):
            raise ValueError('Grade must be between 0 and 100')
            
    @property 
    def writing_grade(self):
        return self._writing_grade
    
    @writing_grade.setter
    def writing_grade(self, value):
        self._check_grade(value)
        self._writing_grade = value
    
    @property
    def math_grade(self):
        return self._math_grade
    
    @math_grade.setter
    def math_grade(self, value):
        self._check_grade(value)
        self._math_grade = value

- This quickly gets tedious. Each section of the exam requires addinga new @property and related validation.

- Also, this approach is now general. If you want to reuse this percentage validation beyond homework and exams, you'd need to write the @property boilerplate and \_check\_grade repeatdly.

- The better way to do this in Python is to use a *descriptor*. The descriptor protocol defines how attribute access is interpreted by the language. A descriptor class can provide \_\_get\_\_ and \_\_set\_\_ methods that let you reuse the grade validation behavior without any boilerplate. For this purpose, descriptors are also better than mix-ins because they let you reuse the same logic for many different attributes in a single class.

- Here, I define a new class called Exam with class attributes that are Grade instances. The Grade class implements the descriptor protocol. Before I explain how the Grade class works, it's important to understand what Python will do when your code accesses such descriptor attibutes on an Exam instance.

In [50]:
class Grade(object):
    def __get__(*args, **kwargs):
        # ...
        pass
        
    def __set__(*args, **kwargs):
        # ...
        pass
    
class Exam(object):
    # Class attributes
    math_grade = Grade()
    writing_grade = Grade()
    science_grade = Grade()

- When you assign a property:

In [51]:
exam = Exam()
exam.writing_grade = 40

- it will be interpreted as:

In [41]:
Exam.__dict['writing_grade'].__set__(exam, 40)

AttributeError: type object 'Exam' has no attribute '__dict'

- When you retrieve a property:

In [54]:
print(exam.writing_grade)

None


- it will be interpreted as:

In [None]:
print(Exam.__dict__['writing_grade'].__get__(exam, Exam))

- What drives this behavior is the \_\_getattribute\_\_ methods of object. In short, when an Exam instance doesn't have an attribute named writing\_grade, Python will fall back to the Exam class's attribute instead. If this class attribute is an object that has \_\_get\_\_ and \_\_set\_\_ methods, Python will  assume you want to follow the sescriptor protocol.

- Knowing this behavior and how I used @property for grade validation in the Homework calss, here's a reasonable first attempt at implementing the Grade descriptor.

In [57]:
class Grade(object):
    def __init__(self):
        self._value = 0
        
    def __get__(self, instance, instance_type):
        return self._value
    
    def __set__(self, instance, value):
        if not (0 <= value <= 100):
            raise ValueError('Grade must be between 0 and 100')
        self._value = value
        
        
class Exam(object):
    # Class attributes
    math_grade = Grade()
    writing_grade = Grade()
    science_grade = Grade()

In [58]:
first_exam = Exam()
first_exam.writing_grade = 82
first_exam.science_grade = 99
print('Writing', first_exam.writing_grade)
print('Science', first_exam.science_grade)

Writing 82
Science 99


- But accessing these attributes on multiple Exam instances will have unexpected behavior.

In [59]:
second_exam = Exam()
second_exam.writing_grade = 75
print('Second', second_exam.writing_grade, 'is right')
print('First', first_exam.writing_grade, 'is wrong')

Second 75 is right
First 75 is wrong


- The problem is that a single Grade instance is shared across all Exam instances for the class attribute writing_grade. The Grade instance for this attibute is constucted once in the program lifetime when the Exam class is first defined, not each time an Exam instance is created.

- To solve this, I need the Grade class to keep track of its value for each unique Exam instance. I can do this by saving the per-instance state in a dictionary.

In [62]:
class Grade(object):
    def __init__(self):
        self._values = {}
        
    def __get__(self, instance, instance_type):
        if instance is None: return self
        return self._values.get(instance, 0)
    
    def __set__(self, instance, value):
        if not (0 <= value <= 100):
            raise ValueError('Grade must be between 0 and 100')
        self._values[instance] = value
        
        
class Exam(object):
    # Class attributes
    math_grade = Grade()
    writing_grade = Grade()
    science_grade = Grade()

- This implementation is simple and works well, but there's still one gotcha: It leaks memory. The \_values dictionary will hold a reference to every instance of Exam ever passed to \_\_set\_\_ over the lifetime of the program. This causes instances to never have their reference count go to zero, preventing cleanup by the garbage collector.

- To fix this, I can use Python's weakref built-in module, This module provides a special class called WeakKeyDictionary that can take the place of the simple dictionary used for \_values. The unique behavior of WeakKeyDictionary is that it will remove last remaining reference in the program. Python will do the bookkeeping for you and ensure that the \_values dictionary will be empty when all Exam instances are no longer in use.

In [65]:
from weakref import WeakKeyDictionary

class Grade(object):
    def __init__(self):
        self._values = WeakKeyDictionary()
        
    def __get__(self, instance, instance_type):
        if instance is None: return self
        return self._values.get(instance, 0)
    
    def __set__(self, instance, value):
        if not (0 <= value <= 100):
            raise ValueError('Grade must be between 0 and 100')
        self._values[instance] = value
        
        
class Exam(object):
    # Class attributes
    math_grade = Grade()
    writing_grade = Grade()
    science_grade = Grade()

- Using this implementatiion of the Grade descriptor, everything works as expected.

In [66]:
first_exam = Exam()
first_exam.writing_grade = 82
second_exam = Exam()
second_exam.writing_grade = 75
print('First', first_exam.writing_grade, 'is right')
print('Second', second_exam.writing_grade, 'is right')

First 82 is right
Second 75 is right


## Things to Remember

- Reuse the behavior and validation of @property methods by defining your own descriptor classes.
- Use WeakKeyDictionary to ensure that your descriptor classes don't cause memory leaks
- Don't get bogged down trying to understand exactly how \_\_getattribute\_\_ uses the descriptor protocol for getting and setting attributes.

# Item 32: Use \_\_getattr\_\_, \_\_getattribute\_\_, and \_\_setattr\_\_ for Lazy Attributes

- Python;s language hooks make it easy to write generic code for gluing systems together. For example, say you want to represent the rows of your database as Python objects. Your database has its schema set. Your code that uses objects corresponding to those rows must also know what your database looks like. However, in Python, the code that connects your Python objects to the database doesn't nees to know the scheman of your rows; it can be generic.

- How is that possible? Plain instance attributes, @property methods, and descriptor can't do this because they all need to be defined in advance. Python makes this dynamic behavior possible with the \_\_getattr\_\_ speical method. If your class defines \_\_getattr\_\_, that method is called every time an attribute can't be found in an object's instance dictionary.

In [69]:
class LazyDB(object):
    def __init__(self):
        self.exists = 5
        
    def __getattr__(self, name):
        value = 'Value for %s' % name
        setattr(self, name, value)
        return value

- Here, I access the missing property foo. This causes Python to call the \_\_getattr\_\_ method above, which mutates the instance dictionary \_\_dict\_\_:

In [70]:
data = LazyDB()
print('Before:', data.__dict__)
print('foo:   ', data.foo)
print('After: ', data.__dict__)

Before: {'exists': 5}
foo:    Value for foo
After:  {'exists': 5, 'foo': 'Value for foo'}


- Here, I add logging to LazyDB to show when \_\_getattr\_\_ is actually called. Note that I use super().\_\_getattr\_\_() to get the real property value in order to avoid infinite recursion.

In [72]:
class LoggingLazyDB(LazyDB):
    def __getattr__(self, name):
        print('Called __getattr__(%s)' % name)
        return super().__getattr__(name)
    
data = LoggingLazyDB()
print('exists:', data.exists)
print('foo:   ', data.foo)
print('foo:   ', data.foo)

exists: 5
Called __getattr__(foo)
foo:    Value for foo
foo:    Value for foo


In [73]:
print('foo:   ', data.foo)

foo:    Value for foo


- The exists attribute is present in the instance dictionary, so \_\_getattr\_\_ is never called for it. The foo attribute is not in the instance dictionary initailly, so \_\_getattr\_\_ is called the first time. But the call to \_\_getattr\_\_ for foo also does a setattr, which populates foo in the instance dictionary. This is why the second time I access foo there isn't a call to \_\_getattr\_\_.

- This behavior is especially helpfil for use cases like lazily accessing schemaless data. \_\_getattr\_\_ runs once to do the hard work of loading a property; all subsequent accesses retrieve the existing result.

- Say you also want transactions in this database system. The next time the user accesses a property, you want to know whether the corresponding row in the database is still valid and whether the transaction is still open. The \_\_getattr\_\_ hook won't let you do this reliably because it will use the object's instance dictionary as the fasy path for existing attibutes.

- To enable this use case, Python has another language hook called \_\_getattribute\_\_. This special mehtod is called every time an attribute is accessed on an object, even in cases where is *does* exist in the attribute dictionary. This enables you to do things like check global transaction state on every property access. Here, I define ValidatingDB to log each time \_\_getattribute\_\_ is called:

In [76]:
class ValidatingDB(object):
    def __init__(self):
        self.exists = 5
        
    def __getattribute__(self, name):
        print('Called __getattribute__(%s)' % name)
        try:
            return super().__getattribute__(name)
        except AttributeError:
            value = 'Value for %s' % name
            setattr(self, name, value)
            return value

In [78]:
data = ValidatingDB()
print('exists:', data.exists)
print('foo:   ', data.foo)
print('foo:   ', data.foo)

Called __getattribute__(exists)
exists: 5
Called __getattribute__(foo)
foo:    Value for foo
Called __getattribute__(foo)
foo:    Value for foo


- In the event that a dynamically accessed property shouldn'y exist, you can raise an AttributeError to cause Python's standard missing property behavior for both \_\_getattr\_\_ and \_\_getattribute\_\_.

In [81]:
class MissingPropertyDB(object):
    def __getattr__(self, name):
        if name == 'bad_name':
            raise AttributeError('%s is missing' % name)
    def __init__(self):
        self.exists = 5
        
    def __getattribute__(self, name):
        print('Called __getattribute__(%s)' % name)
        try:
            return super().__getattribute__(name)
        except AttributeError:
            value = 'Value for %s' % name
            setattr(self, name, value)
            return value

data = MissingPropertyDB()
data.bad_name

Called __getattribute__(bad_name)


'Value for bad_name'

- Python code implementing generic functionality often relies on the hasattr built-in function to determine when properties exist, and the getattr built-in function to retrieve property values. These functions also look in the instance dictionary for an attribute name before calling \_\_getattr\_\_

In [83]:
data = LoggingLazyDB()
print('Before:      ', data.__dict__)
print('foo exists:  ', hasattr(data, 'foo'))
print('After:       ', data.__dict__)
print('foo exists:  ', hasattr(data, 'foo'))

Before:       {'exists': 5}
Called __getattr__(foo)
foo exists:   True
After:        {'exists': 5, 'foo': 'Value for foo'}
foo exists:   True


- In the example above, \_\_getattr\_\_ is only called once, In contrast, classses that implement \_\_getattribute\_\_ will have that method called each time hasattr or getattr is run on an object.

In [84]:
data = ValidatingDB()
print('foo exists: ', hasattr(data, 'foo'))
print('foo exists: ', hasattr(data, 'foo'))

Called __getattribute__(foo)
foo exists:  True
Called __getattribute__(foo)
foo exists:  True


- Now, say you want to lazily push data back to the database when values are assigned to your Python object. You can do this with \_\_setattr\_\_, a similar language hook that lets you intercept arbitrary attribute assignments. Unlike retrieving an attribute with \_\_getattr\_\_ and \_\_getattribute\_\_, there's no need for two separate methods. The \_\_setter\_\_ method is always called every time an attribute is assigned on an instance(either directly or through the setattr built-in function).

In [1]:
class SavingDB(object):
    def __setattr__(self, name, value):
        # Save some data to the DB log
        # ...
        super().__setattr__(name, value)

- Here, I define a logging subclass of SavingDB. Its \_\_setattr\_\_ method is alwats called on each attribute assignment:

In [5]:
class LoggingSavingDB(SavingDB):
    def __setattr__(self, name, value):
        print('Called __setattr__(%s, %r)' % (name, value))
        super().__setattr__(name, value)


data = LoggingSavingDB()
print('Before: ', data.__dict__)
data.foo = 5
print('After:  ', data.__dict__)
data.foo = 7
print('Finally:', data.__dict__)


Before:  {}
Called __setattr__(foo, 5)
After:   {'foo': 5}
Called __setattr__(foo, 7)
Finally: {'foo': 7}


- The problem with \_\_getattribute\_\_ and \_\_setattr\_\_ is that they're called on every attribute access for an object, even when you may not want that to happen. For example, say you want attribute accesses on your object to actually look up keys in an associated dictionary.

In [6]:
class BrokenDictionaryDB(object):
    def __init__(self, data):
        self._data = {}
        
    def __getattribute__(self, name):
        print('Called __getattribute__(%s)' % name)
        return self._data[name]

- This requires accessing self.\_data from the \_\_getattribute\_\_ method. Howeber, if you actually try to do that, Python will recurse until it reaches its stack limit, and then it'll die.

In [7]:
data = BrokenDictionaryDB({'foo': 3})
data.foo

Called __getattribute__(foo)
Called __getattribute__(_data)
Called __getattribute__(_data)
Called __getattribute__(_data)
Called __getattribute__(_data)
Called __getattribute__(_data)
Called __getattribute__(_data)
Called __getattribute__(_data)
Called __getattribute__(_data)
Called __getattribute__(_data)
Called __getattribute__(_data)
Called __getattribute__(_data)
Called __getattribute__(_data)
Called __getattribute__(_data)
Called __getattribute__(_data)
Called __getattribute__(_data)
Called __getattribute__(_data)
Called __getattribute__(_data)
Called __getattribute__(_data)
Called __getattribute__(_data)
Called __getattribute__(_data)
Called __getattribute__(_data)
Called __getattribute__(_data)
Called __getattribute__(_data)
Called __getattribute__(_data)
Called __getattribute__(_data)
Called __getattribute__(_data)
Called __getattribute__(_data)
Called __getattribute__(_data)
Called __getattribute__(_data)
Called __getattribute__(_data)
Called __getattribute__(_data)
Called __g

RecursionError: maximum recursion depth exceeded in comparison

In [8]:
class DictionaryDB(object):
    def __init__(self, data):
        self._data = data
    
    def __getatttribute__(self, name):
        data_dict = super().__getattribute__('_data')
        return data_dict[name]

- Similarly, you'll need \_\_setattr\_\_ methods that modify attributes on an object to use super().\_\_setattr\_\_.

## Things to Remember

- Use \_\_getattr\_\_ and \_\_setattr\_\_ to lazily load and save attributes for an object.
- Undenstand that \_\_getattr\_\_ only gets called once when accessing a missing attributes, whereas \_\_getattribute\_\_ gets called every time an attribute is accessed.
- Avoid infinite recuresion in \_\_getattribute\_\_ and \_\_setattr\_\_ by using methods from super() (i.e., the object class) to access instance attributes directly.