# Item 47: Use `__getattr__`, `__getattribute__`, and `__setattr__` for Lazy Attributes

Python's `object` hooks make it easy to write generic code for glueing systems together. In Python, the code that connects Python objects to a database, for example, doesn't need to explicitly specify the schema of the records; it can be generic.

In [1]:
# Python makes the above mentioned dynamic behavior possible with the __getattr special method. If a
# class defines __getattr__, that method is called every time an attribute can't be found in an object's
# instance dictionary
class LazyRecord:
    def __init__(self):
        self.exists = 5

    def __getattr__(self, name):
        value = f'Value for {name}'
        setattr(self, name, value)
        return value

In [2]:
# Here, we access the missing property foo. This causes Python to call the __getattr__ method, which
# mutates the instance dictionary __dict__
data = LazyRecord()
print('Before:', data.__dict__)
print('foo:   ', data.foo)
print('After: ', data.__dict__)

Before: {'exists': 5}
foo:    Value for foo
After:  {'exists': 5, 'foo': 'Value for foo'}


In [3]:
# Logging added to LazyRecord class. Note how we call super().__getattr__ to use the super class's 
# implementation of __getattr__ in order to fetch the real property value and avoid infinite recursion
class LoggingLazyRecord(LazyRecord):
    def __getattr__(self, name):
        print(f'* Called __getattr__ ({name!r}), populating instance dictionary')
        result = super().__getattr__(name)
        print(f'* Returning {result!r}')
        return result

data = LoggingLazyRecord()
print('exists:     ', data.exists)
print('First foo:  ', data.foo)
print('Second foo: ', data.foo)

exists:      5
* Called __getattr__ ('foo'), populating instance dictionary
* Returning 'Value for foo'
First foo:   Value for foo
Second foo:  Value for foo


This behavior of calling `__getattr__` whenever an attribute (which is not already in the instance dictionary) needs to be populated into the instance dictionary, is helpful for use cases like lazily accessing schemaless data. `__getattr__` runs once to do the hard work of loading a property; all subsequent accesses retrieve the existing result.

Python has another `object` hook called `__getattribute__` which is called *every* time an attribute is accessed on an object, even in cases where it *does* exist in the attribute dictionary.

**Note**: this operation can cause overhead and negatively impact performance, but sometimes it's worth it.

In [5]:
class ValidatingRecord:
    def __init__(self):
        self.exists = 5

    def __getattribute__(self, name):
        print(f'* Called __getattribute__({name!r})')
        try:
            value = super().__getattribute__(name)
            print(f'* Found {name!r}, returning {value!r}')
            return value
        except AttributeError:
            value = f'Value for {name}'
            print(f'* Setting {name!r} to {value!r}')
            setattr(self, name, value)
            return value

data = ValidatingRecord()
print('exists:     ', data.exists)
print('First foo:  ', data.foo)
print('Second foo: ', data.foo)

* Called __getattribute__('exists')
* Found 'exists', returning 5
exists:      5
* Called __getattribute__('foo')
* Setting 'foo' to 'Value for foo'
First foo:   Value for foo
* Called __getattribute__('foo')
* Found 'foo', returning 'Value for foo'
Second foo:  Value for foo


In [8]:
# In the event that a dynamically accessed property shouldn't exist, we can raise an AttributeError to
# cause Python's standard missing property behavior for both __getattr__ and __getattribute__
class MissingPropertyRecord:
    def __getattr__(self, name):
        if name == 'bad_name':
            raise AttributeError(f'{name} is missing')

data = MissingPropertyRecord()
data.bad_name

AttributeError: bad_name is missing

In [9]:
# Python code implementing generic functionality often relies on the hasattr and getattr built-in 
# functions, which also look at the instance dictionary for an attribute name before calling __getattr__
data = LoggingLazyRecord() # Implements __getattr__
print('Before:         ', data.__dict__)
print('Has first foo:  ', hasattr(data, 'foo'))
print('After:          ', data.__dict__)
print('Has second foo: ', hasattr(data, 'foo'))

Before:          {'exists': 5}
* Called __getattr__ ('foo'), populating instance dictionary
* Returning 'Value for foo'
Has first foo:   True
After:           {'exists': 5, 'foo': 'Value for foo'}
Has second foo:  True


In [10]:
# In the example above, __getattr__ is called only once. In contrast, classes that implement 
# __getattribute__ have that method called each time hasattr or getattr is used with an instance
data = ValidatingRecord()  # Implements __getattribute__
print('Has first foo:  ', hasattr(data, 'foo'))
print('Has second foo:  ', hasattr(data, 'foo'))

* Called __getattribute__('foo')
* Setting 'foo' to 'Value for foo'
Has first foo:   True
* Called __getattribute__('foo')
* Found 'foo', returning 'Value for foo'
Has second foo:   True


Python also provides the `__setattr__` `object` hook that lets us intercept arbitrary attribute assignments. This method is always called every timean attribute is assigned on an instance (eithe rdirectly or through the `setattr` built-in function).

In [11]:
class SavingRecord:
    def __setattr__(self, name, value):
        # Save some data for the record
        # ...
        super().__setattr__(name, value)

In [12]:
# Here, we define a logging subclass of SavingRecord. Its __setattr__ method is always called on each
# attribute assignment
class LoggingSavingRecord(SavingRecord):
    def __setattr__(self, name, value):
        print(f'* Called __setattr__ ({name!r}, {value!r})')
        super().__setattr__(name, value)

data = LoggingSavingRecord()
print('Before:  ', data.__dict__)
data.foo = 5
print('After:   ', data.__dict__)
data.foo = 7
print('Finally: ', data.__dict__)

Before:   {}
* Called __setattr__ ('foo', 5)
After:    {'foo': 5}
* Called __setattr__ ('foo', 7)
Finally:  {'foo': 7}


The proble with `__getattribute__` and `__setattr__` is that the're called in every attribute access for an object, even when we may not want that to happen.

In [13]:
# For example, say that we want attribute accesses on my object to actually look up keys in an
# associated dictionary
class BrokenDictionaryRecord:
    def __init__(self, data):
        self._data = {}

    def __getattribute__(self, name):
        print(f'* Called __getattribute__({name!r})')
        return self._data[name]

In [14]:
# This requires accessing self._data from the __getattribute__ method. However, if we actually do that,
# Python will recurse until it reaches its stack limit, and then it'l die
data = BrokenDictionaryRecord({'foo': 3})
data.foo

* Called __getattribute__('foo')
* Called __getattribute__('_data')
* Called __getattribute__('_data')
* Called __getattribute__('_data')
* Called __getattribute__('_data')
* Called __getattribute__('_data')
* Called __getattribute__('_data')
* Called __getattribute__('_data')
* Called __getattribute__('_data')
* Called __getattribute__('_data')
* Called __getattribute__('_data')
* Called __getattribute__('_data')
* Called __getattribute__('_data')
* Called __getattribute__('_data')
* Called __getattribute__('_data')
* Called __getattribute__('_data')
* Called __getattribute__('_data')
* Called __getattribute__('_data')
* Called __getattribute__('_data')
* Called __getattribute__('_data')
* Called __getattribute__('_data')
* Called __getattribute__('_data')
* Called __getattribute__('_data')
* Called __getattribute__('_data')
* Called __getattribute__('_data')
* Called __getattribute__('_data')
* Called __getattribute__('_data')
* Called __getattribute__('_data')
* Called __getattribut

RecursionError: maximum recursion depth exceeded while calling a Python object

In [16]:
# The problem is that __getattribute__ accesses self._data, which causes __getattribute__ to run again,
# which accesses self._data again, and so on. The solution is to use the super().__getattribute__
# method to fetch values from the instace attribute dictionary. This avoids recursion
class DictionaryRecord:
    def __init__(self, data):
        self._data = data

    def __getattribute__(self, name):
        print(f'* Called __getattribute__({name!r})')
        data_dict = super().__getattribute__('_data')
        return data_dict[name]

data = DictionaryRecord({'foo': 3})
print('foo: ', data.foo)


* Called __getattribute__('foo')
foo:  3


`__setattr__` methods that modify attributes on an object also need to use `super().__seattr__` accordingly.