Python’s language hooks make it easy to write generic code for gluing systems together.
For example, say you want to represent the rows of your database as Python objects. Your
database has its schema set. Your code that uses objects corresponding to those rows must
also know what your database looks like. However, in Python, the code that connects your
Python objects to the database doesn’t need to know the schema of your rows; it can be
generic.

How is that possible? Plain instance attributes, @property methods, and descriptors
can’t do this because they all need to be defined in advance. Python makes this dynamic
behavior possible with the __getattr__ special method. If your class defines
__getattr__, that method is called every time an attribute can’t be found in an object’s
instance dictionary.

In [1]:
import logging
from pprint import pprint
from sys import stdout as STDOUT


# Example 1
class LazyDB(object):
    def __init__(self):
        self.exists = 5

    def __getattr__(self, name):
        value = 'Value for %s' % name
        setattr(self, name, value)
        return value


# Example 2
data = LazyDB()
print('Before:', data.__dict__)
print('foo:   ', data.foo)
print('After: ', data.__dict__)

Before: {'exists': 5}
foo:    Value for foo
After:  {'exists': 5, 'foo': 'Value for foo'}


* I add logging to LazyDB to show when __getattr__ is actually called. 
>>>
Note that I use super().__getattr__() to get the real property value in order to avoid
infinite recursion.


In [2]:
# Example 3
class LoggingLazyDB(LazyDB):
    def __getattr__(self, name):
        print('Called __getattr__(%s)' % name)
        return super().__getattr__(name)

data = LoggingLazyDB()
print('exists:', data.exists)
print('foo:   ', data.foo)
print('foo:   ', data.foo)

exists: 5
Called __getattr__(foo)
foo:    Value for foo
foo:    Value for foo


This behavior is especially helpful for use cases like lazily accessing schemaless data.
__getattr__ runs once to do the hard work of loading a property; all subsequent
accesses retrieve the existing result.

Say you also want transactions in this database system. The next time the user accesses a
property, you want to know whether the corresponding row in the database is still valid
and whether the transaction is still open. The __getattr__ hook won’t let you do this
reliably because it will use the object’s instance dictionary as the fast path for existing
attributes.

To enable this use case, Python has another language hook called __getattribute__.
This special method is called every time an attribute is accessed on an object, even in
cases where it does exist in the attribute dictionary. This enables you to do things like
check global transaction state on every property access. Here, I define ValidatingDB
to log each time __getattribute__ is called:


In [3]:
# Example 4
class ValidatingDB(object):
    def __init__(self):
        self.exists = 5

    def __getattribute__(self, name):
        print('Called __getattribute__(%s)' % name)
        try:
            return super().__getattribute__(name)
        except AttributeError:
            value = 'Value for %s' % name
            setattr(self, name, value)
            return value

data = ValidatingDB()
print('exists:', data.exists)
print('foo:   ', data.foo)
print('foo:   ', data.foo)

Called __getattribute__(exists)
exists: 5
Called __getattribute__(foo)
foo:    Value for foo
Called __getattribute__(foo)
foo:    Value for foo


In the event that a dynamically accessed property shouldn’t exist, you can raise an
AttributeError to cause Python’s standard missing property behavior for both
__getattr__ and __getattribute__.



In [4]:
# Example 5
try:
    class MissingPropertyDB(object):
        def __getattr__(self, name):
            if name == 'bad_name':
                raise AttributeError('%s is missing' % name)
            value = 'Value for %s' % name
            setattr(self, name, value)
            return value

    data = MissingPropertyDB()
    data.foo  # Test this works
    data.bad_name
except:
    logging.exception('Expected')
else:
    assert False



ERROR:root:Expected
Traceback (most recent call last):
  File "<ipython-input-4-f121cf5c843b>", line 13, in <module>
    data.bad_name
  File "<ipython-input-4-f121cf5c843b>", line 6, in __getattr__
    raise AttributeError('%s is missing' % name)
AttributeError: bad_name is missing


Python code implementing generic functionality often relies on the hasattr built-in
function to determine when properties exist, and the getattr built-in function to
retrieve property values. These functions also look in the instance dictionary for an
attribute name before calling __getattr__.

In the example above, __getattr__ is only called once. In contrast, classes that
implement __getattribute__ will have that method called each time hasattr or
getattr is run on an object.

In [5]:
# Example 6
data = LoggingLazyDB()
print('Before:     ', data.__dict__)
print('foo exists: ', hasattr(data, 'foo'))
print('After:      ', data.__dict__)
print('foo exists: ', hasattr(data, 'foo'))


# Example 7
data = ValidatingDB()
print('foo exists: ', hasattr(data, 'foo'))
print('foo exists: ', hasattr(data, 'foo'))

Before:      {'exists': 5}
Called __getattr__(foo)
foo exists:  True
After:       {'exists': 5, 'foo': 'Value for foo'}
foo exists:  True
Called __getattribute__(foo)
foo exists:  True
Called __getattribute__(foo)
foo exists:  True


Now, say you want to lazily push data back to the database when values are assigned to
your Python object. You can do this with __setattr__, a similar language hook that
lets you intercept arbitrary attribute assignments. Unlike retrieving an attribute with
__getattr__ and __getattribute__, there’s no need for two separate methods.
The __setattr__ method is always called every time an attribute is assigned on an
instance (either directly or through the setattr built-in function).

Here, I define a logging subclass of SavingDB. Its __setattr__ method is always
called on each attribute assignment:


In [6]:
# Example 8
class SavingDB(object):
    def __setattr__(self, name, value):
        # Save some data to the DB log
        super().__setattr__(name, value)


# Example 9
class LoggingSavingDB(SavingDB):
    def __setattr__(self, name, value):
        print('Called __setattr__(%s, %r)' % (name, value))
        super().__setattr__(name, value)

data = LoggingSavingDB()
print('Before: ', data.__dict__)
data.foo = 5
print('After:  ', data.__dict__)
data.foo = 7
print('Finally:', data.__dict__)

Before:  {}
Called __setattr__(foo, 5)
After:   {'foo': 5}
Called __setattr__(foo, 7)
Finally: {'foo': 7}


The problem with __getattribute__ and __setattr__ is that they’re called on
every attribute access for an object, even when you may not want that to happen. For
example, say you want attribute accesses on your object to actually look up keys in an
associated dictionary.

This requires accessing self._data from the __getattribute__ method.
However, if you actually try to do that, Python will recurse until it reaches its stack limit,
and then it’ll die.



In [7]:
# Example 10
class BrokenDictionaryDB(object):
    def __init__(self, data):
        self._data = data

    def __getattribute__(self, name):
        print('Called __getattribute__(%s)' % name)
        return self._data[name]


# Example 11
try:
    data = BrokenDictionaryDB({'foo': 3})
    data.foo
except:
    logging.exception('Expected')
else:
    assert False

ERROR:root:Expected
Traceback (most recent call last):
  File "<ipython-input-7-7d6253e65def>", line 14, in <module>
    data.foo
  File "<ipython-input-7-7d6253e65def>", line 8, in __getattribute__
    return self._data[name]
  File "<ipython-input-7-7d6253e65def>", line 8, in __getattribute__
    return self._data[name]
  File "<ipython-input-7-7d6253e65def>", line 8, in __getattribute__
    return self._data[name]
  [Previous line repeated 651 more times]
  File "<ipython-input-7-7d6253e65def>", line 7, in __getattribute__
    print('Called __getattribute__(%s)' % name)
  File "/Users/godot/anaconda3/lib/python3.6/site-packages/ipykernel/iostream.py", line 350, in write
    is_child = (not self._is_master_process())
  File "/Users/godot/anaconda3/lib/python3.6/site-packages/ipykernel/iostream.py", line 285, in _is_master_process
    return os.getpid() == self._master_pid
RecursionError: maximum recursion depth exceeded in comparison


Called __getattribute__(foo)
Called __getattribute__(_data)
Called __getattribute__(_data)
Called __getattribute__(_data)
Called __getattribute__(_data)
Called __getattribute__(_data)
Called __getattribute__(_data)
Called __getattribute__(_data)
Called __getattribute__(_data)
Called __getattribute__(_data)
Called __getattribute__(_data)
Called __getattribute__(_data)
Called __getattribute__(_data)
Called __getattribute__(_data)
Called __getattribute__(_data)
Called __getattribute__(_data)
Called __getattribute__(_data)
Called __getattribute__(_data)
Called __getattribute__(_data)
Called __getattribute__(_data)
Called __getattribute__(_data)
Called __getattribute__(_data)
Called __getattribute__(_data)
Called __getattribute__(_data)
Called __getattribute__(_data)
Called __getattribute__(_data)
Called __getattribute__(_data)
Called __getattribute__(_data)
Called __getattribute__(_data)
Called __getattribute__(_data)
Called __getattribute__(_data)
Called __getattribute__(_data)
Called __g

The problem is that __getattribute__ accesses self._data, which causes
__getattribute__ to run again, which accesses self._data again, and so on. The
solution is to use the super().__getattribute__ method on your instance to fetch
values from the instance attribute dictionary. This avoids the recursion.

Similarly, you’ll need __setattr__ methods that modify attributes on an object to use
super().__setattr__.

In [8]:
# Example 12
class DictionaryDB(object):
    def __init__(self, data):
        self._data = data

    def __getattribute__(self, name):
        data_dict = super().__getattribute__('_data')
        return data_dict[name]

data = DictionaryDB({'foo': 3})
print(data.foo)

3


* Understand that __getattr__ only gets called once when accessing a missing attribute, whereas __getattribute__ gets called every time an attribute is accessed.
* Avoid infinite recursion in __getattribute__ and __setattr__ by using methods from super() (i.e., the object class) to access instance attributes directly.