# Chapter 8: Classes and Objects

The primary focus of this chapter is to present recipes to common programming patterns related to class definitions.  
Topics include making objects support common Python features, usage of special methods, encapsulation techniques, inheritance, memory management, and useful design patterns.  

## 8.1. Changing the String Representation of Instances

### Problem

You want to change the output produced by printing or viewing instances to something more sensible.

### Solution

To change the string representation of an instance, define the `__str__()` and `__repr__()` methods.

In [1]:
class Pair:
    def __init__(self, x, y):
        self.x = x
        self.y = y
    def __repr__(self):
        # r for repr
        return "Pair({0.x!r}, {0.y!r})".format(self)
    def __str__(self):
        # s for str
        return "({0.x!s}, {0.y!s})".format(self)

The `__repr__()` method returns the code representation of an instance, and is usually the text you would type to recreate the instance.  
The built-in `repr()` function returns this text, as does the interactive interpreter when inspecting values.  
The `__str__()` method converts the instance to a string, and is the output produced by the `str()` and `print()` functions.

In [2]:
p = Pair(3, 4)
# __repr__() output:
p

Pair(3, 4)

In [3]:
#__str__() output:
print(p)

(3, 4)


The implementation of this recipe also shows how different string representations may be used during formatting.  
Specifically, the special `!r` formatting code indicates that the output of `__repr__()` should be used instead of the default`__str__()`.  
You can try this experiment with the preceding class to see this:  

In [4]:
p = Pair(3, 4)
print('p is {0!r}'.format(p))

p is Pair(3, 4)


In [5]:
print('p is {0}'.format(p))

p is (3, 4)


### Discussion

Defining `__repr__()` and `__str__()` is often good practice, as it can simplify debugging and instance output.  
For example, just by printing or logging an instance, a programmer will be shown more useful information about the instance contents.  
It is standard practice for the output of `__repr__()` to produce text such that `eval(repr(x)) == x`.  
If this is not possible or desired, then it is common to create a useful textual representation enclosed in `<` and `>` instead.

In [6]:
f = open('example.bin')
f

<_io.TextIOWrapper name='example.bin' mode='r' encoding='UTF-8'>

In [7]:
f.close()

If no `__str__()` is defined, the output of `__repr__()` is used as a fallback.  
The use of `format()` in the solution might look a little funny, but the format code `{0.x}` specifies the x-attribute of argument 0.  
So, in the following function, the 0 is actually the instance self:

In [8]:
def __repr__(self):
    return "Pair({0.x!r}, (0.y!r))".format(self)

You can also use the modulo operator:

In [9]:
def __repr__(self):
    return "Pair(%r, %r)" % (self.x, self.y)

In [10]:
p

Pair(3, 4)

In [11]:
print(p)

(3, 4)


## 8.2. Customizing String Formatting

### Problem

You want an object to support customized formatting through the `format()` function and string method.

### Solution

You can customize string formatting by defining the `__format__()` method on a class.

In [12]:
_formats = {
        'ymd' : '{d.year}-{d.month}-{d.day}',
        'mdy' : '{d.month}/{d.day}/{d.year}',
        'dmy' : '{d.day}/{d.month}/{d.year}'
        }

class Date:
    def __init__(self, year, month, day):
        self.year = year
        self.month = month
        self.day = day
        
    def __format__(self, code):
        if code == '':
            code = 'ymd'
        fmt = _formats[code]
        return fmt.format(d=self)

Instances of the `Date` class now support operations like the ones below:

In [13]:
d = Date(2019, 3, 21)
format(d)

'2019-3-21'

In [14]:
format(d, 'mdy')

'3/21/2019'

In [15]:
"The date is {:ymd}".format(d)

'The date is 2019-3-21'

In [16]:
"The date is {:mdy}".format(d)

'The date is 3/21/2019'

### Discussion


The `__format__()` method provides a hook into Python’s string formatting functionality.  
It’s important to emphasize that the interpretation of format codes is entirely up to the class itself.  
Thus, the codes can be almost anything at all.  
For example, consider the following from the `datetime` module:

In [17]:
from datetime import date

d = date(2019, 3, 21)
format(d)

'2019-03-21'

In [18]:
format(d, '%A, %B, %d, %Y')

'Thursday, March, 21, 2019'

In [19]:
'The end is coming on {:%d %b %Y}. Farewell.'.format(d)

'The end is coming on 21 Mar 2019. Farewell.'

There are some standard conventions for the formatting of the built-in types.  
See the [documentation for the string module](https://docs.python.org/3/library/string.html) for a formal specification.

## 8.3. Making Objects Support the Context-Management Protocol

### Problem

You want to make your objects support the context-management protocol, aka the `with` statement.

### Solution

In order to make an object compatible with the `with` statement, you need to implement `__enter__()` and `__exit__()` methods.  
For example, consider the following class, which provides a network connection:

In [20]:
from socket import socket, AF_INET, SOCK_STREAM

class LazyConnection:
    def __init__(self, address, family=AF_INET, type=SOCK_STREAM):
        self.address = address
        self.family = AF_INET
        self.type = SOCK_STREAM
        self.sock = None
        
    def __enter__(self):
        if self.sock is not None:
            raise RuntimeError('*** Already connected ***')
        self.sock = socket(self.family, self.type)
        self.sock.connect(self.address)
        return self.sock
    
    def __exit__(self, exc_ty, exc_val, tb):
        self.sock.close()
        self.sock = None

The key feature of this class is that it represents a network connection, but it doesn’t actually do anything initially (e.g., it doesn’t establish a connection).  
Instead, the connection is established and closed using the with statement (essentially on demand).

In [21]:
from functools import partial

conn = LazyConnection(('www.python.org', 80))
# Connection closed
with conn as s:
    # conn.__enter__() executes: connection open
    s.send(b'GET /index.html HTTP/1.0\r\n')
    s.send(b'Host: www.python.org\r\n')
    s.send(b'\r\n')
    resp = b''.join(iter(partial(s.recv, 8192), b''))
    # conn.__exit() executes: connection closed

### Discussion

The main principle behind writing a context manager is that you’re writing code that’s meant to surround a block of statements as defined by the use of the `with` statement.  
When the `with` statement is first encountered, the `__enter__()` method is triggered.  
The return value of `__enter__()` (if any) is placed into the variable indicated with the `as` qualifier.  
Afterward, the statements in the body of the with statement execute.  
Finally, the `__exit__()` method is triggered to clean up.  
This control flow happens regardless of what happens in the body of the with statement, including if there are exceptions.  
In fact, the three arguments to the `__exit__()` method contain the exception type, value, and traceback for pending exceptions (if any).  
The `__exit__()` method can choose to use the exception information in some way or to ignore it by doing nothing and returning `None` as a result.  
If `__exit__()` returns `True`, the exception is cleared as if nothing happened and the program continues executing statements immediately after the with block.  
One subtle aspect of this recipe is whether or not the `LazyConnection` class allows nested use of the connection with multiple with statements.  
As shown, only a single socket connection at a time is allowed, and an exception is raised if a repeated with statement is attempted when a socket is already in use.  
You can work around this limitation with a slightly different implementation, as shown here:

In [22]:
from socket import socket, AF_INET, SOCK_STREAM

class LazyConnection:
    def __init__(self, address, family = AF_INET, type=SOCK_STREAM):
        self.address = address
        self.family = AF_INET
        self.type = SOCK_STREAM
        self.connections = []
        
    def __enter__(self):
        sock = socket(self.family, self.type)
        sock.connect(self.address)
        self.connections.append(sock)
        return sock
    
    def __exit(self, exc_ty, exc_val, tb):
        self.connections.pop().close()

In this second version, the `LazyConnection` class serves as a kind of factory for connections.  
Internally, a list is used to keep a stack.  
Whenever `__enter__()` executes, it makes a new connection and adds it to the stack.  
The `__exit__()` method simply pops the last connection off the stack and closes it.  
It’s subtle, but this allows multiple connections to be created at once with nested with statements, as shown.  
Context managers are most commonly used in programs that need to manage resources such as files, network connections, and locks.  
A key part of such resources is they have to be explicitly closed or released to operate correctly.  
For instance, if you acquire a lock, then you have to make sure you release it, or else you risk deadlock.  
By implementing `__enter__()`, `__exit__()`, and using the with statement, it is much easier to avoid such problems, since the cleanup code in the `__exit__()` method is guaranteed to run no matter what.  
An alternative formulation of context managers is found in the `contextmanager` module in Recipe 9.22.  
A thread-safe version of this recipe can be found in Recipe 12.6.

## 8.4. Saving Memory When Creating a Large Number of Instances

### Problem

Your program creates a large number (e.g., millions) of instances and uses a large amount of memory.

### Solution

For classes that primarily serve as simple data structures, you can often greatly reduce the memory footprint of instances by adding the `__slots__` attribute to the class definition.

In [23]:
class Date:
    __slots__ = ['year', 'month', 'day']
    def __init__(self, year, month, day):
        self.year = year
        self.month = month
        self.day = day

When you define `__slots__`, Python uses a much more compact internal representation for instances.  
Instead of each instance consisting of a dictionary, instances are built around a small fixed-sized array, much like a tuple or list.  
Attribute names listed in the `__slots__` specifier are internally mapped to specific indices within this array.  
A side effect of using slots is that it is no longer possible to add new attributes to instances — you are restricted to only those attribute names listed in the `__slots__` specifier.

### Discussion

The memory saved by using `__slots__` varies according to the number and type of attributes stored.  
However, in general, the resulting memory use is comparable to that of storing data in a tuple.  
To give you an idea, storing a single `Date` instance without slots requires 428 bytes of memory on a 64-bit version of Python.  
If slots is defined, it drops to 156 bytes.  
In a program that manipulated a large number of dates all at once, this would make a significant reduction in overall memory use.  
Although slots may seem like a feature that could be generally useful, you should resist the urge to use it in most code.  
There are many parts of Python that rely on the normal dictionary-based implementation.  
In addition, classes that define slots don’t support certain features such as multiple inheritance.  
For the most part, you should only use slots on classes that are going to serve as frequently used data structures in your program (like if your program created millions of instances of a particular class).  
A common misperception of `__slots__` is that it is an encapsulation tool that prevents users from adding new attributes to instances.  
Although this is a side effect of using slots, this was never the original purpose.  
Instead, `__slots__` was always [intended to be an optimization tool](http://python-history.blogspot.com/2010/06/inside-story-on-new-style-classes.html).

## 8.5. Encapsulating Names in a Class

### Problem

You want to encapsulate restricted data on instances of a class, but you are concerned about Python's lack of access control.

### Solution

Rather than relying on language features to encapsulate data, Python programmers are expected to observe certain naming conventions concerning the intended usage of data and methods.  
The first convention is that any name that starts with a single leading underscore `(_)` should *always* be assumed to be internal implementation.

In [24]:
class A:
    def __init__(self):
        # internal attribute
        self._internal = 0
        # public attribute
        self.public = 1
        
    def public_method(self):
        """
        Any public method will do.
        """
        pass
    
    def _internal_method(self):
        """
        Any private method will do.
        """

Python doesn’t actually prevent someone from accessing internal names.  
However, doing so is considered impolite, and may result in fragile code.  
It should be noted, too, that the use of the leading underscore is also used for module names and module-level functions.  
For example, if you ever see a module name that starts with a leading underscore (like `_socket`), it’s an internal implementation.  
Likewise, module-level functions such as `sys._getframe()` should only be used with great caution.  
You may also encounter the use of two leading underscores `(__)` on names within class definitions.

In [25]:
class B: 
    def __init__(self):
        self.__private = 0
        
    def __private_method(self):
        """
        Did you know that a docstring alone can form a complete function in Python?
        You don't even need the pass keyword.
        Thanks Guido;)
        """
    def public_method(self):
        self.__private_method()
        pass

The use of double leading underscores causes the name to be mangled to something else.  
Specifically, the private attributes in the preceding class get renamed to `_B__private` and `_B__private_method`, respectively.  
At this point, you might ask what purpose such name mangling serves.  
The answer is inheritance — such attributes cannot be overridden via inheritance.

In [26]:
class C(B):
    def __init__(self):
        super().__init__()
        self.__private = 1 # Doesn't override B.__private
    def __private_method(self):
        """
        Doesn't override B.__private_method()
        """

Here, the private names `__private` and `__private_method()` get renamed to `_C__private` and `_C__private_method`, which are different than the mangled names in the base class `B`.

### Discussion

The fact that there are two different conventions (single underscore versus double underscore) for "private" attributes leads to the obvious question of which style you should use.  
For most code, you should probably just make your nonpublic names start with a single underscore.  
If, however, you know that your code will involve subclassing, and there are internal attributes that should be hidden from subclasses, use the double underscore instead.  
It should also be noted that sometimes you may want to define a variable that clashes with the name of a reserved word.  
For this, you should use a single trailing underscore.

In [27]:
lambda_ = 2.0
# so you don't clash with the lambda keyword

The reason for not using a leading underscore here is that it avoids confusion about the intended usage.  
In other words, the use of a leading underscore could be interpreted as a way to avoid a name collision rather than as an indication that the value is private.  
Using a single trailing underscore solves this problem.

## 8.6. Creating Managed Attributes

### Problem

You want to add extra processing, such as type checking or validation, to the getting or setting of an instance attribute.

### Solution

A simple way to customize access to an attribute is to define it as a [property](https://www.programiz.com/python-programming/property).  
For example, this code defines a `property` that adds simple type checking to an attribute:

In [28]:
class Person:
    def __init__(self, first_name):
        self.first_name = first_name
        
    # Getter function
    @property
    def first_name(self):
        return self._first_name
    
    # Setter function
    @first_name.setter
    def first_name(self, value):
        if not isinstance(value, str):
            raise TypeError('Expected a string')
        self._first_name = value
            
    # Optional Deleter function
    @first_name.deleter
    def first_name(self):
        raise AttributeError('This attribute cannot be deleted')

In the preceding code, there are three related methods, all of which must have the same name.  
The first method is a getter function, and establishes `first_name` as being a property.  
The other two methods attach optional setter and deleter functions to the `first_name` property.  
It’s important to stress that the `@first_name.setter` and `@first_name.deleter` decorators won’t be defined unless `first_name` was already established as a property using the `@property` decorator.
A critical feature of a property is that it looks like a normal attribute, but access *automatically* triggers the getter, setter, and deleter methods.

In [29]:
a = Person('Guido')
a.first_name

'Guido'

When implementing a property, the underlying data (if any) still needs to be stored somewhere.  
Thus, in the getter and setter methods, you see direct manipulation of a `_first_name` attribute, which is where the actual data lives.  
In addition, you may ask why the `__init__()` method sets `self.first_name` instead of `self._first_name`.  
In this example, the entire point of the property is to apply type checking when setting an attribute.  
Thus, chances are you would also want such checking to take place during initialization.  
By setting `self.first_name`, the set operation uses the setter method instead of bypassing it by accessing `self._first_name`.  
Properties can also be defined for existing get and set methods.

In [30]:
class Person:
    def __init__(self, first_name):
        self.set_first_name(first_name)
        
    # Getter function
    def get_first_name(self):
        return self._first_name
    
    # Setter function
    def set_first_name(self, value):
        if not isinstance(value, str):
            raise TypeError('Expected a string')
        self._first_name = value
        
    # Optional Deleter function
    def del_first_name(self):
        raise AttributeError("This attribute cannot be deleted")
        
    # Make a property from the existing get and set methods
    name = property(get_first_name, set_first_name, del_first_name)

### Discussion

A property attribute is actually a collection of methods bundled together.  
If you inspect a class with a property, you can find the raw methods in the `fget`, `fset`, and `fdel` attributes of the property itself.

Normally, you wouldn't call `fget` or `fset` directly, but they are triggered automatically when the property is accessed.  
Properties should only be used in cases where you actually need to perform extra processing on attribute access.  
Sometimes programmers coming from languages such as Java feel that all access should be handled by getters and setters, and that they should write code like this:

In [31]:
class Person:
    def __init__(self, first_name):
        self.first_name = first_name
    @property
    def first_name(self):
        return self._first_name
    @first_name.setter
    def first_name(self, value):
        self._first_name = value

Don’t write properties that don’t actually add anything extra like this.  
For one, it makes your code more verbose and confusing to others.  
Second, it will make your program run a lot slower.  
Lastly, it offers no real design benefit.  
Specifically, if you later decide that extra processing needs to be added to the handling of an ordinary attribute, you could promote it to a property without changing existing code.  
This is because the syntax of code that accessed the attribute would remain unchanged.

Properties can also be a way to define computed attributes.  
These are attributes that are not actually stored, but computed on demand.

In [32]:
import math

class Circle:
    def __init__(self, radius):
        self.radius = radius
    @property
    def area(self):
        return math.pi * self.radius ** 2
    @property
    def perimeter(self):
        return 2 * math.pi * self.radius

Here, the use of properties results in a very uniform instance interface in that `radius`, `area`, and `perimeter` are all accessed as simple attributes, as opposed to a mix of simple attributes and method calls.

In [33]:
c = Circle(4.0)
c.radius

4.0

In [34]:
# notice the lack of ()
c.area

50.26548245743669

In [35]:
# again no parentheses
c.perimeter

25.132741228718345

Although properties give you an elegant programming interface, sometimes you actually may want to directly use getter and setter functions.

This often arises in situations where Python code is being integrated into a larger infrastructure of systems or programs.  
For example, perhaps a Python class is going to be plugged into a large distributed system based on remote procedure calls or distributed objects.  
In such a setting, it may be much easier to work with an explicit get/set method (as a normal method call) rather than a property that implicitly makes such calls.

Last, please don’t write Python code that features a lot of repetitive property definitions.

In [36]:
class Person:
    def __init__(self, first_name, last_name):
        self.first_name = first_name
        self.last_name = last_name
        
    @property
    def first_name(self):
        return self._first_name
    
    @first_name.setter
    def first_name(self, value):
        if not isinstance(value, str):
            raise TypeError('Expected a string')
        self._first_name = value
        
    # Repeated property code, but for a different name, which is confusing and not DRY
    @property
    def last_name(self):
        return self._last_name
    
    @last_name.setter
    def last_name(self, value):
        if not isinstance(value, str):
            raise TypeError('Expected a string')
        self._last_name = value

Code repetition leads to bloated, error prone, and ugly code.  
As it turns out, there are much better ways to achieve the same thing using descriptors or closures.  
See Recipes 8.9 and 9.21.

## 8.7. Calling a Method on a Parent Class

### Problem 

You want to invoke a method in a parent class in place of a method that has been overridden in a subclass.

### Solution

To call a method in a parent (or superclass), use the `super()` function.

In [37]:
class A:
    def spam(self):
        print('A.spam')
        
class B(A):
    def spam(self):
        print('B.spam')
        # call parent spam function
        super().spam()

A very common use of `super()` is in the handling of the `__init__()` method to make sure that parents are properly initialized:

In [38]:
class A:
    def __init__(self):
        self.x = 0
        
class B(A):
    def __init__(self):
        super().__init__()
        self.y = 1

Another common use of `super()` is in code that overrrides any of Python's special methods.

In [39]:
class Proxy:
    def __init__(self, obj):
        self._obj = obj
        
    # delegate attribute lookup to internal object
    def __getattr__(self, name):
        return getattr(self._obj, name)
    
    # delegate attribute assignment
    def __setattr__(self, name, value):
        if name.startswith('_'):
            # call original setattr
            super().__setattr__(name, value)
        else:
            setattr(self._obj, name, value)

In this code, the implementation of `__setattr__()` includes a name check.  
If the name starts with an underscore `(_)`, it invokes the original implementation of `__setattr__()` using `super()`.  
Otherwise, it delegates to the internally held object `self._obj`.  
It looks a little funny, but `super()` works even though there is no explicit base class listed.

### Discussion

Correct use of the `super()` function is actually one of the most poorly understood aspects of Python.  
Occasionally, you will see code written that directly calls a method in a parent like this:

In [40]:
class Base:
    def __init__(self):
        print('Base.__init__')
        
class A(Base):
    def __init__(self):
        Base.__init__(self)
        print('A.__init__')

Although you can usually get away with this, it can lead to trouble with inheritance later on.

In [41]:
class Base:
    def __init__(self):
        print('Base.__init__')
        
class A(Base):
    def __init__(self):
        Base.__init__(self)
        print('A.__init__')
        
class B(Base):
    def __init__(self):
        Base.__init__(self)
        print('B.__init__')
        
class C(A,B):
    def __init__(self):
        A.__init__(self)
        B.__init__(self)
        print('C__init__')

Now watch what happens when the `Base.__init__()` method is invoked twice:

In [42]:
c = C()

Base.__init__
A.__init__
Base.__init__
B.__init__
C__init__


Maybe that double-invocation of `Base.__init__()` is harmless, but you won't have to worry if you write the code using `super()`.

In [43]:
class Base:
    def __init__(self):
        print('Base.__init__')
        
class A(Base):
    def __init__(self):
        super().__init__()
        print('A.__init__')
        
class B(Base):
    def __init__(self):
        super().__init__()
        print('B.__init__')
        
class C(A, B):
    def __init__(self):
        # single call to super
        super().__init__()
        print('C.__init__')

Now each `__init__` method is only called once:

In [44]:
c = C()

Base.__init__
B.__init__
A.__init__
C.__init__


To understand why it works, we need to step back for a minute and discuss how Python implements inheritance.  
For every class that you define, Python computes what’s known as a [method resolution order (MRO)](http://python-history.blogspot.com/2010/06/method-resolution-order.html) list.  
The MRO list is simply a linear ordering of all the base classes.  

In [45]:
print(C.__mro__)

(<class '__main__.C'>, <class '__main__.A'>, <class '__main__.B'>, <class '__main__.Base'>, <class 'object'>)


[First](https://en.wikipedia.org/wiki/C3_linearization#Example_demonstrated_in_Python_3), we have a metaclass to enable a short representation of the objects by name.

In [46]:
class Type(type):
    def __repr__(cls):
        return cls.__name__
    
class O(object, metaclass=Type): pass

Now construct the inheritance tree:

In [47]:
class A(O): pass

class B(O): pass

class C(O): pass

class D(O): pass

class E(O): pass

class K1(A, B, C): pass

class K2(D, B, E): pass

class K3(D, A): pass

class Z(K1, K2, K3): pass

In [48]:
print(Z.mro())

[Z, K1, K2, K3, D, A, B, C, E, O, <class 'object'>]


To implement inheritance, Python starts with the leftmost class and works its way left-to-right through classes on the MRO list until it finds the first attribute match.  
The actual determination of the MRO list itself is made using a technique known as [C3 Linearization](https://en.wikipedia.org/wiki/C3_linearization).  
Without getting too bogged down in the mathematics of it, it is actually a merge sort of the MROs from the parent classes subject to three constraints:  
* Child classes get checked before parents.  
* Multiple parents get checked in the order listed.  
* If there are two valid choices for the next class, pick the one from the first parent. 

Honestly, all you really need to know is that the order of classes in the MRO list accommodates almost any class hierarchy you are going to define.  
When you use the `super()` function, Python continues its search starting with the next class on the MRO.  
As long as every redefined method consistently uses `super()` and only calls it once, control will ultimately work its way through the entire MRO list and each method will only be called once.  
This is why you don’t get double calls to `Base.__init__()` in the second example.  
A somewhat surprising aspect of `super()` is that it doesn’t necessarily go to the direct parent of a class next in the MRO and that you can even use it in a class with no direct parent at all.  
For example, consider this class:

In [49]:
class A:
    def spam(self):
        print('A.spam')
        super().spam()

We have a problem:

However, watch what happens if you begin using the class with multiple inheritance:

In [50]:
class B:
    def spam(self):
        print('B.spam')
        
class C(A,B):
    pass

In [51]:
c = C()
c.spam()

A.spam
B.spam


Here you see that the use of `super().spam()` in class `A` has called the `spam()` method in class `B` -- a class that is completely unrelated to `A`.  
This is all explained by the MRO of class `C`:

In [52]:
print(C.__mro__)

(<class '__main__.C'>, <class '__main__.A'>, <class '__main__.B'>, <class 'object'>)


Using `super()` in this manner is most common when defining mixin classes.  
See Recipes 8.13 and 8.18.  
However, because `super()` might invoke a method that you’re not expecting, there are a few general rules of thumb you should try to follow.  
First, make sure that all methods with the same name in an inheritance hierarchy have a compatible calling signature, including the same number of arguments and argument names.  
This ensures that `super()` won’t get tripped up if it tries to invoke a method on a class that’s not a direct parent.  
Second, it’s usually a good idea to make sure that the topmost class provides an implementation of the method so that the chain of lookups that occur along the MRO get terminated by an actual method of some sort.  
Use of `super()` is sometimes a source of debate in the Python community.  
However, all things being equal, you should probably use it in modern code.  
Raymond Hettinger has written an excellent blog post [Python’s super() Considered Super!](https://rhettinger.wordpress.com/2011/05/26/super-considered-super/) that has even more examples and reasons why `super()` might be super-awesome.

## 8.8. Extending a Property in a Subclass

### Problem

Within a subclass, you want to extend the functionality of a property defined in a parent class.   

### Solution

Consider the following code, whch defines a property:

In [53]:
class Person:
    def __init__(self, name):
        self.name = name
        
    
    # getter function
    @property
    def name(self):
        return self._name
    
    # setter function
    @name.setter
    
    def name(self, value):
        if not isinstance(value, str):
            raise TypeError('Expected a string')
        self._name = value
        
    # deleter function
    @name.deleter
    def name(self):
        raise AttributeError("Can't delete attribute")

Here is an example of a class that inherits from `Person` and extends the `name` property with new functionality:

In [54]:
class SubPerson(Person):
    @property
    def name(self):
        print('Getting name')
        return super().name
    
    @name.setter
    def name(self, value):
        print('Setting name to', value)
        super(SubPerson, SubPerson).name.__set__(self, value)
        
    @name.deleter
    def name(self):
        print('Deleting name')
        super(SubPerson, SubPerson).name.__delete__(self)

Here is an example of our new class in use:

In [55]:
s = SubPerson('Guido')

Setting name to Guido


In [56]:
s.name

Getting name


'Guido'

In [57]:
s.name = 'Larry'

Setting name to Larry


If you only want to extend one of the methods of a property, use code such as the following:

In [58]:
class SubPerson(Person):
    @Person.name.getter
    def name(self):
        print('Getting name')
        return super().name

If you just want the setter, you can try this:

In [59]:
class SubPerson(Person):
    @Person.name.setter
    def name(self, value):
        print('Setting name to', value)
        super(SubPerson, SubPerson).name.__set__(self, value)

### Discussion

Extending a property in a subclass introduces a number of very subtle problems related to the fact that a property is defined as a collection of getter, setter, and deleter methods, as opposed to just a single method.  
Thus, when extending a property, you need to figure out if you will redefine all of the methods together or just one of the methods.  
In the first example, all of the property methods are redefined together.  
Within each method, `super()` is used to call the previous implementation.  
The use of `super(Sub Person, SubPerson).name.__set__(self, value)` in the setter function is no mistake.  
To delegate to the previous implementation of the setter, control needs to pass through the `__set__()` method of the previously defined name property.  
However, the only way to get to this method is to access it as a class variable instead of an instance variable.  
This is what happens with the `super(SubPerson, SubPerson)` operation.  
If you only want to redefine one of the methods, it’s not enough to use `@property` by itself.

In [60]:
class SubPerson(Person):
    @property
    def name(self):
        print('Getting name')
        return super().name

The following code will throw an `AttributeError` because the setter function can't be accessed.

With the code shown in the solution, however:

All of the previously defined methods of the property are copied, and the getter function is replaced.  
In this particular solution, there is no way to replace the hardcoded class name `Person` with something more generic.  
If you don’t know which base class defined a property, you should use the solution where all of the property methods are redefined and `super()` is used to pass control to the previous implementation.  
It’s worth noting that the first technique shown in this recipe can also be used to extend a descriptor, as described in Recipe 8.9.

In [61]:
# A descriptor
class String:
    def __init__(self, name):
        self.name = name
        
    def __get__(self, instance, cls):
        if instance is None:
            return self
        return instance.__dict__[self.name]
    
    def __set__(self, instance, value):
        if not isinstance(value, str):
            raise TypeError('Expected a string')
        instance.__dict__[self.name] = value

        
# A class with a descriptor
class Person:
    name = String('name')
    def __init__(self, name):
        self.name = name

        
# Extending a descriptor with a property
class SubPerson(Person):
    @property
    def name(self):
        print('Getting name')
        return super().name
    
    @name.setter
    def name(self, value):
        print('Setting name to', value)
        super(SubPerson, SubPerson).name.__set__(self, value)
        
    @name.deleter
    def name(self):
        print('Deleting name')
        super(SubPerson, SubPerson).name.__delete__(self)

Finally, it’s worth noting that by the time you read this, subclassing of setter and deleter methods might be somewhat simplified.  
The solution shown will still work, but the [bug reported at Python’s issues page](https://bugs.python.org/issue14965) might evolve into a cleaner approach in a future Python version.

## 8.9. Creating a New Kind of Class or Instance Attribute

### Problem

You want to create a new kind of instance attribute type with some extra functionality, such as type checking.  

### Solution

If you want to create an entirely new kind of instance attribute, define its functionality in the form of a descriptor class. 

In [62]:
# Descriptor attribute for an integer type-checked attribute:
class Integer:
    def __init__(self, name):
        self.name = name
        
    def __get__(self, instance, cls):
        if instance is None:
            return self
        else:
            return instance.__dict__[self.name]
        
    def __set__(self, instance, value):
        if not isinstance(value, int):
            raise TypeError('Expected an integer')
        instance.__dict__[self.name] = value
        
    def __delete__(self, instance):
        del instance.__dict__[self.name]

A [descriptor](https://docs.python.org/3/howto/descriptor.html) is a class that implements the three core attribute access operations (get, set, and delete) in the form of `__get__()`, `__set__()`, and `__delete__()` special methods.  
These methods work by receiving an instance as input.  
The underlying dictionary of the instance is then manipulated as appropriate.  
To use a descriptor, instances of the descriptor are placed into a class definition as class variables.

In [63]:
class Point:
    x = Integer('x')
    y = Integer('y')
    def __init__(self, x, y):
        self.x = x
        self.y = y

When you do this, all access to the descriptor attributes x and y is captured by the `__get__()`, `__set__()`, and `__delete__()` methods.

In [64]:
p = Point(2, 3)
# calls Point.x.__get__(p, Point)
p.x

2

In [65]:
# calls Point.y.__set__(p, 5)
p.y = 5

As input, each method of a descriptor receives the instance being manipulated.  
To carry out the requested operation, the underlying instance dictionary (the `__dict__` attribute) is manipulated as appropriate.  
The `self.name` attribute of the descriptor holds the dictionary key being used to store the actual data in the instance dictionary.

### Discussion

Descriptors provide the underlying magic for most of Python’s class features, including `@classmethod`, `@staticmethod`, `@property`, and even the `__slots__` specification.  
By defining a descriptor, you can capture the core instance operations (get, set, delete) at a very low level and completely customize what they do.  
This gives you great power, and is one of the most important tools employed by the writers of advanced libraries and frameworks.  
One confusion with descriptors is that they can only be defined at the class level, not on a per-instance basis.  
Thus, code like this will not work:

In [66]:
class Point:
    def __init__(self, x, y):
        self.x = Integer('x')  # No. Must be a class variable.
        self.y = Integer('y')
        self.x = x
        self.y = y

The implementation of the `__get__()` method is also unusual:

In [67]:
# Descriptor attribute for an integer type-checked attribute.
class Integer:
    # ...
    def __get__(self, instance, cls):
        if instance is None:
            return self
        else:
            return instance.__dict__[self.name]
    # ...

The reason `__get__()` looks somewhat complicated is to account for the distinction between instance variables and class variables.  
If a descriptor is accessed as a class variable, the instance argument is set to `None`.  
In this case, it is standard practice to simply return the descriptor instance itself, although any kind of custom processing is also allowed.

Descriptors are often just one component of a larger programming framework involving decorators or metaclasses.  
As such, their use may be hidden just barely out of sight.  
As an example, here is some more advanced descriptor-based code involving a class decorator:

In [68]:
# Descriptor for a type-checked attribute:
class Typed:
    def __init__(self, name, expected_type):
        self.name = name
        self.expected_type = expected_type
        
    def __get__(self, instance, cls):
        if instance is None:
            return self
        else:
            return instance.__dict__[self.name]
        
    def __set__(self, instance, value):
        if not isinstance(value, self.expected_type):
            raise TypeError('Expected ' + str(self.expected_type))
        instance.__dict__[self.name] = value
        
    def __delete__(self, instance):
        del instance.__dict__[self.name]
        
# Class decorator that is applied to selected attributes
def typeassert(**kwargs):
    def decorate(cls):
        for name, expected_type in kwargs.items():
            # Attach a typed descriptor to the class
            setattr(cls, name, Typed(name, expected_type))
        return cls
    return decorate

# Example use
@typeassert(name=str, shares=int, price=float)
class Stock:
    def __init__(self, name, shares, price):
        self.name = name
        self.shares = shares
        self.price = price

Finally, it should be stressed that you would probably not write a descriptor if you simply want to customize the access of a single attribute of a specific class.  
For that, it’s easier to use a property instead, as described in Recipe 8.6.  
Descriptors are more useful in situations where there will be a lot of code reuse, like if you want to use the functionality provided by the descriptor in hundreds of places in your code or provide it as a library feature.

## 8.10 Using Lazily Computed Properties

### Problem

You’d like to define a read-only attribute as a property that only gets computed on access.  
However, once accessed, you’d like the value to be cached instead of re-computed.

### Solution

An efficient way to define a lazy attribute is through the use of a descriptor class, such as the following:

In [69]:
class lazyproperty:
    def __init__(self, func):
        self.func = func
        
    def __get__(self, instance, cls):
        if instance is None:
            return self
        else:
            value = self.func(instance)
            setattr(instance, self.func.__name__, value)
            return value

You can use this code in a class like the following example:

In [70]:
import math

class Circle:
    def __init__(self, radius):
        self.radius = radius
        
    @lazyproperty
    def area(self):
        print('Calculating area')
        return math.pi * self.radius ** 2
    
    @lazyproperty
    def perimeter(self):
        print('Computing perimeter')
        return 2 * math.pi * self.radius

Now for some examples:

In [71]:
c = Circle(4.0)
c.radius

4.0

In [72]:
c.area

Calculating area


50.26548245743669

In [73]:
c.area

50.26548245743669

In [74]:
c.perimeter

Computing perimeter


25.132741228718345

In [75]:
c.perimeter

25.132741228718345

Why did the messages "Calculating Area" and "Computing perimeter" only appear once?

### Discussion

In many cases, the whole point of having a lazily computed attribute is to improve performance.  
For example, you avoid computing values unless you actually need them somewhere.  
The solution shown does just this, but it exploits a subtle feature of descriptors to do it in a highly efficient way.  
As shown in other recipes, like Recipe 8.9, when a descriptor is placed into a class definition, its `__get__()`, `__set__()`, and `__delete__()` methods get triggered on attribute access.  
However, if a descriptor only defines a `__get__()` method, it has a much weaker binding than usual.  
In particular, the `__get__()` method only fires if the attribute being accessed is not in the underlying instance dictionary.  
The `lazyproperty` class exploits this by having the `__get__()` method store the computed value on the instance using the same name as the property itself.  
By doing this, the value gets stored in the instance dictionary and disables further computation of the property.  
You can observe this by digging a little deeper into the example:

In [76]:
c = Circle(4.0)
# get the instance variables
vars(c)

{'radius': 4.0}

In [77]:
# calculate the area and observe the variables afterward
c.area

Calculating area


50.26548245743669

In [78]:
vars(c)

{'radius': 4.0, 'area': 50.26548245743669}

In [79]:
# notice that accessing the attribute doesn't invoke the property anymore
c.area

50.26548245743669

In [80]:
# delete the variable and see the property triggered again
del c.area
vars(c)

{'radius': 4.0}

In [81]:
c.area

Calculating area


50.26548245743669

Of course there's a catch.  
The computed value becomes mutable after it's created.

In [82]:
c.area

50.26548245743669

In [83]:
c.area = 25
c.area

25

This implementation takes care of that problem, but isn't quite as efficient.

In [84]:
def lazyproperty(func):
    name = '_lazy_' + func.__name__
    @property
    def lazy(self):
        if hasattr(self, name):
            return getattr(self, name)
        else:
            value = func(self)
            setattr(self, name, value)
            return value
    return lazy

In [85]:
c = Circle(4.0)
c.area

Calculating area


50.26548245743669

In [86]:
c.area

50.26548245743669

However, a disadvantage is that all get operations have to be routed through the property’s getter function.  
This is less efficient than simply looking up the value in the instance dictionary, as was done in the original solution.
For more information on properties and managed attributes, see Recipe 8.6.  
Descriptors are described in Recipe 8.9.

## 8.11. Simplifying the Initialization of Data Structures

### Problem

You are writing a lot of classes that serve as data structures, but you are getting tired of writing highly repetitive and boilerplate `__init__()` functions.

### Solution

You can often generalize the initialization of data structures into a single `__init__()` function defined in a common base class.

In [87]:
class Structure:
    # Class variable that specifies expected fields:
    _fields = []
    def __init__(self, *args):
        if len(args) != len(self._fields):
            raise TypeError('Expected {} arguments'.format(len(self._fields)))
        # Set the arguments:
        for name, value in zip(self._fields, args):
            setattr(self, name, value)

In [88]:
# Example class definitions:
if __name__ == '__main__':

    class Stock(Structure):
        _fields = ['name', 'shares', 'price']

    class Point(Structure):
        _fields = ['x', 'y']

    class Circle(Structure):
        _fields = ['radius']
        def area(self):
            return math.pi * self.radius ** 2

This makes it easier to construct classes:

In [89]:
s = Stock('ACME', 50, 91.1)
s

<__main__.Stock at 0x107e43e48>

In [90]:
p = Point(2, 3)
c = Circle(4.5)
p,c

(<__main__.Point at 0x107e434a8>, <__main__.Circle at 0x107e43ba8>)

Should you decide to support keyword arguments, there are several design options.  
One choice is to map the keyword arguments so that they only correspond to the attribute names specified in `_fields`.

In [91]:
class Structure:
    _fields = []
    def __init__(self, *args, **kwargs):
        if len(args) > len(self._fields):
            raise TypeError('Expected {} arguments'.format(len(self._fields)))
        # Set all of the positional arguments:
        for name, value in zip(self._fields, args):
            setattr(self, name, value)
        # Set the remaining keyword arguments:
        for name in self._fields[len(args):]:
            setattr(self, name, kwargs.pop(name))
        # Check for any remaining unknown arguments:
        if kwargs:
            raise TypeError('Invalid argument(s): {}'.format(','.join(kwargs)))

In [92]:
if __name__ == '__main__':
    class Stock(Structure):
        _fields = ['name', 'shares', 'price']
        
    s1 = Stock('ACME', 50, 91.1)
    s2 = Stock('ACME', 50, price=91.1)
    s3 = Stock('ACME', shares=50, price=91.1)

Another possible choice is to use keyword arguments as a means for adding additional attributes to the structure not specified in `_fields`.

In [93]:
class Structure:
    # Class variable that specifies expected fields:
    _fields = []
    def __init__(self, *args, **kwargs):
        if len(args) != len(self._fields):
            raise TypeError('Expected {} arguents'.format(len(self._fields)))
        
        # Set the arguments:
        for name, value in zip(self._fields, args):
            setattr(self, name, value)
            
        # Set any additional arguments:
        extra_args = kwargs.keys() - self._fields
        for name in extra_args:
            setattr(self, name, kwargs.pop(name))
        if kwargs:
            raise TypeError('Duplicate values for {}'.format(','.join(kwargs)))

In [94]:
if __name__ == '__main__':
    class Stock(Structure):
        _fields = ['name', 'shares', 'price']
        
    s1 = Stock('ACME', 50, 91.1)
    s2 = Stock('ACME', 50, 91.1, date='4/3/2019')
    
s1,s2

(<__main__.Stock at 0x107e42048>, <__main__.Stock at 0x107e427b8>)

### Discussion


This technique of defining a general purpose `__init__()` method can be extremely useful if you’re ever writing a program built around a large number of small data structures.  
It leads to much less code than manually writing `__init__()` methods like this:

In [95]:
class Stock:
    def __init__(self, name, shares, price):
        self.name = name
        self.shares = shares
        self.price = price
        
class Point:
    def __init__(self, x, y):
        self.x = x
        self.y = y
        
class Circle:
    def __init__(self, radius):
        self.radius = radius
        
    def area(self):
        return math.pi * self.radius ** 2

One subtle aspect of the implementation concerns the mechanism used to set values using the `setattr()` function.  
Instead of doing that, you might be inclined to directly access the instance dictionary.

In [96]:
class Structure:
    # Initialize a class variable that specifies the expected fields
    _fields = []
    def __init__(self, *args):
        if len(args) != len(self._fields):
            raise TypeError('Expected {} arguments'.format(len(self._fields)))
        # Set the arguments
        self.__dict__.update(zip(self._fields, args))

Although this works, don't make assumptions about the implementation of a subclass.  
If a subclass uses `__slots__` or wraps a specific attribute with something like a descriptor or property, then directly accessing the instance dictionary could wreak havoc on your program or break it altogether.  
The solution has been written to be as flexible as possible and avoid assumptions about subclasses.  
One downside of this technique is that it impairs the documentation and help features of IDE's.  
For example, if a user asks for help on a specific class, the required arguments are described in an unusual way.

In [97]:
help(Stock)

Help on class Stock in module __main__:

class Stock(builtins.object)
 |  Methods defined here:
 |  
 |  __init__(self, name, shares, price)
 |      Initialize self.  See help(type(self)) for accurate signature.
 |  
 |  ----------------------------------------------------------------------
 |  Data descriptors defined here:
 |  
 |  __dict__
 |      dictionary for instance variables (if defined)
 |  
 |  __weakref__
 |      list of weak references to the object (if defined)



One way to deal with these problems is to attach or enforce a type signature in the `__init__()` function.  
If you're unsure how to do that, it will be covered in Recipe 9.16.  
It should be noted that it is also possible to automatically initialize instance variables using a utility function and what's commonly known as a [frame hack](http://farmdev.com/src/secrets/framehack/index.html#what-do-we-mean-by-frame-hack).

In [98]:
def init_fromlocals(self):
    import sys
    locs = sys._getframe(1).f_locals
    for k, v in locs.items():
        if k != 'self':
            setattr(self, k, v)
            
class Stock:
    def __init__(self, name, shares, price):
        init_fromlocals(self)

In this variation, the `init_fromlocals()` function uses `sys._getframe()` to peek at the local variables of the calling method.  
If used as the first step of an `__init__()` method, the local variables will be the same as the passed arguments and can be easily used to set attributes with the same names.  
Although this approach avoids the problem of getting the right calling signature in IDEs, it runs more than 50% slower than the solution provided in the recipe, requires more typing, and involves more sophisticated magic behind the scenes.  
If your code doesn’t need this extra power, often times the simpler solution will work just fine.

## 8.12. Defining an Interface or Abstract Base Class

### Problem

You want to define a class that serves as an interface or abstract base class from which you can perform type checking and ensure that certain methods are implemented in subclasses.

### Solution

To define an abstract base class, use the [`abc` module](https://docs.python.org/3/library/abc.html).

In [99]:
from abc import ABCMeta, abstractmethod

class IStream(metaclass=ABCMeta):
    @abstractmethod
    def read(self, maxbytes=-1):
        pass
    @abstractmethod
    def write(self, data):
        pass

One important feature of an abstract base class is that it can't be directly instantiated.  

An abstract base class is meant to be used in concert with other classes that can implement the required methods.

In [100]:
class SocketStream(IStream):
    def read(self, maxbytes=-1):
        """
        Read something
        """
    def write(self, data):
        """
        Write something
        """

A major use of abstract base classes is in code that wants to enforce an expected programming interface.  
For example, one way to view the `IStream` base class is as a high-level specification for an interface that allows reading and writing of data.  
Code that explicitly checks for this interface could be written as follows:

In [101]:
def serialize(obj, stream):
    if not isinstance(stream, IStream):
        raise TypeError('Expected an IStream')

You might think that this kind of type checking only works by subclassing the abstract base class (ABC), but ABCs allow other classes to be registered as implementing the required interface.

In [102]:
import io

# Register the built-in I/O classes so they can talk to our interface
IStream.register(io.IOBase)

# Open a normal file and type check
f = open('foo.txt')
isinstance(f, IStream)

True

It should be noted that `@abstractmethod` can also be applied to static methods, class methods, and properties.  
You just need to make sure you apply it in the proper sequence where `@abstractmethod` appears immediately before the function definition, as shown here:

In [103]:
from abc import ABCMeta, abstractmethod

class A(metaclass=ABCMeta):
    @property
    @abstractmethod
    def name(self):
        pass
    
    @name.setter
    @abstractmethod
    def name(self, value):
        pass
    
    @classmethod
    @abstractmethod
    def method1(cls):
        pass
    
    @staticmethod
    @abstractmethod
    def method2():
        pass

### Discussion

Predefined abstract base classes are found in various places in the standard library.  
The `collections` module defines a variety of ABCs related to containers and iterators like sequences, mappings, and sets.  
The `numbers` library defines ABCs related to numeric objects including integers, floats, and rationals.  
Additionally, the `io` library defines ABCs related to I/O handling.  
You can use the predefined ABCs to perform more generalized kinds of type checking.

It should be noted that, as of this writing, certain library modules don’t make use of these predefined ABCs as you might expect.

In [104]:
from decimal import Decimal

import numbers

x = Decimal('3.4')
isinstance(x, numbers.Real)

False

Even though the value 3.4 is technically a real number, it doesn’t type check that way to help avoid inadvertent mixing of floating-point numbers and decimals.  
Thus, if you use the ABC functionality, it is wise to carefully write tests that verify that the behavior is exactly what you intended.  
Although ABCs facilitate type checking, it’s not something that you should overuse in a program.  
At its heart, Python is a dynamic language that gives you great flexibility.  
Trying to enforce type constraints everywhere tends to result in code that is more complicated than it needs to be.  
Keep it simple.

## 8.13. Implementing a Data Model or Type System

### Problem

You want to define various kinds of data structures, but want to enforce constraints on the values that are allowed to be assigned to certain attributes.

### Solution

In this problem, you are basically faced with the task of placing checks or assertions on the setting of certain instance attributes.  
To do this, you need to customize the setting of attributes on a per-attribute basis, which can be accomplished using [descriptors](https://docs.python.org/3/howto/descriptor.html).  
The following code illustrates the use of descriptors to implement a system type and value checking framework:

In [105]:
# Base class descriptor that sets a value
class Descriptor:
    def __init__(self, name=None, **opts):
        self.name = name
        for key, value in opts.items():
            setattr(self, key, value)
            
    def __set__(self, instance, value):
        instance.__dict__[self.name] = value
        
# Descriptor for enforcing types
class Typed(Descriptor):
    expected_type = type(None)
    
    def __set__(self, instance, value):
        if not isinstance(value, self.expected_type):
            raise TypeError('expected ' + str(self.expected_type))
        super().__set__(instance, value)
        
# Descriptor for enforcing values
class Unsigned(Descriptor):
    def __set__(self, instance, value):
        if value < 0:
            raise ValueError('Expected >= 0')
        super().__set__(instance, value)
        
class MaxSized(Descriptor):
    def __init__(self, name=None, **opts):
        if 'size' not in opts:
            raise TypeError('Missing size parameter')
        super().__init__(name, **opts)
        
    def __set__(self, instance, value):
        if len(value) >= self.size:
            raise ValueError('size must be < ' + str(self.size))
        super().__set__(instance, value)

These classes should be viewed as basic building blocks from which you construct a data model or type system.  
Continuing, here is some code that implements some different kinds of data:

In [106]:
class Integer(Typed):
    expected_type = int
    
class UnsignedInteger(Integer, Unsigned):
    pass

class Float(Typed):
    expected_type = float
    
class UnsignedFloat(Float, Unsigned):
    pass

class String(Typed):
    expected_type = str
    
class SizedString(String, MaxSized):
    pass

Using these type objects, now we can define a new class.

In [107]:
class Stock:
    # Specify constraints
    name = SizedString('name', size=8)
    shares = UnsignedInteger('shares')
    price = UnsignedFloat('price')
    def __init__(self, name, shares, price):
        self.name = name
        self.shares = shares
        self.price = price

With those parameters enforced, you can validate attributes as well as assign them.

In [108]:
s = Stock('ACME', 50, 91.1)
s.name

'ACME'

In [109]:
s.shares = 75

There are some techniques that can be used to simplify the specification of constraints in classes.  
One way to do this is to use a class decorator.

In [110]:
# Class decorator to apply constraints
def check_attributes(**kwargs):
    def decorate(cls):
        for key, value in kwargs.items():
            if isinstance(value, Descriptor):
                value.name = key
                setattr(cls, key, value)
            else:
                setattr(cls, key, value(key))
        return cls
    return decorate

Now an example:

In [111]:
@check_attributes(name=SizedString(size=8),
                  shares=UnsignedInteger,
                  price=UnsignedFloat)
class Stock:
    def __init__(self, name, shares, price):
        self.name = name
        self.shares = shares
        self.price = price

Another way to simplify the specification of constraints is to use a metaclass.

In [112]:
# A metaclass that applies checking
class checkedmeta(type):
    def __new__(cls, clsname, bases, methods):
        # Attach attribute names to the descriptors
        for key, value in methods.items():
            if isinstance(value, Descriptor):
                value.name = key
        return type.__new__(cls, clsname, bases, methods)
    
# Example
class Stock(metaclass=checkedmeta):
    name   = SizedString(size=8)
    shares = UnsignedInteger()
    price  = UnsignedFloat()
    def __init__(self, name, shares, price):
        self.name = name
        self.shares = shares
        self.price = price

### Discussion

This recipe involves a number of advanced techniques, including descriptors, mixin classes, the use of `super()`, class decorators, and metaclasses.  
Covering the basics of all those topics is beyond what can be covered here, but examples can be found in other recipes (see Recipes 8.9, 8.18, 9.12, and 9.19).  
However, there are a number of subtle points worth noting.  
First, in the `Descriptor` base class, you will notice that there is a `__set__()` method, but no corresponding `__get__()`.  
If a descriptor will do nothing more than extract an identically named value from the underlying instance dictionary, defining `__get__()` is unnecessary.  
In fact, defining `__get__()` will just make it run slower.  
Thus, this recipe only focuses on the implementation of `__set__()`.  
The overall design of the various descriptor classes is based on mixin classes.  
For example, the `Unsigned` and `MaxSized` classes are meant to be mixed with the other descriptor classes derived from `Typed`.  
To handle a specific kind of data type, multiple inheritance is used to combine the desired functionality.

You will also notice that all `__init__()` methods of the various descriptors have been programmed to have an identical signature involving keyword arguments `**opts`.  
The class for `MaxSized` looks for its required attribute in `opts`, but simply passes it along to the `Descriptor` base class, which actually sets it.  
One tricky part about composing classes like this (especially mixins), is that you don’t always know how the classes are going to be chained together or what `super()` will invoke.  
For this reason, you need to make it work with any possible combination of classes.  
The definitions of the various type classes such as `Integer`, `Float`, and `String` illustrate a useful technique of using class variables to customize an implementation.  
The `Typed` descriptor merely looks for an `expected_type` attribute that is provided by each of those subclasses.  
The use of a class decorator or metaclass is often useful for simplifying the specification by the user.  
You will notice that in those examples, the user no longer has to type the name of the attribute more than once.

In [113]:
# Normal
class Point:
    x = Integer('x')
    y = Integer('y')
    
# Metaclass
class Point(metaclass=checkedmeta):
    x = Integer()
    y = Integer()

The code for the class decorator and metaclass scan the class dictionary looking for descriptors.  
When found, they fill in the descriptor name based on the key value.  
Of all the approaches, the class decorator solution provides the most flexibility and sanity.  
For one, it does not rely on any advanced machinery, such as metaclasses.  
Second, decoration is something that can easily be added or removed from a class definition as desired.  
For example, within the decorator, there could be an option to simply omit the added checking altogether.  
These might allow the checking to be something that could be turned on or off depending on demand (maybe for debugging versus production).  
As a final twist, a class decorator approach can also be used as a replacement for mixin classes, multiple inheritance, and tricky use of the `super()` function.  
Here is an alternative formulation of this recipe that uses class decorators:

In [114]:
# Base class that uses a descriptor to set a value
class Descriptor:
    def __init__(self, name=None, **opts):
        self.name = name
        for key, value in opts.items():
            setattr(self, key, value)
            
    def __set__(self, instance, value):
        instance.__dict__[self.name] = value
        
# Decorator for applying type checking
def Typed(expected_type, cls=None):
    if cls is None:
        return lambda cls: Typed(expected_type, cls)
    
    super_set = cls.__set__
    def __set__(self, instance, value):
        if not isinstance(value, expected_type):
            raise TypeError('Expected ' + str(expected_type))
        super_set(self, instance, value)
    cls.__set__ = __set__
    return cls

# Decorator for unsigned values
def Unsigned(cls):
    super_set = cls.__set__
    def __set__(self, instance, value):
        if value < 0:
            raise ValueError('Expected >= 0')
        super_set(self, instance, value)
    cls.__set__ = __set__
    return cls

# Decorator for allowing sized values
def MaxSized(cls):
    super_init = cls.__init__
    def __init__(self, name=None, **opts):
        if 'size' not in opts:
            raise TypeError('missing size option')
        super_init(self, name **opts)
    cls.__init__ = __init__
    
    super_set = cls.__set__
    def __set__(self, instance, value):
        if len(value) >= self.size:
            raise ValueError('Size must be < ' + str(self.size))
        super_set(self, instance, value)
    cls.__set__ = __set__
    return cls

# Specialized descriptors
@Typed(int)
class Integer(Descriptor):
    pass

@Unsigned
class UnsignedInteger(Integer):
    pass

@Typed(float)
class Float(Descriptor):
    pass

@Unsigned
class UnsignedFloat(Float):
    pass

@Typed(str)
class String(Descriptor):
    pass

@MaxSized
class SizedString(String):
    pass

The classes defined in this alternative formulation work exactly the same way as befor, except that now the code runs much faster.  
The class decorator approach runs almost 100% faster than the approach using mixins.  

## 8.14. Implementing Custom Containers

### Problem

You want to implement a custom class that mimics the behavior of a common built-in container type, such as a list or dictionary.  
However, you’re not entirely sure what methods need to be implemented to do it.

### Solution

The `collections` library defines a variety of abstract base classes that are extremely useful when implementing custom container classes.  
To illustrate, suppose you want your class to support iteration.  
To do that, simply start by having it inherit from `collections.Iterable`, as follows:

In [115]:
import collections

class A(collections.Iterable):
    pass

The special feature about inheriting from `collections.Iterable` is that it ensures you implement all of the required special methods.  
If you don’t, you’ll get an error upon instantiation:

To fix this error, simply give the class the required `__iter__()` method and implement it as desired (see Recipes 4.2 and 4.7).  
Other notable classes defined in `collections` include `Sequence, MutableSequence, Mapping, MutableMapping, Set,` and `MutableSet`.  
Many of these classes form hierarchies with increasing levels of functionality (e.g., one such hierarchy is `Container, Iterable, Sized, Sequence,` and `MutableSequence`).  
Again, simply instantiate any of these classes to see what methods need to be implemented to make a custom container with that behavior:

Here is a simple example of a class that implements the preceding methods to create a sequence where items are always store in a sorted order:

In [116]:
import collections
import bisect

class SortedItems(collections.Sequence):
    def __init__(self, initial=None):
        self._items = sorted(initial) if initial is None else []
        
    # Required sequence methods
    def __getitem__(self, index):
        return self._items[index]
    
    def __len__(self):
        return len(self._items)
    
    # Method for adding an item in the right location

    def add(self, item):
        bisect.insort(self._items, item)

Instances of `SortedItems` behave exactly like a normal sequence and support all of the usual operations, including indexing, iteration, `len()`, containment (using the `in` operator), and even slicing.  
As an aside, the bisect module used in this recipe is a convenient way to keep items in a list sorted.  
The `bisect.insort()` inserts an item into a list so that the list remains in order.

### Discussion

Inheriting from one of the abstract base classes in collections ensures that your custom container implements all of the required methods expected of the container.  
However, this inheritance also facilitates type checking.  
For example, your custom container will satisfy various type checks like this:

In [117]:
items = SortedItems([5, 1, 3])
import collections
isinstance(items, collections.Iterable)

True

In [118]:
isinstance(items, collections.Sequence)

True

In [119]:
isinstance(items, collections.Container)

True

In [120]:
isinstance(items, collections.Sized)

True

In [121]:
isinstance(items, collections.Mapping)

False

Many of the abstract base classes in `collections` also provide default implementations of common container methods.  
To illustrate, suppose you have a class that inherits from `collections.MutableSequence`, like this:

In [122]:
class Items(collections.MutableSequence):
    def __init__(self, initial=None):
        self._items = list(initial) if initial is None else []
        
    # Required sequence methods
    def __getitem__(self, index):
        print('Getting:', index)
        return self._items[index]
    
    def __setitem__(self, index, value):
        print('Setting:', index, value)
        self._items[index] = value
        
    def __delitem__(self, index):
        print('Deleting:', index)
        del self._items[index]
        
    def insert(self, index, value):
        self._items.insert(index, value)
        
    def __len__(self):
        print('Len')
        return len(self._items)

If you create an instance of `Items`, you’ll find that it supports almost all of the core list methods such as `append()`, `remove()`, `count()`, and others.  
These methods are implemented in such a way that they only use the required ones.  

This recipe only provides a brief glimpse into Python’s abstract class functionality.  
The numbers module provides a similar collection of abstract classes related to numeric data types.  
See Recipe 8.12 for more information about making your own abstract base classes.  
You can also find another recipe for abstract methods and classes [here](https://github.com/ActiveState/code/tree/master/recipes/Python/266468_Abstract_methodsclasses/).

## 8.15. Delegating Attribute Access

### Problem

You want an instance to delegate attribut access to an internally held instance possibly as an alternative to inheritance or in order to implement a proxy.

### Solution

Delegation is a programming pattern where the responsibility for implementing a particular operation is passed on to a different object.

In [123]:
class A:
    def spam(self, x):
        pass
    
    def foo(self):
        pass
    
class B:
    def __init__(self):
        self._a = A()
        
    def spam(self, x):
        # Delegate to the internal self._a instance
        return self._a.spam(x)
    
    def foo(self):
        # Delegate to the internal self._a instance
        return self._a.foo()
    
    def bar(self):
        pass

If you have many methods to delegate, an alternative approach is to define a `__getattr__()` method.

In [124]:
class A:
    def spam(self, x):
        pass
    
    def foo(self):
        pass
    
class B:
    def __init__(self):
        self._a = A()
        
    def bar(self):
        pass
    
# Expose all of the methods defined on class A
def __getattr__(self, name):
    return getattr(self._a, name)

The `__getattr__()` method is kind of like a catch-all for attribute lookup.  
It’s a method that gets called if code tries to access an attribute that doesn’t exist.  
In the preceding code, it would catch access to undefined methods on B and simply delegate them to A.

Implementing proxies is another example of delegation.

In [125]:
# A proxy class that wraps around another object but still exposes its public attributes
def __init__(self, obj):
    self._obj = obj
    
    # Delegate attribute lookup to internal object
    def __getattr__(self, name):
        print('getattr:', name)
        return getattr(self._obj, name)
    
    # Delegate attribute assignment
    def __setattr__(self, name, value):
        if name.startswith(''):
            super().__setattr__(name, value)
        else:
            print('setattr:', name, value)
            setattr(self._obj, name, value)
            
    # Delegate attribute deletion
    def __delattr__(self, name):
        if name.startswith('_'):
            super().__delattr__(name)
        else:
            print('delattr:', name)
            delattr(self._obj, name)

You can use this proxy class by wrapping it around another instance:

In [126]:
class Spam:
    def __init__(self, x):
        self.x = x
    def bar(self, y):
        print('Spam.bar:', self.x, y)

In [127]:
# Create an instance
s = Spam(2)

In [128]:
# Create a proxy around it
p = Proxy(s)

In [129]:
# Access the proxy
print(p.x)
print(p.bar(3))
p.x = 37

2
Spam.bar: 2 3
None


By customizing the implementation of the attribute access methods, you can have more control over the proxy's behavior.  
You can enable logging access, allow read-only access, and other behavior modifications.

### Discussion

Delegation is sometimes used as an alternative to inheritance.

In [130]:
# Inheritance
class A:
    def spam(self, x):
        print('A.spam', x)
        
    def foo(self):
        print('A.foo')
        
class B(A):
    def spam(self, x):
        print('B.spam')
        super().spam(x)
        
    def bar(self):
        print('B.bar')

In [131]:
# Delegation
class A:
    def spam(self, x):
        print('A.spam', x)
        
    def foo(self):
        print(a.foo)
        
class B:
    def __init__(self):
        self._a = A()
        
    def spam(self, x):
        print('B.spam', x)
        self._a.spam(x)
        
    def bar(self):
        print('B.bar')
        
    def __getattr__(self, name):
        return getattr(self._a, name)

This use of delegation is often useful in situations where direct inheritance might not make much sense or where you want to have more control of the relationship between objects, such as only exposing certain methods, implementing interfaces, an so forth.  
When using delegation to implement proxies, there are a few additional details to note.  
First, the `__getattr__()` method is actually a fallback method that only gets called when an attribute is not found.  
Thus, when attributes of the proxy instance itself are accessed (e.g., the `_obj` attribute), this method would not be triggered.  
Second, the `__setattr__()` and `__delattr__()` methods need a bit of extra logic added to separate attributes from the proxy instance inself and attributes on the internal object `_obj`.  
A common convention is for proxies to only delegate to attributes that don’t start with a leading underscore so that proxies only expose the public attributes of the held instance.  
It is also important to emphasize that the `__getattr__()` method usually does not apply to most special methods that start and end with double underscores.  
For example, consider this class:

In [132]:
class ListLike:
    def __init__(self):
        self._items = []
    def __getattr__(self, name):
        return getattr(self._items, name)

If you try to make a `ListLike` object, you’ll find that it supports the common list methods, such as `append()` and `insert()`.  
However, it does not support any of the operators like `len()`, item lookup, and so forth.

To support the different operators, you have to manually delegate the associated special methods yourself.

In [133]:
class ListLike:
    def __init__(self):
        self._items = []
    def __getattr__(self, name):
        return getattr(self._items, name)
    
    # Special methods to support certain list operations
    def __len__(self):
        return len(self._items)
    
    def __getitem__(self, index):
        return self._items[index]
    
    def __setitem__(self, index, value):
        self._items[index] = value
        
    def __delitem__(self, index):
        del self._items[index]

See Recipe 11.8 for another example of using delegation in the context of creating proxy classes for remote procedure calls.

## 8.16. Defining More Than One Constructor in a Class

### Problem

You’re writing a class, but you want users to be able to create instances in more than the one way provided by `__init__()`.

### Solution

To define a class with more than one constructor, you should use a class method.

In [134]:
import time

class Date:
    # Primary constructor
    def __init__(self, year, month, day):
        self.year = year
        self.month = month
        self.day = day
        
    # Alternate constructor
    @classmethod
    def today(cls):
        t = time.localtime()
        return cls(t.tm_year, t.tm_mon, t.tm_mday)

To use the alternate constructor, you call it as a function such as `Date.today()`.

In [135]:
a = Date(2019, 4, 10)
b = Date.today()
print(a.year, a.month, a.day)
print(b.year, b.month, b.day)

2019 4 10
2019 4 25


Even though the numbers above are identical, `a` and `b` are different `Date` objects stored in different places.

In [136]:
a, b

(<__main__.Date at 0x107e8b198>, <__main__.Date at 0x107e8b128>)

Perhaps one could be used as a primary and another as an alternate, but if memory and speed are a concern and the objects are identical, you may want to use only one.

### Discussion

One of the primary uses of class methods is to define alternate constructors, as shown in this recipe.  
A critical feature of a class method is that it receives the class as the first argument `(cls)`.  
You will notice that this class is used within the method to create and return the final instance.  
It is extremely subtle, but this aspect of class methods makes them work correctly with features such as inheritance.

In [137]:
class NewDate(Date):
    pass

# Creates an instance of Date (cls=Date)
c = Date.today()
# Creates an instance of NewDate(cls=NewDate)
d = NewDate.today()
print(c.year, c.month, c.day)
print(d.year, d.month, d.day)

2019 4 25
2019 4 25


When defining a class with multiple constructors, you should make the `__init__()` function as simple as possible — doing nothing more than assigning attributes from given values.  
Alternate constructors can then choose to perform advanced operations if needed.
Instead of defining a separate class method, you might be inclined to implement the `__init__()` method in a way that allows for different calling conventions.

In [138]:
class Date:
    def __init__(self, *args):
        if len(args) == 0:
            t = time.localtime()
            args = (t.tm_year, t.tm_mon, t.tm_mday)
        self.year, self.month, self.day = args

In [139]:
a = Date(2019, 4, 10)
a

<__main__.Date at 0x107e62d68>

In [140]:
a.year, a.month, a.day

(2019, 4, 10)

Although this technique works in certain cases, it often leads to code that is hard to understand and difficult to maintain.  
For example, this implementation won’t show useful help strings containing argument names.  
In addition, code that creates `Date` instances won't be as clear.  
Compare and contrast the following:

In [141]:
# Specify a date clearly
a = Date(2019, 4, 10)
# Don't specify a date? What does this do?
b = Date()

As shown, the `Date.today()` invokes the regular `Date.__init__()` method by instantiating a `Date()` with suitable `year`, `month`, and `day` arguments.  
If necessary, instances can be created without ever invoking the `__init__()` method.  
This is described in the next recipe.

## 8.17. Creating an Instance Without Invoking `__init__()`

### Problem

You need to create an instance, but want to bypass the execution of the `__init__()` method.

### Solution

A bare uninitialized instance can be created by directly calling the [`__new__()`](https://www.python.org/download/releases/2.2.3/descrintro/#__new__) method of a class.

In [142]:
class Date:
    def __init__(self, year, month, day):
        self.year = year
        self.month = month
        self.day = day

Here's how you can create a `Date` instance without invoking `__init__()`:

In [143]:
d = Date.__new__(Date)
d

<__main__.Date at 0x107e8b2e8>

However, the resulting instance is uninitialized and the instance variables must be set manually.

In [144]:
data = {'year':2019, 'month':4, 'day':12}
for key, value in data.items():
    setattr(d, key, value)

In [145]:
d.year, d.month, d.day

(2019, 4, 12)

### Discussion

The problem of bypassing `__init__()` sometimes arises when instances are being created in a nonstandard way.  
Examples include deserializing data or the implementation of a class method that’s been defined as an alternate constructor.  
For this example on the `Date` class shown, someone might define an alternate constructor `today()` as follows:

In [146]:
from time import localtime

class Date:
    def __init__(self, year, month, day):
        self.year = year
        self.month = month
        self.day = day
        
    @classmethod
    def today(cls):
        d = cls.__new__(cls)
        t = localtime()
        d.year = t.tm_year
        d.month = t.tm_mon
        d.dat = t.tm_mday
        return d

Now suppose you are deserializing JSON data and the resulting dictionary looks like this:

In [147]:
data = { 'year': 2019, 'month': 4, 'day': 12 }

If you want to turn this into a `Date` instance, simply use the technique shown in the solution.  
When creating instances in a nonstandard way, it’s usually best to not make too many assumptions about their implementation.  
As such, you generally don’t want to write code that directly manipulates the underlying instance dictionary `__dict__` unless you know it’s guaranteed to be defined.  
Otherwise, the code will break if the class uses `__slots__`, properties, descriptors, or other advanced techniques.  
By using `setattr()` to set the values, your code will be as general purpose as possible.

## 8.18. Extending Classes with Mixins

### Problem

You have a collection of generally useful methods that you would like to make available for extending the functionality of other class definitions.  
However, the classes where the methods might be added aren’t necessarily related to one another via inheritance.  
Thus, you can’t just attach the methods to a common base class.

### Discussion

The problem addressed by this recipe often arises in code where one is interested in the issue of class customization.  
For example, maybe a library provides a basic set of classes along with a set of optional customizations that can be applied if desired by the user.  
To illustrate, suppose you have an interest in adding various customizations like logging, set-once, and type checking to mapping objects.  
Here are a set of mixin classes that do that:

In [148]:
class LoggedMappingMixin:
    """
    Add logging to get, set, and delete operations for debugging.
    """
    __slots__ = ()
    
    def __getitem__(self, key):
        print('Getting ' + str(key))
        return super().__getitem__(key)
    
    def __setitem__(self, key, value):
        print('Setting {} = {!r}'.format(key, value))
        return super().__setitem__(key, value)
    
    def __delitem__(self, key):
        print('Deleting ' + str(key))
        return super().__delitem__(key)
    
class SetOnceMappingMixin:
    """
    Only allow a key to be set once.
    """
    __slots__ = ()
    def __setitem__(self, key, value):
        if key in self:
            raise KeyError(str(key) + ' already set')
        return super().__setitem__(key, value)
    
class StringKeysMappingMixin:
    """
    Restrict keys to strings only.
    """
    __slots__ = ()
    def __setitem__(self, key, value):
        if not isinstance(key, str):
            raise TypeError('Keys must be strings')
        return super().__setitem__(key, value)

These classes, by themselves, are useless.  
In fact, if you instantiate any one of them in isolation, the only thing they do is generate exceptions.  
Instead, they are supposed to be mixed with other mapping classes through multiple inheritance.

In [149]:
class LoggedDict(LoggedMappingMixin, dict):
    pass

d = LoggedDict()
d['x'] = 23

Setting x = 23


In [150]:
d['x']

Getting x


23

In [151]:
del d['x']

Deleting x


In [152]:
from collections import defaultdict

class SetOnceDefaultDict(SetOnceMappingMixin, defaultdict):
    pass

In [153]:
d = SetOnceDefaultDict(list)

In [154]:
d['x'].append(2)
d['y'].append(3)
d['x'].append(10)

In [155]:
from collections import OrderedDict

class StringOrderedDict(StringKeysMappingMixin,
                        SetOnceMappingMixin,
                        OrderedDict):
    pass

In [156]:
d = StringOrderedDict()
d['x'] = 23

In the example, you will notice that the mixins are combined with other existing classes such as `dict`, `defaultdict`, and `OrderedDict` as well as one another.  
When combined, the classes all work together to provide the desired functionality.

### Discussion

Mixin classes appear in various places in the standard library, mostly as a means for extending the functionality of other classes similar to th one shown.  
They are also one of the main uses of multiple inheritance.  
For instance, if you are writing network code, you can often use the `ThreadingMixIn` from the `socketserver` module to add thread support to other network-related classes.  
For example, here is a multithreaded XML-RPC server:

In [157]:
from xmlrpc.server import SimpleXMLRPCServer
from socketserver import ThreadingMixIn

class ThreadedXMLRPCServer(ThreadingMixIn, SimpleXMLRPCServer):
    pass

It is also common to find mixins defined in large libraries and frameworks -- again, typically to enhance the functionality of existing classes with optional features in some way.  
There is a rich history surrounding the theory of mixin classes.  
However, rather than getting into all of the details, there are a few important implementation details to keep in mind.  
First, mixin classes are never meant to be instantiated directly.  
For example, none of the classes in this recipe work by themselves.  
They have to be mixed with another class that implements the required mapping functionality.  
Similarly, the `ThreadingMixIn` from the `socketserver` library has to be mixed with an appropriate server class -- it can’t be used all by itself.  
Second, mixin classes typically have no state of their own.  
This means there is no `__init__()` method and no instance variables.  
In this recipe, the specification of `__slots__ = ()` is meant to serve as a strong hint that the mixin classes do not have their own instance data.  

If you are thinking about defining a mixin class that has an` __init__()` method and instance variables, be aware that there is significant peril associated with the fact that the class doesn’t know anything about the other classes it’s going to be mixed with.  
Thus, any instance variables created would have to be named in a way that avoids name clashes.  
In addition, the `__init__()` method would have to be programmed in a way that properly invokes the `__init__()` method of other classes that are mixed in.  
In general, this is difficult to implement since you know nothing about the argument signatures of the other classes.  
At the very least, you would have to implement something very general using `*arg`, `**kwargs`.  
If the `__init__()` of the mixin class took any arguments of its own, those arguments should be specified by keyword only and named in such a way to avoid name collisions with other arguments.  
Here is one possible implementation of a mixin defining an` __init__()` and accepting a keyword argument:

In [158]:
class RestrictKeysMixin:
    def __init__(self, *args, _restrict_key_type, **kwargs):
        self.__restrict_key_type = _restrict_key_type
        super().__init__(*args, **kwargs)
        
    def __setitem__(self, key, value):
        if not isinstance(key, self.__restrict_key_type):
            raise TypeError('Keys must be ' + str(self.__restrict_key_type))
        super().__setitem__(key, value)

Here is one possible use for this class:

In [159]:
class RDict(RestrictKeysMixin, dict):
    pass

d = RDict(_restrict_key_type=str)
e = RDict([('name','Dave'), ('n',37)], _restrict_key_type=str)
f = RDict(name='Dave', n=37, _restrict_key_type=str)
f

{'name': 'Dave', 'n': 37}

In this example, you’ll notice that initializing an `RDict()` still takes the arguments understood by `dict()`.  
However, there is an extra keyword argument `restrict_key_type` that is provided to the mixin class.  
Finally, use of the `super()` function is an essential and critical part of writing mixin classes.  
In the solution, the classes redefine certain critical methods, such as `__getitem__()` and `__setitem__()`.  
However, they also need to call the original implementation of those methods.  
Using `super()` delegates to the next class on the method resolution order (MRO).  
This aspect of the recipe, however, is not obvious to novices, because `super()` is being used in classes that have no parent (at first glance, it might look like an error).  
However, in a class definition such as this:

In [160]:
class LoggedDict(LoggedMappingMixin, dict):
    pass

the use of `super()` in `LoggedMappingMixin` delegates to the next class over in the multiple inheritance list.  
That is, a call such as `super().__getitem__()` in `LoggedMappingMixin` actually steps over and invokes `dict.__getitem__()`.  
Without this behavior, the mixin class wouldn’t work at all.  
An alternative implementation of mixins involves the use of class decorators.  
For example, consider this code:

In [161]:
def LoggedMapping(cls):
    cls_getitem = cls.__getitem__
    cls_setitem = cls.__setitem__
    cls_delitem = cls.__delitem__


    def __getitem__(self, key):
        print('Getting ' + str(key))
        return cls_getitem(self, key)
    
    def __setitem__(self, key, value):
        print('Setting {} = {!r}'.format(key, value))
        return cls_setitem(self, key, value)
    
    def __delitem__(self, key):
        print('Deleting ' + str(key))
        return cls_delitem(self, key)
    
    cls.__getitem__ = __getitem__
    cls.__setitem__ = __setitem__
    cls.__delitem__ = __delitem__
    return cls

This function is applied as a decorator to a class definition.

In [162]:
@LoggedMapping
class LoggedDict(dict):
    pass

If you try it, you’ll find that you get the same behavior, but multiple inheritance is no longer involved.  
Instead, the decorator has simply performed a bit of surgery on the class definition to replace certain methods.  
Further details about class decorators can be found in Recipe 9.12.
See Recipe 8.13 for an advanced recipe involving both mixins and class decorators.

## 8.19. Implementing Stateful Objects or State Machines

### Problem

You want to implement a state machine or an object that operates in multiple states, and you want to avoid cluttering your code with conditional statements.

### Solution

In certain applications, you might have objects that operate differently according to some kind of internal state.  
For example, consider a simple class representing a connection:

In [163]:
class Connection:
    def __init__(self):
        self.state = 'CLOSED'
        
    def read(self):
        if self.state != 'OPEN':
            raise RuntimeError('Not open')
        print('Reading...')
        
    def write(self, data):
        if self.state != 'OPEN':
            raise RuntimeError('Not Open')
        print('Writing...')
        
    def open(self):
        if self.state == 'OPEN':
            raise RuntimeError('Already Open')
        self.state = 'OPEN'
        
    def open(self):
        if self.state == 'CLOSED':
            raise RuntimeError('Already Closed')
        self.state = 'CLOSED'

This implementation presents a couple of difficulties.  
First, the code is complicated by the introduction of many conditional checks for the state.  
Second, the performance is degraded because common operations like `read()` and `write()` always check the state before proceeding.  
A more elegant approach is to encode each operational state as a separate class and arrange for the `Connection` class to delegate to the state class.

In [164]:
class Connection:
    def __init__(self):
        self.new_state(ClosedConnectionState)
        
    def new_state(self, newstate):
        self._state = newstate
        
    # Delegate to the state class
    def read(self):
        return self._state.read(self)
    
    def write(self, data):
        return self._state.write(self, data)
    
    def open(self):
        return self._state.open(self)
    
    def close(self):
        return self._state.close(self)
    
# Connection state base class
class ConnectionState:
    @staticmethod
    def read(conn):
        raise NotImplementedError()
        
    @staticmethod
    def write(conn, data):
        raise NotImplementedError()
        
    @staticmethod
    def open(conn):
        raise NotImplementedError()
        
    @staticmethod
    def close(conn):
        raise NotImplementedError()
        
# Implementation of different states
class ClosedConnectionState(ConnectionState):
    @staticmethod
    def read(conn):
        raise RuntimeError('Not open')
        
    @staticmethod
    def write(conn, data):
        raise RuntimeError('Not open')
        
    @staticmethod
    def open(conn):
        raise RuntimeError('Already closed')
        
class OpenConnectionState(ConnectionState):
    @staticmethod
    def read(conn):
        print('Reading')
        
    @staticmethod
    def write(conn, data):
        print('Writing')
    
    @staticmethod
    def open(conn):
        raise RuntimeError('Already open')
        
    @staticmethod
    def close(conn):
        conn.new_state(ClosedConnectionState)

Now we can illustrate the use of these classes:

In [165]:
c = Connection()
c._state

__main__.ClosedConnectionState

### Discussion

Writing code that features a large set of complicated conditionals and intertwined states is hard to maintain and explain.  
The solution presented here avoids that by splitting the individual states into their own classes.  
It might look a little weird, but each state is implemented by a class with static methods, each of which take an instance of `Connection` as the first argument.  
This design is based on a decision to not store any instance data in the different state classes themselves.  
Instead, all instance data should be stored on the `Connection` instance.  
The grouping of states under a common base class is mostly there to help organize the code and to ensure that the proper methods get implemented.  
The `NotImplementedError` exception raised in base class methods is just there to make sure that subclasses provide an implementation of the required methods.  
As an alternative, you might consider the use of an abstract base class, as described in Recipe 8.12.  
Another implementation technique concerns direct manipulation of the `__class__` attribute of instances.

In [166]:
class Connection:
    def __init__(self):
        self.new_state(ClossedConnection)

    def new_state(self, newstate):
        self.__class__ = newstate

    def read(self):
        raise NotImplementedError()

    def write(self, data):
        raise NotImplementedError()

    def open(self):
        raise NotImplementedError()

    def close(self):
        raise NotImplementedError()
            
class ClosedConnection(Connection):
    def read(self):
        raise(RuntimeError('Not open'))
        
    def write(self, data):
        raise RuntimeError('Not open')
        
    def open(self):
        self.new_stae(OpenConnection)
        
    def close(self):
        raise RuntimeError('Already closed')
        
class OpenConnection(Connection):
    def read(self):
        print('Reading...')
        
    def write(self, data):
        print('Writing...')
        
    def open(self):
        raise RuntimeError('Aready open')
        
    def close(self):
        self.new_state(ClosedConnection)

The main feature of this implementation is that it eliminates an extra level of indirection.  
Instead of having separate `Connection` and `ConnectionState` classes, the two classes are merged together into one.  
As the state changes, the instance will change its type.

Object-oriented purists might be offended by the idea of simply changing the instance `__class__` attribute.  
However, it’s technically allowed.  
Also, it might result in slightly faster code since all of the methods on the connection no longer involve an extra delegation step.  
Finally, either technique is useful in implementing more complicated state machines -- especially in code that might otherwise feature large if-elif-else blocks.

In [167]:
# Original implementation
class State:
    def __init__(self):
        self.state = 'A'
    
    def action(self, x):
        if state == 'A':
            # 'A' does something
            state = 'B'
        elif state == 'B':
            # 'B' does something
            state = 'C'
        elif state == 'C':
            # 'C' does something, and so forth
            state == 'A'
            # Recursion is useful because it is

In [168]:
# Alternative implementation
class State:
    def __init__(self):
        self.new_state(State_A)
        
    def new_state(self, state):
        self.__class__ = state
        
    def action(self, x):
        raise NotImplementedError()
        
class State_A(State):
    def action(self, x):
        # State_A does something
        self.new_state(State_B)

class State_B(State):
    def action(self, x):
        # State_B does something
        self.new_state(State_C)

class State_C(State):
    def action(self, x):
        # State_C does something
        self.new_state(State_A)

This recipe is based on the state design pattern found in Design Patterns: Ele‐ments of Reusable Object-Oriented Software by Erich Gamma, Richard Helm, Ralph Johnson, and John Vlissides (Addison-Wesley, 1995).

## 8.20. Calling a Method on an Object Given the Name As a String

### Problem

You have the name of a method that you want to call on an object stored in a string and you want that method executed. 

### Solution

For simple cases, you might us e `getattr()`, like this:

In [169]:
import math

class Point:
    def __init__(self, x, y):
        self.x = x
        self.y = y
        
    def __repr__(self):
        return 'Point({!r:},{!r:})'.format(self.x, self.y)
    
    def distance(self, x, y):
        return math.hypot(self.x - x, self.y - y)

In [170]:
p = Point(2, 3)
d = getattr(p, 'distance')(0, 0)
# Calls p.distance(0, 0)
d

3.605551275463989

An alternative approach is to use `operator.methodcaller()`.

In [171]:
import operator

operator.methodcaller('distance', 0,0)(p)

3.605551275463989

`operator.methodcaller()` may be useful if you want to look up a method by name and supply the same arguments over and over again.  
For example, if you need to sort an entire list of points:

In [172]:
points = [
    Point(1, 2),
    Point(3, 0),
    Point(10, -3),
    Point(-5, -7),
    Point(-1, 8),
    Point(3, 2)
]
# sort by distance from origin (0, 0)
z = points.sort(key=operator.methodcaller('distance', 0, 0))

### Discussion

Calling a method is actually two separate steps involving an attribute lookup and a function call.  
Therefore, to call a method, you simply look up the attribute using `getattr()`, as for any other attribute.  
To invoke the result as a method, simply treat the result of the lookup as a function.  
`operator.methodcaller()` creates a callable object, but also fixes any arguments that are going to be supplied to the method.  
All that you need to do is provide the appropriate self argument.

In [173]:
p = Point(3, 4)
d = operator.methodcaller('distance', 0, 0)
d(p)

5.0

Invoking methods using names contained in strings is somewhat common in code that emulates case statements or variants of the visitor pattern.  
See the next recipe for a more advanced example.

## 8.21. Implementing the Visitor Pattern

### Problem

You need to write code that processes or navigates through a complicated data structure consisting of many different kinds of objects, each of which needs to be handled in a different way.  
One example would be traversing a tree structure and performing different actions depending on what kinds of nodes are encountered.

### Solution

The problem addressed by this recipe is one that often arises inprograms that build data structures consisting of a wide variety and and quantity of objects.  
To illustrate, suppose you are trying to write a program that represents mathematical expressions, and this program uses a number of different classes.

In [174]:
class Node:
    pass

class UnaryOperator(Node):
    def __init__(self, operand):
        self.operand = operand
        
class BinaryOperator(Node):
    def __init__(self, left, right):
        self.left = left
        self.right = right
        
class Add(BinaryOperator):
    pass

class Sub(BinaryOperator):
    pass

class Mul(BinaryOperator):
    pass

class Div(BinaryOperator):
    pass

class Negate(UnaryOperator):
    pass

class Number(Node):
    def __init__(self, value):
        self.value = value

The classes above can be used to build up nested data structures.  
The following example will demonstrate this when we tell our machine to do math.

In [175]:
# Representation of 1 + 2 * (3 - 4) / 5
t1 = Sub(Number(3), Number(4))
t2 = Mul(Number(2), t1)
t3 = Div(t2, Number(5))
t4 = Add(Number(1), t3)

In [176]:
t1

<__main__.Sub at 0x107f901d0>

In [177]:
type(t1)

__main__.Sub

In [178]:
import pprint

pprint.pprint(t4)

<__main__.Add object at 0x107f903c8>


In [179]:
pprint.pprint(vars(t4))

{'left': <__main__.Number object at 0x107f90358>,
 'right': <__main__.Div object at 0x107f90240>}


In [180]:
pprint.pprint(t4.__dict__)

{'left': <__main__.Number object at 0x107f90358>,
 'right': <__main__.Div object at 0x107f90240>}


In [181]:
dir(t4)

['__class__',
 '__delattr__',
 '__dict__',
 '__dir__',
 '__doc__',
 '__eq__',
 '__format__',
 '__ge__',
 '__getattribute__',
 '__gt__',
 '__hash__',
 '__init__',
 '__init_subclass__',
 '__le__',
 '__lt__',
 '__module__',
 '__ne__',
 '__new__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__setattr__',
 '__sizeof__',
 '__str__',
 '__subclasshook__',
 '__weakref__',
 'left',
 'right']

The problem is not the creation of such structures, but in writing code that processes them later.  
For example, given such an expression, a program might want to do any number of things including producing output, generating instructions, or performing translations.  
To enable general-purpose processing, a common solution is to implement the [Visitor Pattern](https://en.wikipedia.org/wiki/Visitor_pattern#Python_example) using a class similar to this:

In [182]:
class NodeVisitor:
    def visit(self, node):
        methname = 'visit_' + type(node).__name__
        meth = getattr(self, methname, None)
        if meth is None:
            meth = self.generic_visit
        return meth(node)
    
    def generic_visit(self, node):
        raise RuntimeError('No {} method'.format('visit_' + type(node).__name__))

To use this class, a programmer inherits from it and implements various methods of the form `visit_Name()`, where `Name` is substituted with the node type.  
For example, this is one way that you could evaluate an expression:

In [183]:
class Evaluator(NodeVisitor):
    def visit_Number(self, node):
        return node.value
    
    def visit_Add(self, node):
        return self.visit(node.left) + self.visit(node.right)
    
    def visit_Sub(self, node):
        return self.visit(node.left) - self.visit(node.right)
    
    def visit_Mul(self, node):
        return self.visit(node.left) * self.visit(node.right)
    
    def visit_Div(self, node):
        return self.visit(node.left) / self.visit(node.right)
    
    def visit_Negate(self, node):
        return -node.operand

Now we can evaluate our previously generated expression:

In [184]:
e = Evaluator()
e.visit(t4)

0.6

Here is an example of a class that translates an expression into operations on a simple stack machine:

In [185]:
class StackCode(NodeVisitor):
    def generate_code(self, node):
        self.instructions = []
        self.visit(node)
        return self.instructions
    
    def visit_Number(self, node):
        self.instructions.append(('PUSH', node.value))
        
    def binop(self, node, instruction):
        self.visit(node.left)
        self.visit(node.right)
        self.instructions.append((instruction,))
        
    def visit_Add(self, node):
        self.binop(node, 'ADD')
        
    def visit_Sub(self, node):
        self.binop(node, 'SUB')
        
    def visit_Mul(self, node):
        self.binop(node, 'MUL')
        
    def visit_Div(self, node):
        self.binop(node, 'DIV')
        
    def unaryop(self, node, instruction):
        self.visit(node.operand)
        self.instructions.append((instruction,))
        
    def vist_Negate(self, node):
        self.unaryop(node, 'NEG')

In [186]:
s = StackCode()
s.generate_code(t1)

[('PUSH', 3), ('PUSH', 4), ('SUB',)]

In [187]:
s.generate_code(t1)

[('PUSH', 3), ('PUSH', 4), ('SUB',)]

In [188]:
s.generate_code(t2)

[('PUSH', 2), ('PUSH', 3), ('PUSH', 4), ('SUB',), ('MUL',)]

In [189]:
s.generate_code(t3)

[('PUSH', 2),
 ('PUSH', 3),
 ('PUSH', 4),
 ('SUB',),
 ('MUL',),
 ('PUSH', 5),
 ('DIV',)]

In [190]:
s.generate_code(t4)

[('PUSH', 1),
 ('PUSH', 2),
 ('PUSH', 3),
 ('PUSH', 4),
 ('SUB',),
 ('MUL',),
 ('PUSH', 5),
 ('DIV',),
 ('ADD',)]

### Discussion

There are really two key ideas in this recipe.  
The first is a design strategy where code that manipulates a complicated data structure is decoupled from the data structure itself.  
That is, in this recipe, none of the various `Node` classes provide any implementation that does anything with the data.  
Instead, all of the data manipulation is carried out by specific implementations of the separate `NodeVisitor` class.  
This separation makes the code extremely general purpose.  
The second major idea of this recipe is in the implementation of the visitor class itself.  
In the visitor, you want to dispatch to a different handling method based on some value such as the node type.  
In a naive implementation, you might be inclined to write a huge if statement, like this:

In [191]:
class NodeVisitor:
    def visit(self, node):
        nodetype = type(node).__name__
        if nodetype == 'Number':
            return self.visit_Number(node)
        elif nodetype == 'Add':
            return self.visit_Add(node)
        elif nodetype == 'Sub':
            return self.visit_Sub(node)
        # And so on ...
        else:
            pass

However, it quickly becomes apparent that you don’t really want to take that approach.  
Aside from being incredibly verbose, it runs slowly, and it’s hard to maintain if you ever add or change the kind of nodes being handled.  
Instead, it’s much better to play a little trick where you form the name of a method and go fetch it with the `getattr()` function, as shown.  
The `generic_visit()` method in the solution is a fallback should no matching handler method be found.  
In this recipe, it raises an exception to alert the programmer that an unexpected node type was encountered.  
Within each visitor class, it is common for calculations to be driven by recursive calls to the `visit()` method.

In [192]:
class Evaluator(NodeVisitor):
    # ...
    def visit_Add(self, node):
        return self.visit(node.left) + self.visit(node.right)

This recursion is what makes the visitor class traverse the entire data structure.  
Essentially, you keep calling `visit()` until you reach some sort of terminal node, such as `Number` in the example.  
The exact order of the recursion and other operations depend entirely on the application.  
It should be noted that this particular technique of dispatching to a method is also a common way to emulate the behavior of switch or case statements from other languages.  
For example, if you are writing an HTTP framework, you might have classes that do a similar kind of dispatch:

In [193]:
class HTTPHandler:
    def handle(self, request):
        methname = 'do_' + request.request_method
        getattr(self, methname)(request)
        
    def do_GET(self, request):
        # ... 
        pass
    
    def do_POST(self, request):
        # ... 
        pass
    
    def do_HEAD(self, request):
        # ...
        pass

One weakness of the visitor pattern is its heavy reliance on recursion.  
If you try to apply it to a deeply nested structure, it’s possible that you will hit Python’s [recursion depth limit](https://docs.python.org/3/library/sys.html#sys.getrecursionlimit).  
To avoid this problem, you can make certain choices in your data structures.  
For example, you can use normal Python lists instead of linked lists or try to aggregate more data in each node to make the data more shallow.  
You can also try to employ nonrecursive traversal algorithms using generators or iterators as discussed in Recipe 8.22.  
Use of the visitor pattern is extremely common in programs related to parsing and compiling.  
One notable implementation can be found in Python’s own `ast` module.  
In addition to allowing traversal of tree structures, it provides a variation that allows a data structure to be rewritten or transformed as it is traversed, like when nodes are added or removed.  
Look at the [source for ast](https://github.com/python/cpython/blob/master/Lib/ast.py) for more details.  
Recipe 9.24 shows an example of using the `ast` module to process Python source code.

## 8.22. Implementing the Visitor Pattern Without Recursion

### Problem

You’re writing code that navigates through a deeply nested tree structure using the visitor pattern, but it blows up due to [exceeding the recursion limit](https://stackoverflow.com/questions/3323001/what-is-the-maximum-recursion-depth-in-python-and-how-to-increase-it) .  
You’d like to eliminate the recursion, but keep the programming style of the visitor pattern.

### Solution

Clever use of generators can sometimes be used to eliminate recursion from algorithms involving tree traversal or searching.  
In Recipe 8.21, a visitor class was presented.  
Here is an alternative implementation of that class that drives the computation in an entirely different way using a stack and generators:

In [194]:
import types

class Node:
    pass

class NodeVisitor:
    def visit(self, node):
        stack = [node]
        last_result = None
        while stack:
            try:
                last = stack[-1]
                if isinstance(last, types.GeneratorType):
                    stack.append(last.send(last_result))
                    last_result = None
                elif isinstance(last, Node):
                    stack.append(self._visit(stack.pop()))
                else:
                    last_result = stack.pop()
            except StopIteration:
                stack.pop()
        return last_result
    
    def _visit(self, node):
        methname = 'visit_' + type(node).__name__
        meth = getattr(self, methname, None)
        if meth is None:
            meth = self.generic_visit
        return meth(node)
    
    def generic_visit(self, node):
        raise RuntimeError('No {} method'.format('visit_' + type(node).__name__))

If you use this class, you'll find that it still works with existing code that might have used recursion.  
In fact, you can use it as a drop-in replacement for the visitor pattern implementation in the prior recipe.  
For example, consider the following code that uses [expression trees](https://www.geeksforgeeks.org/expression-tree/):

In [195]:
class UnaryOperator(Node):
    def __init__(self, operand):
        self.operand = operand
        
class BinaryOperator(Node):
    def __init__(self, left, right):
        self.left = left
        self.right = right
        
class Add(BinaryOperator):
    pass

class Sub(BinaryOperator):
    pass

class Mul(BinaryOperator):
    pass

class Div(BinaryOperator):
    pass

class Negate(UnaryOperator):
    pass

class Number(Node):
    def __init__(self, value):
        self.value = value
        
# A sample visitor class that evaluates expressions
class Evaluator(NodeVisitor):
    def visit_Number(self, node):
        return node.value
    
    def visit_Add(self, node):
        return self.visit(node.left) + self.visit(node.right)
    
    def visit_Sub(self, node):
        return self.visit(node.left) - self.visit(node.right)
    
    def visit_Mul(self, node):
        return self.visit(node.left) * self.visit(node.right)
    
    def visit_Div(self, node):
        return self.visit(node.left) / self.visit(node.right)
    
    def visit_Negate(self, node):
        return -self.visit(node.operand)

In [196]:
if __name__ == "__main__":
    # 1 + 2 * (3 - 4) / 5
    t1 = Sub(Number(3), Number(4))
    t2 = Mul(Number(2), t1)
    t3 = Div(t2, Number(5))
    t4 = Add(Number(1), t3)
    # Evaluate the expression
    e = Evaluator()

In [197]:
print(e.visit(t4))

0.6


In [198]:
print(e.visit(t1))

-1


The preceding code works for simple expressions, but `Evaluator()` uses recursion and will crash if the depth is too great.

In [199]:
a = Number(0)
for n in range(1, 10):
    a = Add(a, Number(n))
e = Evaluator()
e.visit(a)

45

Just keep adding zeros to the second range argument until you see a stack trace stating `RecursionError: maximum recursion depth exceeded while calling a Python object`.

Now let's change the Evaluator class so that it can handle our experiments with recursion.

In [200]:
class Evaluator(NodeVisitor):
    def visit_Number(self, node):
        return node.value
    
    def visit_Add(self, node):
        yield (yield node.left) + (yield node.right)
        
    def visit_Sub(self, node):
        yield (yield node.left) - (yield node.right)
        
    def visit_Mul(self, node):
        yield (yield node.left) * (yield node.right)
        
    def visit_Div(self, node):
        yield (yield node.left) / (yield node.right)
        
    def visit_Negate(self, node):
        yield -(yield node.operand)

In [201]:
if __name__ == "__main__":
    # 1 + 2 * (3 - 4) / 5
    t1 = Sub(Number(3), Number(4))
    t2 = Mul(Number(2), t1)
    t3 = Div(t2, Number(5))
    t4 = Add(Number(1), t3)
    # Evaluate the expression
    e = Evaluator()

In [202]:
a = Number(0)
for n in range(1, 100000):
    a = Add(a, Number(n))
e = Evaluator()

In [203]:
e.visit(a)

4999950000

You can also add custom processing into any of the methods:

In [204]:
class Evaluator(NodeVisitor):
    # ...
    def visit_Add(self, node):
        print('Add:', node)
        lhs = yield node.left
        print('left=', lhs)
        rhs = yield node.right
        print('right=', rhs)
        yield lhs + rhs
        # ...

### Discussion

This recipe nicely illustrates how generators and coroutines can perform mind-bending tricks involving program control flow, often to great advantage.  
To understand this recipe, a few key insights are required.  
First, in problems related to tree traversal, a common implementation strategy for avoiding recursion is to write algorithms involving a stack or queue.  
For example, depth-first traversal can be implemented entirely by pushing nodes onto a stack when first encountered and then popping them off once processing has finished.  
The central core of the `visit()` method in the solution is built around this idea.  
The algorithm starts by pushing the initial node onto the stack list and runs until the stack is empty.  
During execution, the stack will grow according to the depth of the underlying tree structure.  
The second insight concerns the behavior of the `yield` statement in generators.  
When `yield` is encountered, the behavior of a generator is to emit a value and to suspend.  
This recipe uses this as a replacement for recursion.  
For example, instead of writing a recursive expression like this:

you replace it with this:

Behind the scenes, this sends the node in question, `node.left`, back to the `visit()` method.  
The `visit()` method then carries out the execution of the appropriate `visit_Name()` method for that node.  
In some sense, this is almost the opposite of recursion.  
That is, instead of calling `visit()` recursively to move the algorithm forward, the yield statement is being used to temporarily back out of the computation in progress.  
Thus, the yield is essentially a signal that tells the algorithm that the yielded node needs to be processed first before further progress can be made.

The final part of this recipe concerns propagation of results.  
When generator functions are used, you can no longer use return statements to emit values, and doing so will cause a `SyntaxError` exception.  
Thus, the `yield` statement has to do double duty to cover the case.  
In this recipe, if the value produced by a yield statement is a non-Node type, it is assumed to be a value that will be propagated to the next step of the calculation.  
This is the purpose of the `last_return` variable in the code.  
Typically, this would hold the last value yielded by a visit method.  
That value would then be sent into the previously executing method, where it would show up as the return value from a yield statement.  
For example, in this code: