# Chapter 8: Classes and Objects

The primary focus of this chapter is to present recipes to common programming patterns related to class definitions.  
Topics include making objects support common Python features, usage of special methods, encapsulation techniques, inheritance, memory management, and useful design patterns.  

## 8.1. Changing the String Representation of Instances

### Problem

You want to change the output produced by printing or viewing instances to something more sensible.

### Solution

To change the string representation of an instance, define the `__str__()` and `__repr__()` methods.

In [37]:
class Pair:
    def __init__(self, x, y):
        self.x = x
        self.y = y
    def __repr__(self):
        # r for repr
        return "Pair({0.x!r}, {0.y!r})".format(self)
    def __str__(self):
        # s for str
        return "({0.x!s}, {0.y!s})".format(self)

The `__repr__()` method returns the code representation of an instance, and is usually the text you would type to recreate the instance.  
The built-in `repr()` function returns this text, as does the interactive interpreter when inspecting values.  
The `__str__()` method converts the instance to a string, and is the output produced by the `str()` and `print()` functions.

In [38]:
p = Pair(3, 4)
# __repr__() output:
p

Pair(3, 4)

In [39]:
#__str__() output:
print(p)

(3, 4)


The implementation of this recipe also shows how different string representations may be used during formatting.  
Specifically, the special `!r` formatting code indicates that the output of `__repr__()` should be used instead of the default`__str__()`.  
You can try this experiment with the preceding class to see this:  

In [40]:
p = Pair(3, 4)
print('p is {0!r}'.format(p))

p is Pair(3, 4)


In [41]:
print('p is {0}'.format(p))

p is (3, 4)


### Discussion

Defining `__repr__()` and `__str__()` is often good practice, as it can simplify debugging and instance output.  
For example, just by printing or logging an instance, a programmer will be shown more useful information about the instance contents.  
It is standard practice for the output of `__repr__()` to produce text such that `eval(repr(x)) == x`.  
If this is not possible or desired, then it is common to create a useful textual representation enclosed in `<` and `>` instead.

In [42]:
f = open('example.bin')
f

<_io.TextIOWrapper name='example.bin' mode='r' encoding='UTF-8'>

In [43]:
f.close()

If no `__str__()` is defined, the output of `__repr__()` is used as a fallback.  
The use of `format()` in the solution might look a little funny, but the format code `{0.x}` specifies the x-attribute of argument 0.  
So, in the following function, the 0 is actually the instance self:

In [44]:
def __repr__(self):
    return "Pair({0.x!r}, (0.y!r))".format(self)

You can also use the modulo operator:

In [45]:
def __repr__(self):
    return "Pair(%r, %r)" % (self.x, self.y)

In [46]:
p

Pair(3, 4)

In [47]:
print(p)

(3, 4)


## 8.2. Customizing String Formatting

### Problem

You want an object to support customized formatting through the `format()` function and string method.

### Solution

You can customize string formatting by defining the `__format__()` method on a class.

In [48]:
_formats = {
        'ymd' : '{d.year}-{d.month}-{d.day}',
        'mdy' : '{d.month}/{d.day}/{d.year}',
        'dmy' : '{d.day}/{d.month}/{d.year}'
        }

class Date:
    def __init__(self, year, month, day):
        self.year = year
        self.month = month
        self.day = day
        
    def __format__(self, code):
        if code == '':
            code = 'ymd'
        fmt = _formats[code]
        return fmt.format(d=self)

Instances of the `Date` class now support operations like the ones below:

In [49]:
d = Date(2019, 3, 21)
format(d)

'2019-3-21'

In [50]:
format(d, 'mdy')

'3/21/2019'

In [51]:
"The date is {:ymd}".format(d)

'The date is 2019-3-21'

In [52]:
"The date is {:mdy}".format(d)

'The date is 3/21/2019'

### Discussion


The `__format__()` method provides a hook into Python’s string formatting functionality.  
It’s important to emphasize that the interpretation of format codes is entirely up to the class itself.  
Thus, the codes can be almost anything at all.  
For example, consider the following from the `datetime` module:

In [53]:
from datetime import date

d = date(2019, 3, 21)
format(d)

'2019-03-21'

In [54]:
format(d, '%A, %B, %d, %Y')

'Thursday, March, 21, 2019'

In [55]:
'The end is coming on {:%d %b %Y}. Farewell.'.format(d)

'The end is coming on 21 Mar 2019. Farewell.'

There are some standard conventions for the formatting of the built-in types.  
See the [documentation for the string module](https://docs.python.org/3/library/string.html) for a formal specification.

## 8.3. Making Objects Support the Context-Management Protocol

### Problem

You want to make your objects support the context-management protocol, aka the `with` statement.

### Solution

In order to make an object compatible with the `with` statement, you need to implement `__enter__()` and `__exit__()` methods.  
For example, consider the following class, which provides a network connection:

In [56]:
from socket import socket, AF_INET, SOCK_STREAM

class LazyConnection:
    def __init__(self, address, family=AF_INET, type=SOCK_STREAM):
        self.address = address
        self.family = AF_INET
        self.type = SOCK_STREAM
        self.sock = None
        
    def __enter__(self):
        if self.sock is not None:
            raise RuntimeError('*** Already connected ***')
        self.sock = socket(self.family, self.type)
        self.sock.connect(self.address)
        return self.sock
    
    def __exit__(self, exc_ty, exc_val, tb):
        self.sock.close()
        self.sock = None

The key feature of this class is that it represents a network connection, but it doesn’t actually do anything initially (e.g., it doesn’t establish a connection).  
Instead, the connection is established and closed using the with statement (essentially on demand).

In [57]:
from functools import partial

conn = LazyConnection(('www.python.org', 80))
# Connection closed
with conn as s:
    # conn.__enter__() executes: connection open
    s.send(b'GET /index.html HTTP/1.0\r\n')
    s.send(b'Host: www.python.org\r\n')
    s.send(b'\r\n')
    resp = b''.join(iter(partial(s.recv, 8192), b''))
    # conn.__exit() executes: connection closed

### Discussion

The main principle behind writing a context manager is that you’re writing code that’s meant to surround a block of statements as defined by the use of the `with` statement.  
When the `with` statement is first encountered, the `__enter__()` method is triggered.  
The return value of `__enter__()` (if any) is placed into the variable indicated with the `as` qualifier.  
Afterward, the statements in the body of the with statement execute.  
Finally, the `__exit__()` method is triggered to clean up.  
This control flow happens regardless of what happens in the body of the with statement, including if there are exceptions.  
In fact, the three arguments to the `__exit__()` method contain the exception type, value, and traceback for pending exceptions (if any).  
The `__exit__()` method can choose to use the exception information in some way or to ignore it by doing nothing and returning `None` as a result.  
If `__exit__()` returns `True`, the exception is cleared as if nothing happened and the program continues executing statements immediately after the with block.  
One subtle aspect of this recipe is whether or not the `LazyConnection` class allows nested use of the connection with multiple with statements.  
As shown, only a single socket connection at a time is allowed, and an exception is raised if a repeated with statement is attempted when a socket is already in use.  
You can work around this limitation with a slightly different implementation, as shown here:

In [58]:
from socket import socket, AF_INET, SOCK_STREAM

class LazyConnection:
    def __init__(self, address, family = AF_INET, type=SOCK_STREAM):
        self.address = address
        self.family = AF_INET
        self.type = SOCK_STREAM
        self.connections = []
        
    def __enter__(self):
        sock = socket(self.family, self.type)
        sock.connect(self.address)
        self.connections.append(sock)
        return sock
    
    def __exit(self, exc_ty, exc_val, tb):
        self.connections.pop().close()

In this second version, the `LazyConnection` class serves as a kind of factory for connections.  
Internally, a list is used to keep a stack.  
Whenever `__enter__()` executes, it makes a new connection and adds it to the stack.  
The `__exit__()` method simply pops the last connection off the stack and closes it.  
It’s subtle, but this allows multiple connections to be created at once with nested with statements, as shown.  
Context managers are most commonly used in programs that need to manage resources such as files, network connections, and locks.  
A key part of such resources is they have to be explicitly closed or released to operate correctly.  
For instance, if you acquire a lock, then you have to make sure you release it, or else you risk deadlock.  
By implementing `__enter__()`, `__exit__()`, and using the with statement, it is much easier to avoid such problems, since the cleanup code in the `__exit__()` method is guaranteed to run no matter what.  
An alternative formulation of context managers is found in the `contextmanager` module in Recipe 9.22.  
A thread-safe version of this recipe can be found in Recipe 12.6.

## 8.4. Saving Memory When Creating a Large Number of Instances

### Problem

Your program creates a large number (e.g., millions) of instances and uses a large amount of memory.

### Solution

For classes that primarily serve as simple data structures, you can often greatly reduce the memory footprint of instances by adding the `__slots__` attribute to the class definition.

In [59]:
class Date:
    __slots__ = ['year', 'month', 'day']
    def __init__(self, year, month, day):
        self.year = year
        self.month = month
        self.day = day

When you define `__slots__`, Python uses a much more compact internal representation for instances.  
Instead of each instance consisting of a dictionary, instances are built around a small fixed-sized array, much like a tuple or list.  
Attribute names listed in the `__slots__` specifier are internally mapped to specific indices within this array.  
A side effect of using slots is that it is no longer possible to add new attributes to instances — you are restricted to only those attribute names listed in the `__slots__` specifier.

### Discussion

The memory saved by using `__slots__` varies according to the number and type of attributes stored.  
However, in general, the resulting memory use is comparable to that of storing data in a tuple.  
To give you an idea, storing a single `Date` instance without slots requires 428 bytes of memory on a 64-bit version of Python.  
If slots is defined, it drops to 156 bytes.  
In a program that manipulated a large number of dates all at once, this would make a significant reduction in overall memory use.  
Although slots may seem like a feature that could be generally useful, you should resist the urge to use it in most code.  
There are many parts of Python that rely on the normal dictionary-based implementation.  
In addition, classes that define slots don’t support certain features such as multiple inheritance.  
For the most part, you should only use slots on classes that are going to serve as frequently used data structures in your program (like if your program created millions of instances of a particular class).  
A common misperception of `__slots__` is that it is an encapsulation tool that prevents users from adding new attributes to instances.  
Although this is a side effect of using slots, this was never the original purpose.  
Instead, `__slots__` was always [intended to be an optimization tool](http://python-history.blogspot.com/2010/06/inside-story-on-new-style-classes.html).

## 8.5. Encapsulating Names in a Class

### Problem

You want to encapsulate restricted data on instances of a class, but you are concerned about Python's lack of access control.

### Solution

Rather than relying on language features to encapsulate data, Python programmers are expected to observe certain naming conventions concerning the intended usage of data and methods.  
The first convention is that any name that starts with a single leading underscore `(_)` should *always* be assumed to be internal implementation.

In [60]:
class A:
    def __init__(self):
        # internal attribute
        self._internal = 0
        # public attribute
        self.public = 1
        
    def public_method(self):
        """
        Any public method will do.
        """
        pass
    
    def _internal_method(self):
        """
        Any private method will do.
        """

Python doesn’t actually prevent someone from accessing internal names.  
However, doing so is considered impolite, and may result in fragile code.  
It should be noted, too, that the use of the leading underscore is also used for module names and module-level functions.  
For example, if you ever see a module name that starts with a leading underscore (like `_socket`), it’s an internal implementation.  
Likewise, module-level functions such as `sys._getframe()` should only be used with great caution.  
You may also encounter the use of two leading underscores `(__)` on names within class definitions.

In [61]:
class B: 
    def __init__(self):
        self.__private = 0
        
    def __private_method(self):
        """
        Did you know that a docstring alone can form a complete function in Python?
        You don't even need the pass keyword.
        Thanks Guido;)
        """
    def public_method(self):
        self.__private_method()
        pass

The use of double leading underscores causes the name to be mangled to something else.  
Specifically, the private attributes in the preceding class get renamed to `_B__private` and `_B__private_method`, respectively.  
At this point, you might ask what purpose such name mangling serves.  
The answer is inheritance — such attributes cannot be overridden via inheritance.

In [62]:
class C(B):
    def __init__(self):
        super().__init__()
        self.__private = 1 # Doesn't override B.__private
    def __private_method(self):
        """
        Doesn't override B.__private_method()
        """

Here, the private names `__private` and `__private_method()` get renamed to `_C__private` and `_C__private_method`, which are different than the mangled names in the base class `B`.

### Discussion

The fact that there are two different conventions (single underscore versus double underscore) for "private" attributes leads to the obvious question of which style you should use.  
For most code, you should probably just make your nonpublic names start with a single underscore.  
If, however, you know that your code will involve subclassing, and there are internal attributes that should be hidden from subclasses, use the double underscore instead.  
It should also be noted that sometimes you may want to define a variable that clashes with the name of a reserved word.  
For this, you should use a single trailing underscore.

In [63]:
lambda_ = 2.0
# so you don't clash with the lambda keyword

The reason for not using a leading underscore here is that it avoids confusion about the intended usage.  
In other words, the use of a leading underscore could be interpreted as a way to avoid a name collision rather than as an indication that the value is private.  
Using a single trailing underscore solves this problem.

## 8.6. Creating Managed Attributes

### Problem

You want to add extra processing, such as type checking or validation, to the getting or setting of an instance attribute.

### Solution

A simple way to customize access to an attribute is to define it as a [property](https://www.programiz.com/python-programming/property).  
For example, this code defines a `property` that adds simple type checking to an attribute:

In [64]:
class Person:
    def __init__(self, first_name):
        self.first_name = first_name
        
    # Getter function
    @property
    def first_name(self):
        return self._first_name
    
    # Setter function
    @first_name.setter
    def first_name(self, value):
        if not isinstance(value, str):
            raise TypeError('Expected a string')
        self._first_name = value
            
    # Optional Deleter function
    @first_name.deleter
    def first_name(self):
        raise AttributeError('This attribute cannot be deleted')

In the preceding code, there are three related methods, all of which must have the same name.  
The first method is a getter function, and establishes `first_name` as being a property.  
The other two methods attach optional setter and deleter functions to the `first_name` property.  
It’s important to stress that the `@first_name.setter` and `@first_name.deleter` decorators won’t be defined unless `first_name` was already established as a property using the `@property` decorator.
A critical feature of a property is that it looks like a normal attribute, but access *automatically* triggers the getter, setter, and deleter methods.

In [65]:
a = Person('Guido')
a.first_name

'Guido'

When implementing a property, the underlying data (if any) still needs to be stored somewhere.  
Thus, in the getter and setter methods, you see direct manipulation of a `_first_name` attribute, which is where the actual data lives.  
In addition, you may ask why the `__init__()` method sets `self.first_name` instead of `self._first_name`.  
In this example, the entire point of the property is to apply type checking when setting an attribute.  
Thus, chances are you would also want such checking to take place during initialization.  
By setting `self.first_name`, the set operation uses the setter method instead of bypassing it by accessing `self._first_name`.  
Properties can also be defined for existing get and set methods.

In [66]:
class Person:
    def __init__(self, first_name):
        self.set_first_name(first_name)
        
    # Getter function
    def get_first_name(self):
        return self._first_name
    
    # Setter function
    def set_first_name(self, value):
        if not isinstance(value, str):
            raise TypeError('Expected a string')
        self._first_name = value
        
    # Optional Deleter function
    def del_first_name(self):
        raise AttributeError("This attribute cannot be deleted")
        
    # Make a property from the existing get and set methods
    name = property(get_first_name, set_first_name, del_first_name)

### Discussion