# 2.1. Advanced Python Constructs

## 2.1.1 Iterators, generator expressions and generators

### 2.1.1 Iterators

An iterator is an object adhering to the iterator protocol — basically this means that it has a next method, which, when called, returns the next item in the sequence, and when there’s nothing to return, raises the StopIteration exception.

An iterator object allows to loop just once. It holds the state (position) of a single iteration, or from the other side, each loop over a sequence requires a single iterator object. This means that we can iterate over the same sequence more than once concurrently. Separating the iteration logic from the sequence allows us to have more than one way of iteration.

Calling the __iter__ method on a container to create an iterator object is the most straightforward way to get hold of an iterator. The iter function does that for us, saving a few keystrokes.

In [1]:
nums = [1, 2, 3]

In [2]:
iter(nums)

<list_iterator at 0x1a8a3484648>

In [3]:
nums.__iter__()

<list_iterator at 0x1a8a345f3c8>

In [4]:
nums.__reversed__()

<list_reverseiterator at 0x1a8a3449208>

In [7]:
it = iter(nums)

In [8]:
next(it)

1

In [9]:
next(it)

2

In [10]:
next(it)

3

In [11]:
next(it)

StopIteration: 

When used in a loop, `StopIteration` is swallowed and causes the loop to finish. But with explicit invocation, we can see that once the iterator is exhausted, accessing it raises an exception.

Using the for..in loop also uses the `__iter__` method. This allows us to transparently start the iteration over a sequence. But if we already have the iterator, we want to be able to use it in an for loop in the same way. In order to achieve this, iterators in addition to next are also required to have a method called `__iter__` which returns the iterator (`self`).

Support for iteration is pervasive in Python: all sequences and unordered containers in the standard library allow this. The concept is also stretched to other things: e.g. `file` objects support iteration over lines.

### 2.1.1.2 Generator expressions

A second way in which iterator objects are created is through generator expressions, the basis for list comprehensions. To increase clarity, a generator expression must always be enclosed in parentheses or an expression. If round parentheses are used, then a generator iterator is created. If rectangular parentheses are used, the process is short-circuited and we get a `list`.

In [12]:
(i for i in nums)

<generator object <genexpr> at 0x000001A8A3517048>

In [13]:
[i for i in nums]

[1, 2, 3]

In [14]:
list(i for i in nums)

[1, 2, 3]

The list comprehension syntax also extends to dictionary and set comprehensions. A `set` is created when the generator expression is enclosed in curly braces. A `dict` is created when the generator expression contains “pairs” of the form `key:value`:

In [15]:
{i for i in range(0, 3)}

{0, 1, 2}

In [16]:
{i:i**2 for i in range(0, 3)}

{0: 0, 1: 1, 2: 4}

One gotcha should be mentioned: in old Pythons the index variable (i) would leak, and in versions >= 3 this is fixed.

### 2.1.1.3

A third way to create iterator objects is to call a generator function. A generator is a function containing the keyword yield. It must be noted that the mere presence of this keyword completely changes the nature of the function: this `yield` statement doesn’t have to be invoked, or even reachable, but causes the function to be marked as a generator. When a normal function is called, the instructions contained in the body start to be executed. When a generator is called, the execution stops before the first instruction in the body. An invocation of a generator function creates a generator object, adhering to the iterator protocol. As with normal function invocations, concurrent and recursive invocations are allowed.

When `next` is called, the function is executed until the first `yield`. Each encountered `yield` statement gives a value becomes the return value of `next`. After executing the `yield` statement, the execution of this function is suspended.

In [18]:
def f():
    yield 1
    yield 2
    
f()

<generator object f at 0x000001A8A34931C8>

In [19]:
gen = f()
next(gen)

1

In [20]:
next(gen)

2

In [21]:
next(gen)

StopIteration: 

Let’s go over the life of the single invocation of the generator function.

In [22]:
def f():
    print('- start')
    yield 3
    print('- middle')
    yield 4
    print('- finished')
    
gen = f()
next(gen)

- start


3

In [23]:
next(gen)

- middle


4

In [24]:
next(gen)

- finished


StopIteration: 



Contrary to a normal function, where executing `f()` would immediately cause the first `print` to be executed, `gen` is assigned without executing any statements in the function body. Only when gen.next() is invoked by next, the statements up to the first yield are executed. The second next prints -- middle -- and execution halts on the second yield. The third `next` prints -- finished -- and falls of the end of the function. Since no `yield` was reached, an exception is raised.

What happens with the function after a yield, when the control passes to the caller? The state of each generator is stored in the generator object. From the point of view of the generator function, is looks almost as if it was running in a separate thread, but this is just an illusion: execution is strictly single-threaded, but the interpreter keeps and restores the state in between the requests for the next value.

Why are generators useful? As noted in the parts about iterators, a generator function is just a different way to create an iterator object. Everything that can be done with `yield` statements, could also be done with `next` methods. Nevertheless, using a function and having the interpreter perform its magic to create an iterator has advantages. A function can be much shorter than the definition of a class with the required `next` and `__iter__` methods. What is more important, it is easier for the author of the generator to understand the state which is kept in local variables, as opposed to instance attributes, which have to be used to pass data between consecutive invocations of next on an iterator object.

A broader question is why are iterators useful? When an iterator is used to power a loop, the loop becomes very simple. The code to initialise the state, to decide if the loop is finished, and to find the next value is extracted into a separate place. This highlights the body of the loop — the interesting part. In addition, it is possible to reuse the iterator code in other places.

### 2.1.1.4. Bidirectional communication

Each `yield` statement causes a value to be passed to the caller. This is the reason for the introduction of generators by PEP 255 (implemented in Python 2.2). But communication in the reverse direction is also useful. One obvious way would be some external state, either a global variable or a shared mutable object. Direct communication is possible thanks to PEP 342 (implemented in 2.5). It is achieved by turning the previously boring `yield` statement into an expression. When the generator resumes execution after a `yield` statement, the caller can call a method on the generator object to either pass a value into the generator, which then is returned by the `yield` statement, or a different method to inject an exception into the generator.

The first of the new methods is `send(value)`, which is similar to `next()`, but passes `value` into the generator to be used for the value of the `yield` expression. In fact, `g.next()` and `g.send(None)` are equivalent.

The second of the new methods is `throw(type, value=None, traceback=None)` which is equivalent to:

```python
raise type, value, traceback
```

at the point of the `yield` statement.

Unlike `raise` (which immediately raises an exception from the current execution point), `throw()` first resumes the generator, and only then raises the exception. The word throw was picked because it is suggestive of putting the exception in another location, and is associated with exceptions in other languages.

What happens when an exception is raised inside the generator? It can be either raised explicitly or when executing some statements or it can be injected at the point of a yield statement by means of the `throw()` method. In either case, such an exception propagates in the standard manner: it can be intercepted by an `except` or `finally` clause, or otherwise it causes the execution of the generator function to be aborted and propagates in the caller.

For completeness’ sake, it’s worth mentioning that generator iterators also have a `close()` method, which can be used to force a generator that would otherwise be able to provide more values to finish immediately. It allows the generator `__del__` method to destroy objects holding the state of generator.

Let’s define a generator which just prints what is passed in through send and throw.

In [25]:
import itertools
def g():
    print('-- start --')
    for i in itertools.count():
        print(f'-- yielding: {i} --')
        try:
            ans = yield i
        except GeneratorExit:
            print('-- closing --')
            raise
        except Exception as e:
            print(f'-- yield raised {e} --')
        else:
            print(f'-- yield returned {ans} --')

In [26]:
it = g()
next(it)

-- start --
-- yielding: 0 --


0

In [27]:
it.send(11)

-- yield returned 11 --
-- yielding: 1 --


1

In [28]:
it.throw(IndexError)

-- yield raised  --
-- yielding: 2 --


2

In [29]:
it.close()

-- closing --


### 2.1.1.5. Chaining generators

Let’s say we are writing a generator and we want to yield a number of values generated by a second generator, a subgenerator. If yielding of values is the only concern, this can be performed without much difficulty using a loop such as

```python
subgen = some_other_generator()
for v in subgen:
    yield v
```

However, if the subgenerator is to interact properly with the caller in the case of calls to `send()`, `throw()` and `close()`, things become considerably more difficult. The `yield` statement has to be guarded by a try..except..finally structure similar to the one defined in the previous section to “debug” the generator function. Such code is provided in PEP 380#id13, here it suffices to say that new syntax to properly yield from a subgenerator is being introduced in Python 3.3:

```python
yield from some_other_generator()
```

This behaves like the explicit loop above, repeatedly yielding values from `some_other_generator` until it is exhausted, but also forwards `send`, `throw` and `close` to the subgenerator.

## 2.1.2. Decorators

Since functions and classes are objects, they can be passed around. Since they are mutable objects, they can be modified. __The act of altering a function or class object after it has been constructed but before is it bound to its name is called decorating__.

There are two things hiding behind the name “decorator” — one is the function which does the work of decorating, i.e. performs the real work, and the other one is the expression adhering to the decorator syntax, i.e. an at-symbol and the name of the decorating function.

Function can be decorated by using the decorator syntax for functions:

```python
@decorator
def function():
    pass
```

Decorators can be applied to functions and to classes. For classes the semantics are identical — the original class definition is used as an argument to call the decorator and whatever is returned is assigned under the original name.

Before the decorator syntax was implemented (PEP 318), it was possible to achieve the same effect by assigning the function or class object to a temporary variable and then invoking the decorator explicitly and then assigning the return value to the name of the function. This sounds like more typing, and it is, and also the name of the decorated function doubling as a temporary variable must be used at least three times, which is prone to errors. Nevertheless, the example above is equivalent to:

```python
def function():
    pass
function = decorator(function)
```


### 2.1.2.1. Replacing or tweaking the original object

Decorators can either return the same function or class object or they can return a completely different object. In the first case, the decorator can exploit the fact that function and class objects are mutable and add attributes, e.g. add a docstring to a class. A decorator might do something useful even without modifying the object, for example register the decorated class in a global registry. In the second case, virtually anything is possible: when something different is substituted for the original function or class, the new object can be completely different. Nevertheless, such behaviour is not the purpose of decorators: they are intended to tweak the decorated object, not do something unpredictable. Therefore, when a function is “decorated” by replacing it with a different function, the new function usually calls the original function, after doing some preparatory work. Likewise, when a class is “decorated” by replacing if with a new class, the new class is usually derived from the original class. When the purpose of the decorator is to do something “every time”, like to log every call to a decorated function, only the second type of decorators can be used. On the other hand, if the first type is sufficient, it is better to use it, because it is simpler.


### 2.1.2.2. Decorators implemented as classes and as functions

The only requirement on decorators is that they can be called with a single argument. This means that decorators can be implemented as normal functions, or as classes with a __call__ method, or in theory, even as lambda functions.

Let’s compare the function and class approaches. The decorator expression (the part after @) can be either just a name, or a call. The bare-name approach is nice (less to type, looks cleaner, etc.), but is only possible when no arguments are needed to customise the decorator. Decorators written as functions can be used in those two cases:

In [31]:
def simple_decorator(function):
    print('doing decoration')
    return function

@simple_decorator
def function():
    print('inside function')

doing decoration


In [32]:
function()

inside function


In [33]:
def decorator_with_arguments(arg):
    print('defining the decorator')
    def _decorator(function):
        # in this inner function, arg is available too
        print(f'doing decoration {arg}')
        return function
    return _decorator

@decorator_with_arguments('abc')
def function():
    print('inside function')

defining the decorator
doing decoration abc


In [34]:
function()

inside function


The two trivial decorators above fall into the category of decorators which return the original function. If they were to return a new function, an extra level of nestedness would be required. In the worst case, three levels of nested functions.

In [37]:
def replacing_decorator_with_args(arg):
    print('defining the decorator')
    def _decorator(function):
        # in this inner function, arg is available too
        print(f'doing decoration, {arg}')
        def _wrapper(*args, **kwargs):
            print(f'inside wrapper, {args} {kwargs}')
            return function(*args, **kwargs)
        return _wrapper
    return _decorator

@replacing_decorator_with_args('abc')
def function(*args, **kwargs):
    print(f'inside function, {args} {kwargs}')
    return 14

defining the decorator
doing decoration, abc


In [38]:
function(11, 12)

inside wrapper, (11, 12) {}
inside function, (11, 12) {}


14

The `_wrapper` function is defined to accept all positional and keyword arguments. In general we cannot know what arguments the decorated function is supposed to accept, so the wrapper function just passes everything to the wrapped function. One unfortunate consequence is that the apparent argument list is misleading.

Compared to decorators defined as functions, complex decorators defined as classes are simpler. When an object is created, the `__init__` method is only allowed to return None, and the type of the created object cannot be changed. This means that when a decorator is defined as a class, it doesn’t make much sense to use the argument-less form: the final decorated object would just be an instance of the decorating class, returned by the constructor call, which is not very useful. Therefore it’s enough to discuss class-based decorators where arguments are given in the decorator expression and the decorator `__init__` method is used for decorator construction.

In [45]:
class decorator_class:
    def __init__(self, arg):
        # this method is called in the decorator expression
        print(f'in decorator init, {arg}')
        self.arg = arg
    
    def __call__(self, function):
        # this method is called to do the job
        print(f'in decorator call, {self.arg}')
        return function


In [46]:
deco_instance = decorator_class('foo')

in decorator init, foo


In [47]:
@deco_instance
def function(*args, **kwargs):
    print(f'in function, {args} {kwargs}')

in decorator call, foo


In [48]:
function()

in function, () {}




Contrary to normal rules (PEP 8) decorators written as classes behave more like functions and therefore their name often starts with a lowercase letter.

In reality, it doesn’t make much sense to create a new class just to have a decorator which returns the original function. Objects are supposed to hold state, and such decorators are more useful when the decorator returns a new object.

In [49]:
class replacing_decorator_class:
    def __init__(self, arg):
        # this method is called in the decorator expression
        print(f'in decorator init, {arg}')
        self.arg = arg
    def __call__(self, function):
        # this method is called to do the job
        print(f'in decorator call, {self.arg}')
        self.function = function
        return self._wrapper
    def _wrapper(self, *args, **kwargs):
        print(f'in the wrapper, {args} {kwargs}')
        return self.function(*args, **kwargs)

In [50]:
deco_instance = replacing_decorator_class('foo')

in decorator init, foo


In [51]:
@deco_instance
def function(*args, **kwargs):
    print(f'in function, {args} {kwargs}')

in decorator call, foo


In [52]:
function(11, 12)

in the wrapper, (11, 12) {}
in function, (11, 12) {}


### 2.1.2.3. Copying the docstring and other attributes of the original function

When a new function is returned by the decorator to replace the original function, an unfortunate consequence is that the original function name, the original docstring, the original argument list are lost. Those attributes of the original function can partially be “transplanted” to the new function by setting `__doc__` (the docstring), `__module__` and `__name__` (the full name of the function), and `__annotations__` (extra information about arguments and the return value of the function available in Python 3). This can be done automatically by using `functools.update_wrapper`.

In [53]:
# update a wrapper function to look like the wrapped function
import functools
def replacing_decorator_with_args(arg):
    print('defining the decorator')
    def _decorator(function):
        print(f'doing decoration, {arg}')
        def _wrapper(*args, **kwargs):
            print(f'inside wrapper, {args} {kwargs}')
            return function(*args, **kwargs)
        return functools.update_wrapper(_wrapper, function)
    return _decorator

In [54]:
@replacing_decorator_with_args('abc')
def function():
    """Extensive documentation
    """
    print('inside function')
    return 14

defining the decorator
doing decoration, abc


In [55]:
function

<function __main__.function()>

In [56]:
function.__doc__

'Extensive documentation\n    '

One important thing is missing from the list of attributes which can be copied to the replacement function: the argument list. The default values for arguments can be modified through the `__defaults__`, `__kwdefaults__` attributes, but unfortunately the argument list itself cannot be set as an attribute. This means that `help(function)` will display a useless argument list which will be confusing for the user of the function. An effective but ugly way around this problem is to create the wrapper dynamically, using eval. This can be automated by using the external `decorator` module. It provides support for the `decorator` decorator, which takes a wrapper and turns it into a decorator which preserves the function signature.

To sum things up, decorators should always use `functools.update_wrapper` or some other means of copying function attributes.

### 2.1.2.4. Examples in the standard library