https://scipy-lectures.org/advanced/advanced_python/index.html

# Iterators, generator expressions and generators

## Iterators

An iterator is an __object__ adhering to the iterator protocol — basically this means\
that it has a next method, which, when called, returns the next item in the sequence,\
and when there’s nothing to return, raises the StopIteration exception.

Calling the `__iter__` method on a container to create an iterator object\
is the most straightforward way to get hold of an iterator. \
The `iter` function does that for us, saving a few keystrokes.

In [3]:
nums = [1, 2, 3]
iter(nums)

<list_iterator at 0x10a89b7f0>

In [4]:
nums.__iter__()

<list_iterator at 0x10a8c9a58>

In [5]:
nums.__reversed__()

<list_reverseiterator at 0x10a8c9828>

In [8]:
it = iter(nums)
next(it)

1

In [9]:
next(it)

2

In [10]:
next(it)

3

In [11]:
next(it)

StopIteration: 

Using the for..in loop also uses the `__iter__` method. This allows us to transparently start the iteration over a sequence. But if we already have the iterator, we want to be able to use it in an for loop in the same way.

In order to achieve this, iterators in addition to next are also required to have a method called `__iter__` which returns the iterator (self).

In [13]:
f = open('print.py')
f is f.__iter__()

True

The file is an iterator itself and it’s `__iter__` method doesn’t create a separate object: only a single thread of sequential access is allowed.

## Generator expressions

A second way in which iterator objects are created is through __generator expressions__, the basis for list comprehensions. To increase clarity, a generator expression must always be enclosed in parentheses or an expression. If round parentheses are used, then a generator iterator is created. If rectangular parentheses are used, the process is short-circuited and we get a list.

In [14]:
(i for i in nums)   

<generator object <genexpr> at 0x10a927f48>

In [15]:
[i for i in nums]

[1, 2, 3]

In [21]:
list(i for i in nums)

[1, 2, 3]

The list comprehension syntax also extends to __dictionary__ and __set comprehensions__. A set is created when the generator expression is enclosed in curly braces. A dict is created when the generator expression contains “pairs” of the form key:value:

In [22]:
{i for i in range(3)}  

{0, 1, 2}

In [23]:
{i:i**2 for i in range(3)}   

{0: 0, 1: 1, 2: 4}

## Generators

A third way to create iterator objects is to call a __generator function__. A generator is a function containing the keyword `yield`. It must be noted that the mere presence of this keyword completely changes the nature of the function: this yield statement doesn’t have to be invoked, or even reachable, but causes the function to be marked as a generator. When a normal function is called, the instructions contained in the body start to be executed. When a generator is called, the execution stops before the first instruction in the body. An invocation of a generator function creates a generator object, adhering to the iterator protocol. As with normal function invocations, concurrent and recursive invocations are allowed.

When next is called, the function is executed until the first yield. Each encountered yield statement gives a value becomes the return value of next. After executing the yield statement, the execution of this function is suspended.

In [24]:
def f():
...   yield 1
...   yield 2
>>> f() 

<generator object f at 0x10a946408>

In [25]:
gen = f()
>>> next(gen)

1

In [27]:
next(gen)

2

In [28]:
next(gen)

StopIteration: 

In [34]:
def f():
  print("-- start --")
  yield 3
  print("-- middle --")
  yield 4
  print("-- finished --")
gen = f()

In [35]:
next(gen)

-- start --


3

In [36]:
next(gen)

-- middle --


4

In [37]:
next(gen)   

-- finished --


StopIteration: 

What happens with the function after a yield, when the control passes to the caller? The state of each generator is stored in the generator object. From the point of view of the generator function, is looks almost as if it was running in a separate thread, but this is just an illusion: execution is strictly single-threaded, but the interpreter keeps and restores the state in between the requests for the next value.

## Bidirectional communication

Each yield statement causes a value to be passed to the caller. This is the reason for the introduction of generators by PEP 255 (implemented in Python 2.2). But communication in the reverse direction is also useful. One obvious way would be some external state, either a global variable or a shared mutable object. Direct communication is possible thanks to PEP 342 (implemented in 2.5). It is achieved by turning the previously boring yield statement into an expression. When the generator resumes execution after a yield statement, the caller can call a method on the generator object to either pass a value into the generator, which then is returned by the yield statement, or a different method to inject an exception into the generator.

The first of the new methods is __send(value)__, which is similar to __next()__, but passes value into the generator to be used for the value of the yield expression. In fact, g.next() and g.send(None) are equivalent.

The second of the new methods is throw(type, value=None, traceback=None) which is equivalent to:

`raise type, value, traceback`

at the point of the yield statement.

Unlike raise (which immediately raises an exception from the current execution point), __throw()__ first resumes the generator, and only then raises the exception. The word throw was picked because it is suggestive of putting the exception in another location, and is associated with exceptions in other languages.

What happens when an exception is raised inside the generator? It can be either raised explicitly or when executing some statements or it can be injected at the point of a yield statement by means of the throw() method. In either case, such an exception propagates in the standard manner: it can be intercepted by an except or finally clause, or otherwise it causes the execution of the generator function to be aborted and propagates in the caller.

For completeness’ sake, it’s worth mentioning that generator iterators also have a __close()__ method, which can be used to force a generator that would otherwise be able to provide more values to finish immediately. It allows the generator `__del__` method to destroy objects holding the state of generator.

Let’s define a generator which just prints what is passed in through send and throw.

In [78]:
import itertools
def g():
    print('--start--')
    for i in itertools.count():
        print('--yielding %i--' % i)
        try:
            ans = yield i
        except GeneratorExit:
            print('--closing--')
            raise
        except Exception as e:
            print('--yield raised %r--' % e)
        else:
            print('--yield returned %s--' % ans)

it = g()
next(it)

--start--
--yielding 0--


0

In [79]:
it.send(11)

--yield returned 11--
--yielding 1--


1

In [80]:
it.throw(IndexError)

--yield raised IndexError()--
--yielding 2--


2

In [81]:
it.close()

--closing--


## Chaining generators

Let’s say we are writing a generator and we want to yield a number of values generated by a second generator, a subgenerator. If yielding of values is the only concern, this can be performed without much difficulty using a loop such as
```
subgen = some_other_generator()
for v in subgen:
    yield v
```
However, if the subgenerator is to interact properly with the caller in the case of calls to send(), throw() and close(), things become considerably more difficult. The yield statement has to be guarded by a try..except..finally structure similar to the one defined in the previous section to “debug” the generator function. Such code is provided in PEP 380#id13, here it suffices to say that new syntax to properly yield from a subgenerator is being introduced in Python 3.3:

```
yield from some_other_generator()
```
This behaves like the explicit loop above, repeatedly yielding values from some_other_generator until it is exhausted, but also forwards send, throw and close to the subgenerator.

In [83]:
def g2():
    yield from g()

it = g2()

In [84]:
next(it)

--start--
--yielding 0--


0

In [85]:
it.send(4)

--yield returned 4--
--yielding 1--


1

In [86]:
it.throw(IndexError)

--yield raised IndexError()--
--yielding 2--


2

In [87]:
it.close()

--closing--


# Decorators

Since functions and classes are objects, they can be passed around. Since they are mutable objects, they can be modified. The act of altering a function or class object after it has been constructed but before is is bound to its name is called decorating.

```
@decorator             # ②
def function():        # ①
    pass
```

A function is defined in the standard way. ①
An expression starting with @ placed before the function definition is the decorator ②. The part after @ must be a simple expression, usually this is just the name of a function or class. This part is evaluated first, and after the function defined below is ready, the decorator is called with the newly defined function object as the single argument. The value returned by the decorator is attached to the original name of the function.

Same without special syntax:
```
def function():                  # ①
    pass
function = decorator(function)   # ②
```

## Decorators implemented as classes and as functions

The only requirement on decorators is that they can be called with a single argument. This means that decorators can be implemented as normal functions, or as classes with a __call__ method, or in theory, even as lambda functions.

In [90]:
def simple_decorator(function):
  print("doing decoration")
  return function

@simple_decorator
def function():
  print("inside function")

doing decoration


In [91]:
function()

inside function


In [93]:
def decorator_with_arguments(arg):
  print("defining the decorator")
  def _decorator(function):
      # in this inner function, arg is available too
      print("doing decoration, %r" % arg)
      return function
  return _decorator

@decorator_with_arguments("abc")
def function():
  print("inside function")

defining the decorator
doing decoration, 'abc'


In [94]:
function()

inside function


In [95]:
def replacing_decorator_with_args(arg):
  print("defining the decorator")
  def _decorator(function):
      # in this inner function, arg is available too
      print("doing decoration, %r" % arg)
      def _wrapper(*args, **kwargs):
          print("inside wrapper, %r %r" % (args, kwargs))
          return function(*args, **kwargs)
      return _wrapper
  return _decorator
@replacing_decorator_with_args("abc")
def function(*args, **kwargs):
    print("inside function, %r %r" % (args, kwargs))
    return 14

defining the decorator
doing decoration, 'abc'


In [98]:
function(11, 12)

inside wrapper, (11, 12) {}
inside function, (11, 12) {}


14

### class based

In [99]:
class decorator_class(object):
  def __init__(self, arg):
      # this method is called in the decorator expression
      print("in decorator init, %s" % arg)
      self.arg = arg
  def __call__(self, function):
      # this method is called to do the job
      print("in decorator call, %s" % self.arg)
      return function
deco_instance = decorator_class('foo')

in decorator init, foo


In [100]:
@deco_instance
def function(*args, **kwargs):
  print("in function, %s %s" % (args, kwargs))

in decorator call, foo


In [101]:
function()

in function, () {}


In [102]:
class replacing_decorator_class(object):
  def __init__(self, arg):
      # this method is called in the decorator expression
      print("in decorator init, %s" % arg)
      self.arg = arg
  def __call__(self, function):
      # this method is called to do the job
      print("in decorator call, %s" % self.arg)
      self.function = function
      return self._wrapper
  def _wrapper(self, *args, **kwargs):
      print("in the wrapper, %s %s" % (args, kwargs))
      return self.function(*args, **kwargs)
deco_instance = replacing_decorator_class('foo')

in decorator init, foo


In [110]:
@deco_instance
def function(*args, **kwargs):
  print("in function, %s %s" % (args, kwargs))

in decorator call, foo


In [111]:
function(11, 12)

in the wrapper, (11, 12) {}
in function, (11, 12) {}


## Copying the docstring and other attributes of the original function

When a new function is returned by the decorator to replace the original function, an unfortunate consequence is that the original function name, the original docstring, the original argument list are lost. Those attributes of the original function can partially be “transplanted” to the new function by setting `__doc__` (the docstring), `__module__` and `__name__` (the full name of the function), and `__annotations__` (extra information about arguments and the return value of the function available in Python 3). This can be done automatically by using `functools.update_wrapper`.

In [106]:
import functools

def replacing_decorator_with_args(arg):
  print("defining the decorator")
  def _decorator(function):
      print("doing decoration, %r" % arg)
      def _wrapper(*args, **kwargs):
          print("inside wrapper, %r %r" % (args, kwargs))
          return function(*args, **kwargs)
      return functools.update_wrapper(_wrapper, function)
  return _decorator

@replacing_decorator_with_args("abc")
def function():
    "extensive documentation"
    print("inside function")
    return 14

defining the decorator
doing decoration, 'abc'


In [107]:
function    

<function __main__.function()>

In [108]:
print(function.__doc__)

extensive documentation


To sum things up, decorators should always use `functools.update_wrapper` or some other means of copying function attributes.

## Examples in the standard library

### `classmethod`
`classmethod` causes a method to become a “class method”, which means that it can be invoked without creating an instance of the class. When a normal method is invoked, the interpreter inserts the instance object as the first positional parameter, self. When a class method is invoked, the class itself is given as the first parameter, often called cls.

Class methods are still accessible through the class’ namespace, so they don’t pollute the module’s namespace. Class methods can be used to provide alternative constructors:

In [114]:
# This is cleaner than using a multitude of flags to __init__
class Array(object):
    def __init__(self, data):
        self.data = data

    @classmethod
    def fromfile(cls, file):
        data = numpy.load(file)
        return cls(data)

### staticmethod

`staticmethod` is applied to methods to make them “static”, i.e. basically a normal function, but accessible through the class namespace. This can be useful when the function is only needed inside this class (its name would then be prefixed with _), or when we want the user to think of the method as connected to the class, despite an implementation which doesn’t require this.

### property

property is the pythonic answer to the problem of getters and setters. A method decorated with property becomes a getter which is automatically called on attribute access.

In [115]:
class A(object):
  @property
  def a(self):
    "an important attribute"
    return "a value"
A.a     

<property at 0x10bb7c188>

In [116]:
A().a

'a value'

In [117]:
class Rectangle(object):
    def __init__(self, edge):
        self.edge = edge

    @property
    def area(self):
        """Computed area.

        Setting this updates the edge length to the proper value.
        """
        return self.edge**2

    @area.setter
    def area(self, area):
        self.edge = area ** 0.5

In [129]:
r = Rectangle(3)
r.area

9

In [132]:
r.area = 25
r.edge

5.0

The way that this works, is that the property decorator replaces the getter method with a property object. This object in turn has three methods, getter, setter, and deleter, which can be used as decorators. Their job is to set the getter, setter and deleter of the property object (stored as attributes fget, fset, and fdel). The getter can be set like in the example above, when creating the object. When defining the setter, we already have the property object under area, and we add the setter to it by using the setter method. All this happens when we are creating the class.

In [118]:
class D(object):
   @property
   def a(self):
     print("getting 1")
     return 1
   @a.setter
   def a(self, value):
     print("setting %r" % value)
   @a.deleter
   def a(self):
     print("deleting")
D.a                

<property at 0x10bb7ca48>

In [119]:
D.a.fget       

<function __main__.D.a(self)>

In [120]:
D.a.fset  

<function __main__.D.a(self, value)>

In [121]:
D.a.fdel

<function __main__.D.a(self)>

In [123]:
d = D()
d.a

getting 1


1

In [124]:
d.a = 2

setting 2


In [126]:
del d.a

deleting


In [127]:
d.a

getting 1


1

## Deprecation of functions

In [150]:
class deprecated(object):
    """Print a deprecation warning once on first use of the function.

    >>> @deprecated()                    # doctest: +SKIP
    ... def f():
    ...     pass
    >>> f()                              # doctest: +SKIP
    f is deprecated
    """
    def __call__(self, func):
        self.func = func
        self.count = 0
        return self._wrapper
    def _wrapper(self, *args, **kwargs):
        self.count += 1
        if self.count == 1:
            print(self.func.__name__, 'is deprecated')
        return self.func(*args, **kwargs)

In [158]:
def deprecated(func):
    """Print a deprecation warning once on first use of the function.

    >>> @deprecated                      # doctest: +SKIP
    ... def f():
    ...     pass
    >>> f()                              # doctest: +SKIP
    f is deprecated
    """
    count = [0]
    def wrapper(*args, **kwargs):
        count[0] += 1
        if count[0] >= 3:
            print(func.__name__, 'is deprecated')
        return func(*args, **kwargs)
    return wrapper

In [159]:
@deprecated
def f():
    print('hi')
    
f()
f()
f()

hi
hi
f is deprecated
hi


## A while-loop removing decorator

Let’s say we have function which returns a lists of things, and this list created by running a loop. If we don’t know how many objects will be needed, the standard way to do this is something like:

In [160]:
def find_answers():
    answers = []
    while True:
        ans = look_for_next_answer()
        if ans is None:
            break
        answers.append(ans)
    return answers

This is fine, as long as the body of the loop is fairly compact. Once it becomes more complicated, as often happens in real code, this becomes pretty unreadable. We could simplify this by using yield statements, but then the user would have to explicitly call list(find_answers()).

We can define a decorator which constructs the list for us:

In [161]:
def vectorized(generator_func):
    def wrapper(*args, **kwargs):
        return list(generator_func(*args, **kwargs))
    return functools.update_wrapper(wrapper, generator_func)

In [162]:
@vectorized
def find_answers():
    while True:
        ans = look_for_next_answer()
        if ans is None:
            break
        yield ans

## A plugin registration system

In [163]:
class WordProcessor(object):
    PLUGINS = []
    def process(self, text):
        for plugin in self.PLUGINS:
            text = plugin().cleanup(text)
        return text

    @classmethod
    def plugin(cls, plugin):
        cls.PLUGINS.append(plugin)

@WordProcessor.plugin
class CleanMdashesExtension(object):
    def cleanup(self, text):
        return text.replace('&mdash;', u'\N{em dash}')

# Context managers

A context manager is an object with __enter__ and __exit__ methods which can be used in the with statement:
```
with manager as var:
    do_something(var)
```
is in the simplest case equivalent to
```
var = manager.__enter__()
try:
    do_something(var)
finally:
    manager.__exit__()
```