<a href="https://colab.research.google.com/github/justalge/another_python_tutorial/blob/main/week6/Lecture_11_metaclasses_iterators.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## Classes and Class Creation

#### Behind the scenes: Relationship between Class and type

In this chapter of our tutorial, we will provide you with a deeper insight into the magic happening behind the scenes, when we are defining a class or creating an instance of a class. You may ask yourself: "Do I really have to learn theses additional details on object oriented programming in Python?" Most probably not, or you belong to the few people who design classes at a very advanced level.

First, we will concentrate on the relationship between type and class. While defining classes so far, you may have asked yourself, what is happening "behind the lines". We have already seen, that applying "type" to an object returns the class of which the object is an instance of:

In [1]:
x = [4, 5, 9]
y = "Hello"
print(type(x), type(y))

<class 'list'> <class 'str'>


If you apply type on the name of a class itself, you get the class "type" returned:

In [2]:
print(type(list), type(str))

<class 'type'> <class 'type'>


**A user-defined class (or the class "object") is an instance of the class "type". So, we can see, that classes are created from type. In Python3 there is no difference between "classes" and "types". They are in most cases used as synonyms**

The fact that classes are instances of a class "type" allows us to program metaclasses. We can create classes, which inherit from the class "type". So, a metaclass is a subclass of the class "type"

Instead of only one argument, type can be called with three parameters:

```python
type(classname, superclasses, attributes_dict)
```

If type is called with three arguments, it will return a new type object. This provides us with a dynamic form of the class statement.

* "classname" is a string defining the class name and becomes the name attribute;

* "superclasses" is a list or tuple with the superclasses of our class. This list or tuple will become the bases attribute;

* the attributes_dict is a dictionary, functioning as the namespace of our class. It contains the definitions for the class body and it becomes the dict attribute

Let's have a look at a simple class definition:

In [4]:
class A:
    pass

x = A()
print(type(x))

<class '__main__.A'>


We can use "type" for the previous class defintion as well:

In [5]:
A = type("A", (), {})
x = A()
print(type(x))

<class '__main__.A'>


Generally speaking, this means, that we can define a class A with

```python
type(classname, superclasses, attributedict)
```

When we call "type", the call method of type is called. The call method runs two other methods: new and init:

```python
type.__new__(typeclass, classname, superclasses, attributedict)
type.__init__(cls, classname, superclasses, attributedict)
```

The new method creates and returns the new class object, and after this, the init method initializes the newly created object ([more](https://www.geeksforgeeks.org/__new__-in-python/) and [more](https://howto.lintel.in/python-__new__-magic-method-explained/) about \_\_new__ in python):

In [7]:
class Robot:

    counter = 0

    def __init__(self, name):
        self.name = name

    def sayHello(self):
        return "Hi, I am " + self.name


def Rob_init(self, name):
    self.name = name

Robot2 = type("Robot2", 
              (), 
              {"counter":0, 
               "__init__": Rob_init,
               "sayHello": lambda self: "Hi, I am " + self.name})

In [8]:
x = Robot2("Marvin")
print(x.name)
print(x.sayHello())

Marvin
Hi, I am Marvin


In [10]:
y = Robot("Marvin")
print(y.name)
print(y.sayHello())

Marvin
Hi, I am Marvin


In [11]:
print(x.__dict__)
print(y.__dict__)

{'name': 'Marvin'}
{'name': 'Marvin'}


The class definitions for Robot and Robot2 are syntactically completely different, but they implement logically the same class.

What Python actually does in the first example, i.e. the "usual way" of defining classes, is the following: Python processes the complete class statement from class Robot to collect the methods and attributes of Robot to add them to the attributes_dict of the type call. So, Python will call type in a similar or the same way as we did in Robot2

## On the Road to Metaclasses

#### Motivation for Metaclasses

In this chapter of our tutorial we want to provide some incentives or motivation for the use of metaclasses. To demonstrate some design problems, which can be solved by metaclasses, we will introduce and design a bunch of philosopher classes. Each philosopher class (Philosopher1, Philosopher2, and so on) need the same "set" of methods (in our example just one, i.e. "the_answer") as the basics for his or her pondering and brooding. A stupid way to implement the classes would be having the same code in every philospher class:

In [12]:
class Philosopher1(object): 
    
    def the_answer(self, *args):              
        return 42

class Philosopher2(object): 

    def the_answer(self, *args):              
        return 42

class Philosopher3: 

    def the_answer(self, *args):              
        return 42

plato = Philosopher1()
print(plato.the_answer())

kant = Philosopher2()
# let's see what Kant has to say :-)
print(kant.the_answer())

42
42


We can see that we have multiple copies of the method "the_answer". This is error prone and tedious to maintain, of course.

From what we know so far, the easiest way to accomplish our goal without creating redundant code would be designing a base, which contains "the_answer" as a method. Now each Philosopher class inherits from this base class:

In [14]:
class Answers(object):

    def the_answer(self, *args):              
        return 42
    
class Philosopher1(Answers): 
    pass

class Philosopher2(Answers): 
    pass

class Philosopher3(Answers): 
    pass

plato = Philosopher1()
print(plato.the_answer())

kant = Philosopher2()
# let's see what Kant has to say :-)
print(kant.the_answer())


42
42


The way we have designed our classes, each Philosopher class will always have a method "the_answer". Let's assume, we don't know a priori if we want or need this method. Let's assume that the decision, if the classes have to be augmented, can only be made at runtime. This decision might depend on configuration files, user input or some calculations

In [18]:
# the following variable would be set as the result of a runtime calculation:
x = input("Do you need 'the answer'? (y/n): ")
if x=="y":
    required = True
else:
    required = False

def the_answer(self, *args):              
        return 42

class Philosopher1(object): 
    pass
if required:
    Philosopher1.the_answer = the_answer

class Philosopher2(object): 
    pass
if required:
    Philosopher2.the_answer = the_answer

class Philosopher3(object): 
    pass
if required:
    Philosopher3.the_answer = the_answer


plato = Philosopher1()
kant = Philosopher2()

# let's see what Plato and Kant have to say :-)
if required:
    print(kant.the_answer())
    print(plato.the_answer())
else:
    print("The silence of the philosphers")

Do you need 'the answer'? (y/n): y
42
42


Even though this is another solution to our problem, there are still some serious drawbacks. It's error-prone, because we have to add the same code to every class and it seems likely that we might forget it. Besides this it's getting hardly manageable and maybe even confusing, if there are many methods we want to add.

We can improve our approach by defining a manager function and avoiding redundant code this way. The manager function will be used to augment the classes conditionally:

In [20]:
# the following variable would be set as the result of a runtime calculation:
x = input("Do you need 'the answer'? (y/n): ")
if x=="y":
    required = True
else:
    required = False
    
    
def the_answer(self, *args):              
        return 42

# manager function
def augment_answer(cls):                      
    if required:
        cls.the_answer = the_answer
        
    
class Philosopher1(object): 
    pass
augment_answer(Philosopher1)

class Philosopher2(object): 
    pass
augment_answer(Philosopher2)

class Philosopher3(object): 
    pass
augment_answer(Philosopher3)
    
    
plato = Philosopher1()
kant = Philosopher2()

# let's see what Plato and Kant have to say :-)
if required:
    print(kant.the_answer())
    print(plato.the_answer())
else:
    print("The silence of the philosphers")

Do you need 'the answer'? (y/n): y
42
42


This is again useful to solve our problem, but we, i.e. the class designers, must be careful not to forget to call the manager function "augment_answer". The code should be executed automatically. We need a way to make sure that "some" code might be executed automatically after the end of a class definition. The same code like above with decorators:

In [22]:
# the following variable would be set as the result of a runtime calculation:
x = input("Do you need 'the answer'? (y/n): ")
if x=="y":
    required = True
else:
    required = False
def the_answer(self, *args):              
        return 42

def augment_answer(cls):                      
    if required:
        cls.the_answer = the_answer
    # we have to return the class now:
    return cls
        
@augment_answer
class Philosopher1(object): 
    pass

@augment_answer
class Philosopher2(object): 
    pass

@augment_answer
class Philosopher3(object): 
    pass
    
plato = Philosopher1()
kant = Philosopher2()
    
# let's see what Plato and Kant have to say :-)
if required:
    print(kant.the_answer())
    print(plato.the_answer())
else:
    print("The silence of the philosphers")

Do you need 'the answer'? (y/n): y
42
42


## Now lets do it with metaclasses

A metaclass is a class whose instances are classes. Like an "ordinary" class defines the behavior of the instances of the class, a metaclass defines the behavior of classes and their instances.

Metaclasses are not supported by every object oriented programming language. Those programming language, which support metaclasses, considerably vary in way they implement them. Python is supporting them.

Some programmers see metaclasses in Python as "solutions waiting or looking for a problem".

There are numerous use cases for metaclasses. Just to name a few:

* logging and profiling
* interface checking
* registering classes at creation time
* automatically adding new methods
* automatic property creation
* [proxies](https://refactoring.guru/design-patterns/proxy/python/example)
* automatic resource locking/synchronization

#### Defining Metaclasses

Principially, metaclasses are defined like any other Python class, but they are classes that inherit from "type". Another difference is, that a metaclass is called automatically, when the class statement using a metaclass ends. In other words: If no "metaclass" keyword is passed after the base classes (there may be no base classes either) of the class header, type() (i.e. \_\_call__ of type) will be called. If a metaclass keyword is used, on the other hand, the class assigned to it will be called instead of type.

Now we create a very simple metaclass. It's good for nothing, except that it will print the content of its arguments in the \_\_new__ method and returns the results of the type.\_\_new__ call:

In [58]:
class LittleMeta(type):
    def __new__(cls, clsname, superclasses, attributedict):
        print(cls)
        print("clsname: ", clsname)
        print("superclasses: ", superclasses)
        print("attributedict: ", attributedict)
        return type.__new__(cls, clsname, superclasses, attributedict)

We will use the metaclass "LittleMeta" in the following example:

In [59]:
class S:
    pass

class A(S, metaclass=LittleMeta):
    pass


# A.__metaclass__ = LittleMeta

<class '__main__.LittleMeta'>
clsname:  A
superclasses:  (<class '__main__.S'>,)
attributedict:  {'__module__': '__main__', '__qualname__': 'A'}


In [97]:
a = A()

We can see LittleMeta.\_\_new__ has been called and not type.\_\_new__.

Resuming our thread from the last chapter: We define a metaclass EssentialAnswers which is capable of automatically including our augment_answer method:

In [65]:
x = input("Do you need the answer? (y/n): ")
if x.lower() == "y":
    required = True
else:
    required = False

    
def the_answer(self, *args):              
        return 42

    
class EssentialAnswers(type):
    
    def __init__(cls, clsname, superclasses, attributedict):
        if required:
            cls.the_answer = the_answer
                           
    
class Philosopher1(metaclass=EssentialAnswers): 
    pass


class Philosopher2(metaclass=EssentialAnswers): 
    pass


class Philosopher3(metaclass=EssentialAnswers): 
    pass
    
    
plato = Philosopher1()
print(plato.the_answer())


kant = Philosopher2()
# let's see what Kant has to say :-)
print(kant.the_answer())

Do you need the answer? (y/n): y
42
42


We have learned in our chapter "Type and Class Relationship" that after the class definition has been processed, Python calls

```python
type(classname, superclasses, attributes_dict)
```

This is not the case, if a metaclass has been declared in the header. That is what we have done in our previous example. Our classes Philosopher1, Philosopher2 and Philosopher3 have been hooked to the metaclass EssentialAnswers. That's why EssentialAnswer will be called instead of type:

```python
EssentialAnswer(classname, superclasses, attributes_dict)
```

To be precise, the arguments of the calls will be set the the following values:

```python
EssentialAnswer('Philopsopher1', 
                (), 
                {'__module__': '__main__', '__qualname__': 'Philosopher1'})
```

The other philosopher classes are treated in an analogue way.

#### Creating Singletons using Metaclasses

The singleton pattern is a design pattern that restricts the instantiation of a class to one object. It is used in cases where exactly one object is needed. The concept can be generalized to restrict the instantiation to a certain or fixed number of objects. The term stems from mathematics, where a singleton, - also called a unit set -, is used for sets with exactly one element.

In [180]:
class Singleton(type):
    _instances = {}
    _x = 7

    def __new__(cls, clsname, superclasses, attributedict):
        print('--------------------------- NEW of meta is called', cls)
        return type.__new__(cls, clsname, superclasses, attributedict)

    def __init__(cls, clsname, superclasses, attributedict):
        print('--------------------------- INIT of meta is called', cls)

    def __call__(cls, *args, **kwargs):
        print('--------------------------- CALL of meta is called', cls)
        if cls not in cls._instances:
            cls._instances[cls] = super(Singleton, cls).__call__(*args, **kwargs)
        return cls._instances[cls]


class SingletonClass(metaclass=Singleton):
    def __init__(self):
        print('--------------------------- INIT of general is called')
    pass


class RegularClass():
    pass

--------------------------- NEW of meta is called <class '__main__.Singleton'>
--------------------------- INIT of meta is called <class '__main__.SingletonClass'>


In [102]:
x = SingletonClass()
y = SingletonClass()
print(x == y)

x = RegularClass()
y = RegularClass()
print(x == y)

--------------------------- CALL of meta is called <class '__main__.SingletonClass'>
--------------------------- INIT of general is called
--------------------------- CALL of meta is called <class '__main__.SingletonClass'>
True
False


#### Creating Singletons without metaclass

In [175]:
class Singleton(object):
    _instance = None
    def __new__(cls, *args, **kwargs):
        if not cls._instance:
            cls._instance = super().__new__(cls, *args, **kwargs)
        return cls._instance

    
class SingletonClass(Singleton):
    pass

class RegularClass():
    pass


x = SingletonClass()
y = SingletonClass()
print(x == y)


x = RegularClass()
y = RegularClass()
print(x == y)

True
False


[Another use cases of metaslasses](https://medium.com/@dan.gittik/metaphysics-2036b38fa711)

[About \_\_prepare__ method of metaclasses](https://stackoverflow.com/questions/46827708/metaclass-and-prepare)

## Count Method Calls Using a Metaclass

#### Introduction

You may have asked yourself about the possible use cases for metaclasses. There are some interesting use cases and it's not - like some say - a solution waiting for a problem. We have mentioned already some examples.

In this chapter of our tutorial on Python, we want to elaborate an example metaclass, which will decorate the methods of the subclass. The decorated function returned by the decorator makes it possible to count the number of times each method of the subclass has been called.

This is usually one of the tasks, we expect from a profiler. So we can use this metaclass for simple profiling purposes. Of course, it will be easy to extend our metaclass for further profiling tasks.

#### Preliminary Remarks

Before we actually dive into the problem, we want to remind how we can access the attributes of a class. We will demonstrate this with the list class. We can get the list of all the non private attributes of a class - in our example the random class - with the following construct:


In [31]:
import random
cls = "random" # name of the class as a string
all_attributes = [x for x in dir(eval(cls)) if not x.startswith("__") ]
print(all_attributes)

['BPF', 'LOG4', 'NV_MAGICCONST', 'RECIP_BPF', 'Random', 'SG_MAGICCONST', 'SystemRandom', 'TWOPI', '_BuiltinMethodType', '_MethodType', '_Sequence', '_Set', '_acos', '_bisect', '_ceil', '_cos', '_e', '_exp', '_inst', '_itertools', '_log', '_os', '_pi', '_random', '_sha512', '_sin', '_sqrt', '_test', '_test_generator', '_urandom', '_warn', 'betavariate', 'choice', 'choices', 'expovariate', 'gammavariate', 'gauss', 'getrandbits', 'getstate', 'lognormvariate', 'normalvariate', 'paretovariate', 'randint', 'random', 'randrange', 'sample', 'seed', 'setstate', 'shuffle', 'triangular', 'uniform', 'vonmisesvariate', 'weibullvariate']


Now, we are filtering the callable attributes, i.e. the public methods of the class:

In [32]:
methods = [x for x in dir(eval(cls)) if not x.startswith("__") 
                              and callable(eval(cls + "." + x))]
print(methods)

['Random', 'SystemRandom', '_BuiltinMethodType', '_MethodType', '_Sequence', '_Set', '_acos', '_ceil', '_cos', '_exp', '_log', '_sha512', '_sin', '_sqrt', '_test', '_test_generator', '_urandom', '_warn', 'betavariate', 'choice', 'choices', 'expovariate', 'gammavariate', 'gauss', 'getrandbits', 'getstate', 'lognormvariate', 'normalvariate', 'paretovariate', 'randint', 'random', 'randrange', 'sample', 'seed', 'setstate', 'shuffle', 'triangular', 'uniform', 'vonmisesvariate', 'weibullvariate']


Getting the non callable attributes of the class can be easily achieved by negating callable, i.e. adding "not":

In [33]:
non_callable_attributes = [x for x in dir(eval(cls)) if not x.startswith("__") 
                              and not callable(eval(cls + "." + x))]
print(non_callable_attributes)

['BPF', 'LOG4', 'NV_MAGICCONST', 'RECIP_BPF', 'SG_MAGICCONST', 'TWOPI', '_bisect', '_e', '_inst', '_itertools', '_os', '_pi', '_random']


In normal Python programming it is neither recommended nor necessary to apply methods in the following way, but it is possible:

In [34]:
lst = [3,4]
list.__dict__["append"](lst, 42)
lst

[3, 4, 42]

#### Please note the remark from the Python documentation:

"Because dir() is supplied primarily as a convenience for use at an interactive prompt, it tries to supply an interesting set of names more than it tries to supply a rigorously or consistently defined set of names, and its detailed behavior may change across releases. For example, metaclass attributes are not in the result list when the argument is a class."

#### A Decorator for Counting Function Calls

Finally, we will begin designing the metaclass, which we have mentioned as our target in the beginning of this chapter. It will decorate all the methods of its subclass with a decorator, which counts the number of calls. We have defined such a decorator previously:

In [35]:
def call_counter(func):
    def helper(*args, **kwargs):
        helper.calls += 1
        return func(*args, **kwargs)
    helper.calls = 0
    helper.__name__= func.__name__

    return helper

We can use it in the usual way:

In [36]:
@call_counter
def f():
    pass

print(f.calls)
for _ in range(10):
    f()
    
print(f.calls)

0
10


It would be better if you add the alternative notation for decorating function. We will need this in our final metaclass:

In [37]:
def f():
    pass

f = call_counter(f)
print(f.calls)
for _ in range(10):
    f()
    
print(f.calls)

0
10


#### The "Count Calls" Metaclass

Now we have all the necessary "ingredients" together to write our metaclass. We will include our call_counter decorator as a staticmethod:

In [38]:
class FuncCallCounter(type):
    """ A Metaclass which decorates all the methods of the 
        subclass using call_counter as the decorator
    """
    
    @staticmethod
    def call_counter(func):
        """ Decorator for counting the number of function 
            or method calls to the function or method func
        """
        def helper(*args, **kwargs):
            helper.calls += 1
            return func(*args, **kwargs)
        helper.calls = 0
        helper.__name__= func.__name__
    
        return helper
    
    
    def __new__(cls, clsname, superclasses, attributedict):
        """ Every method gets decorated with the decorator call_counter,
            which will do the actual call counting
        """
        for attr in attributedict:
            if callable(attributedict[attr]) and not attr.startswith("__"):
                attributedict[attr] = cls.call_counter(attributedict[attr])
        
        return type.__new__(cls, clsname, superclasses, attributedict)
    

class A(metaclass=FuncCallCounter):
    
    def foo(self):
        pass
    
    def bar(self):
        pass

if __name__ == "__main__":
    x = A()
    print(x.foo.calls, x.bar.calls)
    x.foo()
    print(x.foo.calls, x.bar.calls)
    x.foo()
    x.bar()
    print(x.foo.calls, x.bar.calls)

0 0
1 0
2 1


## Iterators vs iterables

The Python forums and other question-and-answer websites like Quora and Stackoverflow are full of questions concerning 'iterators' and 'iterable'. Some want to know how they are defined and others want to know if there is an easy way to check, if an object is an iterator or an iterable. We will provide a function for this purpose.

We have seen that we can loop or iterate over various Python objects like lists, tuples and strings. For example:

In [103]:
for city in ["Berlin", "Vienna", "Zurich"]:
    print(city)
for language in ("Python", "Perl", "Ruby"):
    print(city)
for char in "Iteration is easy":
    print(char)

Berlin
Vienna
Zurich
Zurich
Zurich
Zurich
I
t
e
r
a
t
i
o
n
 
i
s
 
e
a
s
y


This form of looping can be seen as iteration. Iteration is not restricted to explicit for loops. If you call the function sum, - e.g. on a list of integer values, - you do iteration as well

So what is the difference between an iterable and an iterator?

On one hand, they are the same: You can iterate with a for loop over iterators and iterables. Every iterator is also an iterable, but not every iterable is an iterator. E.g. a list is iterable but a list is not an iterator! An iterator can be created from an iterable by using the function 'iter'. To make this possible the class of an object needs either a method '\_\_iter__', which returns an iterator, or a '\_\_getitem__' method with sequential indexes starting with 0

Iterators are objects with a '\_\_next__' method, which will be used when the function 'next()' is called

So what is going on behind the scenes, when a for loop is executed? The for statement calls iter() on the object ( which should be a so-called container object), over which it is supposed to loop . If this call is successful, the iter call will return an iterator object that defines the method \_\_next__() which accesses elements of the object one at a time. The \_\_next__() method will raise a StopIteration exception, if there are no further elements available. The for loop will terminate as soon as it catches a StopIteration exception. You can call the \_\_next__() method using the next() built-in function. This is how it works:

In [104]:
cities = ["Berlin", "Vienna", "Zurich"]
iterator_obj = iter(cities)
print(iterator_obj)

print(next(iterator_obj))
print(next(iterator_obj))
print(next(iterator_obj))

<list_iterator object at 0x7fa4268c8210>
Berlin
Vienna
Zurich


If we called 'next(iterator_obj)' one more time, it would return 'StopIteration'

The following function 'iterable' will return True, if the object 'obj' is an iterable and False otherwise:

In [105]:
def iterable(obj):
     try:
         iter(obj)
         return True
     except TypeError:
         return False 
        
for element in [34, [4, 5], (4, 5), {"a":4}, "dfsdf", 4.5]:
    print(element, "iterable: ", iterable(element))

34 iterable:  False
[4, 5] iterable:  True
(4, 5) iterable:  True
{'a': 4} iterable:  True
dfsdf iterable:  True
4.5 iterable:  False


We have described how an iterator works. So if you want to add an iterator behavior to your class, you have to add the \_\_iter__ and the \_\_next__ method to your class. The \_\_iter__ method returns an iterator object. If the class contains a \_\_next__, it is enough for the \_\_iter__ method to return self, i.e. a reference to itself:

In [106]:
class Reverse:
    """
    Creates Iterators for looping over a sequence backwards.
    """
    
    def __init__(self, data):
        self.data = data
        self.index = len(data)

    def __iter__(self):
        return self

    def __next__(self):
        if self.index == 0:
            raise StopIteration
        self.index = self.index - 1
        return self.data[self.index]

lst = [34, 978, 42]
lst_backwards = Reverse(lst)
for el in lst_backwards:
    print(el)

42
978
34


## Generators and Iterators

#### Introduction

What is an iterator? Iterators are objects that can be iterated over like we do in a for loop. We can also say that an iterator is an object, which returns data, one element at a time. That is, they do not do any work until we explicitly ask for their next item. They work on a principle, which is known in computer science as lazy evaluation. Lazy evaluation is an evaluation strategy which delays the evaluation of an expression until its value is really needed. Due to the laziness of Python iterators, they are a great way to deal with infinity, i.e. iterables which can iterate for ever. You can hardly find Python programs that are not teaming with iterators.

Iterators are a fundamental concept of Python. You already learned in your first Python programs that you can iterate over container objects such as lists and strings. To do this, Python creates an iterator version of the list or string. In this case, an iterator can be seen as a pointer to a container, which enables us to iterate over all the elements of this container. An iterator is an abstraction, which enables the programmer to access all the elements of an iterable object (a set, a string, a list etc.) without any deeper knowledge of the data structure of this object.

Generators are a special kind of function, which enable us to implement or generate iterators.

Mostly, iterators are implicitly used, like in the for-loop of Python. We demonstrate this in the following example. We are iterating over a list, but you shouldn't be mistaken: A list is not an iterator, but it can be used like an iterator:

In [114]:
cities = ["Paris", "Berlin", "Hamburg", 
          "Frankfurt", "London", "Vienna", 
          "Amsterdam", "Den Haag"]
for location in cities:
    print("location: " + location)

location: Paris
location: Berlin
location: Hamburg
location: Frankfurt
location: London
location: Vienna
location: Amsterdam
location: Den Haag


What is really going on when a for loop is executed? The function 'iter' is applied to the object following the 'in' keyword, e.g. for i in o:. Two cases are possible: o is either iterable or not. If o is not iterable, an exception will be raised, saying that the type of the object is not iterable. On the other hand, if o is iterable the call iter(o) will return an iterator, let us call it iterator_obj The for loop uses this iterator to iterate over the object o by using the next method. The for loop stops when next(iterator_obj) is exhausted, which means it returns a StopIteration exception. We demonstrate this behaviour in the following code example:

In [115]:
expertises = ["Python Beginner", 
              "Python Intermediate", 
              "Python Proficient", 
              "Python Advanced"]
expertises_iterator = iter(expertises)
print("Calling 'next' for the first time: ", next(expertises_iterator))
print("Calling 'next' for the second time: ", next(expertises_iterator))

Calling 'next' for the first time:  Python Beginner
Calling 'next' for the second time:  Python Intermediate


We could have called `next` two more times, but after this we will get a StopIteration Exception.

We can simulate this iteration behavior of the for loop in a while loop: You might have noticed that there is something missing in our program: We have to catch the "Stop Iteration" exception:

In [118]:
other_cities = ["Strasbourg", "Freiburg", "Stuttgart", 
                "Vienna / Wien", "Hannover", "Berlin", 
                "Zurich"]

city_iterator = iter(other_cities)
while city_iterator:
    try:
        city = next(city_iterator)
        print(city)
    except StopIteration:
        break

Strasbourg
Freiburg
Stuttgart
Vienna / Wien
Hannover
Berlin
Zurich


The sequential base types as well as the majority of the classes of the standard library of Python support iteration. The dictionary data type dict supports iterators as well. In this case the iteration runs over the keys of the dictionary:

In [127]:
capitals = { 
    "France":"Paris", 
    "Netherlands":"Amsterdam", 
    "Germany":"Berlin", 
    "Switzerland":"Bern", 
    "Austria":"Vienna"}

for country in capitals:
     print("The capital city of " + country + " is " + capitals[country])

The capital city of France is Paris
The capital city of Netherlands is Amsterdam
The capital city of Germany is Berlin
The capital city of Switzerland is Bern
The capital city of Austria is Vienna


Off-topic: Some readers may be confused to learn from our example that the capital of the Netherlands is not Den Haag (The Hague) but Amsterdam. Amsterdam is the capital of the Netherlands according to the constitution, even though the Dutch parliament and the Dutch government are situated in The Hague, as well as the Supreme Court and the Council of State.

#### Implementing an Iterator as a Class

One way to create iterators in Python is defining a class which implements the methods \_\_init__ and \_\_next__. We show this by implementing a class cycle, which can be used to cycle over an iterable object forever. In other words, an instance of this class returns the element of an iterable until it is exhausted. Then it repeats the sequence indefinitely

In [128]:
class Cycle(object):
    
    def __init__(self, iterable):
        self.iterable = iterable
        self.iter_obj = iter(iterable)

    def __iter__(self):
        return self

    def __next__(self):
        while True:
            try:
                next_obj = next(self.iter_obj)
                return next_obj
            except StopIteration:
                self.iter_obj = iter(self.iterable)

x = Cycle("abc")

for i in range(10):
    print(next(x), end=", ")

a, b, c, a, b, c, a, b, c, a, 

Even though the object-oriented approach to creating an iterator may be very interesting, this is not the pythonic method.

The usual and easiest way to create an iterator in Python consists in using a generator function. You will learn this in the following chapter.

## Generators

On the surface, generators in Python look like functions, but there is both a syntactic and a semantic difference. One distinguishing characteristic is the yield statements. The yield statement turns a functions into a generator. A generator is a function which returns a generator object. This generator object can be seen like a function which produces a sequence of results instead of a single object. This sequence of values is produced by iterating over it, e.g. with a for loop. The values, on which can be iterated, are created by using the yield statement. The value created by the yield statement is the value following the yield keyword. The execution of the code stops when a yield statement is reached. The value behind the yield will be returned. The execution of the generator is interrupted now. As soon as "next" is called again on the generator object, the generator function will resume execution right after the yield statement in the code, where the last call is made. The execution will continue in the state in which the generator was left after the last yield. In other words, all the local variables still exist, because they are automatically saved between calls. This is a fundamental difference to functions: functions always start their execution at the beginning of the function body, regardless of where they had left in previous calls. They don't have any static or persistent values. There may be more than one yield statement in the code of a generator or the yield statement might be inside the body of a loop. If there is a return statement in the code of a generator, the execution will stop with a StopIteration exception error when this code is executed by the Python interpreter. The word "generator" is sometimes ambiguously used to mean both the generator function itself and the objects which are generated by a generator.

Everything which can be done with a generator can also be implemented with a class based iterator as well. However, the crucial advantage of generators consists in automatically creating the methods \_\_iter__() and next(). Generators provide a very neat way of producing data which is huge or even infinite.

The following is a simple example of a generator, which is capable of producing various city names.

It's possible to create a generator object with this generator, which generates all the city names, one after the other.

In [141]:
def city_generator():
    yield("Hamburg")
    yield("Konstanz")
    yield("Berlin")
    yield("Zurich")
    yield("Schaffhausen")
    yield("Stuttgart")  

We created an iterator by calling city_generator():

In [142]:
city = city_generator()

In [143]:
while True:
    print(next(city))

Hamburg
Konstanz
Berlin
Zurich
Schaffhausen
Stuttgart


StopIteration: ignored

As we can see, we have generated an iterator city in the interactive shell. Every call of the method next(city) returns another city. After the last city, i.e. Stuttgart, has been created, another call of next(city) raises an exception, saying that the iteration has stopped, i.e. StopIteration. "Can we send a reset to an iterator?" is a frequently asked question, so that it can start the iteration all over again. There is no reset, but it's possible to create another generator. This can be done e.g. by using the statement "city = city_generator()" again. Though at the first sight the yield statement looks like the return statement of a function, we can see in this example that there is a huge difference. If we had a return statement instead of a yield in the previous example, it would be a function. But this function would always return the first city "Hamburg" and never any of the other cities, i.e. "Konstanz", "Berlin", "Zurich", "Schaffhausen", and "Stuttgart"

#### Method of Operation

As we have elaborated in the introduction of this chapter, the generators offer a comfortable method to generate iterators, and that's why they are called generators

Method of working:

* A generator is called like a function. Its return value is an iterator, i.e. a generator object. The code of the generator will not be executed at this stage

* The iterator can be used by calling the next method. The first time the execution starts like a function, i.e. the first line of code within the body of the iterator. The code is executed until a yield statement is reached

* **yield** returns the value of the expression, which is following the keyword yield. This is like a function, but Python keeps track of the position of this yield and the state of the local variables is stored for the next call. At the next call, the execution continues with the statement following the yield statement and the variables have the same values as they had in the previous call

* The iterator is finished, if the generator body is completely worked through or if the program flow encounters a return statement without a value

##### Example 1

We will illustrate this behaviour in the following example. The generator count creates an iterator which creates a sequence of values by counting from the start value 'firstval' and using 'step' as the increment for counting:

In [144]:
def count(firstval=0, step=1):
    x = firstval
    while True:
        yield x
        x += step
        
counter = count() # count will start with 0
for i in range(10):
    print(next(counter), end=", ")

start_value = 2.1
stop_value = 0.3
print("\nNew counter:")
counter = count(start_value, stop_value)
for i in range(10):
    new_value = next(counter)
    print(f"{new_value:2.2f}", end=", ")

0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 
New counter:
2.10, 2.40, 2.70, 3.00, 3.30, 3.60, 3.90, 4.20, 4.50, 4.80, 

##### Example 2. Fibonacci as a Generator

The Fibonacci sequence is named after Leonardo of Pisa, who was known as Fibonacci (a contraction of filius Bonacci, "son of Bonaccio"). In his textbook Liber Abaci, which appeared in the year 1202) he had an exercise about the rabbits and their breeding: It starts with a newly-born pair of rabbits, i.e. a male and a female. It takes one month until they can mate. At the end of the second month the female gives birth to a new pair of rabbits. Now let's suppose that every female rabbit will bring forth another pair of rabbits every month after the end of the first month. We have to mention that Fibonacci's rabbits never die. They question is how large the population will be after a certain period of time.

This produces a sequence of numbers: 0, 1, 1, 2, 3, 5, 8, 13

This sequence can be defined in mathematical terms like this:

$F_n = F_{n - 1} + F_{n - 2}$ with the seed values: $F_0 = 0$ and $F_1 = 1$

In [146]:
def fibonacci(n):
    """ A generator for creating the Fibonacci numbers """
    a, b, counter = 0, 1, 0
    while True:
        if (counter > n): 
            return
        yield a
        a, b = b, a + b
        counter += 1
f = fibonacci(5)
for x in f:
    print(x, " ", end="") # 
print()

0  1  1  2  3  5  


#### Using a 'return' in a Generator

Since Python 3.3, generators can also use return statements, but a generator still needs at least one yield statement to be a generator! A return statement inside of a generator is equivalent to raise StopIteration()

Let's have a look at a generator in which we raise StopIteration:

In [147]:
def gen():
    yield 1
    raise StopIteration(42)
    yield 2

In [148]:
g = gen()

In [149]:
next(g)

1

In [150]:
next(g)

RuntimeError: ignored

We demonstrate now that return is "nearly" equivalent to raising the 'StopIteration' exception:

In [151]:
def gen():
    yield 1
    return 42
    yield 2

In [152]:
g = gen()
next(g)

1

In [153]:
next(g)

StopIteration: ignored

#### send Method / Coroutines

Generators can not only send objects but also receive objects. Sending a message, i.e. an object, into the generator can be achieved by applying the send method to the generator object. Be aware of the fact that send both sends a value to the generator and returns the value yielded by the generator. We will demonstrate this behavior in the following simple example of a coroutine:

In [154]:
def simple_coroutine():
    print("coroutine has been started!")
    while True:
        x = yield "foo"
        print("coroutine received: ", x)
     
 
cr = simple_coroutine()
cr

<generator object simple_coroutine at 0x7fa421fd89d0>

In [155]:
next(cr)

coroutine has been started!


'foo'

In [156]:
ret_value = cr.send("Hi")
print("'send' returned: ", ret_value)

coroutine received:  Hi
'send' returned:  foo


We had to call next on the generator first, because the generator needed to be started. Using send to a generator which hasn't been started leads to an exception.

To use the send method, the generator must wait for a yield statement so that the data sent can be processed or assigned to the variable on the left. What we haven't said so far: A next call also sends and receives. It always sends a None object. The values sent by "next" and "send" are assigned to a variable within the generator: this variable is called new_counter_val in the following example.

The following example modifies the generator 'count' from the previous subchapter by adding a send feature:

In [157]:
def count(firstval=0, step=1):
    counter = firstval
    while True:
        new_counter_val = yield counter
        if new_counter_val is None:
            counter += step
        else:
            counter = new_counter_val
            
start_value = 2.1
stop_value = 0.3
counter = count(start_value, stop_value) 
for i in range(10):
    new_value = next(counter)
    print(f"{new_value:2.2f}", end=", ")
 
print("set current count value to another value:")
counter.send(100.5)
for i in range(10):
    new_value = next(counter)
    print(f"{new_value:2.2f}", end=", ")

2.10, 2.40, 2.70, 3.00, 3.30, 3.60, 3.90, 4.20, 4.50, 4.80, set current count value to another value:
100.80, 101.10, 101.40, 101.70, 102.00, 102.30, 102.60, 102.90, 103.20, 103.50, 

#### Another Example for send

In [159]:
from random import choice

def song_generator(song_list):
    new_song = None
    while True:
        if new_song != None:
            if new_song not in song_list:
                song_list.append(new_song)
            new_song = yield new_song
        else:
            new_song = yield(choice(song_list))           

songs = ["Her Şeyi Yak - Sezen Aksu", 
         "Bluesette - Toots Thielemans",
         "Six Marimbas - Steve Reich",
         "Riverside - Agnes Obel",
         "Not for Radio - Nas",
         "What's going on - Taste",
         "On Stream - Nils Petter Molvær",
         "La' Inta Habibi - Fayrouz",
         "Ik Leef Niet Meer Voor Jou - Marco Borsato",
         "Δέκα λεπτά - Αθηνά Ανδρεάδη"]

In [160]:
radio_program = song_generator(songs)

In [161]:
next(radio_program)

'Her Şeyi Yak - Sezen Aksu'

In [162]:
for i in range(3):
    print(next(radio_program))

What's going on - Taste
Ik Leef Niet Meer Voor Jou - Marco Borsato
Bluesette - Toots Thielemans


In [163]:
radio_program.send("Distorted Angels - Archive")

'Distorted Angels - Archive'

In [164]:
songs

['Her Şeyi Yak - Sezen Aksu',
 'Bluesette - Toots Thielemans',
 'Six Marimbas - Steve Reich',
 'Riverside - Agnes Obel',
 'Not for Radio - Nas',
 "What's going on - Taste",
 'On Stream - Nils Petter Molvær',
 "La' Inta Habibi - Fayrouz",
 'Ik Leef Niet Meer Voor Jou - Marco Borsato',
 'Δέκα λεπτά - Αθηνά Ανδρεάδη',
 'Distorted Angels - Archive']

So far, we could change the behavior of the song interator by sending a new song, i.e. a title plus the performer. We will extend the generator now. We make it possible to send a new song list. We have to send a tuple to the new iterator, either a tuple:

(, ) or

("-songlist-", )


In [165]:
from random import choice
def song_generator(song_list):
    new_song = None
    while True:
        if new_song != None:
            if new_song[0] == "-songlist-":
                song_list = new_song[1]
                new_song = yield(choice(song_list)) 
            else:
                title, performer = new_song
                new_song = title + " - " + performer
                if new_song not in song_list:
                    song_list.append(new_song)
                new_song = yield new_song
        else:
            new_song = yield(choice(song_list)) 

In [166]:
songs1 = ["Après un Rêve - Gabriel Fauré"
         "On Stream - Nils Petter Molvær",
         "Der Wanderer Michael - Michael Wollny",
         "Les barricades mystérieuses - Barbara Thompson",
         "Monday - Ludovico Einaudi"]

songs2 = ["Dünyadan Uszak - Pinhani", 
          "Again - Archive",
          "If I had a Hear - Fever Ray"
          "Every you, every me - Placebo",
          "Familiar - Angnes Obel"]

In [167]:
radio_prog = song_generator(songs1)

In [168]:
for i in range(5):
    print(next(radio_prog))

Après un Rêve - Gabriel FauréOn Stream - Nils Petter Molvær
Monday - Ludovico Einaudi
Les barricades mystérieuses - Barbara Thompson
Les barricades mystérieuses - Barbara Thompson
Der Wanderer Michael - Michael Wollny


We change the radio program now by exchanging the song list:

In [169]:
radio_prog.send(("-songlist-", songs2))

'Again - Archive'

In [170]:
for i in range(5):
    print(next(radio_prog))

Again - Archive
Again - Archive
Again - Archive
Dünyadan Uszak - Pinhani
Dünyadan Uszak - Pinhani


#### The throw Method

The throw() method raises an exception at the point where the generator was paused, and returns the next value yielded by the generator. It raises StopIteration if the generator exits without yielding another value. The generator has to catch the passed-in exception, otherwise the exception will be propagated to the caller.

The infinite loop of our count generator from our previous example keeps yielding the elements of the sequential data, but we don't have any information about the index or the state of the variables of "counter". We can get this information, i.e. the state of the variables in count by throwing an exception with the throw method. We catch this exception inside of the generator and print the state of the values of "count":

In [171]:
def count(firstval=0, step=1):
    counter = firstval
    while True:
        try:
            new_counter_val = yield counter
            if new_counter_val is None:
                counter += step
            else:
                counter = new_counter_val
        except Exception:
            yield (firstval, step, counter)

In the following code block, we will show how to use this generator:

In [172]:
c = count()
for i in range(6):
    print(next(c))
print("Let us see what the state of the iterator is:")
state_of_count = c.throw(Exception)
print(state_of_count)
print("now, we can continue:")
for i in range(3):
    print(next(c))

0
1
2
3
4
5
Let us see what the state of the iterator is:
(0, 1, 5)
now, we can continue:
5
6
7


We can improve the previous example by defining our own exception class StateOfGenerator:

In [173]:
class StateOfGenerator(Exception):
     def __init__(self, message=None):
         self.message = message

def count(firstval=0, step=1):
    counter = firstval
    while True:
        try:
            new_counter_val = yield counter
            if new_counter_val is None:
                counter += step
            else:
                counter = new_counter_val
        except StateOfGenerator:
            yield (firstval, step, counter)

We can use the previous generator like this:

In [174]:
c = count()
for i in range(3):
    print(next(c))
print("Let us see what the state of the iterator is:")
i = c.throw(StateOfGenerator)
print(i)
print("now, we can continue:")
for i in range(3):
    print(next(c))

0
1
2
Let us see what the state of the iterator is:
(0, 1, 2)
now, we can continue:
2
3
4


#### yield from

"yield from" is available since Python 3.3! The yield from <expr> statement can be used inside the body of a generator. <expr> has to be an expression evaluating to an iterable, from which an iterator will be extracted. The iterator is run to exhaustion, i.e. until it encounters a StopIteration exception. This iterator yields and receives values to or from the caller of the generator, i.e. the one which contains the yield from statement.

We can learn from the following example by looking at the two generators 'gen1' and 'gen2' that the yield from from gen2 are substituting the for loops of 'gen1':

In [183]:
def gen1():
    for char in "Python":
        yield char
    for i in range(5):
        yield i

def gen2():
    yield from "Python"
    yield from range(5)

g1 = gen1()
g2 = gen2()
print("g1: ", end=", ")
for x in g1:
    print(x, end=", ")
print("\ng2: ", end=", ")
for x in g2:
    print(x, end=", ")
print()

g1: , P, y, t, h, o, n, 0, 1, 2, 3, 4, 
g2: , P, y, t, h, o, n, 0, 1, 2, 3, 4, 


We can see from the output that both generators are the same.

The benefit of a yield from statement can be seen as a way to split a generator into multiple generators. That's what we have done in our previous example and we will demonstrate this more explicitely in the following example:

In [184]:
def cities():
    for city in ["Berlin", "Hamburg", "Munich", "Freiburg"]:
        yield city

def squares():
    for number in range(10):
        yield number ** 2
        
def generator_all_in_one():
    for city in cities():
        yield city
    for number in squares():
        yield number
        
def generator_splitted():
    yield from cities()
    yield from squares()
    
lst1 = [el for el in generator_all_in_one()]
lst2 = [el for el in generator_splitted()]
print(lst1 == lst2)

True


The previous code returns True because the generators generator_all_in_one and generator_splitted yield the same elements. This means that if the \<expr\> from the yield from is another generator, the effect is the same as if the body of the sub‐generator were inlined at the point of the yield from statement. Furthermore, the subgenerator is allowed to execute a return statement with a value, and that value becomes the value of the yield from expression. We demonstrate this with the following little script:

In [186]:
def subgenerator():
    yield 1
    return 42
    yield 6

def delegating_generator():
    x = yield from subgenerator()
    print(x)

for x in delegating_generator():
    print(x)

1
42


The full semantics of the yield from expression is described in six points in "PEP 380 -- Syntax for Delegating to a Subgenerator" in terms of the generator protocol:

* Any values that the iterator yields are passed directly to the caller
* Any values sent to the delegating generator using send() are passed directly to the iterator. If the sent value is None, the iterator's next() method is called. If the sent value is not None, the iterator's send() method is called. If the call raises StopIteration, the delegating generator is resumed. Any other exception is propagated to the delegating generator
* Exceptions other than GeneratorExit thrown into the delegating generator are passed to the throw() method of the iterator. If the call raises StopIteration, the delegating generator is resumed. Any other exception is propagated to the delegating generator
* If a GeneratorExit exception is thrown into the delegating generator, or the close() method of the delegating generator is called, then the close() method of the iterator is called if it has one. If this call results in an exception, it is propagated to the delegating generator. Otherwise, GeneratorExit is raised in the delegating generator
* The value of the yield from expression is the first argument to the StopIteration exception raised by the iterator when it terminates
* return expr in a generator causes StopIteration(expr) to be raised upon exit from the generator

#### Recursive Generators

The following example is a generator to create all the permutations of a given list of items.

For those who don't know what permutations are, we have a short introduction:

Formal Definition: The term "permutation" stems from the latin verb "permutare" which means A permutation is a rearrangement of the elements of an ordered list. In other words: Every arrangement of n elements is called a permutation.

In the following lines we show you all the permutations of the letter a, b and c:

```
a b c 
a c b 
b a c 
b c a 
c a b 
c b a 
```

The number of permutations on a set of n elements is given by n!

$n! = n(n-1)(n-2) ... 2 * 1$

n! is called the factorial of n.

The permutation generator can be called with an arbitrary list of objects. The iterator returned by this generator generates all the possible permutations:

In [187]:
def permutations(items):
    n = len(items)
    if n==0: yield []
    else:
        for i in range(len(items)):
            for cc in permutations(items[:i]+items[i+1:]):
                yield [items[i]]+cc

for p in permutations(['r','e','d']): print(''.join(p))
for p in permutations(list("game")): print(''.join(p) + ", ", end="")

red
rde
erd
edr
dre
der
game, gaem, gmae, gmea, geam, gema, agme, agem, amge, ameg, aegm, aemg, mgae, mgea, mage, maeg, mega, meag, egam, egma, eagm, eamg, emga, emag, 

The previous example can be hard to understand for newbies. Like always, Python offers a convenient solution. We need the module **itertools** for this purpose. Itertools is a very handy tool to create and operate on iterators.

Creating permutations with itertools:

In [188]:
import itertools
perms = itertools.permutations(['r','e','d'])
list(perms)

[('r', 'e', 'd'),
 ('r', 'd', 'e'),
 ('e', 'r', 'd'),
 ('e', 'd', 'r'),
 ('d', 'r', 'e'),
 ('d', 'e', 'r')]

The term "permutations" can sometimes be used in a weaker meaning. Permutations can denote in this weaker meaning a sequence of elements, where each element occurs just once, but without the requirement to contain all the elements of a given set. So in this sense (1,3,5,2) is a permutation of the set of digits {1,2,3,4,5,6}. We can build, for example, all the sequences of a fixed length k of elements taken from a given set of size n with k ≤ n.

These atypical permutations are also known as sequences without repetition. By using this term we can avoid confusion with the term "permutation". The number of such k-permutations of n is denoted by $P_{n,k}$ and its value is calculated by the product:

$n · (n - 1) · … (n - k + 1)$

By using the factorial notation, the above mentioned expression can be written as:

$P_{n,k} = n! / (n - k)!$

A generator for the creation of k-permuations of n objects looks very similar to our previous permutations generator:

In [190]:
def k_permutations(items, n):
    if n==0: 
        yield []
    else:
        for item in items:
            for kp in k_permutations(items, n-1):
                if item not in kp:
                    yield [item] + kp

These are are all the 3-permutations of the set {"a","b","c","d"}:

In [191]:
for kp in k_permutations("abcd", 3):
    print(kp)

['a', 'b', 'c']
['a', 'b', 'd']
['a', 'c', 'b']
['a', 'c', 'd']
['a', 'd', 'b']
['a', 'd', 'c']
['b', 'a', 'c']
['b', 'a', 'd']
['b', 'c', 'a']
['b', 'c', 'd']
['b', 'd', 'a']
['b', 'd', 'c']
['c', 'a', 'b']
['c', 'a', 'd']
['c', 'b', 'a']
['c', 'b', 'd']
['c', 'd', 'a']
['c', 'd', 'b']
['d', 'a', 'b']
['d', 'a', 'c']
['d', 'b', 'a']
['d', 'b', 'c']
['d', 'c', 'a']
['d', 'c', 'b']


#### A Generator of Generators

The second generator of our Fibonacci sequence example generates an iterator, which can theoretically produce all the Fibonacci numbers, i.e. an infinite number. But you shouldn't try to produce all these numbers in a list with the following line:

```python
list(fibonacci())
```

This will show you very fast the limits of your computer.

In most practical applications, we only need the first n elements of and other "endless" iterators. We can use another generator, in our example firstn, to create the first n elements of a generator generator:

In [192]:
def firstn(generator, n):
    g = generator()
    for i in range(n):
        yield next(g)

The following script returns the first 10 elements of the Fibonacci sequence:

In [193]:
def fibonacci():
    """ A Fibonacci number generator """
    a, b = 0, 1
    while True:
        yield a
        a, b = b, a + b

print(list(firstn(fibonacci, 10))) 

[0, 1, 1, 2, 3, 5, 8, 13, 21, 34]
