# The `yield` command in python

I always asked myself what exactly `yield` does and what it is about. In particular, I was very confused, since it seemed to me it is used in functions similar to `return` statements, yet it seems to do something different.

Spinning up a quick google search, I ended up with two very good pages about the topic - as usual on [realpython.com](https://realpython.com/introduction-to-python-generators/) and also on [stackoverflow](https://stackoverflow.com/questions/231767/what-does-the-yield-keyword-do). This notebook will be my summary of what I took away from the realpython atricle and the suggested answer on stackoverflow - originally posted by user [e-satis](https://stackoverflow.com/users/9951/e-satis).

## The TL;DR

`yield` is used when we are using a **generator function**. This is a special kind of python function that returns a [**generator object**](https://docs.python.org/3/c-api/gen.html). A generator object is similar to an **iterable** (take a look at the python docs glossary[here](https://docs.python.org/3/glossary.html)), with two important distinctions:
- it can only be iterated over **once**
- it's values are not stored in the memory before iteration

This makes generators extremely useful if we need to iterate over a lot of values that we do not want to store in the memory. Let's have a look at some very simple code!

## Generators 101

In the code below, we start with a simple **iterable** and then show the equivalent code with a **generator** object:

In [None]:
# let's create a simple list
mylist = [1, 2, 3]

# Several objects in python are "iterables" (according to the documentation,
# these are objects of any class that contain an __iter__() method or 
# a __getitem__() method which implements sequence semantics), which 
# means we can iterate over them and read them however we like:
for i in mylist:
    print(i)

# A negative aspect of this is that we need to store all values in the memory
# beforehand, like whith this list.
# Here, it is not a big problem, but if we had millions of values,
# it would look different.

In [None]:
# Now, let's create a `generator` object. This can be done like this:
mygenerator = (x for x in range(1,4))
print(type(mygenerator))


We can see that we created an object of the `generator` class. Let's try to iterate over it:

In [None]:
for i in mygenerator:
    print(i)

We see that we can iterate over `generator` objects. This is becasue generators are **iterators**, a kind of iterable that you can iterate over only once. Let's try to iterate over it again:

In [None]:
for i in mygenerator:
    print(i)

As we can see, the second iteration did not print any values, because we aren't able to iterate over our generator a second time, which is specific to **iterators** (and thus to `generator` objects).

## Generator functions and `yield`

Okay, we got a basic understanding of what `generator` objects are, but how does `yield` come into play? `yield` is used in **generator functions**, which are functions that produce `generator` objects. Let's take as an example the following function:

In [None]:
def create_generator():
    print("I'm in the create_generator function!")
    mylist = range(3)
    for i in mylist:
        yield i*i

my_new_generator = create_generator()
print(type(my_new_generator))

Okay, what the heck happened here!? We can make some interesting observations:
- even though our `create_generator` function does not have a `return` statement, the function still did return something: an object of type `generator` (note that this also means that `yield` is NOT simply a replacement of a `return` statement, since otherwise the function should have returned `i*i`)
- we called the function `create_generator` BUT the body of the function was not run! Otherwise we should have seen the printout of "I'm in the create_generator function", but we didn't

The reason is the special behaviour of generator functions: They define how the generator should be executed when it is **iterated** over and only return the `generator` object itself. So the real magic happens when we call the object created from our generator function in a loop:

In [None]:
for i in my_new_generator:
    print(i)

Okay so we saw now that the code is executed **only after calling the `for` loop**. Okay so here is how it works in detail:
Once the first `for` is calling the `generator` object that our function returned, it runs the code in said function until it hits the first `yield`, then returns the respective value. Then, each subsequent iteration of our `generator` object will *continue* the function from where we left off until it hits the next `yield` and then returns again the currently computed value of `i*i` and so on. This continues until the `generator` is "empty", so basically until our function is completely run and can't hit `yield` again. One can use `yield` in this way either using a loop (like we did with the `for` loop over `range(3)`) or using an `if/else` statement that `yield`s a value when a condition is `True` or `False`. This allows you to use `generator` objects in very smart and versatile ways.

## Example with `if/else` logic: The ATM

Let's take a look at a very similar example as described in the above stackoverflow post. We are building an ATM that always pays out 100 bucks if there is no crisis, but when there is a crisis, the ATM will hand out no money at all! No withdrawals allowed!

In [None]:
# define Bank class
class Bank():
    def __init__(self):
        self.crisis = False
    
    # create an ATM that yields 100$ when there is no crisis and 0$ when there
    # actually is a crisis
    def create_atm(self):
        while True:
            if self.crisis:
                yield "$0"
            else:
                yield "$100"

In [None]:
super_bank = Bank()
new_atm = super_bank.create_atm()
print(type(new_atm))

We created a `generator` function inside our `Bank` class! Now let's do some iterations. In this example, I use the `next` function that just causes the next iteration of an `iterator` (like our `generator` object is) to run.

In [None]:
print("withdraw 100 bucks")
print(next(new_atm))

print("withdraw 100 bucks again")
print(next(new_atm))

print("withdraw 100 bucks three times in a row")
for i in range(3):
    print(next(new_atm))

print("Oh no! A crisis happens!")
super_bank.crisis = True

print("Try to withdraw")
print(next(new_atm))

print("Try again to withdraw")
print(next(new_atm))

print("Try to withdraw three times in a row")
for i in range(3):
    print(next(new_atm))

print("Phew! Crisis is over")
super_bank.crisis = False

print("Try to withdraw 100 bucks again")
print(next(new_atm))
print("Yay! It works! Generator functions are cool!")

Okay I guess we got a pretty good feeling of what `generator` objects and functions are and why they are useful!

**Note:** Generators are useful if we are **memory limited**, that is, if we need to loop over a huge number of values or deal with a huge flow of incoming files, that we would need to store in our memory if we would use a more "classical" iterable approach using `list`s for example. However the *execution* of our code will typically be always faster when using lists compared to a generator-based approach, so we should not use generators if we are **computation power limited**.

There are also some more advanced uses and additional useful methods one can call on `generator` objects, which are nicely described and explained in the [realpython article](https://realpython.com/introduction-to-python-generators/#creating-data-pipelines-with-generators), which I would recommend everyone interested in Generators to read!