<center><img src=https://github.com/komisarzGiT/gai/blob/main/GU_Python/course/img/MScAI_brand.png?raw=1 width=70%></center>

# Generators

Python aims to allow quite *lego-like* programming - small pieces, flexibly combinable. Generators are a nice tool for this. They're also important for saving memory.

Generators
---

A generator is like a function -- but instead of returning a
single value, it *yields* one value at a time.

* `yield` keyword
* Eg called by a for-loop
* Saves memory
* "Lazy" versus "eager" execution.


After the caller has "used" the `yield`ed value -- usually,
inside a `for` loop or comprehension or similar --
control comes back into the generator, and it can then proceed to yield
another value, and so on.

Generators are very useful in situations where we would normally have
to create a huge (potentially infinite) list as our return value,
which would use up all our RAM. A generator allows us to create and
yield just one item from the huge list at a time, so the huge list is
never actually formed.

A generator is called "lazy", because it doesn't do all its work
immediately when you call it. It does just enough for now, then
stops, and can resume later. A normal function is "eager", the opposite of lazy.


### `yield`

* `yield` is like `return`
* But it gives back just one value, and suspends the generator
* When the caller asks for the **next** value, the generator **resumes**
* Resuming is not like calling the generator again from the start.

In [None]:
def gen_squares(start, stop):
    result = []
    for i in range(start, stop):
        print('hello')
        result.append(i**2)

In [None]:
gen_squares(0, 100000000000)

<generator object gen_squares at 0x107d98e40>

There's no law that says the consumer has to *use* all the items yielded by the generator.

This fact allows generators to be infinite, e.g. using
```python
def all_the_ints():
    i = 0
    while True:
        yield i
        i = i+1
for i in all_the_ints():
    if i > 100: break
    else: print(i)
```

In the `itertools` module there are several examples where this is useful.

**Example** one place where generators are very important is when reading in files. A file on disk could be, say 10Gb - enough to use up all the RAM on our machine if we read it at once. So it's good practice, if possible, to open the file and then read and process one line at a time.

```python
# nice example from
# https://realpython.com/introduction-to-python-generators/
def csv_reader(file_name):
    file = open(file_name)
    result = file.read().split("\n") # MemoryError
    return result

row_count = 0
for row in csv_reader('some_enormous_file.csv'):
    row_count += 1
print(row_count)
```

We get a `MemoryError` because we tried to load a lot of data into memory at once. Instead we should read one line at a time, and `yield` it:

```python
def csv_reader(file_name):
    for row in open(file_name, "r"):
        yield row

row_count = 0
for row in csv_reader('some_enormous_file.csv'):
    row_count += 1
print(row_count)
```

### `yield from`

`yield from` is a useful shorthand which can be used to `yield` each item from a sub-generator, one-by-one.

In [1]:
def subgen1():
    yield 1
    yield 2

def subgen2():
    yield 3
    yield 4

def gen():
    yield from subgen1()
    yield from subgen2()

for item in gen():
    print(item)

1
2
3
4


When would this be useful? One example is when writing **depth-first traversal of a tree**, which we'll see elsewhere in the module.

### Generator comprehensions

A **generator comprehension** is like a list comprehension, but now using round brackets `()` instead of square. It doesn't create a list, but a generator.

In [9]:
gc = (x for x in range(20) if x % 2 == 0)
print(type(gc))
for x in gc:
    print(x)


<class 'generator'>
0
2
4
6
8
10
12
14
16
18
0
2
4
6
8
10
12
14
16
18


After a generator comprehension has been used, it is **exhausted**. Nothing is left in it to be yielded:

In [4]:
for x in gc:
    print(x)

**Example** (adapted from Prof Michael Madden): Use a generator to generate Pythagorean triples (i.e. integers $(x, y, z)$ such that $x^2 + y^2 = z^2$), and $x, y, z \in [1, 30]$


In [7]:
def pythagorean_triples(n):
    # we *canonicalise* on the ordering x <= y <= z to avoid duplicates
    for x in range(1, n):
        for y in range(x, n):
            for z in range(y, n):
                if x**2 + y**2 == z**2:
                    yield (x, y, z)

In [8]:
for x, y, z in pythagorean_triples(30):
    print(x, y, z)

3 4 5
5 12 13
6 8 10
7 24 25
8 15 17
9 12 15
10 24 26
12 16 20
15 20 25
20 21 29


Now do the same again, but this time using a generator comprehension instead:

In [37]:
triples = ((x,y,z)
           for x in range(1,30)
           for y in range(x,30)
           for z in range(y,30)
           if x**2 + y**2 == z**2)

print('Pythagorean triples:')
for x, y, z in triples:
    print(x, y, z)


L = [1,2,3,4,5,6,7,8,9]
setos = { (a,b,c, d, e) for a in L for b in L for c in L for d in L for e in L }
len(setos)

team = ( {a,b,c,d,e} for a in L for b in L for c in L for d in L for e in L if len({a,b,c,d,e,}) == 5)
print(type(team))
teams = []
i = 0
for a in team:
  print(type(a))
  teams = teams + team
  #print(*a)

print(i)
len(teams)


[1;30;43mStreaming output truncated to the last 5000 lines.[0m
1 3 4 5 7
1 3 5 6 7
1 3 5 7 8
1 3 5 7 9
1 2 3 6 7
1 3 4 6 7
1 3 5 6 7
1 3 6 7 8
1 3 6 7 9
1 2 3 7 8
1 3 4 7 8
1 3 5 7 8
1 3 6 7 8
1 3 7 8 9
1 2 3 7 9
1 3 4 7 9
1 3 5 7 9
1 3 6 7 9
1 3 7 8 9
1 2 3 4 7
1 2 4 5 7
1 2 4 6 7
1 2 4 7 8
1 2 4 7 9
1 2 3 4 7
1 3 4 5 7
1 3 4 6 7
1 3 4 7 8
1 3 4 7 9
1 2 4 5 7
1 3 4 5 7
1 4 5 6 7
1 4 5 7 8
1 4 5 7 9
1 2 4 6 7
1 3 4 6 7
1 4 5 6 7
1 4 6 7 8
1 4 6 7 9
1 2 4 7 8
1 3 4 7 8
1 4 5 7 8
1 4 6 7 8
1 4 7 8 9
1 2 4 7 9
1 3 4 7 9
1 4 5 7 9
1 4 6 7 9
1 4 7 8 9
1 2 3 5 7
1 2 4 5 7
1 2 5 6 7
1 2 5 7 8
1 2 5 7 9
1 2 3 5 7
1 3 4 5 7
1 3 5 6 7
1 3 5 7 8
1 3 5 7 9
1 2 4 5 7
1 3 4 5 7
1 4 5 6 7
1 4 5 7 8
1 4 5 7 9
1 2 5 6 7
1 3 5 6 7
1 4 5 6 7
1 5 6 7 8
1 5 6 7 9
1 2 5 7 8
1 3 5 7 8
1 4 5 7 8
1 5 6 7 8
1 5 7 8 9
1 2 5 7 9
1 3 5 7 9
1 4 5 7 9
1 5 6 7 9
1 5 7 8 9
1 2 3 6 7
1 2 4 6 7
1 2 5 6 7
1 2 6 7 8
1 2 6 7 9
1 2 3 6 7
1 3 4 6 7
1 3 5 6 7
1 3 6 7 8
1 3 6 7 9
1 2 4 6 7
1 3 4 6 7
1 4 5 6 7
1 4 6 7 8
1 4 6

0