# Iterables and Generators

Behind the scenes, Python's lists, dictionaries and other types are doing a lot of work so you can (relatively) efficiently iterate through them in loops (like for). 

Often we need to create our own sequence of numbers.

One way to do this is to create a really long list or tuple, since these have order (0 before 1 before 2...etc). 

And often that is the right way to go.

However, there are other times where you do not actually need the WHOLE sequence, you just need one thing at time. There can be huge efficiency gains for creating **generators**. These split out one output at a time, not the whole list.

This is an example of lazy evaluation. Python will only evaluate a generator when it needs to.

For example a list of the ints 0 to 3 would be:

`a = [0, 1, 2, 3]`

All 4 ints are in memory. 

A generator would be instruction to yield back **the next** item.

Lets implement this to see how it works.

In [2]:
def generate_range(n):
    """generate integers"""
    i = 0
    while i < n:
        yield i
        i += 1

Two things should pop out at first.

- We are using a `while` loop. This will stop when the condition trips to false. Here the condition is `i < n`, where n is an argument. `i` is just a temporary value; not a list. 

- instead of `return` we have `yield`. This is what tells python that this function does not return an object, it is a generator!

In [3]:
generate_range(4)

<generator object generate_range at 0x10922f5e8>

Notice that if we just try to run the generator, we do not get the output. We get the object in memory. This is the generator waiting to be called. 

We usually call a generator in some type of loop.

In [4]:
for i in generate_range(4):
    print(f"i: {i}")

i: 0
i: 1
i: 2
i: 3


Python already has a built-in generator that we use a great deal, `range`.

In [5]:
for i in range(4):
    print(f"i: {i}")

i: 0
i: 1
i: 2
i: 3


One gotcha with generators is that if you do not re-start them, them get reach the end and stop.

In [6]:
myGen = generate_range(4)
for i in myGen:
    print(f"i: {i}")

i: 0
i: 1
i: 2
i: 3


In [5]:
for i in myGen:
    print(f"i: {i}")

Generators save their state so they know where to go next, but this means they might have an end.

Infinite generators can be useful but you have to be careful because it is easy to tell a computer to do something forever, and being the trusted friend that it is... it will keep going.... forever.

## for comprehensions and generators

You will often see generators used not just with functions and `yield` but with parentheses and **for comprehensions**.

You should have seen in the book that we can write. a list comprehension like so:

`myDoubleList = [2 * i for i in [1, 2, 3]]`

We can do the same thing with `range`.

In [18]:
myDoubleGenerator = (2 * i for i in range(4))

In [19]:
myDoubleGenerator

<generator object <genexpr> at 0x10922f750>

Then we can use that generator, one value at a time.

In [20]:
for i in myDoubleGenerator:
    print(i)

0
2
4
6


In [21]:
for i in myDoubleGenerator:
    print(i)

## creating lists from generators

We can use generators to fill lists by using the `list()` constructor. This is what `[...]` is a shortcut for.

In [22]:
myDoubleGenerator = (2 * i for i in range(4))
newList = list(myDoubleGenerator)
newList

[0, 2, 4, 6]

or more directly with a list comprension from the generator

In [23]:
newList = [2 * i for i in range(4)]
newList

[0, 2, 4, 6]

## Iteration for lists and dictionaries

As mentioned at the beginning, lists also are iteratable. Use the `enumerate` function to access next index and name of a list.

In [24]:
names = ["Alice", "Bob", "Charlie", "Debbie"]
for i, name in enumerate(names):
    print(f"item {i} is {name}")

item 0 is Alice
item 1 is Bob
item 2 is Charlie
item 3 is Debbie


If you only need the item, and not the index you can just loop over the list directly, without `enumerate`.

In [25]:
for name in names:
    print(f"The name is {name}, {name} Bond")

The name is Alice, Alice Bond
The name is Bob, Bob Bond
The name is Charlie, Charlie Bond
The name is Debbie, Debbie Bond


If you only want the index, there are a few ways to do this, I prefer:

In [25]:
for i, _ in enumerate(names):
    print(f"item {i}")

item 0
item 1
item 2
item 3


In [53]:
myDict = {"exam1": 95, "exam2": 99, "participation": 45}

In [54]:
for key, val in myDict.items():
    print(f"key is {key}, and value is {val}")

key is exam1, and value is 95
key is exam2, and value is 99
key is participation, and value is 45


A good example was brought up in class. How do you search a dictionary for values.

If the dictionary is huge, something like the following is slow, by design. Dictionary are made to look key to value, not the reverse.


In [58]:
got95 = []
for key, val in myDict.items():
    if val == 95:
        got95.append(key)
print(got95)

['exam1']


If you only do this once, it is no big deal. But if you have to look up keys by the values they point to, then it is best to rearrange the information into a more efficient format once and then use that.

One way is to flip the dictionary, then search. But this only works if the values can each make unique keys.

In [62]:
revDict = {val: key for key, val in myDict.items()}
newDict[95]

'exam1'

A more robust way is to reverse the values into lists of previous keys

In [63]:
myDict = {"exam1": 95, "exam2": 99, "exam3": 95, "participation": 45}
revDict = {}
for key, val in myDict.items():
    if val not in revDict:
        revDict[val] = [key]
    else: 
        revDict[val].append(key)
revDict[95]

['exam1', 'exam3']

In [64]:
revDict

{95: ['exam1', 'exam3'], 99: ['exam2'], 45: ['participation']}