<a href="https://colab.research.google.com/github/4dsolutions/clarusway_data_analysis/blob/main/python_warm_up/warmup_object_sql.ipynb"><img align="left" src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open in Colab" title="Open and Execute in Google Colaboratory"></a><br/>
[![nbviewer](https://raw.githubusercontent.com/jupyter/design/master/logos/Badges/nbviewer_badge.svg)](https://nbviewer.org/github/4dsolutions/clarusway_data_analysis/blob/main/python_warm_up/warmup_object_sql.ipynb)

# Iterators and Generators

<a data-flickr-embed="true" href="https://www.flickr.com/photos/kirbyurner/52563704012/in/album-72177720296706479/" title="LMS Dashboard"><img src="https://live.staticflickr.com/65535/52563704012_71ef4beb8a_b.jpg" width="1024" height="354" alt="LMS Dashboard"></a><script async src="//embedr.flickr.com/assets/client-code.js" charset="utf-8"></script>

Python Warm-up Notebooks:

*  [Introduction to Python](warmup_python_intro.ipynb)
*  [3rd Party Libraries](warmup_3rd_party_datascience.ipynb)
*  [Object Types](warmup_data_structures.ipynb)
*  [Object Oriented Paradigm](warmup_object_oriented.ipynb)
*  [Calling Callables and Type Checking](warmup_callables.ipynb)
*  [Class and Static Methods, Properties](warmup_object_oriented2.ipynb)
*  [SQLite3 and Context Managers](warmup_object_sql.ipynb) 
*  [Iterators and Generators](warmup_generators.ipynb) (you are here)

## Iterators

In Python, an iterator is any object with an internal `__next__` method, as well as an `__iter__` method, which latter may simply return `self` because the object in question is an iterator already.

The list type, for example, implements `__iter__`, triggered by calling `iter(L)` where L is some list.  The `for loop` construct does this implicitly i.e. it turns whatever it needs to loop over, into an iterator first.

In [1]:
foods = '🍕🥯🍿'   # a string of food emoji

In [2]:
"__iter__" in dir(foods)

True

In [3]:
"__next__" in dir(foods)

False

In [4]:
obj = iter(foods)

In [5]:
type(obj)

str_iterator

In [6]:
"__next__" in dir(obj)

True

In [7]:
next(obj)

'🍕'

In [8]:
next(obj)

'🥯'

In [9]:
next(obj)

'🍿'

In [10]:
try:
    next(obj)
except StopIteration:
    print("Sorry, I've reached the end of the line")

Sorry, I've reached the end of the line


The `StopIteration` exception is what stops a for loop, causing it to not re-enter the for block.  The for loop below explicitly turns `foods` into an iterator, however Python does this for you, by virtue of your using a for loop.

In [11]:
for food in iter(foods):
    print(food)

🍕
🥯
🍿


In [12]:
"__next__" in dir(obj)

True

In [13]:
"__next__" in dir(foods)

False

What you might be thinking, now that we have seen how `__ribs__` work, is that we might define our own iterators, using the keyword `class`.  For example, suppose we want an iterator that returns even numbers ad infinitum.  We could write something like this:

In [14]:
class Evens:
    
    def __init__(self):
        self.value = 0
        
    def __next__(self):
        self.value += 2
        return self.value
    
    def __iter__(self):
        return self

In [15]:
evens = Evens()

In [16]:
next(evens)

2

In [17]:
next(evens)

4

In [18]:
print([next(evens) for _ in range(20)])

[6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44]


That works!

## Generator Functions

However, much as classes such as the above may be useful (think of writing Primes() for example, that always gives a next higher prime number), Python gives us another way to define iterators, using function syntax `def` and the keyword `yield`.

In [19]:
def evens():
    value = 0
    while True:
        value += 2
        yield value

In [20]:
the_evens = evens()  # nothing runs yet

In [21]:
next(the_evens)      # next means "run until a yield"

2

In [22]:
next(the_evens)

4

In [23]:
next(the_evens)

6

In [24]:
"__next__" in dir(the_evens)  # it's in there, even though we didn't explicitly define it

True

In [25]:
"__iter__" in dir(the_evens)  # any generator function instance is an iterator

True

In [26]:
type(the_evens)

generator

## Generator Expressions

What is the point of iterators and generators?  For one thing, they can save memory.  We have infinite primes in principle, but if all we need is a next one, then why bother storing many, let alone "all" (an impossibility)?  A generator lets us compute a next result on the fly.

In [27]:
from itertools import count  # more about this library below

In [28]:
it = count()   # a generator that yields consecutive integers

In [29]:
next(it)       # ... starting at 0

0

In [30]:
next(it)       # reinitialize to start over

1

In [31]:
it = count()
anyseq = 'ABCDEFG '  # 8 characters including space
thegen = (anyseq[n % 8] for n in it)  # as n increases n % 8 will cycle 0 1 2 3 4 5 6 7 0 1 2 3 4... 

In [32]:
next(thegen)

'A'

In [33]:
next(thegen)

'B'

In [34]:
it = count()  # reinitialize
letters = [ ]
thegen = (anyseq[n % 8] for n in it)  # generator expression

for c in thegen:  # keep getting the next letter
    if len(letters) > 30:
        break
    letters.append(c)

print(''.join(letters))

ABCDEFG ABCDEFG ABCDEFG ABCDEFG


## itertools in the Standard Library

The art of using iterators wisely takes some thought and practice.  [The itertools library](https://docs.python.org/3/library/itertools.html) provides a set of common iterators worth playing with, as a means of familiarizing yourself with what's in the toolbox.

A better way to accomplish the cycling through characters shown above, would invole using `cycle`.

In [35]:
from itertools import cycle, islice

thegen = cycle('ABCDEFG ')

for c in thegen:  # keep getting the next letter
    if len(letters) > 30:
        break
    letters.append(c)
    
print(''.join(letters))

ABCDEFG ABCDEFG ABCDEFG ABCDEFG


In [36]:
print(''.join(islice(cycle('ABCDEFG '), 31)))  # even simpler, using islice

ABCDEFG ABCDEFG ABCDEFG ABCDEFG
