## Iterators

In [1]:
import dis

dis.dis("for x in xs: do_something(name)")

  1           0 SETUP_LOOP              20 (to 22)
              2 LOAD_NAME                0 (xs)
              4 GET_ITER
        >>    6 FOR_ITER                12 (to 20)
              8 STORE_NAME               1 (x)
             10 LOAD_NAME                2 (do_something)
             12 LOAD_NAME                3 (name)
             14 CALL_FUNCTION            1
             16 POP_TOP
             18 JUMP_ABSOLUTE            6
        >>   20 POP_BLOCK
        >>   22 LOAD_CONST               0 (None)
             24 RETURN_VALUE


- `GET_ITER` invoke `__iter__` method of the `for` operator. This method returns an iterator.
- `FOR_ITER` invoke `__next__` method of an iterator until `StopIteration` exception pops up.

### Iterator Protocol

- `__iter__` returns a class instance implementing the iterator protocol. E.g. `self`.
- `__next__` returns next element of an iterator. If such element is not exist, `StopIteration` exception pops up.

NOTE: If `__next__` raises `StopIteration` exception then all next method calls should do the same.

⚠️ When the `__iter__` method is not defined, the interpreter calls the `__getitem__` method.

The `__getitem__` method takes only one argument (an index of an element) and returns either an element corresponding to the index or raises `IndexError` exception if the element of such an index doesn't exist.

In [2]:
class Identity:
    def __getitem__(self, idx):
        if idx > 5:
            raise IndexError(idx)
        return idx

In [3]:
list(Identity())

[0, 1, 2, 3, 4, 5]

### Iterable vs Iterator

**Iterators** are very restricted objects and they **are not intended to reuse** as iterables are.

In [4]:
names = ['Andrey', 'Maria', 'Mark']

iterator1 = iter(names)      # same as names.__iter__()
iterator2 = names.__iter__() # returns a new! iterator

id(iterator1), id(iterator2)

(4393830096, 4393830288)

In [5]:
type(iterator1), type(iterator2)

(list_iterator, list_iterator)

In [6]:
iterator3 = iter(names)

id(iterator3), id(iter(iterator3)), id(iterator3.__iter__())

(4393894352, 4393894352, 4393894352)

ℹ️ An _iterator_ in Python is also an _iterable_ object.

### Collections and Iterators

```python
class BinaryTree:
    def __iter__(self):
        return self.inorder_iter()
    
    def preorder_iter(self):
        # ...
        
    def inorder_iter(self):
        return InOrderIterator(self)
    
    def postorder_iter(self):
        # ...
```

### `iter` and `next`

`iter`:
- takes an iterator and calls its `__iter__` method
- takes a function and stop value and calls the function until it will return the value:

```python
from functools import partial

with open(path, "rb") as handle:
    read_block = partial(handle.read, 64)
    for block in iter(read_block, ""):
        do_something(block)
```

The `next` function takes an iterator and calls its `__next__` method. We also can specify a value that function will return when the `StopIteration` exception happened.

In [7]:
next(iter([1, 2, 3]))

1

In [8]:
next(iter([]), 42)

42

[Transforming Code into Beautiful, Idiomatic Python](https://youtu.be/OSGv2VnC0go)

### `in` and `not in` operators

These operators use the magic method `__contains__` that returns `True` if its argument exists in a class object.

The big picture:

```python
class object:
    # ...
    
    def __contains__(self, target):
        for item in self:  # default behavior
            if item == target:
                return True
            return False
```

## Generators

* [PEP 255 -- Simple Generators](https://www.python.org/dev/peps/pep-0255/)
* [PEP 289 -- Generator Expressions](https://www.python.org/dev/peps/pep-0289/)
* [Introduction to Python Generators](https://realpython.com/introduction-to-python-generators/)
* [Python Generators Tutorial](https://www.dataquest.io/blog/python-generators-tutorial/)
* [On demand data in Python, Part 1: iterators and generators](https://www.ibm.com/developerworks/library/ba-on-demand-data-python-1/index.html)
* [2 great benefits of Python generators (and how they changed me forever)](https://www.oreilly.com/ideas/2-great-benefits-of-python-generators-and-how-they-changed-me-forever)

A **generator** is an object that behaves like an _iterator_, in that it generates and returns a value on each call of its `next()` method until a `StopIteration` is raised. By generating values on the fly, generators allow to handle large data sets with minimal consumption of memory and processing cycles.

### Generator Function

In [9]:
# usual function
def func():
    return 42

In [10]:
# generator function
def gen():
    yield 42

Python will detect the use of `yield` keyword and tag the function as a generator.

In [11]:
func()

42

In [12]:
gen()

<generator object gen at 0x105e5c9d0>

In [13]:
# Another example

def g():
    print("Started")
    x = 42
    yield x
    x += 1
    yield x
    print("Done")

In [14]:
type(g)

function

In [15]:
gen = g()
type(gen)

generator

In [16]:
next(gen)

Started


42

Generators keep a reference to the stack when a function yields something, and they resume this stack when a call to `next()` is executed again.

In [17]:
next(gen)

43

In [18]:
next(gen)

Done


StopIteration: 

In [19]:
dir(gen)

['__class__',
 '__del__',
 '__delattr__',
 '__dir__',
 '__doc__',
 '__eq__',
 '__format__',
 '__ge__',
 '__getattribute__',
 '__gt__',
 '__hash__',
 '__init__',
 '__init_subclass__',
 '__iter__',
 '__le__',
 '__lt__',
 '__name__',
 '__ne__',
 '__new__',
 '__next__',
 '__qualname__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__setattr__',
 '__sizeof__',
 '__str__',
 '__subclasshook__',
 'close',
 'gi_code',
 'gi_frame',
 'gi_running',
 'gi_yieldfrom',
 'send',
 'throw']

⬆️ As we see, `__iter__` and `__next__` are in the list.

💡 _Generators also help to write more readable code because we don't need to define classes and magic methods._

In [20]:
def unique(iterable, seen=None):
    seen = set(seen or [])
    for item in iterable:
        if item not in seen:
            seen.add(item)
            yield item

In [21]:
xs = [1, 1, 2, 3]

In [22]:
unique(xs)

<generator object unique at 0x105eccdd0>

In [23]:
list(unique(xs))

[1, 2, 3]

In [24]:
1 in unique(xs)

True

### `map` generator

In [25]:
def map(func, iterable, *rest):
    for args in zip(iterable, *rest):
        yield func(*args)

In [26]:
xs = range(5)

In [27]:
map(lambda x: x * x, xs)

<generator object map at 0x105eccc50>

In [28]:
list(map(lambda x: x * x, xs))

[0, 1, 4, 9, 16]

In [29]:
9 in map(lambda x: x * x, xs)

True

### `chain` generator

In [30]:
def chain(*iterables):
    for iterable in iterables:
        for item in iterable:
            yield item

In [31]:
xs = range(3)

In [32]:
ys = [42]

In [33]:
chain(xs, ys)

<generator object chain at 0x105eccb50>

In [34]:
list(chain(xs, ys))

[0, 1, 2, 42]

In [35]:
42 in chain(xs, ys)

True

### `count` and `enumerate` generators

In [36]:
def count(start=0):
    while True:
        yield start
        start += 1

In [37]:
next(count())

0

In [38]:
counter = count()

In [39]:
next(counter)

0

In [40]:
next(counter)

1

In [41]:
def enumerate(iterable, start=0):
    pass  # how?!

In [42]:
# next(enumerate(count(42)))

### Reusing Generators

❌ Don't do this!

In [43]:
def g():
    yield 42

In [44]:
gen = g()

In [45]:
list(gen)

[42]

In [46]:
list(gen)

[]

If you really have to reuse a generator, use the `tee` function from the `itertools` module.

### Better Implementation Of Iterable Collection

In [47]:
class BinaryTree:
    def __init__(self, value, left=None, right=None):
        self.value = value
        self.left, self.right = left, right
        
    def __iter__(self):  # inorder
        for node in self.left:
            yield node.value
        yield self.value
        for node in self.right:
            yield node.value

As we see, there is no need for additional classes like `InOrderIterator`.

### Generator Expression

In [48]:
gen = (x ** 2 for x in range(10**42) if x % 2 == 1)

In [49]:
gen

<generator object <genexpr> at 0x105ecc850>

In [50]:
next(gen)

1

In [51]:
list(filter(lambda x: x % 2 == 1,
            (x ** 2 for x in range(10))))

[1, 9, 25, 49, 81]

We can omit parentheses if generator expression is a single function argument:

In [52]:
sum(x ** 2 for x in range(10) if x % 2 == 1)

165

### `yield` Expression

In [53]:
def g():
    res = yield
    print("Got {!r}".format(res))
    res = yield 42
    print("Got {!r}".format(res))

In [54]:
gen = g()

In [55]:
next(gen)  # go to the first yield

In [56]:
next(gen)  # go to the second yield

Got None


42

In [57]:
next(gen)

Got None


StopIteration: 

### generator.send()

Using `yield` and `send()` in this fashion allows Python generators to function like _coroutines_ seen in other languages.

The `send` method resumes generator execution and "sends" its argument to the next `yield`:

In [58]:
gen = g()

In [59]:
gen.send("foobar")

TypeError: can't send non-None value to a just-started generator

To initialize a generator we have to "send" `None` to it. It what exactly the function `next` does:

In [60]:
gen = g()

In [61]:
next(gen)

---

In [62]:
gen = g()

In [63]:
gen.send(None)  # == next(gen)

In [64]:
gen.send("foobar")

Got 'foobar'


42

### generator.throw()

[What is generator.throw() good for?](https://stackoverflow.com/questions/11485591/what-is-generator-throw-good-for)

Raises a specified exception at the point where generator was paused, and returns the next value yielded by the generator function.

In [65]:
def g():
    try:
        yield 42
    except Exception as e:
        yield e

In [66]:
gen = g()

In [67]:
next(gen)

42

In [68]:
gen.throw(ValueError, "something is wrong")

ValueError('something is wrong')

In [69]:
gen.throw(RuntimeError, "another error")

RuntimeError: another error

### generator.close()

Raises a `GeneratorExit` at the point where the generator function was paused.

In [70]:
def g():
    try:
        yield 42
    finally:
        print("Done")

In [71]:
gen = g()

In [72]:
next(gen)

42

In [73]:
gen.close()

Done


### Inspect Generators

In [74]:
def foo():
    yield 0
    
import inspect
inspect.isgeneratorfunction(foo)

True

In [75]:
gen = foo()
inspect.getgeneratorstate(gen)

'GEN_CREATED'

In [76]:
next(gen)
inspect.getgeneratorstate(gen)

'GEN_SUSPENDED'

In [77]:
next(gen)

StopIteration: 

In [78]:
inspect.getgeneratorstate(gen)

'GEN_CLOSED'

### Coroutines

In [79]:
def grep(pattern):
    print("Looking for {!r}".format(pattern))
    while True:
        line = yield
        if pattern in line:
            print(line)

In [80]:
gen = grep("Gotcha!")

In [81]:
next(gen)

Looking for 'Gotcha!'


In [82]:
gen.send("This line doesn't have what we're looking for")

In [83]:
gen.send("This one does. Gotcha!")

This one does. Gotcha!


[A Curious Course on Coroutines and Concurrency](http://dabeaz.com/coroutines/)

### `yield from`

[PEP 380: Syntax for Delegating to a Subgenerator](https://docs.python.org/3/whatsnew/3.3.html#pep-380)

In [84]:
def chain(*iterables):
    for iterable in iterables:
        yield from iterable

### The `return` Operator and `StopIteration` exception

In [85]:
def g():
    yield 42
    return []

In [86]:
gen = g()

In [87]:
next(gen)

42

In [88]:
next(gen)

StopIteration: []

⚠️ return != raise StopIteration

In [89]:
def g():
    try:
        yield 42
        raise StopIteration([])  # != return []
    except Exception as e:
        pass

### `yield from` expression

In [90]:
def f():
    yield 42
    return []

def g():
    res = yield from f()
    print("Got {!r}".format(res))

In [91]:
gen = g()

In [92]:
next(gen)

42

In [93]:
next(gen, None)

Got []


### @contextlib.contextmanager

The `contextmanager` decorator should be used on a generator function. The `__enter__` and `__exit__` methods will be dynamically implemented based on the code that wraps the `yield` statement of the generator.

In [94]:
from contextlib import contextmanager

@contextmanager
def cd(path):               # __init__
    old_path = os.getcwd()  # __enter__
    os.chdir(path)
    try:
        yield
    finally:
        os.chdir(old_path)  # __exit__

In [95]:
import tempfile
import shutil

In [96]:
@contextmanager
def tempdir():                   # __init__
    outdir = tempfile.mkdtemp()  # __enter__
    try:
        yield outdir
    finally:
        shutil.rmtree(outdir)    # __exit__

In [97]:
with tempdir() as path:
    print(path)

/var/folders/l8/dy97dgx559740931rq6jjytr0000gn/T/tmpiqon235n


## itertools

Functions creating iterators for efficient looping.

### `itertools.islice`

Make an iterator that returns selected elements from the iterable.

In [98]:
from itertools import islice

In [99]:
xs = range(10)

In [100]:
list(islice(xs, 3))  # == xs[:3]

[0, 1, 2]

In [101]:
list(islice(xs, 3, None))  # == xs[3:]

[3, 4, 5, 6, 7, 8, 9]

In [102]:
list(islice(xs, 3, 8, 2))  # == xs[3:8:2]

[3, 5, 7]

### Infinite Iterators

In [105]:
def take(n, iterable):
    return list(islice(iterable, n))

In [106]:
from itertools import count, cycle, repeat

In [107]:
take(3, count(0, 5))

[0, 5, 10]

In [108]:
take(3, cycle([1, 2, 3]))

[1, 2, 3]

In [109]:
take(3, repeat(42))

[42, 42, 42]

In [110]:
take(3, repeat(42, 2))

[42, 42]

### `itertools.dropwhile` and `itertools.takewhile`

In [112]:
from itertools import dropwhile, takewhile

In [113]:
list(dropwhile(lambda x: x < 5, range(10)))

[5, 6, 7, 8, 9]

In [114]:
it = takewhile(lambda x: x < 5, range(10))

In [115]:
it

<itertools.takewhile at 0x1062d95f0>

In [116]:
list(it)

[0, 1, 2, 3, 4]

### `itertools.chain`

In [117]:
from itertools import chain

In [118]:
take(5, chain(range(2), range(5, 10)))

[0, 1, 5, 6, 7]

In [120]:
it = (range(x, x ** x) for x in range(2, 4))

In [121]:
take(5, chain.from_iterable(it))

[2, 3, 3, 4, 5]

### `itertools.tee`

In [122]:
from itertools import tee

In [123]:
it = range(3)

In [124]:
a, b, c = tee(it, 3)

In [125]:
list(a), list(b), list(c)

([0, 1, 2], [0, 1, 2], [0, 1, 2])

### More on `itertools`

In [128]:
import itertools

In [129]:
list(itertools.product("AB", repeat=2))

[('A', 'A'), ('A', 'B'), ('B', 'A'), ('B', 'B')]

In [130]:
list(itertools.product("AB", repeat=3))

[('A', 'A', 'A'),
 ('A', 'A', 'B'),
 ('A', 'B', 'A'),
 ('A', 'B', 'B'),
 ('B', 'A', 'A'),
 ('B', 'A', 'B'),
 ('B', 'B', 'A'),
 ('B', 'B', 'B')]

In [131]:
list(itertools.permutations("AB"))

[('A', 'B'), ('B', 'A')]

In [133]:
list(itertools.combinations("ABC", 2))

[('A', 'B'), ('A', 'C'), ('B', 'C')]

In [135]:
list(itertools.combinations_with_replacement("ABC", 2))

[('A', 'A'), ('A', 'B'), ('A', 'C'), ('B', 'B'), ('B', 'C'), ('C', 'C')]