<center>
    <img src="https://upload.wikimedia.org/wikipedia/commons/a/a8/%D0%9B%D0%9E%D0%93%D0%9E_%D0%A8%D0%90%D0%94.png" width=500px/>
    <font>Python 2023</font><br/>
    <br/>
    <br/>
    <b style="font-size: 2em">Итераторы и генераторы</b><br/>
    <br/>
    <font>Вадим Мазаев</font><br/>
</center>

# Итераторы

In [None]:
for i in range(5):
    pass
    
for line in open('file.txt'):
    pass

for key in {'A' : 1, 'B' : 2, 'C' : 3}:
    pass
    
for letter in 'Hello, World':
    pass

In [1]:
iterable = [1, 2, 3]
iterator = iterable.__iter__()
iterator

<list_iterator at 0x7fc41e383c40>

In [2]:
iterator.__next__()

1

In [3]:
iterator.__next__()

2

In [4]:
iterator.__next__()

3

In [5]:
iterator.__next__()

StopIteration: 

### [Iterable](https://docs.python.org/3/glossary.html#term-iterator)

Это объект, у которого определён метод `__iter__`,

возвращающий **итератор**.

Примеры: `list`, `dict`, `range`

### [Iterator](https://docs.python.org/3/glossary.html#term-iterator)

Это объект, у которого определены методы `__iter__` и `__next__`.

Метод `__iter__` должен возвращать сам итератор (`self`).

Метод `__next__` должен возвращать следующий элемент,

а если их не осталось, выкидывать исключение  `StopIteration`.

## [iter](https://docs.python.org/3/library/functions.html#iter) & [next](https://docs.python.org/3/library/functions.html#next)

In [6]:
iterator = iter([1])
iterator

<list_iterator at 0x7fc41c2c6350>

In [7]:
next(iterator)

1

In [8]:
next(iterator)

StopIteration: 

In [9]:
next(iterator, 'some value')

'some value'

## Вторая форма оператора iter

In [10]:
import io

stream = io.StringIO('abcdefghi')

def read3() -> str:
    return stream.read(3)

In [11]:
iter(read3, '')  # every __next__ translates to __call__

<callable_iterator at 0x7fc41c2c5030>

In [12]:
for chunk in iter(read3, ''):  # iter(callable, sentinel)
    print(chunk, end=' ')

abc def ghi 

In [13]:
from collections.abc import Callable

def make_timer(ticks: int) -> Callable[[], int]:
    
    def timer() -> int:
        nonlocal ticks
        ticks -= 1
        return ticks

    return timer

In [14]:
timer = make_timer(2)

In [15]:
timer()

1

In [16]:
timer()

0

In [17]:
for i in iter(make_timer(10), -1):
    print(i, end=' ')

9 8 7 6 5 4 3 2 1 0 

## Реализация цикла for через while

In [None]:
for value in sequence:
    ...

In [None]:
iterator = iter(sequence)
while True:
    try:
        value = next(iterator)
    except StopIteration:
        break
    else:
        ...

## О хранении итератором состояния

In [18]:
iterable = range(10)
iterable

range(0, 10)

In [19]:
iterator = iter(iterable)
iterator

<range_iterator at 0x7fc41c2c8180>

In [20]:
zip_iterator = zip(iterator, iterator)
zip_iterator

<zip at 0x7fc40c5e20c0>

In [21]:
for pair in zip_iterator:
    print(pair, end=' ')

(0, 1) (2, 3) (4, 5) (6, 7) (8, 9) 

## Объект и его итератор

In [22]:
iterable = open('IteratorsGenerators.ipynb')
iterable

<_io.TextIOWrapper name='IteratorsGenerators.ipynb' mode='r' encoding='UTF-8'>

In [23]:
iter(iterable)

<_io.TextIOWrapper name='IteratorsGenerators.ipynb' mode='r' encoding='UTF-8'>

In [24]:
iterable is iter(iterable)

True

In [None]:
class TextIOWrapper:
    ...
    
    def __iter__(self) -> 'TextIOWrapper':
        return self

    ...

In [25]:
class FilelikeRange:
    def __init__(self, start: int, stop: int) -> None:
        self._index = start
        self._stop = stop

    def __iter__(self) -> 'FilelikeRange':
        return self

    def __next__(self) -> int:
        if self._index >= self._stop:
            raise StopIteration()
        value = self._index
        self._index += 1
        return value

In [26]:
for i in FilelikeRange(0, 5):
    print(i, end=' ')

0 1 2 3 4 

In [27]:
iterable = [1, 2, 3, 4, 5]
iterable

[1, 2, 3, 4, 5]

In [28]:
iter(iterable)

<list_iterator at 0x7fc41c2c5180>

In [29]:
iterable is iter(iterable)

False

In [30]:
class ListlikeRange:
    class Iterator:
        def __init__(self, start: int, stop: int) -> None:
            self._index = start
            self._stop = stop

        def __next__(self) -> int:
            if self._index >= self._stop:
                raise StopIteration()
            value = self._index
            self._index += 1
            return value

    def __init__(self, start: int, stop: int) -> None:
        self._start = start
        self._stop = stop

    def __iter__(self) -> Iterator:
        return self.Iterator(self._start, self._stop)

In [31]:
for i in ListlikeRange(0, 5):
    print(i, end=' ') 

0 1 2 3 4 

## Истощаемость

In [32]:
filelike_range = FilelikeRange(1, 5)
listlike_range = ListlikeRange(1, 5)

In [33]:
for elem in filelike_range:
    print(elem, end=' ')

1 2 3 4 

In [34]:
for elem in filelike_range:
    print(elem, end=' ')

In [35]:
for elem in listlike_range:
    print(elem, end=' ')

1 2 3 4 

In [36]:
for elem in listlike_range:
    print(elem, end=' ')

1 2 3 4 

## [Sequence](https://docs.python.org/3/glossary.html#term-sequence) как iterable

In [37]:
from typing import TypeVar

T = TypeVar('T')

class Sequence:
    def __init__(self, *args: T) -> None:
        self._args = args
        
    def __len__(self) -> int:
        return len(self._args)

    def __getitem__(self, index: int) -> T:
        if index < 0 or index >= len(self):
            raise IndexError(index)  # expected by for to detect eos
        return self._args[index]

In [38]:
seq = Sequence(1, 2, 3, 4, 5)
seq[0], seq[2], seq[4]

(1, 3, 5)

In [39]:
for i in seq:
    print(i, end=' ')

1 2 3 4 5 

## [\_\_contains__](https://docs.python.org/3/reference/datamodel.html#object.__contains__)

In [40]:
3 in range(5)

True

In [None]:
# https://docs.python.org/3.11/reference/expressions.html#membership-test-details
# default __contains__ looks like
def __contains__(self, value: Any) -> bool:
    for item in self:
        if item is value or item == value:
            return True
    return False

In [None]:
class MyRange:
    def __contains__(self, value: int) -> bool:
        return 0 <= value < self._stop
    
    ...

In [41]:
seq = Sequence(2, 3, 5, 8, 13, 21)

In [42]:
for i in seq:
    print(i, end=' ')

2 3 5 8 13 21 

In [43]:
8 in seq  # object has no __contains__, so "in" uses iteration over __getitem__

True

### Некоторые функции для работы с итераторами

### [enumerate](https://docs.python.org/3/library/functions.html#enumerate)

In [44]:
for i, char in enumerate('sample'):  # enumerate(iterable, start=index)
    print(i, char)

0 s
1 a
2 m
3 p
4 l
5 e


### [zip](https://docs.python.org/3/library/functions.html#zip)

In [45]:
for left, right in zip('ABCD', 'xy'):
    print(left + right)

Ax
By


In [46]:
from itertools import zip_longest

for left, right in zip_longest('ABCD', 'xy', fillvalue='-'):
    print(left + right)

Ax
By
C-
D-


### Обращаем zip

In [47]:
x = [1, 2, 3]
y = [4, 5, 6]
zipped = zip(x, y)

In [48]:
for item in zipped:
    print(item, end=' ')

(1, 4) (2, 5) (3, 6) 

In [49]:
zipped = zip(x, y)

In [50]:
x2, y2 = zip(*zipped)
print(x2, y2)

(1, 2, 3) (4, 5, 6)


### [map](https://docs.python.org/3/library/functions.html#map) & [filter](https://docs.python.org/3/library/functions.html#filter)

In [51]:
for squared in map(lambda x: x ** 2, range(5)):
    print(squared, end=' ')

0 1 4 9 16 

In [52]:
for filtered in filter(lambda x: x % 2 == 0, range(10)):
    print(filtered, end=' ')

0 2 4 6 8 

### [itertools](https://docs.python.org/3/library/itertools.html) & [more](https://more-itertools.readthedocs.io/en/stable/)

### [itertools.chain](https://docs.python.org/3/library/itertools.html#itertools.chain)

In [53]:
from itertools import chain

In [54]:
for elem in chain(range(5), [10, 20], 'sample', [[i] for i in range(5)]):
    print(elem, end=' ')

0 1 2 3 4 10 20 s a m p l e [0] [1] [2] [3] [4] 

In [55]:
from typing import Any

def repeat(times: int, obj: Any) -> list[Any]:
    return [obj] * times

In [56]:
list(repeat(5, [1, 2, 3]))

[[1, 2, 3], [1, 2, 3], [1, 2, 3], [1, 2, 3], [1, 2, 3]]

In [57]:
list(chain.from_iterable(repeat(5, [1, 2, 3])))

[1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3]

### Как проверить объект на итерируемость (с гарантией)

In [None]:
try:
    iter(object_to_test)
except TypeError:
    # not an iterable
    ...
else:
    # iterable
    ...

### [itertools.tee](https://docs.python.org/3/library/itertools.html#itertools.tee)

In [58]:
from itertools import tee

![tee](http://4.bp.blogspot.com/-u_KYBwIUyF4/UUR5cvbv6PI/AAAAAAAAAXs/hPJT0ZR5iBc/s1600/tee_diagram.png)

In [59]:
iterator1, iterator2 = tee(range(3), 2)

In [60]:
for elem in iterator1:
    print(elem, end=' ')

for elem in iterator2:
    print(elem, end=' ')

0 1 2 0 1 2 

### [itertools.groupby](https://docs.python.org/3/library/itertools.html#itertools.groupby)

In [61]:
from itertools import groupby

In [62]:
for key, group in groupby('AABBCCDAAB'): 
    print(key, list(group))

A ['A', 'A']
B ['B', 'B']
C ['C', 'C']
D ['D']
A ['A', 'A']
B ['B']


In [63]:
words = ['cab', 'face', 'cafe', 'abc', 'goo']
words = sorted(words, key=sorted)
words

['cab', 'abc', 'face', 'cafe', 'goo']

In [64]:
for key, group in groupby(words, key=sorted): 
    print(','.join(key), list(group))

a,b,c ['cab', 'abc']
a,c,e,f ['face', 'cafe']
g,o,o ['goo']


# Генераторы

In [1]:
from collections.abc import Iterator

def countdown(n: int) -> Iterator[int]:
    print(f'Counting down from {n}')
    for i in range(n, 0, -1):
        yield i
    print('Done')

In [2]:
for i in countdown(5):
    print(i)

Counting down from 5
5
4
3
2
1
Done


In [3]:
countdown

<function __main__.countdown(n: int) -> collections.abc.Iterator[int]>

In [4]:
counter = countdown(10)
counter

<generator object countdown at 0x7f0f49e3b5a0>

In [5]:
iter(counter) is counter

True

In [6]:
counter = countdown(2)

In [7]:
next(counter)

Counting down from 2


2

In [8]:
next(counter)

1

In [9]:
next(counter)

Done


StopIteration: 

### [Generator](https://docs.python.org/3/glossary.html#term-generator)

Это специальный итератор, который получается

в результате вызова функции, содержащей ключевое слово `yield`.

Последовательность значений, которую возвращает генератор,

задается последовательностью операторов `yield` в теле функции.

### Примеры

In [10]:
def squares(size: int) -> Iterator[int]:
    for i in range(size):
        yield i ** 2

In [11]:
generator = squares(5)

In [12]:
for elem in generator:
    print(elem, end=' ')

0 1 4 9 16 

In [13]:
for elem in generator:
    print(elem, end=' ')

Генераторы истощаются!

In [14]:
from collections.abc import Iterable, Iterator
from typing import TypeVar

T = TypeVar('T')

def unique_ordered(elements: Iterable[T]) -> Iterator[T]:
    seen = set()
    for elem in elements:
        if elem in seen:
            continue
        seen.add(elem)
        yield elem

In [15]:
for elem in unique_ordered([1, 2, 3, 1, 2, 4]):
    print(elem, end=' ')

1 2 3 4 

### Цепочка генераторов

In [16]:
def sum_of_squares_of_even(iterable: Iterable[int]) -> int:
    sum_ = 0
    for i in iterable:
        if i % 2 != 0:
            continue
        sum_ += i ** 2
    return sum_

In [17]:
sum_of_squares_of_even(range(10))

120

In [18]:
def even(iterable: Iterable[int]) -> list[int]:
    result = []
    for i in iterable:
        if i % 2 != 0:
            continue
        result.append(i)
    return result

In [19]:
def squares(iterable: Iterable[int]) -> list[int]:
    result = []
    for i in iterable:
        result.append(i ** 2)
    return result

In [20]:
sum(squares(even(range(10))))

120

In [21]:
def even(iterable: Iterable[int]) -> Iterator[int]:
    for elem in iterable:
        if elem % 2 == 0:
            yield elem

In [22]:
def squares(iterable: Iterable[int]) -> Iterator[int]:
    for elem in iterable:
        yield elem ** 2

In [23]:
sum(squares(even(range(10))))

120

Цепочка генераторов позволяет легко декомпозировать алгоритм

без существенных затрат памяти.

### Генераторные выражения ([generator expression](https://docs.python.org/3/glossary.html#term-generator-expression))

In [24]:
squares = (x ** 2 for x in range(5))
squares

<generator object <genexpr> at 0x7f0f4810ecf0>

In [25]:
for square in squares:
    print(square, end=' ')

0 1 4 9 16 

In [26]:
max(x for x in range(10_000_000_000) if x % 11 == 0)

9999999999

In [None]:
max([x for x in range(10_000_000_000) if x % 11 == 0])  # ~20G RAM

In [27]:
import sys

int_size_bytes = sys.getsizeof(0)
int_count = 10_000_000_000 / 11
list_size_bytes = int_size_bytes * int_count
list_size_gigabytes = list_size_bytes / (1024 ** 3)
list_size_gigabytes

20.319765264337715

### Генераторы в качестве итераторов

In [26]:
from collections.abc import Iterator
from dataclasses import dataclass

@dataclass
class BinaryTreeNode:
    value: int
    left: 'BinaryTreeNode | None' = None
    right: 'BinaryTreeNode | None' = None

    def __iter__(self) -> Iterator[int]:  # in-order
        for value in (self.left or ()):
            yield value

        yield self.value

        for value in (self.right or ()):
            yield value

In [27]:
tree = BinaryTreeNode(
    left=BinaryTreeNode(
        left=BinaryTreeNode(value=1),
        value=2,
    ),
    value=3,
    right=BinaryTreeNode(
        value=4,
        right=BinaryTreeNode(value=5),
    ),
)

In [28]:
for value in tree:
    print(value, end=' ')

1 2 3 4 5 

<img src="https://i.imgur.com/mM6OdDr.png" alter="agents-of-yield" width=900/>

#### Image source: https://speakerdeck.com/hollodotme/marvelous-agents-of-yield

## Генераторы: продвинутое использование

#### Inspired by: http://dabeaz.com/finalgenerator/

In [1]:
from collections.abc import Iterator, Generator

In [2]:
def create_generator() -> Iterator[int]:
    yield 5

In [3]:
def create_generator() -> Generator[int, None, None]:
    yield 5

In [4]:
def create_duplicator() -> Generator[int, int, None]:
    print('Give me a value, please')
    value = yield
    print(f'Got value: {value}')
    yield value * 2
    print('Finished')

In [5]:
dublicator = create_duplicator()
next(dublicator)

Give me a value, please


In [6]:
dublicator.send(21)

Got value: 21


42

In [7]:
dublicator.send(100500)

Finished


StopIteration: 

### [yield как выражение](https://docs.python.org/3/reference/simple_stmts.html#yield)

![yield-expr](https://i0.wp.com/storage.googleapis.com/ssivart/super9-blog/priming-generator.png?w=1200&ssl=1)

In [8]:
def jumping_counter(upto: int) -> Generator[int, int, None]:
    count = 1
    while count <= upto:
        jump = yield count
        count += jump or 1

In [9]:
generator = jumping_counter(3)

In [10]:
next(generator)  # equals to .send(None)

1

In [11]:
generator.send(2)

3

In [12]:
next(generator)

StopIteration: 

### [throw](https://docs.python.org/3/reference/expressions.html#generator.throw)

In [13]:
generator = jumping_counter(5)

In [14]:
next(generator)

1

In [15]:
generator.throw(Exception('Good luck!'))

Exception: Good luck!

### [close](https://docs.python.org/3/reference/expressions.html#generator.close)

In [16]:
generator = jumping_counter(5)

In [17]:
next(generator)

1

In [18]:
generator.close()

In [19]:
next(generator)

StopIteration: 

### Обработка close

In [20]:
def create_generator() -> Iterator[int]:
    while True:
        try:
            yield 42
        except GeneratorExit:  # can't be ignored
            print('Exiting...')
            return

In [21]:
generator = create_generator()

In [22]:
next(generator)

42

In [23]:
generator.close()

Exiting...


### [@contextmanager](https://docs.python.org/3/library/contextlib.html#contextlib.contextmanager)

In [24]:
from contextlib import contextmanager
import tempfile
import shutil

In [25]:
@contextmanager
def tempdir():
    dirname = tempfile.mkdtemp()
    try:
        yield dirname
    finally:
        shutil.rmtree(dirname)

In [26]:
with tempdir() as path:
    print(path)

/tmp/tmplslncylo


In [None]:
# More precise implementation:
# https://github.com/python/cpython/blob/b3f0ceae919c1627094ff628c87184684a5cedd6/Lib/contextlib.py#L142

class _GeneratorContextManager:
    def __init__(self, func, args, kwargs):
        self.gen = func(*args, **kwargs)
    
    def __enter__(self):
        return next(self.gen)

    def __exit__(self, exc_type, exc_value, exc_traceback):
        if exc_type is None:
            try:
                next(self.gen)
            except StopIteration:
                return False
            raise RuntimeError("generator didn't stop")
        else:
            try:
                self.gen.throw(exc_type, exc_value, exc_traceback)
            except BaseException:
                return False
            raise RuntimeError("generator didn't stop after throw()")

def contextmanager(func):
    def helper(*args, **kwargs):
        return _GeneratorContextManager(func, args, kwargs)
    return helper

### [yield from](https://peps.python.org/pep-0380/)

In [27]:
from collections.abc import Iterable
from typing import TypeVar

T = TypeVar('T')

def repeat(times: int, iterable: Iterable[T]) -> T:
    for _ in range(times):
        yield from iterable  # https://www.python.org/dev/peps/pep-0380/

In [28]:
for elem in repeat(5, [1, 2, 3]):
    print(elem, end=' ')

1 2 3 1 2 3 1 2 3 1 2 3 1 2 3 

In [29]:
def repeat(times: int, iterable: T) -> T:
    for _ in range(times):
        yield iterable

In [30]:
for elem in repeat(5, [1, 2, 3]):
    print(elem, end=' ')

[1, 2, 3] [1, 2, 3] [1, 2, 3] [1, 2, 3] [1, 2, 3] 

In [31]:
from collections.abc import Iterator
from dataclasses import dataclass

@dataclass
class BinaryTreeNode:
    value: int
    left: 'BinaryTreeNode | None' = None
    right: 'BinaryTreeNode | None' = None

    def __iter__(self) -> Iterator[int]:
        yield from self.left or ()
        yield self.value
        yield from self.right or ()

In [32]:
tree = BinaryTreeNode(
    left=BinaryTreeNode(
        left=BinaryTreeNode(value=1),
        value=2,
    ),
    value=3,
    right=BinaryTreeNode(
        value=4,
        right=BinaryTreeNode(value=5),
    ),
)

In [35]:
for value in tree:
    print(value, end=' ')

1 2 3 4 5 

### [return в генераторах](https://peps.python.org/pep-0255/#specification-return)

In [33]:
def create_generator() -> Generator[int, None, int]:
    yield 42
    return 21

In [34]:
generator = create_generator()
next(generator)
next(generator)

StopIteration: 21

In [35]:
def generator_wrapper():
    result = yield from create_generator()
    print(result)

In [36]:
list(generator_wrapper())

21


[42]

## Генераторы в качестве корутин

### Многозадачность

![multitasking](https://miro.medium.com/max/1580/1*r94wLYporfXxgaIakEfBIA.png)

А что, если использовать генератор в качестве корутины?

У генератора уже есть необходимый нам интерфейс:

In [37]:
def pinger() -> Generator[None, None, None]:
    while True:
        print('ping')
        yield

In [38]:
def ponger() -> Generator[None, None, None]:
    while True:
        print('pong')
        yield

Давайте напишем простенький планировщик задач (scheduler):

In [39]:
from collections import deque
from time import sleep

In [None]:
def run(*tasks: Generator[None, None, None]) -> None:
    try:
        # code here...
    except KeyboardInterrupt:
        return

In [None]:
def run(*tasks: Generator[None, None, None]) -> None:
    task_queue = deque(tasks)
    try:
        while task_queue:
            sleep(0.5)
            task = task_queue.popleft()
            try:
                task.send(None)
            except StopIteration:
                continue
            task_queue.append(task)
    except KeyboardInterrupt:
        return

In [None]:
run(pinger(), ponger())

Кроме того, эту идею можно развить, и возвращать из корутины

специальный объект, описывающий действие, которое должен совершить

планировщик.

Например, создать новую задачу:

In [None]:
def spawner() -> Generator[Task, Any, None]:
    print('Spawn new task')
    task_id = yield NewTask(pinger())
    for _ in range(5):
        print('tick')
        yield
    yield KillTask(task_id)
    print('Task killed')

Именно это вам и нужно будет реализовать в домашней задаче `pyos`

# Спасибо за внимание!