# Python Iterator Tutorial
References: 
- https://www.datacamp.com/community/tutorials/python-iterator-tutorial

Iterator 在當我們需要處理大量資料, 但是一定時間內不需要同時存取所有資料的時候非常好用.

Keywords:
- Lazy factory

![](images/iterator.png)

## Iterator & Iterable
一個 iterable 是一個可以被迭代的物件 (list, set etc), 通常會透過定義 `__iter__` 實作一個 iterator 讓我們可以用 memory-efficient 的方式回傳 iterable 裡頭所有的 elements. 另外只要有 `__next__` method 就算是一個 iterator, 因此一個常見的 pattern 是 iterable 在自己的 class 裡頭定義 `__iter__` 以及 `__next__` 並且讓 `__iter__` 回傳自己(self), 自己就變成自己的 iterator 了.

In [1]:
a_set = {1, 2, 3}
b_iterator = iter(a_set) 
next(b_iterator)

1

In [2]:
type(a_set)

set

In [3]:
type(b_iterator)

set_iterator

## iterator 可以節省大量記憶體

In [1]:
from sys import getsizeof

In [6]:
l_gen = (x for x in range(100000))
head, *_, tail = l_gen
print(f'head: {head}, tail: {tail}')
print(f'Size of l_gen: {getsizeof(l_gen)} bytes')

head: 0, tail: 99999
Size of l_gen: 88 bytes


In [8]:
l = [x for x in range(100000)]
head, *_, tail = l
print(f'head: {head}, tail: {tail}')
print(f'Size of l: {getsizeof(l)} bytes')

head: 0, tail: 99999
Size of l: 824464 bytes


## 自定義 iterable

In [10]:
class Series(object):
    def __init__(self, low, high):
        self.current = low
        self.high = high
    
    def __iter__(self):
        return self
    
    def __next__(self):
        if self.current > self.high:
            raise StopIteration
        else:
            self.current += 1
            return self.current - 1

In [11]:
n_list = Series(1, 10)
list(n_list)

[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

Iterable `n_list` 已經被我們用上一個 `list()` 迭代完了, 再跑 `next()` 就會跑出 `StopIteration` error.

In [14]:
iterator = iter(n_list)
next(iterator)

StopIteration: 

再建立一個新的 Iterable `Series` 即可重新迭代

In [15]:
iterator = iter(Series(1, 10))
next(iterator)

1

## Containers
可以執行 membership tests, e.g., 也就是某個值有沒有在list裡頭

```
if 1 in [1, 2, 3]
```

In [16]:
if 1 in [1,2,3]:
    print('List')

if 4 not in {1,2,3}:
    print('Tuple')

if 'apple' in 'pineapple':
    print('String') #string contains all its substrings

List
Tuple
String


## Itertools
可以幫助我們快速建立很多種 iterators. 值得注意的是很多 itertools 創造出來的 iterators 都能無限輪迴一直產生值, 所以我們要自己限制呼叫這些 iterators 多少次.

In [17]:
from itertools import count
sequence = count(start=0, step=1)
while(next(sequence) <= 10):
    print(next(sequence))

1
3
5
7
9
11


In [19]:
from itertools import cycle
dessert = cycle(['Icecream','Cake'])
for _ in range(4):
    print('Q. What do we have for dessert? A: ' + next(dessert))

Q. What do we have for dessert? A: Icecream
Q. What do we have for dessert? A: Cake
Q. What do we have for dessert? A: Icecream
Q. What do we have for dessert? A: Cake


## Generators
利用 `yield` 讓我們不需要像前面的 `class Series` 還要自己寫 `__iter__` 跟 `__next__` methods. yield 基本上像是一般的 `return`, 但是回傳以後不會把 local variable 的值清掉, 等到下次呼叫的時候把上次的值拿來繼續使用.

### Generator function

In [21]:
def series_generator(low, high):
    while low <= high:
       yield low
       low += 1

n_list = []
for num in series_generator(1,10):
    n_list.append(num)

print(n_list)

[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]


### Generator expression

In [22]:
squares = (x * x for x in range(1,10))
print(type(squares))
print(list(squares))

<class 'generator'>
[1, 4, 9, 16, 25, 36, 49, 64, 81]


將
```
def some_function():
    result = []
    for ... in ...:
        result.append(x)
    return result
```
用
```
def iterate_over():
    for ... in ...:
        yield x
```
取代