# 迭代器

在 Python 中一切皆对象，对象的抽象就是类，而对象的集合就是容器。

列表（list: [0, 1, 2]），元组（tuple: (0, 1, 2)），字典（dict: {0:0, 1:1, 2:2}），集合（set: set([0, 1, 2])）都是容器

所有的容器都是可迭代的（iterable）

严谨地说，迭代器（iterator）提供了一个 next 的方法。调用这个方法后，你要么得到这个容器的下一个对象，要么得到一个 StopIteration 的错误（苹果卖完了）

可迭代对象，通过 iter() 函数返回一个迭代器，再通过 next() 函数就可以实现遍历

In [1]:
def is_iterable(param):
    try: 
        iter(param) # 还有另一种做法，是 isinstance(obj, Iterable)
        return True
    except TypeError:
        return False

params = [
    1234,
    '1234',
    [1, 2, 3, 4],
    set([1, 2, 3, 4]),
    {1:1, 2:2, 3:3, 4:4},
    (1, 2, 3, 4)
]
    
for param in params:
    print('{} is iterable? {}'.format(param, is_iterable(param)))

1234 is iterable? False
1234 is iterable? True
[1, 2, 3, 4] is iterable? True
{1, 2, 3, 4} is iterable? True
{1: 1, 2: 2, 3: 3, 4: 4} is iterable? True
(1, 2, 3, 4) is iterable? True


# 生成器

In [2]:
import os
import psutil

# 显示当前 python 程序占用的内存大小
def show_memory_info(hint):
    pid = os.getpid()
    p = psutil.Process(pid)
    
    info = p.memory_full_info()
    memory = info.uss / 1024. / 1024
    print('{} memory used: {} MB'.format(hint, memory))

def test_iterator():
    show_memory_info('initing iterator')
    list_1 = [i for i in range(100000000)]
    show_memory_info('after iterator initiated')
    print(sum(list_1))
    show_memory_info('after sum called')

def test_generator():
    show_memory_info('initing generator')
    """
    生成器并不会像迭代器一样占用大量内存，只有在被使用的时候才会调用。而且生成器在初始化的时候，
    并不需要运行一次生成操作，相比于 test_iterator() ，test_generator() 函数节省了一次生
    成一亿个元素的过程，因此耗时明显比迭代器短。
    """
    list_2 = (i for i in range(100000000)) 
    show_memory_info('after generator initiated')
    print(sum(list_2))
    show_memory_info('after sum called')

%time test_iterator()
%time test_generator()



initing iterator memory used: 149.546875 MB
after iterator initiated memory used: 1686.28125 MB
4999999950000000
after sum called memory used: 3755.953125 MB
CPU times: user 1.7 s, sys: 1.16 s, total: 2.86 s
Wall time: 3.31 s
initing generator memory used: 30.140625 MB
after generator initiated memory used: 30.140625 MB
4999999950000000
after sum called memory used: 30.34375 MB
CPU times: user 2.18 s, sys: 15.9 ms, total: 2.2 s
Wall time: 2.21 s


In [6]:
def generator(k): # 返回了一个生成器
    i = 1
    while True:
        yield i ** k
        i += 1

gen_1 = generator(1)
gen_3 = generator(3)
print(gen_1)
print(gen_3)

def get_sum(n):
    sum_1, sum_3 = 0, 0
    for i in range(n):
        next_1 = next(gen_1)
        next_3 = next(gen_3)
        print('next_1 = {}, next_3 = {}'.format(next_1, next_3))
        sum_1 += next_1
        sum_3 += next_3
    print(sum_1 * sum_1, sum_3)

get_sum(8)

<generator object generator at 0x106ec0f40>
<generator object generator at 0x106ec0280>
next_1 = 1, next_3 = 1
next_1 = 2, next_3 = 8
next_1 = 3, next_3 = 27
next_1 = 4, next_3 = 64
next_1 = 5, next_3 = 125
next_1 = 6, next_3 = 216
next_1 = 7, next_3 = 343
next_1 = 8, next_3 = 512
1296 1296


迭代器是一个有限集合，生成器则可以成为一个无限集

In [50]:
def is_subsequence(a, b):
    b = iter(b)
    return all(i in b for i in a)

print(is_subsequence([1, 3, 5], [1, 2, 3, 4, 5]))
print(is_subsequence([1, 4, 3], [1, 2, 3, 4, 5]))

True
False


In [81]:
def generator1(a, b):
    tag = True
    for i in a:
        while True:
            try:
                if next(b) == i:
                    tag &= True
            except:
                return False
    return tag

def is_subsequence1(a, b):
    return generator1(a, b)

print(is_subsequence1([1, 3, 5], [1, 2, 3, 4, 5]))
print(is_subsequence1([1, 4, 3], [1, 2, 3, 4, 5]))

False
False


In [54]:
a = [1, 3, 1, 4, 3, 5]
a = iter(a)


print(3 in a, next(a))
print(4 in a)
print(5 in a)


True 1
True
True
