In [1]:
import sys, time, wmi, psutil
SYSTEM_INFO = wmi.WMI().Win32_OperatingSystem()[0]
"system: {0}, {1}, {2}".format(SYSTEM_INFO.Caption, SYSTEM_INFO.BuildNumber, SYSTEM_INFO.OSArchitecture) 
"memory: {}G".format(round(psutil.virtual_memory().total / 1024**3, 2))
"cpu: {}".format(psutil.cpu_count())
"python: {}".format(sys.version)
time.strftime('%Y-%m-%d %H:%M:%S', time.localtime(time.time()))

'system: Microsoft Windows 10 教育版, 18363, 64 位'

'memory: 15.86G'

'cpu: 4'

'python: 3.7.1 (default, Oct 28 2018, 08:39:03) [MSC v.1912 64 bit (AMD64)]'

'2020-10-04 22:12:31'

- **@author**: run_walker
- **@references**:
    1. [Python3之迭代器，生成器](https://www.cnblogs.com/zhangyingai/p/7097944.html)
    2. [Python 迭代器与生成器](http://python.jobbole.com/84527/)
    3. [知乎 > 如何更好地理解Python迭代器和生成器？——赖明星](https://www.zhihu.com/question/20829330/answer/133606850)
    4. [知乎 > 如何更好地理解Python迭代器和生成器？——刘志军](https://www.zhihu.com/question/20829330/answer/213544776)

<img src="imgs/关系图.png" style="width:500px;height:300px;float:left">

# 可迭代对象
实现了`__iter__`方法的，就是可迭代对象（for循环可以用于遍历任何可迭代对象）

**常见的可迭代对象**：
* 内置的数据类型：list, dict, tuple, set, str
* 文件对象
* 迭代器

In [2]:
hasattr(list, '__iter__')
hasattr(dict, '__iter__')
hasattr(tuple, '__iter__')
hasattr(set, '__iter__')
hasattr(str, '__iter__') 

True

True

True

True

True

In [3]:
# 注意，Python2中的xrange，在python3中已移除，而range的返回结果不再是list，而是一个可迭代的对象
hasattr(range, '__iter__') 

True

In [4]:
help(range)

Help on class range in module builtins:

class range(object)
 |  range(stop) -> range object
 |  range(start, stop[, step]) -> range object
 |  
 |  Return an object that produces a sequence of integers from start (inclusive)
 |  to stop (exclusive) by step.  range(i, j) produces i, i+1, i+2, ..., j-1.
 |  start defaults to 0, and stop is omitted!  range(4) produces 0, 1, 2, 3.
 |  These are exactly the valid indices for a list of 4 elements.
 |  When step is given, it specifies the increment (or decrement).
 |  
 |  Methods defined here:
 |  
 |  __bool__(self, /)
 |      self != 0
 |  
 |  __contains__(self, key, /)
 |      Return key in self.
 |  
 |  __eq__(self, value, /)
 |      Return self==value.
 |  
 |  __ge__(self, value, /)
 |      Return self>=value.
 |  
 |  __getattribute__(self, name, /)
 |      Return getattr(self, name).
 |  
 |  __getitem__(self, key, /)
 |      Return self[key].
 |  
 |  __gt__(self, value, /)
 |      Return self>value.
 |  
 |  __hash__(self, /)
 |

# 迭代器 iterator
实现了`__iter__`和`__next__`方法的对象称作迭代器，`__iter__`返回迭代器自身，`__next__`返回迭代器中的下一项，当迭代器内没有元素时引发一个`StopIteration`的异常。

**常见的得到迭代器的方式**:
* 通过python内置的`iter()`方法
* 通过可迭代对象的`__iter__()`方法
* python内置的`itertools`模块可以方便地创建各种高效迭代器
* 生成器

**优点：**
1. 提供了一种不依赖下标的迭代方式
2. 更节省内存

**缺点：**
1. 无法获取迭代器对象的长度
2. 不如序列类型取值灵活，是一次性的，只能往后取值，不能往前退

## 例1
list等数据类型只是可迭代对象，并不是迭代器

In [7]:
for type0 in [str, list, set, tuple, dict]:
    print(type0, hasattr(type0, '__iter__'), hasattr(type0, '__next__'))

<class 'str'> True False
<class 'list'> True False
<class 'set'> True False
<class 'tuple'> True False
<class 'dict'> True False


## 例2
用`iter()`生成迭代器

In [17]:
a1 = iter([1, 2, 3])

type(a1)
a1

list_iterator

<list_iterator at 0x1b41bcdbc50>

In [18]:
hasattr(a1, '__iter__')
hasattr(a1, '__next__')

True

True

In [19]:
# 迭代器的__iter__()方法返回其自身
a1.__iter__() is a1

True

In [20]:
# 迭代器的__next__()方法依次序拿出元素，直到没有就报错StopIteration 
for _ in range(4):
    try:
        print(a1.__next__())
    except Exception as e:
        e

1
2
3


StopIteration()

In [22]:
help(a1)

Help on list_iterator object:

class list_iterator(object)
 |  Methods defined here:
 |  
 |  __getattribute__(self, name, /)
 |      Return getattr(self, name).
 |  
 |  __iter__(self, /)
 |      Implement iter(self).
 |  
 |  __length_hint__(...)
 |      Private method returning an estimate of len(list(it)).
 |  
 |  __next__(self, /)
 |      Implement next(self).
 |  
 |  __reduce__(...)
 |      Return state information for pickling.
 |  
 |  __setstate__(...)
 |      Set state information for unpickling.



## 例3
将dict对象生成迭代器

<div class="alert alert-block alert-info">
    <i class="fa fa-list-alt" aria-hidden="true"><b> Todo:</b></i>
    dict_keyiterator中能取出字典的value吗？
</div>

In [24]:
b = {'name': 'zhangsan', 'age': 18}
b1 = iter(b)

b1
type(b1)

<dict_keyiterator at 0x1b41b1cfcc8>

dict_keyiterator

In [25]:
b1.__next__()
b1.__next__()

'name'

'age'

## 例4

<div class="alert alert-block alert-warning">
    <i class="fa fa-sticky-note" aria-hidden="true"><b> Note:</b></i>
    迭代器只能遍历一次，list强制类型转换相当于将迭代器内元素全部拿出
</div>

In [27]:
c = [1, 2, 3]
c1 = iter(c)

In [28]:
list(c1)

try:
    c1.__next__()
except Exception as e:
    e

[1, 2, 3]

StopIteration()

## 例5
可迭代对象内置的`__iter__()`方法，效果等同于iter()

In [30]:
a = [1, 2, 3]
a1 = iter(a)
a2 = a.__iter__()

type(a1), type(a2)

a2.__iter__() is a2

(list_iterator, list_iterator)

True

## 例6
python内置的next()方法，效果等同于迭代器的`__next__()`方法

In [31]:
a = iter(range(5))

next(a)
a.__next__()

0

1

## 例7
迭代器没有`__len__()`方法。

In [33]:
d = [1, 2, 3]
d1 = iter(d)

try:
    print(len(d1))
except Exception as e:
    print(e)
    
try:
    print(d1.__len__())
except Exception as e:
    print(e)

object of type 'list_iterator' has no len()
'list_iterator' object has no attribute '__len__'


# 生成器 generator
生成器是一种特殊的迭代器。

`yield`语句是生成器实现`__next__()`方法的关键。它作为生成器执行的暂停恢复点，可以对`yield`表达式进行赋值，也可以将`yield`表达式的值返回。也就是说，`yield`是一个语法糖，内部实现支持了迭代器协议，同时`yield`内部是一个状态机，维护着挂起和继续的状态。

**`yield`的功能**：
1. 相当于为函数封装好`__iter__`和`__next__`
2. `return`只能返回一次值，函数就终止了，而`yield`能返回多次值，每次返回都会将函数暂停，下一次`next`会从上一次暂停的位置以保存的状态继续执行

## 通过`yield`构造生成器

### 例1

In [35]:
def gen():
    yield 5
    yield "Hello"
    yield "World"
    yield 4

In [36]:
type(gen())

generator

In [37]:
for i in gen():
    print(i)

5
Hello
World
4


### 例2

In [38]:
def container(start, end):
    while start < end:
        yield start
        start += 1

### 例3

In [41]:
def f1(num):
    # 普通函数
    l = []
    for i in range(num):
        l.append(i)
    return l 

f1(5)

[0, 1, 2, 3, 4]

In [42]:
def f2(num):
    # 生成器函数
    for i in range(num):
        yield i 
        
results = f2(5)
results 

<generator object f2 at 0x000001B41AF4BA98>

In [43]:
next(results)
next(results)
results.__next__()
list(results)

0

1

2

[3, 4]

## 生成器表达式
将列表生成式（列表推导式）的中括号替换成圆括号，即为生成器表达式

### 例1

In [46]:
l = [x**2 for x in range(5)]
g = (x**2 for x in range(5))

type(l)
type(g)

l
g 

list

generator

[0, 1, 4, 9, 16]

<generator object <genexpr> at 0x000001B41AF4BC00>

### 例2
相比列表生成式在内存中一次性生成整个列表对象，生成器延迟计算、一次只返回一个结果，所以内存占用极低，几乎没有占用，从下述例子可以看出时间消耗似乎也会变少一些。

In [44]:
%timeit sum([i for i in range(1000000)])

122 ms ± 1.01 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)


In [45]:
%timeit sum(i for i in range(1000000))

88.8 ms ± 1.12 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)


# 生成器 vs 迭代器
生成器是一种特殊的迭代器，其特定的构造方式使得代码更为简洁，也即更pythonic。

## 实现菲波那切数列

In [50]:
class Fib:
    # 迭代器
    def __init__(self, n):
        self.prev = 0
        self.cur = 1
        self.n = n

    def __iter__(self):
        return self

    def __next__(self):
        if self.n > 0:
            value = self.cur
            self.cur = self.cur + self.prev
            self.prev = value
            self.n -= 1
            return value
        else:
            raise StopIteration()

f = Fib(10)
type(f)
list(f)

__main__.Fib

[1, 1, 2, 3, 5, 8, 13, 21, 34, 55]

In [51]:
def fib(n):
    # 生成器
    prev, curr = 0, 1
    while n > 0:
        n -= 1
        yield curr
        prev, curr = curr, curr + prev

g = fib(10)
type(g)
list(g)

generator

[1, 1, 2, 3, 5, 8, 13, 21, 34, 55]

# 总结

## 效率比较
比较下面的几种方法，当最后的目的是希望得到一个list，然后灵活进行切片操作的话，最好事先定义list的长度，然后直接按索引赋值，效率最高。

In [52]:
size = 1000000

### for + append

In [53]:
%%timeit
a1 = []
for x in range(size):
    a1.append(x**2)
sum(a1)

547 ms ± 22 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


### for

In [54]:
%%timeit
a2 = [0] * size
for x in range(size):
    a2[x] = x**2
sum(a2)

484 ms ± 5.72 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


### 列表生成式

In [55]:
%%timeit
a3 = sum([i**2 for i in range(size)])

520 ms ± 7.26 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


### 生成器

In [56]:
%%timeit
a4 = sum(i**2 for i in range(size))

485 ms ± 10.4 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


### map

In [57]:
%%timeit
a5 = sum(map(lambda x: x**2, range(size)))

518 ms ± 10.7 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
