# 迭代器和解析

## 第一部分

Python中的迭代协议：有 \__next__ 方法的对象会在前进到下一个结果，而在一系列结果的末尾是，则会引发StopIteration。在Python中，任何这类对象都可以是可迭代的。任何这类对象也能以for循环或其他迭代工具遍历，因为所有迭代工具内部工作起来都是在每次迭代中调用 \__next__ ，并且捕捉StopIteration异常来确定何时离开。

### 文件迭代器
#### \__next__方法

In [14]:
f = open("script1.py")
print(f.__next__(), end="")
print(f.__next__(), end="")
print(f.__next__(), end="")
print(f.__next__(), end="")
print(f.__next__(), end="")
print(f.__next__(), end="")

import sys
print(sys.path)
x = 2
print(2 ** 33)

StopIteration: 

#### for循环迭代版本

In [13]:
for line in open("script1.py"):
    print(line.upper(), end="") # print使用end=“”来抑制添加一个\n

IMPORT SYS
PRINT(SYS.PATH)
X = 2
PRINT(2 ** 33)

#### readlines()方法
readlines()一次把整个文件加载到内存，如果文件太大，以至于计算机内存空间不够，甚至不能够工作

In [16]:
for line in open("script1.py").readlines():
    print(line.upper(), end="")

IMPORT SYS
PRINT(SYS.PATH)
X = 2
PRINT(2 ** 33)

#### while循环版本
迭代器在Python中是以C语言的速度运行的，而while循环版本则是通过Python虚拟机运行Python字节码。

In [20]:
f = open("script1.py")
while True:
    line = f.readline()
    if not line: break
    print(line.upper(), end="")

IMPORT SYS
PRINT(SYS.PATH)
X = 2
PRINT(2 ** 33)

### 手动迭代：iter和next
1. next(X)等同于X.\__next__()
2. 文件对象本身就是迭代器，有自己的__next__方法
3. 列表对象本身不是，需要调用iter来启动迭代

In [47]:
f = open("script1.py")
print(iter(f) is f)
f.__next__()

True


'import sys\n'

In [54]:
L = [1, 2, 3]
print(iter(L) is L)
L.__next__()

False


TypeError: 'list' object is not an iterator

In [56]:
L = [1, 2, 3]
I = iter(L)
print(I.__next__())
print(next(I))
# 技术上for循环调用内部等价于I.__next__

1
2


In [59]:
L = [1, 2, 3]
for x in L:
    print(x ** 2, end="")
print("\n", end="")
I = iter(L)
while True:
    try:
        X = next(I)
    except StopIteration:
        break
    print(X ** 2, end="")

149
149

### 列表解析
#### 列表解析基础知识
1. 产生一个新的列表对象
2. 编写起来更加精简
3. 列表解析比手动的for循环语句运行更快（往往速度会快一倍），因为迭代在解释器内部是以C语言的速度执行的

In [26]:
L = [1, 2, 3, 4, 5]
L = [x + 10 for x in L]
print(L)

[11, 12, 13, 14, 15]


In [25]:
L = [1, 2, 3, 4, 5]
res = []
for x in L:
    res.append(x + 10)
print(res)

[11, 12, 13, 14, 15]


#### 在文件上使用列表解析

In [28]:
f = open("script1.py")
lines = f.readlines()
lines

['import sys\n', 'print(sys.path)\n', 'x = 2\n', 'print(2 ** 33)']

In [29]:
f = open("script1.py")
lines = [line.rstrip() for line in lines]
lines

['import sys', 'print(sys.path)', 'x = 2', 'print(2 ** 33)']

In [37]:
lines = [line.rstrip() for line in open("script1.py")]
print(lines)
lines = [line.upper() for line in open("script1.py")]
print(lines)
lines = [line.rstrip().upper() for line in open("script1.py")]
print(lines)
lines = [line.split() for line in open("script1.py")]
print(lines)
lines = [line.replace(" ", "!") for line in open("script1.py")]
print(lines)
[("sys" in line, line[0]) for line in open("script1.py")]

['import sys', 'print(sys.path)', 'x = 2', 'print(2 ** 33)']
['IMPORT SYS\n', 'PRINT(SYS.PATH)\n', 'X = 2\n', 'PRINT(2 ** 33)']
['IMPORT SYS', 'PRINT(SYS.PATH)', 'X = 2', 'PRINT(2 ** 33)']
[['import', 'sys'], ['print(sys.path)'], ['x', '=', '2'], ['print(2', '**', '33)']]
['import!sys\n', 'print(sys.path)\n', 'x!=!2\n', 'print(2!**!33)']


[(True, 'i'), (True, 'p'), (False, 'x'), (False, 'p')]

In [None]:
使用if子句过滤那些测试不为真的结果项

In [40]:
lines = [line.rstrip() for line in open("script1.py") if line[0] == "p"]
print(lines)
# 等效于
res = []
for line in open("script1.py"):
    if line[0] == "p":
        res.append(line.rstrip())
print(res)

['print(sys.path)', 'print(2 ** 33)']
['print(sys.path)', 'print(2 ** 33)']


In [43]:
res = [x + y for x in "abc" for y in "lmn"]
print(res)
# 等效于
res = []
for x in "abc":
    for y in "lmn":
        res.append(x + y)
print(res)

['al', 'am', 'an', 'bl', 'bm', 'bn', 'cl', 'cm', 'cn']
['al', 'am', 'an', 'bl', 'bm', 'bn', 'cl', 'cm', 'cn']
