# 在参数上面迭代时，要多加小心

**示例：**把所有的输入值总加，以求出每年的游客总数，然后，用每城市的游客数除以总数，以求出该城市所占的比例。

In [1]:
def normalize(numbers):
    total = sum(numbers)
    result = []
    for value in numbers:
        percent = 100 * value / total
        result.append(percent)
    return result

In [2]:
visits = [15, 35, 80]
percentages = normalize(visits)
print(percentages)

[11.538461538461538, 26.923076923076923, 61.53846153846154]


为了扩大函数的应用范围，现在把每个城市的游客数放在一份文件里面，然后从该文件中读取数据。

In [3]:
path = 'my_numbers.txt'
with open(path, 'w') as f:
    for i in (15, 35, 80):
        f.write('%d\n' % i)

In [4]:
def read_visits(data_path):
    with open(data_path) as f:
        for line in f:
            yield int(line)

In [5]:
it = read_visits('my_numbers.txt')
percentages = normalize(it)
print(percentages)

[]


出现上述原因在于，迭代器只能产生一轮结果。在抛出过StopIteration异常的迭代器或生成器上面继续迭代第二轮，是不会有结果的。

In [6]:
it = read_visits('my_numbers.txt')
print(list(it))
print(list(it))

[15, 35, 80]
[]


**改进方法1：**使用迭代器制作一份列表，将它的全部内容都遍历一次，并复制到这份列表里，然后，就可以在复制出来的数据列表上面多次迭代了。

In [7]:
def normalize_copy(numbers):
    numbers = list(numbers)
    total = sum(numbers)
    result = []
    for value in numbers:
        percent = 100 * value / total
        result.append(percent)
    return result

In [8]:
it = read_visits('my_numbers.txt')
percentages = normalize_copy(it)
print(percentages)

[11.538461538461538, 26.923076923076923, 61.53846153846154]


**问题：**待复制的那个迭代器，可能含有大量输入数据，从而导致程序在复制迭代器的时候耗尽内存并崩溃。

**改进方法2：**通过参数来接受另外一个函数，那个函数每次调用后，都能返回新的迭代器。

In [9]:
def normalize_func(get_iter):
    total = sum(get_iter())
    result = []
    for value in get_iter():
        percent = 100 * value / total
        result.append(percent)
    return result

In [10]:
percentages = normalize_func(lambda: read_visits(path))

**问题：**传递lambda函数，毕竟显得生硬。

**改进方法3：**新编一种迭代器协议的容器类。

In [11]:
class ReadVisits(object):
    def __init__(self, data_path):
        self.data_path = data_path
    
    def __iter__(self):
        with open(self.data_path) as f:
            for line in f:
                yield int(line)

In [12]:
visits = ReadVisits(path)
percentages = normalize(visits)
print(percentages)

[11.538461538461538, 26.923076923076923, 61.53846153846154]


In [13]:
def normalize_defensive(numbers):
    if iter(numbers) is iter(numbers):
        raise TypeError('Must supply a container')
    total = sum(numbers)
    result = []
    for value in numbers:
        percent = 100 * value / total
        result.append(percent)
    return result

In [14]:
visits = [15, 35, 80]
normalize_defensive(visits)
visits = ReadVisits(path)
normalize_defensive(visits)

[11.538461538461538, 26.923076923076923, 61.53846153846154]

如果输入的参数是迭代器而不是容器，那么函数会抛出异常。

In [15]:
it = iter(visits)
normalize_defensive(it)

TypeError: Must supply a container