**1.1 序列分别为单独的变量**

1.1.1 问题  
我们有一个包含N个元素的元组和序列，现在想将它分解为N个单独的变量

1.1.2 解决方案

In [1]:
p = (4, 5)

In [2]:
x, y = p

In [3]:
x

4

In [4]:
y

5

In [5]:
data = ['ACME', 50, 91.1, (2012, 12, 21)]
name, shares, price, date = data

In [6]:
name

'ACME'

In [7]:
date

(2012, 12, 21)

In [8]:
name, shares, price, (year, mon, day) = data

In [9]:
name

'ACME'

In [10]:
year

2012

In [11]:
mon

12

In [12]:
day

21

In [13]:
p = (4, 5)
x, y , z = p

ValueError: not enough values to unpack (expected 3, got 2)

1.1.3 讨论

In [3]:
s = 'Hello'
a, b, c, d, e = s

In [4]:
a

'H'

In [5]:
b

'e'

In [6]:
e

'o'

In [7]:
data = ['ACEM', 50, 91.1, (2012, 12, 21)]

In [8]:
_, shares, price, _ = data

In [9]:
_

(2012, 12, 21)

In [10]:
shares

50

In [11]:
price

91.1

**1.2 从任意长度的可迭代对象中分解元素**

1.2.1 问题  
需要从某个可迭代对象中分解出N个元素，但是这个可迭代对象的长度可能超过N,这会导致出现“分解的值过多”的异常

1.2.2 解决方案

In [12]:
def drop_first_last(grades):
    first, *middle, last = grades
    return avg(middle)

In [13]:
user_record = ('Dave', 'dave@example.com', '773-555-1212', '847-555-1212')

In [14]:
name, email, *phone_numbers = user_record

In [15]:
name

'Dave'

In [16]:
email

'dave@example.com'

In [17]:
phone_numbers

['773-555-1212', '847-555-1212']

1.2.3 讨论

In [18]:
records = [
    ('foo', 1, 2),
    ('bar', 'hello'),
    ('foo', 3, 4),
]

def do_foo(x, y):
    print('foo', x, y)
    
def do_bar(s):
    print('bar', s)
    
for tag, *args in records:
    if tag == 'foo':
        do_foo(*args)
    elif tag == 'bar':
        do_bar(*args)

foo 1 2
bar hello
foo 3 4


In [19]:
line = 'nobody:*:-2:-2:Unprivileged User:/var/empty:/usr/bin/false'
uname, *fields, homedir, sh = line.split(':')

In [20]:
uname

'nobody'

In [21]:
homedir

'/var/empty'

In [22]:
sh

'/usr/bin/false'

In [23]:
fields

['*', '-2', '-2', 'Unprivileged User']

In [24]:
record = ('ACME', 50, 123.45, (12, 18, 2012))
name, *_, (*_, year) = record

In [25]:
name

'ACME'

In [26]:
year

2012

In [27]:
items = [1, 10, 7, 4, 5, 9]
head, *tail = items

In [28]:
head

1

In [29]:
tail

[10, 7, 4, 5, 9]

In [30]:
def sum(items): # 递归求和
    head, *tail = items
    return head + sum(tail) if tail else head

In [31]:
sum(items)

36

**1.3 保存最后N个元素**

1.3.1 问题  
我们希望在迭代或是其他形式的处理过程中对最后几项记录做一个有限的历史纪录统计

In [32]:
from collections import deque

def search(lines, pattern, history=5):
    previous_lines = deque(maxlen=history)
    for line in lines:
        if pattern in line:
            yield line, previous_lines
            previous_lines.append(line)
            
if __name__ == '__main__':
    with open('somefile.txt') as f:
        for line, prevlines in search(f, 'python', 5):
            for pline in prevlines:
                print(pline, end='')
            print(line, end='')
            print('-'*20)

1+python=2
--------------------
1+python=2
python is my favourite language
--------------------
1+python=2
python is my favourite language
The world needs python
--------------------
1+python=2
python is my favourite language
The world needs python
I hope python will help me
--------------------
1+python=2
python is my favourite language
The world needs python
I hope python will help me
if python is dead
--------------------
1+python=2
python is my favourite language
The world needs python
I hope python will help me
if python is dead
python python
--------------------
python is my favourite language
The world needs python
I hope python will help me
if python is dead
python python
12 * python
--------------------
The world needs python
I hope python will help me
if python is dead
python python
12 * python
final python
--------------------


1.3.3 讨论

In [33]:
q = deque(maxlen=3)

In [34]:
q.append(1)
q.append(2)
q.append(3)
q

deque([1, 2, 3])

In [35]:
q.append(4)
q

deque([2, 3, 4])

In [36]:
q.append(5)
q

deque([3, 4, 5])

In [37]:
q = deque()

In [38]:
q.append(1)
q.append(2)
q.append(3)
q

deque([1, 2, 3])

In [39]:
q.appendleft(4)
q

deque([4, 1, 2, 3])

In [40]:
q.pop()

3

In [41]:
q

deque([4, 1, 2])

In [42]:
q.popleft()

4

**1.4 找到最大或最小的N个元素**

1.4.1 问题  
我们想在某个集合中找出最大或最小的N个元素

1.4.2 解决方案

In [43]:
import heapq

nums = [1, 8, 2, 23, 7, -4, 18, 23, 42, 37, 2]
print(heapq.nlargest(3, nums))

[42, 37, 23]


In [44]:
print(heapq.nsmallest(3, nums))

[-4, 1, 2]


In [45]:
portfolio = [
    {'name': 'IBM', 'shares': 100, 'price': 91.1},
    {'name': 'AAPL', 'shares': 50, 'price': 543.22},
    {'name': 'FB', 'shares': 200, 'price': 21.09},
    {'name': 'HPQ', 'shares': 35, 'price': 31.75},
    {'name': 'YHOO', 'shares':45, 'price': 16.35},
    {'name': 'ACME', 'shares': 75, 'price': 115.65}
]

In [46]:
cheap = heapq.nsmallest(3, portfolio, key=lambda s: s['price'])
expensive = heapq.nlargest(3, portfolio, key=lambda s: s['price'])

In [47]:
cheap

[{'name': 'YHOO', 'price': 16.35, 'shares': 45},
 {'name': 'FB', 'price': 21.09, 'shares': 200},
 {'name': 'HPQ', 'price': 31.75, 'shares': 35}]

In [48]:
expensive

[{'name': 'AAPL', 'price': 543.22, 'shares': 50},
 {'name': 'ACME', 'price': 115.65, 'shares': 75},
 {'name': 'IBM', 'price': 91.1, 'shares': 100}]

1.4.3 讨论

In [49]:
nums = [1, 8, 2, 23, 7, -4, 18, 23, 42, 37, 2]

In [50]:
import heapq

heap = list(nums)
heapq.heapify(heap) # 转换成堆，heap[0]为最小元素

In [51]:
heap

[-4, 2, 1, 23, 7, 2, 18, 23, 42, 37, 8]

In [52]:
heapq.heappop(heap)

-4

In [53]:
heapq.heappop(heap)

1

In [54]:
heap

[2, 2, 8, 23, 7, 37, 18, 23, 42]

**1.5 实现优先级队列**

1.5.1 问题  
我们想要实现一个队列，它能够以给定的优先级来对元素排序，且每次pop操作时都会返回优先级最高的那个元素

1.5.2 解决方案

In [15]:
import heapq

class PriorityQueue:
    def __init__(self):
        self._queue = []
        self._index = 0
    
    def push(self, item, priority):
        heapq.heappush(self._queue, (-priority, self._index, item))
        self._index += 1
    
    def pop(self):
        return heapq.heappop(self._queue)[-1]
    
class Item:
    def __init__(self, name):
        self.name = name
    def __repr__(self):
        return 'Item({!r})'.format(self.name) # 可以使用!s、!a、!r，将对象转成str，ascii，repr格式

In [31]:
q = PriorityQueue()

In [32]:
q.push(Item('foo'), 1)

In [33]:
q.push(Item('bar'), 5)

In [34]:
q.push(Item('spam'), 4)

In [35]:
q.push(Item('qrok'), 1)

In [36]:
q.pop()

Item('bar')

In [37]:
q.pop()

Item('spam')

In [38]:
q.pop()

Item('foo')

In [39]:
q.pop()

Item('qrok')

1.5.3 讨论

In [40]:
a = Item('foo')
b = Item('bar')
a < b

TypeError: '<' not supported between instances of 'Item' and 'Item'

In [41]:
a = (1, Item('foo'))
b = (5, Item('bar'))
a < b

True

In [42]:
c = (1, Item('grok'))
a < c

TypeError: '<' not supported between instances of 'Item' and 'Item'

In [43]:
a = (1, 0, Item('foo'))
b = (5, 1, Item('bar'))
c = (1, 2, Item('grok'))
a < b

True

In [45]:
a < c

True

In [46]:
print(Item('foo'))

Item('foo')


**1.6 在字典中将键映射到多个值上**

1.6.1 问题  
我们想要一个能将键(key)映射到多个值的字典（即所谓的一键多值字典[multidict]）

1.6.2 解决方案

In [1]:
d = {
    'a': [1, 2, 3],
    'b': [4, 5]
}

e = {
    'a': [1, 2, 3],
    'b': [4,5]
}

In [2]:
from collections import defaultdict

d = defaultdict(list)
d['a'].append(1)
d['a'].append(2)
d['b'].append(4)
d

defaultdict(list, {'a': [1, 2], 'b': [4]})

In [3]:
d['a']

[1, 2]

In [4]:
d = defaultdict(set)
d['a'].add(1)
d['a'].add(2)
d['a'].add(1)
d['b'].add(4)
d

defaultdict(set, {'a': {1, 2}, 'b': {4}})

In [5]:
d['a']

{1, 2}

In [7]:
d = {}
d.setdefault('a', []).append(1)
d.setdefault('a', []).append(2)
d.setdefault('b', []).append(4)
d

{'a': [1, 2], 'b': [4]}

1.6.3 讨论

In [10]:
d = {}
pairs = [('a', 1), ('a', 2), ('b', 4)]
for key, value in pairs:
    if key not in d:
        d[key] = []
    d[key].append(value)
d

{'a': [1, 2], 'b': [4]}

In [11]:
d = defaultdict(list)
for key, value in pairs:
    d[key].append(value)
d

defaultdict(list, {'a': [1, 2], 'b': [4]})

**1.7 让字典保持有序**

1.7.1 问题  
我们想创建一个字典，同时当对字典做迭代或序列化操作时，也能控制其中元素的顺序

1.7.2 解决方案

In [12]:
from collections import OrderedDict

d = OrderedDict()
d['foo'] = 1
d['bar'] = 2
d['spam'] = 3
d['grok'] = 4

for key in d:
    print(key, d[key])

foo 1
bar 2
spam 3
grok 4


In [13]:
import json
json.dumps(d)

'{"foo": 1, "bar": 2, "spam": 3, "grok": 4}'

1.7.3 讨论  
OrderedDict内部维护了一个双向链表，所以大小是普通字典的2倍多

**1.8 与字典有关的计算问题**

1.8.1 问题  
我们想在字典上对数据执行各式各样的计算（比如求最小值、最大值、排序等）

1.8.2 解决方案

In [16]:
prices = {
    'ACME': 45.23,
    'AAPL': 612.78,
    'IBM': 205.55,
    'HPQ': 37.20,
    'FB': 10.75
}

In [17]:
min_price = min(zip(prices.values(), prices.keys()))
min_price

(10.75, 'FB')

In [18]:
max_price = max(zip(prices.values(), prices.keys()))
max_price

(612.78, 'AAPL')

In [19]:
prices_sorted = sorted(zip(prices.values(), prices.keys()))
prices_sorted

[(10.75, 'FB'),
 (37.2, 'HPQ'),
 (45.23, 'ACME'),
 (205.55, 'IBM'),
 (612.78, 'AAPL')]

In [20]:
prices_and_names = zip(prices.values(), prices.keys())
print(min(prices_and_names))
print(max(prices_and_names))

(10.75, 'FB')


ValueError: max() arg is an empty sequence

1.8.3 讨论

In [21]:
min(prices)

'AAPL'

In [22]:
max(prices)

'IBM'

In [23]:
min(prices.values())

10.75

In [24]:
max(prices.values())

612.78

In [25]:
min(prices, key=lambda k: prices[k])

'FB'

In [26]:
max(prices, key=lambda k: prices[k])

'AAPL'

In [28]:
# find min value
min_value = prices[min(prices, key=lambda k: prices[k])]
min_value

10.75

In [29]:
prices = {'AAA': 45.23, 'ZZZ': 45.23} # 值相同，则将key作为判定的结果，key不可能相同

In [30]:
min(zip(prices.values(), prices.keys()))

(45.23, 'AAA')

In [31]:
max(zip(prices.values(), prices.keys()))

(45.23, 'ZZZ')

**1.9 在两个字典中寻找相同点**

1.9.1 问题  
有两个字典，我们想找出它们中间可能相同的地方（相同的键、相同的值等）

1.9.2 解决方案

In [32]:
a = {
    'x': 1,
    'y': 2,
    'z': 3
}

b = {
    'w': 10,
    'x': 11,
    'y': 2
}

In [33]:
# Find keys in common
a.keys() & b.keys()

{'x', 'y'}

In [35]:
# Find keys in a that are not in b
a.keys() - b.keys()

{'z'}

In [36]:
# Find (key, value) pairs in common
a.items() & b.items()

{('y', 2)}

In [38]:
# Make a new dictionary with certain keys removed
c = {key: a[key] for key in a.keys() - {'z', 'w'}}
c

{'x': 1, 'y': 2}

1.9.3 讨论  
字典的keys()和items()方法返回的对象支持集合操作，如并集、交集和差集，但values()方法不支持，因为值不是唯一的

**1.10 从序列中移除重复项且保持元素间顺序不变**

1.10.1 问题  
我们想去除序列中出现的重复元素，但仍然保持剩下的元素顺序不变

1.10.2 解决方案

In [39]:
def dedupe(items):
    seen = set()
    for item in items:
        if item not in seen:
            yield item
            seen.add(item)

In [40]:
a = [1, 5, 2, 1, 9, 1, 5, 10]
list(dedupe(a))

[1, 5, 2, 9, 10]

In [41]:
# 针对不可哈希的对象（如列表）去除重复项
def dedupe(items, key=None):
    seen = set()
    for item in items:
        val = item if key is None else key(item)
        if val not in seen:
            yield item
            seen.add(val)

In [42]:
a = [{'x': 1, 'y': 2}, {'x': 1, 'y': 3}, {'x': 1, 'y': 2}, {'x': 2, 'y': 4}]

list(dedupe(a, key=lambda d: (d['x'], d['y'])))

[{'x': 1, 'y': 2}, {'x': 1, 'y': 3}, {'x': 2, 'y': 4}]

In [43]:
list(dedupe(a, key=lambda d: d['x']))

[{'x': 1, 'y': 2}, {'x': 2, 'y': 4}]

1.10.3 讨论  
构建集合可以足够简单地去除重复项，但不能保证顺序不变。生成器的使用，可以使函数尽可能的通用。

In [47]:
a = [1, 5, 2, 1, 9, 1, 5, 10]
set(a)

{1, 2, 5, 9, 10}

In [49]:
somefile = 'somefile.txt'
with open(somefile, 'r') as f:
    for line in dedupe(f):
        ...