
---
title: Python Multiprocessing 多进程 apply/apply_async 和 map/imap/starmap 的区别
tags: 小书匠,python,multiprocessing,multiprocess,apply,map,imap,starmap,map_async,async
grammar_cjkRuby: true
# renderNumberedHeading: true
---

[toc!]

# Python Multiprocessing 多进程 apply/apply_async 和 map/imap/starmap 的区别

Python 的 multiprocessing 的 Pool 对象中有许多容易混淆的方法，如
1. map/imap/starmap/imap_unordered
2. xx/xx_async，如 map/map_asycn，apply/apply_async，starmap/starmap_async

## xx 和 xx_async 的区别

先给出结论：
1. pool.xx 返回的是接受的函数的结果
2. pool.xx_async 返回的是一个 `multiprocessing.pool.ApplyResult` 对象，可以使用 .get() 方法从这个对象中获取结果。

拿 `pool.apply` 和 `pool.apply_async` 为例:

### pool.apply

In [5]:
from multiprocessing import Pool

def func(x):
    print(x)
    return x
    
pool = Pool()
rets = []
for i in range(5):
    ret = pool.apply(func, args=(i, ))
    rets.append(ret)
    
pool.close()
pool.join()

0
1
2
3
4


pool.apply 返回的是 func 的结果

### pool.apply_async

In [7]:
from multiprocessing import Pool

def func(x):
    print(x)
    return x
    
pool = Pool()
rets = []
for i in range(5):
    ret = pool.apply_async(func, args=(i, ))
    rets.append(ret)
    
pool.close()
pool.join()
for ret in rets:
    print(ret)
    print(ret.get())

1
2
0
3
4
<multiprocessing.pool.ApplyResult object at 0x10a03db00>
0
<multiprocessing.pool.ApplyResult object at 0x109fd0a20>
1
<multiprocessing.pool.ApplyResult object at 0x10a1107b8>
2
<multiprocessing.pool.ApplyResult object at 0x10a110b70>
3
<multiprocessing.pool.ApplyResult object at 0x10a1126a0>
4


实际上，`pool.apply` 就只是对 `pool.apply_async` 取了一个 get 而已，看[源码](https://github.com/python/cpython/blob/2d1cbe4193499914ccc9d217ea63eb17ff927c91/Lib/multiprocessing/pool.py#L352)

In [8]:
def apply(self, func, args=(), kwds={}):
    '''
    Equivalent of `func(*args, **kwds)`.
    Pool must be running.
    '''
    return self.apply_async(func, args, kwds).get()

## map 和 starmap

这个比较简单，map 只能接受单参数函数，而 starmap 可以接受多参数函数

In [12]:
def func(x, y):
    print("x: {} y: {}".format(x, y))

pool = Pool()
pool.starmap(func, zip(range(5), range(5))) #关键点，images是一个可迭代对象
pool.close()
pool.join()

x: 1 y: 1
x: 0 y: 0
x: 4 y: 4
x: 2 y: 2
x: 3 y: 3


In [14]:
def func(x):
    print(x)

pool = Pool()
ret = pool.map(func, range(5)) #关键点，images是一个可迭代对象
pool.close()
pool.join()

0
1
2
3
4


实际上，starmap 也是通过 map 来实现的，看[源码](https://github.com/python/cpython/blob/2d1cbe4193499914ccc9d217ea63eb17ff927c91/Lib/multiprocessing/pool.py#L374)

In [15]:
def starmap(self, func, iterable, chunksize=None):
    '''
    Like `map()` method but the elements of the `iterable` are expected to
    be iterables as well and will be unpacked as arguments. Hence
    `func` and (a, b) becomes func(a, b).
    '''
    return self._map_async(func, iterable, starmapstar, chunksize).get()

其中用到了一个 starmapstar 函数，这个函数在 [这里](https://github.com/python/cpython/blob/2d1cbe4193499914ccc9d217ea63eb17ff927c91/Lib/multiprocessing/pool.py#L50)

In [18]:
def starmapstar(args):
    return list(itertools.starmap(args[0], args[1]))

其中还用到了 `self._map_async` 函数，这个函数就是 map，imap，starmap 底层的一个函数，其原型是

In [35]:
def _map_async(self, func, iterable, mapper, chunksize=None, callback=None,
        error_callback=None):
    '''
    Helper function to implement map, starmap and their async counterparts.
    '''
    pass

可以看到，这个函数的 mapper 参数是一个函数，用来对 iterable 进行迭代。

这个源代码没哟看明白。。。

# References

1. http://localhost:8888/lab/workspaces/auto-K/tree/learnPython/Python%20tqdm%20multiprocessing.ipynb
2. [python - multiprocessing.Pool: What's the difference between map_async and imap? - Stack Overflow](https://stackoverflow.com/questions/26520781/multiprocessing-pool-whats-the-difference-between-map-async-and-imap/26521507#26521507)