# 应用该如何选择这些并行的方式

## 几种并行方式的总结
<img src="../asset/parallelism.jpg" width="100%" >

## 如何选择这些并行的方法？
### I/O密集型(IO-Bound)
![](../asset/IOBound.png)

### 计算密集型(CPU-Bound)
![](../asset/CPUBound.png)

### 如何加入I/O密集型的程序

In [14]:
import requests
import time


def download_site(url, session):
    with session.get(url) as response:
        print(f"Read {len(response.content)} from {url}")


def download_all_sites(sites):
    with requests.Session() as session:
        for url in sites:
            download_site(url, session)


if __name__ == "__main__":
    sites = [
        "https://www.baidu.com",
    ] * 80
    start_time = time.time()
    download_all_sites(sites)
    duration = time.time() - start_time
    print(f"Downloaded {len(sites)} in {duration} seconds")

Read 2443 from https://www.baidu.com
Read 2443 from https://www.baidu.com
Read 2443 from https://www.baidu.com
Read 2443 from https://www.baidu.com
Read 2443 from https://www.baidu.com
Read 2443 from https://www.baidu.com
Read 2443 from https://www.baidu.com
Read 2443 from https://www.baidu.com
Read 2443 from https://www.baidu.com
Read 2443 from https://www.baidu.com
Read 2443 from https://www.baidu.com
Read 2443 from https://www.baidu.com
Read 2443 from https://www.baidu.com
Read 2443 from https://www.baidu.com
Read 2443 from https://www.baidu.com
Read 2443 from https://www.baidu.com
Read 2443 from https://www.baidu.com
Read 2443 from https://www.baidu.com
Read 2443 from https://www.baidu.com
Read 2443 from https://www.baidu.com
Read 2443 from https://www.baidu.com
Read 2443 from https://www.baidu.com
Read 2443 from https://www.baidu.com
Read 2443 from https://www.baidu.com
Read 2443 from https://www.baidu.com
Read 2443 from https://www.baidu.com
Read 2443 from https://www.baidu.com
R

改造成多线程的版本

In [16]:
import concurrent.futures
import requests
import threading
import time


thread_local = threading.local()


def get_session():
    if not hasattr(thread_local, "session"):
        thread_local.session = requests.Session()
    return thread_local.session


def download_site(url):
    session = get_session()
    with session.get(url) as response:
        print(f"Read {len(response.content)} from {url}")


def download_all_sites(sites):
    with concurrent.futures.ThreadPoolExecutor(max_workers=5) as executor:
        executor.map(download_site, sites)


if __name__ == "__main__":
    sites = [
        "https://www.baidu.com",

    ] * 80
    start_time = time.time()
    download_all_sites(sites)
    duration = time.time() - start_time
    print(f"Downloaded {len(sites)} in {duration} seconds")

Read 2443 from https://www.baidu.com
Read 2443 from https://www.baidu.com
Read 2443 from https://www.baidu.com
Read 2443 from https://www.baidu.comRead 2443 from https://www.baidu.com

Read 2443 from https://www.baidu.com
Read 2443 from https://www.baidu.com
Read 2443 from https://www.baidu.com
Read 2443 from https://www.baidu.com
Read 2443 from https://www.baidu.com
Read 2443 from https://www.baidu.com
Read 2443 from https://www.baidu.com
Read 2443 from https://www.baidu.com
Read 2443 from https://www.baidu.com
Read 2443 from https://www.baidu.comRead 2443 from https://www.baidu.com

Read 2443 from https://www.baidu.com
Read 2443 from https://www.baidu.com
Read 2443 from https://www.baidu.comRead 2443 from https://www.baidu.com

Read 2443 from https://www.baidu.com
Read 2443 from https://www.baidu.com
Read 2443 from https://www.baidu.com
Read 2443 from https://www.baidu.comRead 2443 from https://www.baidu.com
Read 2443 from https://www.baidu.com

Read 2443 from https://www.baidu.com
R

![](../asset/Threading.png)

改造成asyncio的版本

In [17]:
# %load src/asyncio-aiohttp.py
import asyncio
import time
import aiohttp


async def download_site(session, url):
    async with session.get(url) as response:
        print("Read {0} from {1}".format(response.content_length, url))


async def download_all_sites(sites):
    async with aiohttp.ClientSession() as session:
        tasks = []
        for url in sites:
            task = asyncio.ensure_future(download_site(session, url))
            tasks.append(task)
        await asyncio.gather(*tasks, return_exceptions=True)

if __name__ == "__main__":
    sites = [
            "https://www.baidu.com",

            ] * 80
    start_time = time.time()
    asyncio.get_event_loop().run_until_complete(download_all_sites(sites))
    duration = time.time() - start_time
    print(f"Downloaded {len(sites)} sites in {duration} seconds")


RuntimeError: This event loop is already running

Read 227 from https://www.baidu.com
Read 227 from https://www.baidu.com
Read 227 from https://www.baidu.com
Read 227 from https://www.baidu.com
Read 227 from https://www.baidu.com
Read 227 from https://www.baidu.com
Read 227 from https://www.baidu.com
Read 227 from https://www.baidu.com
Read 227 from https://www.baidu.com
Read 227 from https://www.baidu.com
Read 227 from https://www.baidu.com
Read 227 from https://www.baidu.com
Read 227 from https://www.baidu.com
Read 227 from https://www.baidu.com
Read 227 from https://www.baidu.com
Read 227 from https://www.baidu.com
Read 227 from https://www.baidu.com
Read 227 from https://www.baidu.com
Read 227 from https://www.baidu.com
Read 227 from https://www.baidu.com
Read 227 from https://www.baidu.com
Read 227 from https://www.baidu.com
Read 227 from https://www.baidu.com
Read 227 from https://www.baidu.com
Read 227 from https://www.baidu.com
Read 227 from https://www.baidu.com
Read 227 from https://www.baidu.com
Read 227 from https://www.ba

In [11]:
!python src/asyncio-aiohttp.py

Read 3549 from https://www.jython.org
Read 3549 from https://www.jython.org
Read 3549 from https://www.jython.org
Read 3549 from https://www.jython.org
Read 3549 from https://www.jython.org
Read 3549 from https://www.jython.org
Read 3549 from https://www.jython.org
Read 3549 from https://www.jython.org
Read 3549 from https://www.jython.org
Read 3549 from https://www.jython.org
Read 3549 from https://www.jython.org
Read 3549 from https://www.jython.org
Read 3549 from https://www.jython.org
Read 3549 from https://www.jython.org
Read 3549 from https://www.jython.org
Read 3549 from https://www.jython.org
Read 3549 from https://www.jython.org
Read 3549 from https://www.jython.org
Read 3549 from https://www.jython.org
Read 3549 from https://www.jython.org
Read 3549 from https://www.jython.org
Read 3549 from https://www.jython.org
Read 3549 from https://www.jython.org
Read 3549 from https://www.jython.org
Read 3549 from https://www.jython.org
Read 3549 from https://www.jython.org
Read 3549 fr

![](../asset/Asyncio.webp)

改造成多进程版本

In [None]:
# %load src/multiprocessing-http.py
import requests
import multiprocessing
import time

session = None


def set_global_session():
    global session
    if not session:
        session = requests.Session()


def download_site(url):
    with session.get(url) as response:
        name = multiprocessing.current_process().name
        print(f"{name}:Read {len(response.content)} from {url}")


def download_all_sites(sites):
    with multiprocessing.Pool(initializer=set_global_session) as pool:
        pool.map(download_site, sites)


if __name__ == "__main__":
    sites = [
        "https://www.jython.org",
        "https://olympus.realpython.org/dice",
    ] * 10
    start_time = time.time()
    download_all_sites(sites)
    duration = time.time() - start_time
    print(f"Downloaded {len(sites)} in {duration} seconds")

In [12]:
!python src/multiprocessing-http.py

ForkPoolWorker-9:Read 10267 from https://www.jython.org
ForkPoolWorker-17:Read 10267 from https://www.jython.org
ForkPoolWorker-3:Read 10267 from https://www.jython.org
ForkPoolWorker-5:Read 10267 from https://www.jython.org
ForkPoolWorker-23:Read 10267 from https://www.jython.org
ForkPoolWorker-9:Read 10267 from https://www.jython.org
ForkPoolWorker-20:Read 278 from http://olympus.realpython.org/dice
ForkPoolWorker-26:Read 278 from http://olympus.realpython.org/dice
ForkPoolWorker-7:Read 10267 from https://www.jython.org
ForkPoolWorker-32:Read 278 from http://olympus.realpython.org/dice
ForkPoolWorker-34:Read 278 from http://olympus.realpython.org/dice
ForkPoolWorker-56:Read 10267 from https://www.jython.org
ForkPoolWorker-44:Read 278 from http://olympus.realpython.org/dice
ForkPoolWorker-2:Read 278 from http://olympus.realpython.org/dice
ForkPoolWorker-37:Read 278 from http://olympus.realpython.org/dice
ForkPoolWorker-40:Read 278 from http://olympus.realpython.org/dice
ForkPoolWorker

ForkPoolWorker-5:Read 278 from http://olympus.realpython.org/dice
ForkPoolWorker-17:Read 278 from http://olympus.realpython.org/dice
ForkPoolWorker-23:Read 278 from http://olympus.realpython.org/dice
ForkPoolWorker-21:Read 278 from http://olympus.realpython.org/dice
ForkPoolWorker-46:Read 278 from http://olympus.realpython.org/dice
ForkPoolWorker-13:Read 10267 from https://www.jython.org
ForkPoolWorker-25:Read 278 from http://olympus.realpython.org/dice
ForkPoolWorker-24:Read 10267 from https://www.jython.org
ForkPoolWorker-38:Read 278 from http://olympus.realpython.org/dice
ForkPoolWorker-63:Read 278 from http://olympus.realpython.org/dice
ForkPoolWorker-43:Read 278 from http://olympus.realpython.org/dice
ForkPoolWorker-50:Read 278 from http://olympus.realpython.org/dice
ForkPoolWorker-56:Read 278 from http://olympus.realpython.org/dice
ForkPoolWorker-37:Read 10267 from https://www.jython.org
ForkPoolWorker-14:Read 10267 from https://www.jython.org
ForkPoolWorker-45:Read 10267 from ht

![](../asset/MProc.webp)

## 如何加入CPU密集型的程序

In [18]:
import time


def cpu_bound(number):
    return sum(i * i for i in range(number))


def find_sums(numbers):
    for number in numbers:
        cpu_bound(number)


if __name__ == "__main__":
    numbers = [5_000_000 + x for x in range(20)]

    start_time = time.time()
    find_sums(numbers)
    duration = time.time() - start_time
    print(f"Duration {duration} seconds")

Duration 37.74378275871277 seconds


![](../asset/CPUBound.webp)

改造成多进程版本

In [19]:
# %load src/multiprocessing-cpu.py
import multiprocessing
import time


def cpu_bound(number):
    return sum(i * i for i in range(number))


def find_sums(numbers):
    with multiprocessing.Pool() as pool:
        pool.map(cpu_bound, numbers)


if __name__ == "__main__":
    numbers = [5_000_000 + x for x in range(20)]

    start_time = time.time()
    find_sums(numbers)
    duration = time.time() - start_time
    print(f"Duration {duration} seconds")

Duration 3.974608898162842 seconds


In [20]:
!python src/multiprocessing-cpu.py

Duration 4.311970472335815 seconds


![](../asset/CPUMP.webp)

> Premature optimization is the root of all evil (or at least most of it) in programming.
>                                                                                    - Donald Knuth 

# 面试题
## 说说进程、线程和协程之间的区别和举例说明在实际使用的中选择标准