# A Guided Tour of Ray Core: Multiprocessing Pool

![Anyscale Academy](../images/AnyscaleAcademyLogo.png)

© 2019-2022, Anyscale. All Rights Reserved

[*Distributed multiprocessing.Pool*](https://docs.ray.io/en/latest/multiprocessing.html) makes it easy to scale existing Python applications that use [`multiprocessing.Pool`](https://docs.python.org/3/library/multiprocessing.html) by leveraging *actors*. Ray supports running distributed python programs with the **multiprocessing.Pool** API using Ray Actors, each running on a [workder node](https://docs.ray.io/en/latest/ray-core/actors.html#faq-actors-workers-and-resources), instead of local processes. This makes it easy to scale existing applications that use `multiprocessing.Pool` from a single node to a cluster.

First, let's start Ray…

In [1]:
import multiprocessing as mp
import time
import logging
import ray

## Multiprocessing Pool example

The following is a simple Python function with a slight delay added (to make it behave like a more complex calculation)...

In [2]:
# this could be some complicated and compute intensive task
def func(x):
    time.sleep(1.5)
    return x ** 2

Then, use the Ray's drop-in replacement for [multiprocessing pool](https://docs.ray.io/en/latest/multiprocessing.html)

In [3]:
ray.init(
    ignore_reinit_error=True,
    logging_level=logging.ERROR,
)

{'node_ip_address': '127.0.0.1',
 'raylet_ip_address': '127.0.0.1',
 'redis_address': None,
 'object_store_address': '/tmp/ray/session_2022-03-15_10-11-12_801942_29045/sockets/plasma_store',
 'raylet_socket_name': '/tmp/ray/session_2022-03-15_10-11-12_801942_29045/sockets/raylet',
 'webui_url': '127.0.0.1:8265',
 'session_dir': '/tmp/ray/session_2022-03-15_10-11-12_801942_29045',
 'metrics_export_port': 61227,
 'gcs_address': '127.0.0.1:59920',
 'address': '127.0.0.1:59920',
 'node_id': 'de2c95b087773266ed8332cbe26bddd759ed6ffd36ddb5f2e271d73d'}

Now we'll create a *Pool* using and distribute its tasks across a cluster (or across the available cores on a laptop):

In [4]:
%%time

from ray.util.multiprocessing import Pool

pool = Pool()

for result in pool.map(func, range(10)):
    print(result)

0
1
4
9
16
25
36
49
64
81
CPU times: user 181 ms, sys: 75.6 ms, total: 257 ms
Wall time: 1.98 s


The distributed version has the trade-off of increased overhead, although now it can scale-out horizontally across a cluster. The benefits would be more pronounced with a more computationally expensive calculation.

In [5]:
pool.terminate()

Finally, shutdown Ray

In [6]:
ray.shutdown()