# 1. Scenario: Calculate Squares of Numbers

### 1.1 Solution without Process Pool:
### Issue with Traditional Multiprocessing (without Pool):

- Manual Process Creation: You must manually create, start, and join each process.
- Overhead: Creating too many processes can overwhelm the CPU, leading to inefficiency.
- No Automatic Work Distribution: Tasks need to be divided manually among processes.

In [None]:
from multiprocessing import Process
import time
import os

def square_number(number):
    pid = os.getpid()  # Get process ID
    result = number ** 2
    print(f"Process {pid}: Square of {number} is {result}")

if __name__ == "__main__":
    numbers = [1, 2, 3, 4, 5, 6]

    processes = []

    start_time = time.time()  # Start timing

    for number in numbers:
        p = Process(target=square_number, args=(number,))
        processes.append(p)
        p.start()

    for p in processes:
        p.join()

    end_time = time.time()  # End timing
    print(f"Traditional Multiprocessing Time: {end_time - start_time:.4f} seconds")


Traditional Multiprocessing Time: 0.5574 seconds


### Issues:
- Manual Management:
Processes are created, started, and joined manually.
- No Task Limitation:
For 6 numbers, 6 processes are created, which can overwhelm the CPU.
- No Automatic Work Distribution:
The task division is not handled automatically; all processes are launched simultaneously.


### 1.2 Solution with Process Pool:
- The Pool class in Python's multiprocessing module is used for parallel processing. It allows you to manage a pool of worker processes, enabling you to execute tasks concurrently across multiple processes. It abstracts away the complexity of manually managing processes and distributing work.


In [None]:
from multiprocessing import Pool
import os
import time

def square_number(number):
    pid = os.getpid()  # Get process ID
    result = number ** 2
    print(f"Process {pid}: Square of {number} is {result}")
    return result

if __name__ == "__main__":  # Ensure this is protected for multiprocessing on Windows
    numbers = [1, 2, 3, 4, 5, 6]

    start_time = time.time()  # Start timing

    # Create a process pool with 3 processes
    with Pool(processes=3) as pool:
        results = pool.map(square_number, numbers)

    end_time = time.time()  # End timing
    print("Squares:", results)
    print(f"Pool Multiprocessing Time: {end_time - start_time:.4f} seconds")


### The multiprocessing.Pool simplifies these issues:
- Automatic Creation and Management:
Creates a fixed number of processes.
- Limits Processes:
You can specify the maximum number of processes, avoiding CPU overload.
- Automatic Work Distribution:
Tasks are evenly distributed across processes.

### Common Methods of multiprocessing.Pool:
#### i. map(func, iterable, chunksize=None)
- This method applies the func function to every item of the iterable, distributing the work to the worker processes in the pool.
- Usage: It is similar to Python’s built-in map() but distributed across processes.

Parameters:
- func: The function to apply to each item of the iterable.
- iterable: The input iterable (like a list or range).
- chunksize (optional): The size of each chunk of data to send to each worker.
- Returns: A list of results (in the same order as the input iterable).

Returns:
- A list of results (same order as the input iterable).

In [None]:
from multiprocessing import Pool

def square(x):
    return x ** 2

if __name__ == "__main__":
    with Pool(4) as p:
        results = p.map(square, [1, 2, 3, 4, 5])
    print("map results:", results)  # Output: [1, 4, 9, 16, 25]


#### ii. apply(func, args=(), kwds={})
- The apply() method applies the function (func) to the given arguments (args) and keyword arguments (kwds) in a single worker process. This method blocks until the function completes execution.

- Usage:
Use this method when you need to execute a function on a single set of arguments and wait for it to complete.


Parameters:
- func: The function to apply.
- args: A tuple of arguments to pass to the function.
- kwds: A dictionary of keyword arguments to pass to the function.

Returns:
- The result of the function call.

In [None]:
from multiprocessing import Pool

def multiply(x, y):
    return x * y

if __name__ == "__main__":
    with Pool(4) as p:
        result = p.apply(multiply, args=(5, 6))
    print("apply result:", result)  # Output: 30


#### iii. apply_async(func, args=(), kwds={}, callback=None)
- The apply_async() method is the asynchronous version of apply(). It allows the function to execute in a worker process without blocking the main program. You can specify a callback function to be executed when the result is ready.

- Usage:
Use apply_async() when you need to execute a function in parallel and don’t need to wait for it to finish immediately.

Parameters:
- func: The function to apply.
- args: A tuple of arguments to pass to the function.
- kwds: A dictionary of keyword arguments to pass to the function.
- callback: A function to call with the result once the task finishes.

Returns:
- An AsyncResult object, which allows you to check the status of the task or get the result when ready.

In [None]:
from multiprocessing import Pool

def square(x):
    return x ** 2

def print_result(result):
    print(f"apply_async result: {result}")

if __name__ == "__main__":
    with Pool(4) as p:
        async_result = p.apply_async(square, args=(4,), callback=print_result)

        # Get the result later
        async_result.get()  # This blocks until the task is completed


#### iv. map_async(func, iterable, chunksize=None, callback=None, error_callback=None)
- map_async() is the asynchronous version of map(). It executes func on each item in the iterable concurrently and returns an AsyncResult object. This allows you to process the results asynchronously, and you can specify a callback or an error_callback for handling results and errors.

- Usage:
Use map_async() when you want to apply a function to multiple items in parallel and get the results asynchronously.

Parameters:
- func: The function to apply to each item.
- iterable: The input iterable.
- chunksize: The size of each chunk of data to send to each worker.
- callback: A function to call with the result once the task finishes.
- error_callback: A function to call if an error occurs during execution.

Returns:
- An AsyncResult object.



In [None]:
from multiprocessing import Pool

def square(x):
    return x ** 2

def print_results(results):
    print(f"map_async results: {results}")

if __name__ == "__main__":
    with Pool(4) as p:
        async_result = p.map_async(square, [1, 2, 3, 4, 5], callback=print_results)


#### v. close()
- The close() method prevents any more tasks from being added to the pool. Once you call close(), you can no longer add new tasks to the pool.

- Usage:
Use close() when you are done adding tasks to the pool and want to prevent further tasks from being submitted.


#### vi. join()
- The join() method blocks the main program until all worker processes have completed their tasks. It's typically used after calling close() to wait for all processes to finish.

- Usage:
Call join() when you want to wait for all worker processes to finish executing before continuing with the program.


#### vii. terminate()
- The terminate() method immediately terminates all worker processes in the pool. This method can be used if you need to stop all processes prematurely.

- Usage:
Use terminate() to forcefully stop all worker processes. However, this may result in unfinished tasks or other side effects


#### viii. get() (Used with apply_async() or map_async())
- The get() method is used with asynchronous methods (apply_async(), map_async()) to retrieve the result of the operation once it is complete.

- Usage:
Use get() when you want to retrieve the result of an asynchronous task.

In [None]:
from multiprocessing import Pool

def square(x):
    return x ** 2

if __name__ == "__main__":
    with Pool(4) as p:
        async_result = p.apply_async(square, args=(4,))
        result = async_result.get()  # Block until result is ready
    print("Async result:", result)  # Output: 16


### 1.2 Process Pool Executor:
- ProcessPoolExecutor from Python's concurrent.futures module provides a simple interface for parallel processing using multiple processes. It allows you to execute tasks concurrently, utilizing multiple CPU cores efficiently.

### Methods of ProcessPoolExecutor:
#### i. submit(fn, *args, **kwargs)
- Submits a function (fn) to be executed asynchronously in a separate process.
- Returns a Future object, which can be used to retrieve the result later.
- Non-blocking: Allows you to submit multiple tasks and retrieve results when needed.
#### ii. map(func, *iterables, timeout=None, chunksize=1)
- Applies a function (func) to each item in iterables concurrently.
- Returns results in the same order as the input.
- Blocking: Waits until all tasks are completed before returning results.
#### iii. shutdown(wait=True)
- It is used to cleanly shut down the ProcessPoolExecutor and free up the resources that were used by the worker processes.

How it works:
- wait=True: Blocks the main program until all submitted tasks are completed and all worker processes are cleaned up (terminated).
- wait=False: Initiates the shutdown process, but doesn't block the program. The program can continue running while the executor is shutting down in the background.
#### iv. result(timeout=None) (from Future)
- Blocks and returns the result of the task once it’s completed.
- Can specify a timeout to raise an exception if the task takes too long.
#### v. done() (from Future)
- Returns True if the task is finished, otherwise False.
#### vi. cancel() (from Future)
- Attempts to cancel the task if it hasn't started yet.
#### vii. exception(timeout=None) (from Future)
- If the task raised an exception, it returns the exception; otherwise, it returns None.

In [None]:
from concurrent.futures import ProcessPoolExecutor
import time

# Function to square a number
def square(x):
    time.sleep(2)  # Simulate a time-consuming task
    return x ** 2

if __name__ == "__main__":
    # Create ProcessPoolExecutor with 2 worker processes
    with ProcessPoolExecutor(max_workers=2) as executor:
        future1 = executor.submit(square, 10)
        future2 = executor.submit(square, 20)

        print(f"Result of future1: {future1.result()}")
        print(f"Result of future2: {future2.result()}")

    # The executor is automatically shut down when the 'with' block is exited.
    # However, we can explicitly call shutdown() if needed:
    # executor.shutdown(wait=True)
    print("Executor has been shut down and resources are freed.")


- Functionality: All the tasks that you can perform with ProcessPoolExecutor can also be done with multiprocessing.Pool. Both are capable of parallelizing tasks, handling errors, managing results, and canceling tasks.
- Real Advantage: The advantage of ProcessPoolExecutor lies in its simplicity and flexibility in managing parallel tasks. It makes it easier to handle individual task status, error handling, and result collection with less boilerplate code compared to multiprocessing.Pool.

- But: Both can achieve the same outcomes. The choice comes down to how much manual work you want to handle and how much abstraction you prefer.