
Multiprocessing in Python refers to a technique of executing multiple processes concurrently, taking advantage of multiple CPU cores to improve performance and efficiency. Python's multiprocessing module provides a way to create and manage processes, allowing developers to leverage parallelism in their programs.

Here are some reasons why multiprocessing in Python is useful:

Parallelism: Multiprocessing allows you to execute multiple tasks simultaneously, taking advantage of multiple CPU cores. This can significantly speed up the execution of CPU-bound tasks.

Improved Performance: By distributing tasks across multiple processes, you can reduce the overall execution time of your program, especially when dealing with computationally intensive tasks.

Concurrency: Multiprocessing enables concurrent execution of tasks, which can be useful for I/O-bound operations where the program spends a lot of time waiting for external resources (such as disk I/O or network I/O).

Isolation: Each process has its own memory space, which provides better isolation compared to threading. This can help prevent issues like race conditions and deadlocks.

Fault Tolerance: If one process crashes due to an error, it typically does not affect other processes, ensuring better fault tolerance in multiprocessing compared to multithreading.

Scalability: Multiprocessing allows you to scale your application across multiple CPU cores, making it suitable for handling large workloads and increasing throughput.

Multiprocessing and multithreading are both techniques used to achieve concurrency and parallelism in programming, but they have distinct differences in how they handle concurrency and utilize system resources. Here are the key differences between multiprocessing and multithreading:

Execution Model:

Multiprocessing: In multiprocessing, multiple processes are created, each with its own memory space and resources. These processes run independently and can execute different parts of the program concurrently. Communication between processes typically involves inter-process communication (IPC) mechanisms such as pipes, queues, or shared memory.
Multithreading: In multithreading, multiple threads are created within a single process. Threads share the same memory space and resources, allowing them to access shared data directly. Threads within the same process can execute different parts of the program concurrently.
Isolation:

Multiprocessing: Processes are isolated from each other, with their own memory space. This provides better protection against memory corruption and crashes, as one process crashing generally does not affect others.
Multithreading: Threads within the same process share the same memory space, which can lead to issues like race conditions and data corruption if not properly synchronized.
Resource Usage:

Multiprocessing: Each process has its own memory space and resources, including CPU cores. Multiprocessing can utilize multiple CPU cores effectively and is suitable for CPU-bound tasks.
Multithreading: Threads within the same process share resources such as memory and CPU cores. Multithreading is more suitable for I/O-bound tasks where threads spend a lot of time waiting for external resources (e.g., disk I/O, network I/O), as threads can be blocked without affecting the execution of other threads.
Overhead:

Multiprocessing: Creating and managing separate processes incurs more overhead compared to multithreading. Inter-process communication can also introduce additional overhead.
Multithreading: Creating and managing threads within the same process has less overhead compared to multiprocessing. However, synchronization between threads can introduce overhead and complexity.
Scalability:

Multiprocessing: Multiprocessing can scale across multiple CPU cores, making it suitable for parallel execution of CPU-bound tasks.
Multithreading: Due to the Global Interpreter Lock (GIL) in CPython, multithreading may not scale well for CPU-bound tasks in Python, as only one thread can execute Python bytecode at a time. However, it can still be effective for I/O-bound tasks and situations where the GIL is not a bottleneck.

In [1]:
import multiprocessing
import os
import time

def worker():
    """Function to be executed by the process."""
    print(f"Worker process ID: {os.getpid()}")
    time.sleep(2)  # Simulate some task
    print("Worker process finished execution.")

if __name__ == "__main__":
    # Create a multiprocessing Process object
    process = multiprocessing.Process(target=worker)

    # Start the process
    process.start()

    print(f"Main process ID: {os.getpid()}")
    print("Main process waiting for the worker process to finish...")

    # Wait for the process to finish
    process.join()

    print("Main process resumed execution.")


Worker process ID: 214
Main process ID: 77
Main process waiting for the worker process to finish...
Worker process finished execution.
Main process resumed execution.


Initialization: You create a multiprocessing pool by initializing an instance of the multiprocessing.Pool class, specifying the desired number of worker processes (if not specified, it defaults to the number of CPU cores).

Task Distribution: Once the pool is created, you can submit tasks to the pool using its apply, map, apply_async, or map_async methods. These methods distribute the tasks among the worker processes in the pool.

Task Execution: The worker processes in the pool execute the submitted tasks concurrently. Each process picks up a task from the task queue, executes it, and returns the result (if any) to the parent process.

Result Retrieval: After all tasks are completed, you can retrieve the results (if any) from the pool. The results can be obtained using the get method of the result objects returned by the apply_async or map_async methods.

Multiprocessing pools are useful for several reasons:

Parallelism: Pools allow you to execute multiple tasks concurrently, leveraging the processing power of multiple CPU cores. This can significantly speed up the execution of CPU-bound tasks.

Simplified Task Management: Pools abstract away the complexities of managing multiple processes, such as process creation, synchronization, and communication. You can focus on defining the tasks and let the pool handle the process management.

Efficient Resource Utilization: Pools manage a fixed number of worker processes, which helps in efficiently utilizing system resources without overwhelming the system with too many processes.

Scalability: Pools can scale across multiple CPU cores, making them suitable for parallel execution of computationally intensive tasks on multicore systems.

Asynchronous Execution: Pools support asynchronous task execution, allowing you to submit tasks asynchronously and continue with other tasks while waiting for the results.