## Creating processes in Python

In this notebook, we will explore how to create processes.

### Parent child processes

In computer science and operating systems, the terms “parent” and “child” processes are used to describe the relationship between two processes.

A process is an instance of a program that is being executed. It represents a running program along with its current state, including variables, memory, and other resources. When a program is executed, an operating system creates a process to run the program.

A parent process is a process that creates another process, known as the child process. The parent process typically initiates the creation of the child process by using a system call, such as “fork” in Unix-like operating systems. The parent process is responsible for managing and controlling the child process.

When a parent process creates a child process, the child process inherits certain attributes from the parent, such as the environment variables, open files, and the memory space. However, **the child process has its own unique process identifier (PID) and runs independently of the parent process**.

The relationship between a parent process and a child process is often referred to as a “parent-child relationship.” The parent process can monitor the execution of the child process, communicate with it, and perform other operations such as terminating or signaling it.

It's worth noting that a parent process can have multiple child processes, forming a hierarchical structure. This allows for the creation of complex systems and the execution of concurrent tasks.

<img src="./pics/parent_child_process.svg" alt="Parent and child processes" width="800" height="400">

### Creating processes in Python

In Python, you can create processes using the `multiprocessing` module, which provides a way to spawn child processes and communicate with them. Here's an example that demonstrates how to create processes in Python:

```python
import multiprocessing
import os

def worker():
    """Function to be executed by the child process"""
    print(f'Worker process executing with PID: {os.getpid()}')


if __name__ == '__main__':
    # Create a new process
    process = multiprocessing.Process(target=worker)
    
    # Start the process
    process.start()
    
    print(f'Parent process executing with PID: {os.getpid()}')
    
    # Wait for the process to complete
    process.join()
```

In this example, we define a function called `worker()` that represents the task to be executed by the child process. The function simply prints a message indicating that it is executing.

Inside the `if __name__ == '__main__':` block,  we create a new process using the `multiprocessing.Process` constructor. We pass the `target` argument with the function `worker`, which specifies the function to be executed by the child process.

Next, we start the process by calling the `start()` method on the process object. This will spawn a new process that executes the `worker` function.

After starting the process, we call the `join()` method to wait for the child process to complete. This ensures that the parent process waits until the child process finishes its execution before continuing.

Finally, we print a message indicating that the parent process is executing.

When you run this Python script, you'll see the output:

```
Worker process executing with PID: 5092
Parent process executing with PID: 3868
```

This demonstrates the basic process creation and execution flow in Python using the `multiprocessing` module. You can extend this example to perform more complex tasks and communication between processes using various features provided by the module.

In [9]:
import multiprocessing
import os

def worker():
    """Function to be executed by the child process"""
    print(f'Worker process executing with PID: {os.getpid()}')


if __name__ == '__main__':
    # Create a new process
    process = multiprocessing.Process(target=worker)
    
    # Start the process
    process.start()
    
    print(f'Parent process executing with PID: {os.getpid()}')
    
    # Wait for the process to complete
    process.join()

Worker process executing with PID: 5237
Parent process executing with PID: 3868


### Passing arguments to Process functions

When creating processes in Python using the `multiprocessing` module, you can pass arguments to the target function of the process. There are a few different ways to pass arguments, depending on your specific requirements. Here are a few examples:

1. **Positional Arguments**:
You can pass positional arguments to the target function by specifying them as additional arguments when creating the `Process` object. Here's an example:

```python
import multiprocessing

def worker(name, age):
    """Function to be executed by the child process"""
    print(f"Worker process executing with name: {name} and age: {age}")

if __name__ == '__main__':
    # Create a new process with arguments
    process = multiprocessing.Process(target=worker, args=('John', 25))

    # Start the process
    process.start()

    # Wait for the process to complete
    process.join()

    print("Parent process executing")
```

In this example, we pass two positional arguments, `'John'` and `25`, to the `worker()` function by specifying them in the `args` parameter when creating the `Process` object.

2. **Keyword Arguments**:
You can also pass keyword arguments to the target function using the `kwargs` parameter when creating the `Process` object. Here's an example:

```python
import multiprocessing

def worker(name, age):
    """Function to be executed by the child process"""
    print(f"Worker process executing with name: {name} and age: {age}")

if __name__ == '__main__':
    # Create a new process with keyword arguments
    process = multiprocessing.Process(target=worker, kwargs={'name': 'John', 'age': 25})

    # Start the process
    process.start()

    # Wait for the process to complete
    process.join()

    print("Parent process executing")
```

In this example, we pass the arguments `'name': 'John'` and `'age': 25` to the `worker()` function using the `kwargs` parameter.

3. **Combination of Positional and Keyword Arguments**:
You can combine both positional and keyword arguments when creating the `Process` object. Here's an example:

```python
import multiprocessing

def worker(name, age):
    """Function to be executed by the child process"""
    print(f"Worker process executing with name: {name} and age: {age}")

if __name__ == '__main__':
    # Create a new process with positional and keyword arguments
    process = multiprocessing.Process(target=worker, args=('John',), kwargs={'age': 25})

    # Start the process
    process.start()

    # Wait for the process to complete
    process.join()

    print("Parent process executing")
```

In this example, we pass the positional argument `'John'` and the keyword argument `'age': 25` to the `worker()` function.

By using these methods, you can pass arguments to the target function of the child process when creating processes in Python using the `multiprocessing` module.

In [56]:
import multiprocessing

def worker(name, age):
    """Function to be executed by the child process"""
    print(f"Worker process executing with name: {name} and age: {age}")

if __name__ == '__main__':
    # Create a new process with arguments
    process = multiprocessing.Process(target=worker, args=('John', 25))

    # Start the process
    process.start()

    # Wait for the process to complete
    process.join()

    print("Parent process executing")

Worker process executing with name: John and age: 25
Parent process executing


### Communicating with another process and sharing data

Remember that processes are typically **isolated from each other and run in separate memory spaces**. This isolation ensures that each process has its own independent memory and does not interfere with the memory of other processes. This isolation is a fundamental feature of operating systems and is key to ensuring stability, security, and reliability.

Due to this process isolation, you cannot directly share data between processes in Python. Each process has its own memory space, and modifications made to variables in one process do not affect the variables in another process.

To enable communication and data sharing between processes, you need to use specific mechanisms provided by Python's multiprocessing module, such as pipes, queues, or shared memory. These mechanisms facilitate inter-process communication (IPC) by providing ways to exchange data or share memory regions between processes.

Using these IPC mechanisms ensures that data is safely and efficiently transferred between processes without violating the process isolation. They handle the necessary synchronization and communication protocols required for inter-process communication.

By using these IPC mechanisms, you can safely and effectively share data between processes in Python, while respecting the process isolation principles enforced by the operating system.

In Python, you can communicate with another process and share data between processes using various inter-process communication (IPC) mechanisms. Here are a few commonly used methods:

#### 1. Pipes:

Pipes provide a way to establish a communication channel between two processes. In Python, you can use the `multiprocessing.Pipe()` function to create a pipe. The pipe returns two connection objects, one for each end of the pipe. You can send data between processes through these connections. Here's a simple example:

```python
from multiprocessing import Process, Pipe, connection
from typing import List

def sender(conn: connection.Connection , messages: List[str]):
    for msg in messages:
        # Send message
        conn.send(msg)
        while True:
            response = conn.recv()
            # Check if message is received
            if response == 'ack':
                break
            else:
                # Retry sending the message
                conn.send(msg)

    conn.close()

def receiver(conn: connection.Connection) -> None:
    while True:
        # Receive message
        message = conn.recv()
        # Send acknowledgment
        conn.send('ack')
        print("Received:", message)
        if message == 'Bye':
            print('Closing connection...')
            conn.close()
            break

if __name__ == '__main__':
    messages = [
        'Hi',
        'How are you?',
        'Bye'
    ]
    
    # Create a pipe for communication
    parent_conn, child_conn = Pipe()
    
    # Create sender and receiver processes
    sender_process = Process(target=sender, args=(parent_conn, messages))
    receiver_process = Process(target=receiver, args=(child_conn,))

    # Start the sender and receiver processes
    sender_process.start()
    receiver_process.start()

    # Wait for the sender and receiver processes to finish
    sender_process.join()
    receiver_process.join()
```

This code demonstrates inter-process communication using pipes in. It involves two processes: a sender process and a receiver process.

The sender process, defined in the `sender` function, takes a connection object (`conn`) and a list of messages as arguments. It iterates over each message in the list and sends it through the connection using `conn.send(msg)`. After sending a message, it enters a loop to wait for a response from the receiver process. It calls `conn.recv()` to receive the response, expecting an acknowledgment message (`'ack'`). If the response is `'ack'`, indicating that the message was received successfully, it breaks out of the loop and proceeds to send the next message. If the response is not `'ack'`, it assumes there was an issue and retries sending the message. This loop continues until all messages have been sent. Finally, it closes the connection using `conn.close()`.

The receiver process, defined in the `receiver()` function, takes a connection object (`conn`) as an argument. It enters an infinite loop, continuously waiting to receive messages through the connection using `conn.recv()`. After receiving a message, it sends an acknowledgment message (`'ack'`) back to the sender process using `conn.send('ack')`. It also prints the received message using `print("Received:", message)`. If the received message is `'Bye'`, indicating the end of communication, it prints a closing message, closes the connection using `conn.close()`, and breaks out of the loop

> The `Pipe()` function returns a pair of connection objects connected by a pipe, which by default is duplex (two-way).

> The two connection objects returned by `Pipe()` represent the two ends of the pipe. Each connection object has `send()` and `recv()` methods (among others). 

In [57]:
from multiprocessing import Process, Pipe, connection
from typing import List

def sender(conn: connection.Connection , messages: List[str]):
    for msg in messages:
        # Send message
        conn.send(msg)
        while True:
            response = conn.recv()
            # Check if message is received
            if response == 'ack':
                break
            else:
                # Retry sending the message
                conn.send(msg)

    conn.close()

def receiver(conn: connection.Connection) -> None:
    while True:
        # Receive message
        message = conn.recv()
        # Send acknowledgment
        conn.send('ack')
        print("Received:", message)
        if message == 'Bye':
            print('Closing connection...')
            conn.close()
            break

if __name__ == '__main__':
    messages = [
        'Hi',
        'How are you?',
        'Bye'
    ]
    
    # Create a pipe for communication
    parent_conn, child_conn = Pipe()
    
    # Create sender and receiver processes
    sender_process = Process(target=sender, args=(parent_conn, messages))
    receiver_process = Process(target=receiver, args=(child_conn,))

    # Start the sender and receiver processes
    sender_process.start()
    receiver_process.start()

    # Wait for the sender and receiver processes to finish
    sender_process.join()
    receiver_process.join()

Received: Hi
Received: How are you?
Received: Bye
Closing connection...


#### 2. Queues:
   Queues provide a safe way to exchange data between processes. The `multiprocessing.Queue()` class can be used to create a shared queue. Multiple processes can put items into the queue and retrieve them. Here's an example:

```python
from multiprocessing import Process, Queue

def producer(queue):
   queue.put("Item 1")
   queue.put("Item 2")
   queue.put("Item 3")

def consumer(queue):
   while not queue.empty():
       item = queue.get()
       print("Consumed:", item)

if __name__ == '__main__':
   queue = Queue()

   producer_process = Process(target=producer, args=(queue,))
   consumer_process = Process(target=consumer, args=(queue,))

   producer_process.start()
   consumer_process.start()

   producer_process.join()
   consumer_process.join()
```

   In this example, the `producer` puts items into the queue, and the `consumer` function retrieves and prints the items.

In [58]:
from multiprocessing import Process, Queue

def producer(queue):
    queue.put("Item 1")
    queue.put("Item 2")
    queue.put("Item 3")

def consumer(queue):
    while not queue.empty():
        item = queue.get()
        print("Consumed:", item)

if __name__ == '__main__':
    queue = Queue()

    producer_process = Process(target=producer, args=(queue,))
    consumer_process = Process(target=consumer, args=(queue,))

    producer_process.start()
    consumer_process.start()

    producer_process.join()
    consumer_process.join()

Consumed: Item 1
Consumed: Item 2
Consumed: Item 3


#### 3. Shared Memory:
Shared memory allows multiple processes to access the **same region of memory**.

In the `multiprocessing` module of Python, the `Value` and `Array` classes are provided to create shared memory objects that can be accessed and modified by multiple processes.

1. `multiprocessing.Value`: This class allows you to create a shared variable of a specific type. The `Value` class takes two arguments: the data type of the variable and its initial value. The supported data types are `'b'` (boolean), `'i'` (integer), `'d'` (double-precision float), and `'c'` (character).

   Here's an example that creates a shared integer variable using `Value`:

```python
from multiprocessing import Process, Value

def increment_counter(counter):
   counter.value += 1

if __name__ == '__main__':
   counter = Value('i', 0)

   processes = []
   for _ in range(5):
       process = Process(target=increment_counter, args=(counter,))
       processes.append(process)
       process.start()

   for process in processes:
       process.join()

   print("Counter value:", counter.value)
```

In this example, a shared integer variable named `counter` is created with an initial value of 0. The `increment_counter` function is defined to increment the value of `counter` by 1. Multiple processes are created, and each process calls the `increment_counter` function, passing the shared `counter` variable as an argument. After all the processes finish, the final value of `counter` is printed.

2. `multiprocessing.Array`: This class allows you to create a shared array of a specific type. The `Array` class takes two arguments: the data type of the array and its initial values. The supported data types for the array are the same as those for `Value`.

   Here's an example that demonstrates the usage of `Array` to create a shared array:

```python
from multiprocessing import Process, Array

def square_numbers(numbers):
   for i in range(len(numbers)):
       numbers[i] = numbers[i] ** 2

if __name__ == '__main__':
   numbers = [1, 2, 3, 4, 5]
   shared_numbers = Array('i', numbers)

   processes = []
   for _ in range(2):
       process = Process(target=square_numbers, args=(shared_numbers,))
       processes.append(process)
       process.start()

   for process in processes:
       process.join()

   print("Shared numbers:", shared_numbers[:])
```

In this example, a regular Python list named `numbers` is created with some initial values. The `Array` class is then used to create a shared array named `shared_numbers` of type integer ('i'), initialized with the values from `numbers`. The `square_numbers` function is defined to square each element of the shared array. Two processes are created, and each process calls the `square_numbers` function, passing the shared array as an argument. After the processes finish, the final contents of the shared array are printed.

Both `Value` and `Array` provide a convenient way to share data between processes. However, it's important to ensure proper synchronization and coordination when multiple processes access or modify the shared memory to avoid race conditions or data corruption.

In [59]:
from multiprocessing import Process, Value

def increment_counter(counter):
   counter.value += 1

if __name__ == '__main__':
   counter = Value('i', 0)

   processes = []
   for _ in range(5):
       process = Process(target=increment_counter, args=(counter,))
       processes.append(process)
       process.start()

   for process in processes:
       process.join()

   print("Counter value:", counter.value)

Counter value: 5


In [60]:
from multiprocessing import Process, Array

def square_numbers(numbers):
   for i in range(len(numbers)):
       numbers[i] = numbers[i] ** 2

if __name__ == '__main__':
   numbers = [1, 2, 3, 4, 5]
   shared_numbers = Array('i', numbers)

   processes = []
   for _ in range(2):
       process = Process(target=square_numbers, args=(shared_numbers,))
       processes.append(process)
       process.start()

   for process in processes:
       process.join()

   print("Shared numbers:", shared_numbers[:])

Shared numbers: [1, 16, 81, 256, 625]
