# **Assignment : Files & Exceptional Handling (Module 8)**

###**Question 1: Discuss the scenarios where multithreading is preferable to multiprocessing and scenarios where multiprocessing is a better choice.**###

## Solution :

### *Multithreading is Preferable under below conditions:*

Multithreading is typically better suited for I/O-bound tasks, where tasks spend significant time waiting for external resources, like file I/O or network requests, rather than performing heavy computations.

This is because threads share the same memory space and can easily communicate with each other.

### Ideal Scenarios for Multithreading:

I/O-Bound Tasks :

When tasks involve waiting for input or output operations, such as reading files, querying databases, or making network requests. Threads can remain idle while waiting for the I/O to complete and can quickly switch to another thread.


Lightweight Tasks:

When tasks are relatively lightweight and don’t require extensive CPU resources, as multithreading allows for quick context switching and lower memory overhead.

Shared Memory Requirements:

When tasks need to share data or variables between each other frequently, threads can be more efficient because they share the same memory space and avoid the need for inter-process communication (IPC).

Limited Memory Resources:

Since threads share the same memory, they typically use less memory than multiple processes. For applications with limited memory resources, multithreading can be more efficient.

User Interface Applications:

In applications with a graphical user interface (GUI), multithreading is useful to keep the application responsive (e.g., background tasks that don’t freeze the UI).

Example:

Web servers handle multiple network requests simultaneously, often using threads. Since each request may spend time waiting on a network response, threads can manage multiple requests without needing excessive CPU time.

### Multiprocessing is Preferable
Multiprocessing is generally better for CPU-bound tasks that require intensive computation. Each process runs in its own memory space, allowing for parallel execution on multiple CPU cores without the limitations of the Global Interpreter Lock (GIL) in Python.

### Ideal Scenarios for Multiprocessing:
CPU-Bound Tasks:

When tasks involve heavy computation (like mathematical calculations or data processing) that can be done independently, multiprocessing can effectively use multiple CPU cores.

Avoiding the GIL:

In Python, the Global Interpreter Lock (GIL) prevents multiple threads from executing Python bytecode in true parallel on multiple cores. Multiprocessing bypasses the GIL since each process has its own Python interpreter instance.

High Isolation Requirements:

When tasks require strong isolation (e.g., separate memory spaces), multiprocessing can be more stable because each process is separate and won’t affect the others if it crashes or has memory leaks.

Large or Independent Tasks:

For tasks that are large and do not need to share memory frequently, multiprocessing is more efficient. Since processes don’t share memory, they communicate through IPC methods like pipes or message queues.

Background Workers for Heavy Tasks:

For tasks that need to run in the background but involve significant computations, like data analytics or machine learning, multiprocessing can handle these tasks effectively without impacting the main program.

Example:

Image processing, where large batches of images need to be processed independently (e.g., resizing, filtering) can leverage multiprocessing to run parallel processes, allowing each CPU core to process a different image simultaneously.

### **Question 2: Describe what a process pool is and how it helps in managing multiple processes efficiently.**

## Solution :


A **process pool** is a programming construct used to manage and execute multiple processes concurrently (at same time), typically when performing tasks that can benefit from parallel processing.

In Python, a process pool is commonly managed through libraries like multiprocessing or concurrent.futures.

A process pool allows for the creation of a fixed number of worker processes that can run tasks in parallel, balancing the workload across multiple processes without requiring the programmer to explicitly handle process creation, termination, and task distribution.

It helps to manage multiple processes efficiently because of following key features:

**Efficient Resource Management:** By limiting the number of concurrent processes, a process pool avoids the overhead and system strain that can come from creating too many processes at once. This is especially useful when tasks involve heavy computations, which can exhaust system resources quickly.

**Automatic Task Distribution:** A process pool handles task distribution by assigning tasks to available worker processes automatically. This reduces the overhead of manually assigning tasks to processes and tracking their states.

**Reduced Process Creation Overhead:** Creating and destroying processes repeatedly can be computationally expensive. A process pool mitigates this by reusing a set number of worker processes, which minimizes the overhead involved with repeatedly creating and terminating processes.

**Simplified Parallel Execution:** Process pools abstract much of the complexity of parallel processing, allowing developers to focus on the tasks at hand rather than managing the low-level details of process lifecycles, inter-process communication, and synchronization

In [1]:
# Example of Using a Process Pool in Python

from multiprocessing import Pool

def square(n):
    """
    Computes the square of a given number.

    Parameters:
        n (int): The number to be squared.

    Returns:
        int: The square of the input number.
    """
    # Calculate the square of the input number
    return n * n

if __name__ == "__main__":
    numbers = [1, 2, 3, 4, 5]

    # Create a process pool with 3 worker processes
    with Pool(processes=3) as pool:
        results = pool.map(square, numbers)
    # Output
    print(results)

[1, 4, 9, 16, 25]


### **Question 3: Explain what multiprocessing is and why it is used in Python programs.**

## Solution :

**Multiprocessing** is a technique that allows a program to use multiple CPU cores by running several processes simultaneously.  

Each process runs independently and can execute code in parallel with other processes, enabling programs to perform tasks concurrently.

**Benifits Of Using Multiprocessing In Python:**

**Overcoming the GIL** : In Python, the GIL only allows one thread to execute Python bytecode at a time within a single process. This can limit the effectiveness of multithreading, especially for CPU-bound tasks (tasks that use a lot of CPU resources). Multiprocessing bypasses this limitation by creating separate processes, each with its own interpreter and memory space, allowing true parallelism.

**Parallel Execution** : Multiprocessing enables true parallelism, which is ideal for CPU-intensive tasks like data processing, scientific computing, image processing, and machine learning computations. Each process can run on a separate CPU core, distributing the workload across multiple cores and significantly speeding up execution.

**Isolation Between Processes** : Unlike threads, processes have their own memory space, so they don’t share memory by default. This reduces the risk of data corruption and simplifies debugging since there’s less concern about race conditions or locks that are common in shared-memory threading models.

**Scalability for Resource-Intensive Tasks** : For large datasets or complex algorithms, multiprocessing helps by dividing the workload among processes. Each process can independently perform part of the computation, making it scalable and efficient, especially on multi-core machines.

In [2]:
# Example of Multiprocessing in Python
"""
Below code demonstrates creating multiple processes to perform
a CPU-bound task (calculating the square of a number) in parallel.
"""

from multiprocessing import Process

def square(number):
    """
    Function to compute the square of a number.
    """
    result = number * number
    print(f"The square of {number} is {result}")

if __name__ == "__main__":
    # List of numbers to compute squares
    numbers = [1, 2, 3, 4, 5]

    # Create a list to hold the process objects
    processes = []

    # Create and start a separate process for each number
    for number in numbers:
        process = Process(target=square, args=(number,))
        processes.append(process)
        process.start()

    # Wait for all processes to complete
    for process in processes:
        process.join()

The square of 1 is 1The square of 2 is 4

The square of 3 is 9
The square of 4 is 16The square of 5 is 25



###**Question 4:  Write a Python program using multithreading where one thread adds numbers to a list, and another thread removes numbers from the list. Implement a mechanism to avoid race conditions using threading.Lock.**

## Solution :

In [3]:
import threading
import time
import random

# Shared list
shared_list = []

# Lock to avoid race condition
list_lock = threading.Lock()

# Function for adding numbers to the list
def add_numbers():
    for i in range(10):
        time.sleep(random.uniform(0.1, 0.5))  # Simulate work by sleeping for a random time
        with list_lock:  # Acquire the lock before modifying the list
            num = random.randint(1, 100)
            shared_list.append(num)
            print(f"Added {num}, list now: {shared_list}")

# Function for removing numbers from the list
def remove_numbers():
    for i in range(10):
        time.sleep(random.uniform(0.1, 0.5))  # Simulate work by sleeping for a random time
        with list_lock:  # Acquire the lock before modifying the list
            if shared_list:
                num = shared_list.pop(0)  # Remove the first element
                print(f"Removed {num}, list now: {shared_list}")
            else:
                print("List is empty, nothing to remove")

# Create threads for adding and removing numbers
add_thread = threading.Thread(target=add_numbers)
remove_thread = threading.Thread(target=remove_numbers)

# Start the threads
add_thread.start()
remove_thread.start()

# Wait for both threads to finish
add_thread.join()
remove_thread.join()

print("Final list:", shared_list)

List is empty, nothing to remove
Added 19, list now: [19]
Added 1, list now: [19, 1]
Removed 19, list now: [1]
Added 73, list now: [1, 73]
Removed 1, list now: [73]
Added 55, list now: [73, 55]
Removed 73, list now: [55]
Added 33, list now: [55, 33]
Removed 55, list now: [33]
Added 79, list now: [33, 79]
Removed 33, list now: [79]
Added 33, list now: [79, 33]
Removed 79, list now: [33]
Added 31, list now: [33, 31]
Added 2, list now: [33, 31, 2]
Removed 33, list now: [31, 2]
Added 91, list now: [31, 2, 91]
Removed 31, list now: [2, 91]
Removed 2, list now: [91]
Final list: [91]


###**Question 5: Describe the methods and tools available in Python for safely sharing data between threads and processes.**

## Solution :

Python provides several methods and tools to safely share data between threads and processes, helping avoid race conditions, deadlocks, and data corruption.

**Threads**: Use Lock, RLock, Semaphore, Condition, Event, and Queue for thread-safe access and coordination.

**Processes**: Use Queue, Pipe, Value, Array, Manager, and synchronization primitives like Lock and Event from multiprocessing for safe inter-process communication and data sharing.

Let me explain the tools and methods used to safely share data between threads and processes.

**For Threads**


---


**Lock (threading.Lock)**

A Lock is the simplest and most commonly used mechanism for mutual exclusion, ensuring that only one thread can access a shared resource at a time.

In [4]:
# Example

import threading

lock = threading.Lock()
shared_resource = 0

def increment():
    global shared_resource
    for _ in range(1000):
        with lock:
            shared_resource += 1

threads = [threading.Thread(target=increment) for _ in range(2)]
for t in threads: t.start()
for t in threads: t.join()

print("Final value:", shared_resource)

Final value: 2000


**RLock (threading.RLock)**

An RLock (reentrant lock) is similar to a Lock, but it allows the same thread to acquire it multiple times without blocking itself. This is useful when a thread needs to access a shared resource recursively.

In [5]:
# Example

import threading

rlock = threading.RLock()
shared_resource = 0

def increment():
    global shared_resource
    for _ in range(1000):
        with rlock:
            with rlock:  # Can acquire RLock again
                shared_resource += 1

threads = [threading.Thread(target=increment) for _ in range(2)]
for t in threads: t.start()
for t in threads: t.join()

print("Final value:", shared_resource)

Final value: 2000


**Semaphore (threading.Semaphore)**

A Semaphore limits access to a shared resource to a fixed number of threads.

For instance, a semaphore with a value of 3 allows up to three threads to access the resource simultaneously.

In [6]:
# Example

import threading
import time

semaphore = threading.Semaphore(2)

def access_resource():
    with semaphore:
        print(f"{threading.current_thread().name} accessing")
        time.sleep(1)

threads = [threading.Thread(target=access_resource) for _ in range(5)]
for t in threads: t.start()
for t in threads: t.join()

Thread-19 (access_resource) accessing
Thread-20 (access_resource) accessing
Thread-21 (access_resource) accessing
Thread-22 (access_resource) accessing
Thread-23 (access_resource) accessing


**Condition (threading.Condition)**

A Condition is used to coordinate access to shared resources, especially when one thread needs to wait for a specific condition before proceeding.

In [7]:
# Example

import threading

condition = threading.Condition()
shared_list = []

def consumer():
    with condition:
        condition.wait()
        print("Consumed:", shared_list.pop())

def producer():
    with condition:
        shared_list.append(42)
        print("Produced:", shared_list[0])
        condition.notify()

t1 = threading.Thread(target=consumer)
t2 = threading.Thread(target=producer)

t1.start()
t2.start()
t1.join()
t2.join()

Produced: 42
Consumed: 42


**Event (threading.Event)**

An Event is a simple mechanism for signaling between threads.

One thread can wait for an event to be set, while another thread can set it, indicating that an action should proceed.

In [8]:
# Example

import threading
import time

event = threading.Event()

def wait_for_event():
    print("Waiting for event to be set...")
    event.wait()
    print("Event is set, proceeding!")

def set_event():
    time.sleep(1)
    print("Setting event.")
    event.set()

t1 = threading.Thread(target=wait_for_event)
t2 = threading.Thread(target=set_event)

t1.start()
t2.start()
t1.join()
t2.join()

Waiting for event to be set...
Setting event.
Event is set, proceeding!


**Queue (queue.Queue)**

A Queue provides a thread-safe way to store and share data.

It's especially useful in producer-consumer scenarios, as it handles synchronization internally.

In [9]:
from queue import Queue
import threading

queue = Queue()

def producer():
    for i in range(5):
        queue.put(i)
        print("Produced:", i)

def consumer():
    while not queue.empty():
        item = queue.get()
        print("Consumed:", item)
        queue.task_done()

t1 = threading.Thread(target=producer)
t2 = threading.Thread(target=consumer)

t1.start()
t1.join()
t2.start()
t2.join()

Produced: 0
Produced: 1
Produced: 2
Produced: 3
Produced: 4
Consumed: 0
Consumed: 1
Consumed: 2
Consumed: 3
Consumed: 4


**For Processes**


---



**Queue (multiprocessing.Queue)**

A Queue in multiprocessing is similar to a thread Queue, but it enables process-safe communication between processes.

In [10]:
# Example

from multiprocessing import Queue, Process

queue = Queue()

def producer():
    for i in range(5):
        queue.put(i)
        print("Produced:", i)

def consumer():
    while not queue.empty():
        item = queue.get()
        print("Consumed:", item)

p1 = Process(target=producer)
p2 = Process(target=consumer)

p1.start()
p1.join()
p2.start()
p2.join()

Produced: 0
Produced: 1
Produced: 2
Produced: 3
Produced: 4
Consumed: 0
Consumed: 1
Consumed: 2
Consumed: 3
Consumed: 4


**Pipe (multiprocessing.Pipe)**

A Pipe allows for bidirectional communication between two processes. It returns a pair of connection objects, which each process can use to send or receive data.

In [11]:
# Example

from multiprocessing import Pipe, Process

parent_conn, child_conn = Pipe()

def sender():
    child_conn.send("Hello from child")
    child_conn.close()

def receiver():
    print("Received:", parent_conn.recv())

p1 = Process(target=sender)
p2 = Process(target=receiver)

p1.start()
p2.start()
p1.join()
p2.join()

Received: Hello from child


**Value and Array (multiprocessing.Value, multiprocessing.Array)**

Value and Array are shared, mutable objects that provide synchronized access to shared data.

Value holds a single shared value, while Array holds a list of values.


In [12]:
# Example

from multiprocessing import Value, Array, Process

num = Value('i', 0)  # 'i' is for integer
arr = Array('i', [1, 2, 3])

def modify():
    num.value = 42
    for i in range(len(arr)):
        arr[i] += 1

p = Process(target=modify)
p.start()
p.join()

print("Number:", num.value)
print("Array:", arr[:])

Number: 42
Array: [2, 3, 4]


**Manager (multiprocessing.Manager)**

A Manager provides a high-level interface for sharing complex data structures (like dictionaries, lists, etc.) between processes.

In [13]:
from multiprocessing import Manager, Process

manager = Manager()
shared_dict = manager.dict()
shared_list = manager.list()

def modify():
    shared_dict['key'] = 'value'
    shared_list.append(42)

p = Process(target=modify)
p.start()
p.join()

print("Shared dict:", dict(shared_dict))
print("Shared list:", list(shared_list))

Shared dict: {'key': 'value'}
Shared list: [42]


**Lock, RLock, Semaphore, Event, Condition (from multiprocessing)**

Similar to the threading module, multiprocessing provides these synchronization primitives specifically for inter-process synchronization.

### **Question 6:  Discuss why it’s crucial to handle exceptions in concurrent programs and the techniques available for doing so.**

## Solution :

Handling exceptions in concurrent programs is crucial because unhandled exceptions in one thread or process can affect the stability and functionality of the entire program.

Here are the primary reasons for handling exceptions in concurrent programming:

**Stability and Reliability:** An unhandled exception can cause a thread or process to terminate unexpectedly, leaving resources in an inconsistent state or causing data corruption.

**Resource Management:** In concurrent programs, resources like locks, file handles, or network connections are often shared. If an exception is not handled, resources may not be properly released, leading to deadlocks, memory leaks, or other resource issues.

**Debugging and Monitoring:** Exceptions provide insight into issues within concurrent code. If exceptions are handled appropriately, developers can log them for diagnosis without disrupting the application flow.

**Graceful Shutdown**: Without handling exceptions, a program may terminate abruptly, making it difficult to clean up resources, log errors, or inform users.

**Techniques for Handling Exceptions in Concurrent Programs**

---



**Try-Except Blocks in Threads and Processes:**

Wrapping critical code sections in try-except blocks ensures that exceptions are caught at the thread or process level.

This allows logging of errors or cleanup actions before termination.

In [14]:
# Example
import threading

def task():
    try:
        # Simulate work that might raise an exception
        result = 1 / 0
    except Exception as e:
        print("Exception in thread:", e)

thread = threading.Thread(target=task)
thread.start()
thread.join()


Exception in thread: division by zero


**Using a Wrapper Function:**

To ensure that all threads or processes handle exceptions consistently, you can create a wrapper function that includes exception handling and pass it as the target function.

In [15]:
# Example

import threading

def safe_task(task, *args):
    try:
        task(*args)
    except Exception as e:
        print("Handled exception:", e)

def task():
    # This will raise an exception
    result = 1 / 0

thread = threading.Thread(target=safe_task, args=(task,))
thread.start()
thread.join()

Handled exception: division by zero


**Queue-Based Error Handling:**

For thread pools or producer-consumer setups, using a Queue to collect exceptions allows a centralized place to log and handle errors.

In [16]:
# Example

from queue import Queue
import threading

error_queue = Queue()

def task():
    try:
        result = 1 / 0
    except Exception as e:
        error_queue.put(e)

threads = [threading.Thread(target=task) for _ in range(2)]
for t in threads: t.start()
for t in threads: t.join()

while not error_queue.empty():
    print("Exception caught:", error_queue.get())

Exception caught: division by zero
Exception caught: division by zero


**Timeouts and Cancellation:**

Setting timeouts or implementing cancellation mechanisms helps prevent "hanging" threads or processes.

This can be particularly useful when using concurrent.futures or when you want a task to fail gracefully.

In [17]:
# Example

from concurrent.futures import ThreadPoolExecutor, TimeoutError

def task():
    import time
    time.sleep(2)
    return "Completed"

with ThreadPoolExecutor() as executor:
    future = executor.submit(task)
    try:
        result = future.result(timeout=1)  # Set a timeout
    except TimeoutError:
        print("Task timed out")

Task timed out


**Using Finalizers or finally Blocks:**

Always use finally blocks or context managers to ensure resources are released, even if an exception occurs.

This is especially important for releasing locks and closing files.

In [18]:
# Example

import threading

lock = threading.Lock()

def task():
    try:
        with lock:
            result = 1 / 0  # Simulate an exception
    finally:
        print("Lock released or cleanup done")

thread = threading.Thread(target=task)
thread.start()
thread.join()

Exception in thread Thread-34 (task):
Traceback (most recent call last):
  File "/usr/lib/python3.10/threading.py", line 1016, in _bootstrap_inner
    self.run()
  File "/usr/lib/python3.10/threading.py", line 953, in run
    self._target(*self._args, **self._kwargs)
  File "<ipython-input-18-3d0ec0d04fe5>", line 10, in task
ZeroDivisionError: division by zero


Lock released or cleanup done


**Exception Handling in Processes:**

Exceptions in multiprocessing.Process do not propagate to the parent process.

Using a multiprocessing.Queue or Pipe to send exceptions back to the main process allows centralized error handling.

In [19]:
from multiprocessing import Process, Queue

def task(error_queue):
    try:
        result = 1 / 0  # This will raise an exception
    except Exception as e:
        error_queue.put(e)

error_queue = Queue()
process = Process(target=task, args=(error_queue,))
process.start()
process.join()

while not error_queue.empty():
    print("Exception caught:", error_queue.get())

Exception caught: division by zero


### **Question 7: Create a program that uses a thread pool to calculate the factorial of numbers from 1 to 10 concurrently. Use concurrent.futures.ThreadPoolExecutor to manage the threads.**

## Solution :

In [20]:
from concurrent.futures import ThreadPoolExecutor
import math

def factorial(n):
    """
    Calculate the factorial of a given number.

    Args:
        n (int): The number to calculate the factorial for.

    Returns:
        int: The factorial of the number n.
    """
    result = math.factorial(n)
    print(f"Factorial of {n} is {result}")
    return result

def main():
    """
    Main function to calculate factorials for numbers from 1 to 10
    concurrently using a thread pool.
    """
    numbers = range(1, 11)  # Numbers from 1 to 10

    # Create a thread pool executor to manage concurrent threads
    with ThreadPoolExecutor() as executor:
        # Calculate factorials concurrently
        results = executor.map(factorial, numbers)

    # Collect and print results as a list
    results = list(results)
    print("All factorials:", results)

if __name__ == "__main__":
    main()

Factorial of 1 is 1
Factorial of 2 is 2
Factorial of 3 is 6
Factorial of 4 is 24
Factorial of 5 is 120
Factorial of 6 is 720
Factorial of 7 is 5040
Factorial of 8 is 40320
Factorial of 9 is 362880
Factorial of 10 is 3628800
All factorials: [1, 2, 6, 24, 120, 720, 5040, 40320, 362880, 3628800]


### **Question 8: Create a Python program that uses multiprocessing.Pool to compute the square of numbers from 1 to 10 in parallel. Measure the time taken to perform this computation using a pool of different sizes (e.g., 2, 4, 8 processes).**

## Solution :

In [21]:
import multiprocessing
import time

def square(n):
    """
    Compute the square of a number.

    Args:
        n (int): The number to be squared.

    Returns:
        int: The square of the number n.
    """
    return n * n

def main():
    numbers = range(1, 11)  # Numbers from 1 to 10

    pool_sizes = [2, 4, 8]  # Different pool sizes to test
    for pool_size in pool_sizes:
        # Create a multiprocessing Pool with the specified number of processes
        with multiprocessing.Pool(processes=pool_size) as pool:
            start_time = time.time()  # Start time measurement
            results = pool.map(square, numbers)  # Compute squares in parallel
            end_time = time.time()  # End time measurement

        # Print the results and the time taken
        print(f"Pool Size: {pool_size}, Results: {results}, Time Taken: {end_time - start_time:.4f} seconds")

if __name__ == "__main__":
    main()

Pool Size: 2, Results: [1, 4, 9, 16, 25, 36, 49, 64, 81, 100], Time Taken: 0.0121 seconds
Pool Size: 4, Results: [1, 4, 9, 16, 25, 36, 49, 64, 81, 100], Time Taken: 0.0023 seconds
Pool Size: 8, Results: [1, 4, 9, 16, 25, 36, 49, 64, 81, 100], Time Taken: 0.0033 seconds




---

