Question 1 : What is the difference between multithreading and multiprocessing?
- Difference between Multithreading and Multiprocessing

Multithreading and multiprocessing are both techniques used to perform tasks concurrently, but they differ in how they achieve this and how they manage system resources.

**Multithreading** involves executing multiple threads within a single process. Threads share the same memory space and resources of their parent process, which makes communication between them easy. It is lightweight, and creating or switching between threads requires less overhead. However, since threads share the same memory, if one thread crashes, it can potentially affect the entire process. Multithreading is especially useful for I/O-bound tasks, such as reading or writing files, network operations, or handling multiple user requests in a web server. In multithreading, context switching is faster because all threads share the same memory.

On the other hand, **multiprocessing** involves executing multiple processes simultaneously, with each process having its own separate memory space. Processes are independent of each other, so if one process crashes, it does not affect others. Communication between processes is more complex and often requires inter-process communication (IPC) techniques like queues, pipes, or shared memory. Creating processes involves more overhead compared to threads, and context switching is slower. Multiprocessing is particularly suitable for CPU-bound tasks, such as performing heavy computations or running multiple independent programs in parallel. Multiprocessing allows true parallel execution on multiple CPU cores, unlike multithreading in languages like Python, which may be limited by the Global Interpreter Lock (GIL).

Key points:

1. Multithreading is lightweight, shares memory, and is ideal for tasks that need frequent communication.

2. Multiprocessing is heavier, has isolated memory spaces, and is ideal for computationally intensive tasks that can run independently.

3. Choice between the two depends on the nature of the task: I/O-bound tasks benefit from multithreading, while CPU-bound tasks benefit from multiprocessing.

Conclusion:
Multithreading allows concurrent execution within a single process, while multiprocessing allows true parallel execution across multiple processes. Both have their advantages and specific use cases, making them important concepts in modern programming and operating systems.

Question 2: What are the challenges associated with memory management in Python?     
- Challenges Associated with Memory Management in Python

Memory management is an important aspect of any programming language because efficient use of memory ensures programs run smoothly and do not crash due to excessive resource consumption. Python provides automatic memory management, including garbage collection, but there are still several challenges associated with it.

Automatic Garbage Collection Limitations
Python uses automatic memory management through reference counting and a garbage collector for cyclic references. However, reference counting alone cannot handle circular references, where two or more objects reference each other. Although Python’s garbage collector can detect and clean up such cycles, it may not always run immediately, which can lead to temporary memory usage spikes.

High Memory Overhead
Python objects are more memory-intensive compared to lower-level languages like C or C++. Each object in Python carries additional information, such as type, reference count, and other metadata. For large-scale applications or programs that handle huge amounts of data, this overhead can lead to high memory consumption.

Fragmentation
Frequent allocation and deallocation of memory can cause memory fragmentation. Fragmentation occurs when memory is allocated in small, non-contiguous blocks, making it inefficient for the memory manager to utilize space effectively. This may slow down the program or increase memory usage unnecessarily.

Uncontrolled Object Creation
In Python, developers can create objects dynamically and easily, which can sometimes lead to the creation of unnecessary or duplicate objects. If objects are not deleted properly or references are maintained unintentionally, it can lead to memory leaks, where memory is consumed but not freed.

Inefficient Handling of Large Data Structures
Python’s dynamic typing and high-level abstractions make it convenient to work with complex data structures like lists, dictionaries, and sets. However, these data structures consume more memory than equivalent structures in languages like C. Managing large datasets in memory can therefore be challenging.

Dependency on Garbage Collector Timing
The garbage collector runs at intervals determined by Python’s memory management system. This can be unpredictable, causing memory to be held longer than necessary. For applications requiring real-time memory control or working with limited resources, this can be a challenge.

Conclusion:

While Python simplifies memory management through automatic garbage collection and dynamic allocation, it presents challenges such as high memory overhead, fragmentation, circular references, and potential memory leaks. Developers must be aware of these issues and use techniques like efficient data structures, manual deletion of unnecessary objects, and memory profiling to manage memory effectively in Python applications.

Question 3:Write a Python program that logs an error message to a log file when a division by zero exception occurs.



In [1]:
import logging

# Configure logging
logging.basicConfig(
    filename="error_log.txt",  # Log file name
    level=logging.ERROR,       # Log only errors
    format="%(asctime)s - %(levelname)s - %(message)s"
)

def divide_numbers(a, b):
    try:
        result = a / b
        print(f"Result: {result}")
    except ZeroDivisionError as e:
        logging.error("Division by zero occurred: %s", e)
        print("Error: Division by zero. Check the log file for details.")

# Example usage
divide_numbers(10, 0)   # This will cause a division by zero error
divide_numbers(10, 2)   # This will work normally


ERROR:root:Division by zero occurred: division by zero


Error: Division by zero. Check the log file for details.
Result: 5.0


Question 4:Write a Python program that reads from one file and writes its content to another file.

In [2]:
# File names
source_file = "input.txt"
destination_file = "output.txt"

try:
    # Open the source file in read mode
    with open(source_file, "r") as src:
        content = src.read()  # Read the content of the source file

    # Open the destination file in write mode
    with open(destination_file, "w") as dest:
        dest.write(content)   # Write the content to the destination file

    print(f"Content copied from {source_file} to {destination_file} successfully.")

except FileNotFoundError:
    print(f"Error: The file {source_file} does not exist.")
except Exception as e:
    print(f"An error occurred: {e}")


Question 5: Write a program that handles both IndexError and KeyError using a
try-except block.

In [None]:
# Sample list and dictionary
my_list = [10, 20, 30]
my_dict = {"a": 1, "b": 2, "c": 3}

try:
    # Attempt to access an invalid index in the list
    print("Accessing list element:", my_list[5])

    # Attempt to access a non-existent key in the dictionary
    print("Accessing dictionary value:", my_dict["z"])

except IndexError as ie:
    print(f"IndexError occurred: {ie}")

except KeyError as ke:
    print(f"KeyError occurred: {ke}")

except Exception as e:
    print(f"Some other error occurred: {e}")

finally:
    print("Program execution completed.")


Question 6: What are the differences between NumPy arrays and Python lists?
- Differences between NumPy Arrays and Python Lists

NumPy arrays and Python lists are both used to store collections of data, but they differ significantly in terms of functionality, performance, and use cases.

**Homogeneity vs Heterogeneity**

NumPy arrays are homogeneous, meaning all elements must be of the same data type (e.g., all integers or all floats).

Python lists are heterogeneous, meaning they can store elements of different types, such as integers, strings, and objects together.

**Performance**

NumPy arrays are much faster for numerical operations because they are implemented in C and support vectorized operations, which allows performing element-wise calculations without explicit loops.

Python lists are slower for mathematical operations because they are high-level, flexible containers and require iteration through elements for computation.

**Memory Efficiency**

NumPy arrays are more memory-efficient because they store elements in contiguous memory blocks and use fixed-size data types.

Python lists use more memory as they store references to objects, which adds overhead.

**Functionality and Operations**

NumPy arrays support vectorized operations, broadcasting, and a large collection of mathematical, statistical, and linear algebra functions.

Python lists require explicit loops or list comprehensions to perform most operations, which makes them less convenient for large-scale numerical computations.

**Indexing and Slicing**

NumPy arrays support advanced indexing, boolean indexing, and multidimensional slicing (e.g., selecting rows/columns in a 2D array).

Python lists support only basic indexing and slicing, and do not directly support multidimensional operations.

**Dimensionality**

NumPy arrays can be multidimensional, allowing creation of matrices, tensors, and higher-dimensional arrays.

Python lists are inherently 1D, though nested lists can mimic multidimensional arrays, but they are less efficient and cumbersome to use.

**Type Conversion**

NumPy arrays automatically convert elements to a single common type if different types are provided.

Python lists preserve the type of each element, allowing mixed data types.

Question 7:Explain the difference between apply() and map() in Pandas.
- Difference Between apply() and map() in Pandas

Both apply() and map() are used in Pandas to apply a function to elements of a DataFrame or Series, but they have differences in scope, flexibility, and usage.

**Scope of Operation**

map() is used only on a Pandas Series (single column). It applies a function element-wise to each value in the Series.

apply() can be used on both a Series and a DataFrame. On a DataFrame, it can apply a function row-wise (axis=1) or column-wise (axis=0).

**Flexibility**

map() is simpler and primarily designed for element-wise transformations. It can also accept dictionaries or Series to map values.

apply() is more flexible, as it can handle more complex operations, including aggregation, custom functions, or applying functions across rows or columns.

**Return Type**

map() always returns a Series.

apply() can return a Series, DataFrame, or scalar, depending on the function applied and whether it’s used on a Series or DataFrame.

Question 8: Create a histogram using Seaborn to visualize a distribution.

In [None]:
# Import necessary libraries
import seaborn as sns
import matplotlib.pyplot as plt

data = [12, 15, 20, 21, 22, 22, 25, 25, 25, 30, 32, 35, 35, 36, 40, 42, 45, 50]

sns.histplot(data, bins=8, kde=True, color='skyblue')

# Add titles and labels
plt.title("Histogram of Sample Data")
plt.xlabel("Value")
plt.ylabel("Frequency")

# Show the plot
plt.show()


Question 9: Use Pandas to load a CSV file and display its first 5 rows.

In [None]:
import pandas as pd

# Load the CSV file
file_path = "data.csv"
df = pd.read_csv(file_path)

print(df.head())

Question 10: Calculate the correlation matrix using Seaborn and visualize it with a
heatmap.


In [None]:
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt

data = {
    'A': [10, 20, 30, 40, 50],
    'B': [5, 15, 25, 35, 45],
    'C': [2, 4, 6, 8, 10],
    'D': [100, 90, 80, 70, 60]
}

df = pd.DataFrame(data)

# Calculate the correlation matrix
corr_matrix = df.corr()

# Display the correlation matrix
print("Correlation Matrix:\n", corr_matrix)

# Visualize the correlation matrix with a heatmap
sns.heatmap(corr_matrix, annot=True, cmap='coolwarm', fmt=".2f")
plt.title("Correlation Heatmap")
plt.show()
