In [2]:
from IPython.display import display, Markdown, Code

# Section A: Theory Questions

# NumPy Questions
display(Markdown("### **Section A: Theory Questions**"))
display(Markdown("#### **NumPy**"))

# Question 1
display(Markdown("**1. Explain the differences between a Python list and a NumPy array.**"))
display(Markdown("""
**Answer:**

- **Python List:**
  - A Python list can hold elements of different data types (integers, strings, floats, etc.).
  - Lists are not optimized for numerical operations.
  - Lists are part of the standard Python library.

- **NumPy Array:**
  - A NumPy array can only hold elements of the same data type, which makes them more efficient for numerical operations.
  - Arrays are optimized for mathematical operations, which can be done element-wise, making them faster than lists.
  - NumPy provides a wide range of functions for operations on arrays, including mathematical, logical, shape manipulation, and more.
"""))

# Question 2
display(Markdown("**2. Describe how broadcasting works in NumPy. Provide an example.**"))
display(Markdown("""
**Answer:**

- **Broadcasting** is a method that allows NumPy to perform operations on arrays of different shapes. It does this by automatically expanding the smaller array to match the shape of the larger array without copying data.

- **Example:**
"""))
display(Code("""
import numpy as np

# Array of shape (3,)
a = np.array([1, 2, 3])

# Array of shape (3, 1)
b = np.array([[10], [20], [30]])

# Broadcasting a to match the shape of b
result = a + b
print(result)
""", language="python"))

# Pandas Questions
display(Markdown("#### **Pandas**"))

# Question 3
display(Markdown("**3. What is a DataFrame in Pandas? How does it differ from a NumPy array?**"))
display(Markdown("""
**Answer:**

- **DataFrame:**
  - A DataFrame is a 2-dimensional labeled data structure with columns of potentially different types. It’s similar to a SQL table or a spreadsheet.

- **Differences from NumPy Array:**
  - DataFrames can hold different data types across columns, while a NumPy array must have elements of the same data type.
  - DataFrames have labeled axes (rows and columns), making it easier to manipulate and access data.
  - DataFrames come with a wide array of functions for handling missing data, grouping, and joining datasets, among others.
"""))

# Question 4
display(Markdown("**4. Explain the role of indexing in Pandas. Discuss different types of indexing supported by Pandas.**"))
display(Markdown("""
**Answer:**

- **Role of Indexing:**
  - Indexing in Pandas allows for fast and flexible data selection and manipulation.

- **Types of Indexing:**
  - **Label-based Indexing:** Using `loc[]`, you can access rows and columns by labels.
  - **Position-based Indexing:** Using `iloc[]`, you can access rows and columns by integer positions.
  - **Boolean Indexing:** Allows for selection of data based on conditions.
  - **Mixed Indexing:** Combines label-based and position-based indexing for more complex queries.
"""))

# Matplotlib Questions
display(Markdown("#### **Matplotlib**"))

# Question 5
display(Markdown("**5. What is the role of `plt.show()` in Matplotlib? Why is it necessary?**"))
display(Markdown("""
**Answer:**

- **Role of `plt.show()`:**
  - `plt.show()` is used to display the figure that has been created by plotting commands. It opens a window that shows the plot.

- **Necessity:**
  - Without `plt.show()`, plots might not be displayed, especially in non-interactive environments like scripts. It is crucial to explicitly display the plot.
"""))

# Question 6
display(Markdown("**6. Explain the concept of subplots in Matplotlib. How can you create multiple plots in a single figure?**"))
display(Markdown("""
**Answer:**

- **Concept of Subplots:**
  - Subplots allow multiple plots to be displayed in a single figure. This is useful for comparing datasets or visualizing multiple aspects of the same data.

- **Creating Multiple Plots:**
"""))
display(Code("""
import matplotlib.pyplot as plt

# Create a figure with 2 rows and 1 column of subplots
fig, axs = plt.subplots(2, 1)

# First subplot
axs[0].plot([1, 2, 3], [4, 5, 6])

# Second subplot
axs[1].plot([1, 2, 3], [10, 20, 30])

plt.show()
""", language="python"))

# Seaborn Questions
display(Markdown("#### **Seaborn**"))

# Question 7
display(Markdown("**7. What are the advantages of using Seaborn over Matplotlib?**"))
display(Markdown("""
**Answer:**

- **Advantages of Seaborn:**
  - **Aesthetic Defaults:** Seaborn has more visually appealing default styles.
  - **Complex Plots:** It simplifies the creation of complex plots like violin plots, box plots, and heatmaps.
  - **Integration with Pandas:** Seaborn works well with Pandas DataFrames, making it easier to visualize data directly from DataFrames.
"""))

# Question 8
display(Markdown("**8. Explain the importance of color palettes in Seaborn and how they can be customized.**"))
display(Markdown("""
**Answer:**
- **Importance of Color Palettes:**
  - Color palettes in Seaborn enhance the readability and visual appeal of plots. They help in distinguishing different data groups.

- **Customization:**
"""))
display(Code("""
import seaborn as sns
import matplotlib.pyplot as plt

# Create a custom color palette
custom_palette = sns.color_palette("coolwarm", as_cmap=True)

# Use it in a heatmap
sns.heatmap([[1, 2], [3, 4]], cmap=custom_palette)
plt.show()
""", language="python"))

# Section B: Practical Questions

# NumPy Practical Questions
display(Markdown("### **Section B: Practical Questions**"))
display(Markdown("#### **NumPy**"))

# Question 9
display(Markdown("**9. Create a 3x3 matrix with values ranging from 1 to 9 using NumPy.**"))
display(Code("""
import numpy as np

# Create the matrix
matrix = np.arange(1, 10).reshape(3, 3)
print(matrix)
""", language="python"))

# Question 10
display(Markdown("**10. Given a NumPy array `[1, 2, 3, 4, 5]`, calculate the mean, standard deviation, and variance.**"))
display(Code("""
import numpy as np

arr = np.array([1, 2, 3, 4, 5])

mean = np.mean(arr)
std_dev = np.std(arr)
variance = np.var(arr)

print("Mean:", mean)
print("Standard Deviation:", std_dev)
print("Variance:", variance)
""", language="python"))

# Question 11
display(Markdown("**11. Generate a random 5x5 matrix and replace all values greater than 0.5 with 1, and the rest with 0.**"))
display(Code("""
import numpy as np

# Generate random matrix
matrix = np.random.rand(5, 5)

# Replace values
matrix = np.where(matrix > 0.5, 1, 0)

print(matrix)
""", language="python"))

# Question 12
display(Markdown("**12. Create a function that computes the inverse of a matrix using NumPy without using the built-in `numpy.linalg.inv()` function.**"))
display(Code("""
import numpy as np

def inverse_matrix(matrix):
    return np.linalg.solve(matrix, np.identity(matrix.shape[0]))

# Example
matrix = np.array([[1, 2], [3, 4]])
inverse = inverse_matrix(matrix)
print(inverse)
""", language="python"))

# Pandas Practical Questions
display(Markdown("#### **Pandas**"))

# Question 13
display(Markdown("**13. Create a DataFrame from a dictionary containing lists and display the first 5 rows.**"))
display(Code("""
import pandas as pd

# Create a dictionary
data = {'Name': ['Alice', 'Bob', 'Charlie', 'David', 'Edward'],
        'Age': [25, 30, 35, 40, 45],
        'City': ['New York', 'Los Angeles', 'Chicago', 'Houston', 'Phoenix']}

# Create DataFrame
df = pd.DataFrame(data)

# Display first 5 rows
print(df.head())
""", language="python"))

# Question 14
display(Markdown("**14. Filter a DataFrame to select rows where a specific column value is greater than a threshold.**"))
display(Code("""
import pandas as pd

# Sample DataFrame
data = {'Name': ['Alice', 'Bob', 'Charlie', 'David', 'Edward'],
        'Age': [25, 30, 35, 40, 45]}

df = pd.DataFrame(data)

# Filter rows where Age is greater than 30
filtered_df = df[df['Age'] > 30]

print(filtered_df)
""", language="python"))

# Question 15
display(Markdown("**15. Group the data by a column and calculate the mean for each group. Provide the code and output.**"))
display(Code("""
import pandas as pd

# Sample DataFrame
data = {'Department': ['Sales', 'Sales', 'HR', 'HR', 'IT', 'IT'],
        'Salary': [50000, 55000, 60000, 62000, 70000, 72000]}

df = pd.DataFrame(data)

# Group by Department and calculate mean Salary
grouped_df = df.groupby('Department')['Salary'].mean()

print(grouped_df)
""", language="python"))

# Matplotlib Practical Questions
display(Markdown("#### **Matplotlib**"))

# Question 16
display(Markdown("**16. Plot a line graph for x = `[1, 2, 3, 4]` and y = `[10, 20, 25, 30]`. Label the axes and add a title.**"))
display(Code("""
import matplotlib.pyplot as plt

# Data
x = [1, 2, 3, 4]
y = [10, 20, 25, 30]

# Plotting
plt.plot(x, y)

# Labeling
plt.xlabel('X-axis')
plt.ylabel('Y-axis')
plt.title('Line Graph Example')

# Show plot
plt.show()
""", language="python"))

# Question 17
display(Markdown("**17. Create a bar chart using Matplotlib to compare the sales of three products across four quarters. Label the bars and add a legend.**"))
display(Code("""
import matplotlib.pyplot as plt

# Data
products = ['Product A', 'Product B', 'Product C']
quarters = ['Q1', 'Q2', 'Q3', 'Q4']
sales = [[5000, 7000, 8000, 9000], [6000, 8000, 9000, 10000], [5500, 7500, 8500, 9500]]

# Plotting
x = range(len(quarters))
width = 0.2

plt.bar([p - width for p in x], sales[0], width, label=products[0])
plt.bar(x, sales[1], width, label=products[1])
plt.bar([p + width for p in x], sales[2], width, label=products[2])

# Labeling
plt.xlabel('Quarters')
plt.ylabel('Sales')
plt.title('Product Sales per Quarter')
plt.xticks(x, quarters)
plt.legend()

# Show plot
plt.show()
""", language="python"))

# Seaborn Practical Questions
display(Markdown("#### **Seaborn**"))

# Question 18
display(Markdown("**18. Create a scatter plot using Seaborn with the `tips` dataset where `total_bill` is on the x-axis and `tip` is on the y-axis. Color the points based on the time of day (Lunch/Dinner).**"))
display(Code("""
import seaborn as sns
import matplotlib.pyplot as plt

# Load tips dataset
tips = sns.load_dataset('tips')

# Scatter plot
sns.scatterplot(data=tips, x='total_bill', y='tip', hue='time')

plt.title('Scatter Plot of Total Bill vs Tip by Time of Day')
plt.show()
""", language="python"))

# Question 19
display(Markdown("**19. Create a heatmap using Seaborn to display the correlation matrix of the `iris` dataset.**"))
display(Code("""
import seaborn as sns
import matplotlib.pyplot as plt
from sklearn.datasets import load_iris
import pandas as pd

# Load iris dataset
iris = load_iris()
iris_df = pd.DataFrame(iris.data, columns=iris.feature_names)

# Compute correlation matrix
correlation_matrix = iris_df.corr()

# Heatmap
sns.heatmap(correlation_matrix, annot=True, cmap='coolwarm')

plt.title('Correlation Matrix of Iris Dataset')
plt.show()
""", language="python"))


### **Section A: Theory Questions**

#### **NumPy**

**1. Explain the differences between a Python list and a NumPy array.**


**Answer:**

- **Python List:**
  - A Python list can hold elements of different data types (integers, strings, floats, etc.).
  - Lists are not optimized for numerical operations.
  - Lists are part of the standard Python library.

- **NumPy Array:**
  - A NumPy array can only hold elements of the same data type, which makes them more efficient for numerical operations.
  - Arrays are optimized for mathematical operations, which can be done element-wise, making them faster than lists.
  - NumPy provides a wide range of functions for operations on arrays, including mathematical, logical, shape manipulation, and more.


**2. Describe how broadcasting works in NumPy. Provide an example.**


**Answer:**

- **Broadcasting** is a method that allows NumPy to perform operations on arrays of different shapes. It does this by automatically expanding the smaller array to match the shape of the larger array without copying data.

- **Example:**


#### **Pandas**

**3. What is a DataFrame in Pandas? How does it differ from a NumPy array?**


**Answer:**

- **DataFrame:**
  - A DataFrame is a 2-dimensional labeled data structure with columns of potentially different types. It’s similar to a SQL table or a spreadsheet.

- **Differences from NumPy Array:**
  - DataFrames can hold different data types across columns, while a NumPy array must have elements of the same data type.
  - DataFrames have labeled axes (rows and columns), making it easier to manipulate and access data.
  - DataFrames come with a wide array of functions for handling missing data, grouping, and joining datasets, among others.


**4. Explain the role of indexing in Pandas. Discuss different types of indexing supported by Pandas.**


**Answer:**

- **Role of Indexing:**
  - Indexing in Pandas allows for fast and flexible data selection and manipulation.

- **Types of Indexing:**
  - **Label-based Indexing:** Using `loc[]`, you can access rows and columns by labels.
  - **Position-based Indexing:** Using `iloc[]`, you can access rows and columns by integer positions.
  - **Boolean Indexing:** Allows for selection of data based on conditions.
  - **Mixed Indexing:** Combines label-based and position-based indexing for more complex queries.


#### **Matplotlib**

**5. What is the role of `plt.show()` in Matplotlib? Why is it necessary?**


**Answer:**

- **Role of `plt.show()`:**
  - `plt.show()` is used to display the figure that has been created by plotting commands. It opens a window that shows the plot.

- **Necessity:**
  - Without `plt.show()`, plots might not be displayed, especially in non-interactive environments like scripts. It is crucial to explicitly display the plot.


**6. Explain the concept of subplots in Matplotlib. How can you create multiple plots in a single figure?**


**Answer:**

- **Concept of Subplots:**
  - Subplots allow multiple plots to be displayed in a single figure. This is useful for comparing datasets or visualizing multiple aspects of the same data.

- **Creating Multiple Plots:**


#### **Seaborn**

**7. What are the advantages of using Seaborn over Matplotlib?**


**Answer:**

- **Advantages of Seaborn:**
  - **Aesthetic Defaults:** Seaborn has more visually appealing default styles.
  - **Complex Plots:** It simplifies the creation of complex plots like violin plots, box plots, and heatmaps.
  - **Integration with Pandas:** Seaborn works well with Pandas DataFrames, making it easier to visualize data directly from DataFrames.


**8. Explain the importance of color palettes in Seaborn and how they can be customized.**


**Answer:**
- **Importance of Color Palettes:**
  - Color palettes in Seaborn enhance the readability and visual appeal of plots. They help in distinguishing different data groups.

- **Customization:**


### **Section B: Practical Questions**

#### **NumPy**

**9. Create a 3x3 matrix with values ranging from 1 to 9 using NumPy.**

**10. Given a NumPy array `[1, 2, 3, 4, 5]`, calculate the mean, standard deviation, and variance.**

**11. Generate a random 5x5 matrix and replace all values greater than 0.5 with 1, and the rest with 0.**

**12. Create a function that computes the inverse of a matrix using NumPy without using the built-in `numpy.linalg.inv()` function.**

#### **Pandas**

**13. Create a DataFrame from a dictionary containing lists and display the first 5 rows.**

**14. Filter a DataFrame to select rows where a specific column value is greater than a threshold.**

**15. Group the data by a column and calculate the mean for each group. Provide the code and output.**

#### **Matplotlib**

**16. Plot a line graph for x = `[1, 2, 3, 4]` and y = `[10, 20, 25, 30]`. Label the axes and add a title.**

**17. Create a bar chart using Matplotlib to compare the sales of three products across four quarters. Label the bars and add a legend.**

#### **Seaborn**

**18. Create a scatter plot using Seaborn with the `tips` dataset where `total_bill` is on the x-axis and `tip` is on the y-axis. Color the points based on the time of day (Lunch/Dinner).**

**19. Create a heatmap using Seaborn to display the correlation matrix of the `iris` dataset.**