1. What is NumPy, and why is it widely used in Python?**  
   NumPy is a powerful Python library for numerical computing, providing support for large, multi-dimensional arrays and matrices along with a collection of mathematical functions to operate on them. It is widely used because of its performance, simplicity, and integration with other libraries like Pandas and SciPy.

2. How does broadcasting work in NumPy?
   Broadcasting allows NumPy to perform arithmetic operations on arrays of different shapes by automatically expanding the smaller array’s shape to match the larger one, without copying data.

3. What is a Pandas DataFrame?**  
   A Pandas DataFrame is a 2-dimensional, labeled data structure with columns that can hold different data types, similar to a table in SQL or Excel.

4. Explain the use of the `groupby()` method in Pandas.**  
   The `groupby()` method in Pandas is used to split data into groups based on some criteria, perform operations (like sum, mean, etc.) on each group, and combine the results.

5. Why is Seaborn preferred for statistical visualizations?**  
   Seaborn is preferred because it provides high-level functions for drawing attractive and informative statistical graphics with less code and better aesthetics than Matplotlib.

6. What are the differences between NumPy arrays and Python lists?**  
   NumPy arrays are faster, more memory-efficient, and support vectorized operations, while Python lists are more flexible but slower and don't support element-wise operations directly.

7. What is a heatmap, and when should it be used?**  
   A heatmap is a data visualization technique that shows the magnitude of a phenomenon as color in two dimensions. It is best used to visualize correlation matrices or large-scale data patterns.

8. What does the term “vectorized operation” mean in NumPy?**  
   Vectorized operations allow you to apply an operation to an entire array without using loops, leading to cleaner and faster code execution.

9. How does Matplotlib differ from Plotly?**  
  Matplotlib is a static plotting library best for simple visualizations, while Plotly is interactive and web-based, offering zooming, hovering, and real-time updates.

10. What is the significance of hierarchical indexing in Pandas?
  Hierarchical indexing lets you have multiple levels (or indexes) on an axis, allowing for more sophisticated data representation and easier data selection.

11. What is the role of Seaborn’s `pairplot()` function?
  pairplot()` is used to visualize the pairwise relationships in a dataset by plotting scatter plots and histograms for each feature combination.

12. What is the purpose of the `describe()` function in Pandas?**  
  The `describe()` function provides summary statistics (mean, count, std, min, max, etc.) for numerical columns in a DataFrame.

13. Why is handling missing data important in Pandas?**  
  Handling missing data is crucial for maintaining data integrity, avoiding computation errors, and improving model performance during analysis or machine learning tasks.

14. What are the benefits of using Plotly for data visualization?**  
  Plotly allows for interactive, publication-quality graphs, supports 3D and web-based visualizations, and integrates well with Dash for building data apps.

15. How does NumPy handle multidimensional arrays?**  
  NumPy uses `ndarray` objects that can represent arrays of any number of dimensions, and it provides methods for reshaping, indexing, and performing operations efficiently.

16. What is the role of Bokeh in data visualization?**  
  Bokeh is a Python library for creating interactive and real-time visualizations in web browsers, ideal for dashboards and live data updates.

17. Explain the difference between `apply()` and `map()` in Pandas.**  
  `map()` is used for element-wise operations on Series, while `apply()` works on both Series and DataFrames and can apply functions to rows or columns.

18. What are some advanced features of NumPy?**  
  Advanced features include broadcasting, masked arrays, structured arrays, linear algebra routines, Fourier transforms, and integration with C code.

19. How does Pandas simplify time series analysis?**  
  Pandas offers powerful features like datetime indexing, resampling, frequency conversion, and rolling statistics that make time series analysis easier and more intuitive.

20. What is the role of a pivot table in Pandas?**  
  Pivot tables summarize data by grouping and aggregating, making it easier to analyze relationships and patterns within a dataset.

21. Why is NumPy’s array slicing faster than Python’s list slicing?**  
  NumPy arrays are stored in contiguous memory blocks and use optimized C code for slicing, whereas Python lists are more general and slower due to their dynamic structure.

22. What are some common use cases for Seaborn?**  
  Common use cases include correlation heatmaps, categorical data visualization (like boxplots, violin plots), regression analysis, and pairwise feature plots.

In [None]:
# 1. Create a 2D NumPy array and calculate the sum of each row
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import plotly.express as px
import plotly.graph_objects as go

arr = np.array([[1, 2, 3], [4, 5, 6]])
row_sums = np.sum(arr, axis=1)
print("Row sums:", row_sums)

# 2. Find the mean of a specific column in a DataFrame
df1 = pd.DataFrame({'Math': [85, 90, 78, 92]})
mean_math = df1['Math'].mean()
print("Mean of Math column:", mean_math)

# 3. Create a scatter plot using Matplotlib
x = [1, 2, 3, 4]
y = [10, 15, 13, 17]
plt.scatter(x, y)
plt.title("Simple Scatter Plot")
plt.xlabel("X-axis")
plt.ylabel("Y-axis")
plt.show()

# 4. Correlation matrix using Seaborn with heatmap
df2 = pd.DataFrame({
    'A': [1, 2, 3, 4],
    'B': [4, 5, 6, 7],
    'C': [7, 6, 5, 4]
})
corr = df2.corr()
sns.heatmap(corr, annot=True, cmap='coolwarm')
plt.title("Correlation Matrix")
plt.show()

# 5. Generate a bar plot using Plotly
df3 = pd.DataFrame({'Fruits': ['Apple', 'Banana', 'Mango'], 'Count': [10, 20, 15]})
fig1 = px.bar(df3, x='Fruits', y='Count', title='Fruit Count')
fig1.show()

# 6. Create a DataFrame and add a new column based on an existing column
df4 = pd.DataFrame({'Name': ['Alice', 'Bob'], 'Score': [85, 90]})
df4['Passed'] = df4['Score'] > 80
print(df4)

# 7. Element-wise multiplication of two NumPy arrays
arr1 = np.array([1, 2, 3])
arr2 = np.array([4, 5, 6])
elementwise = arr1 * arr2
print("Element-wise multiplication:", elementwise)

# 8. Line plot with multiple lines using Matplotlib
x = [1, 2, 3, 4]
y1 = [10, 20, 25, 30]
y2 = [5, 15, 20, 25]
plt.plot(x, y1, label='Line 1')
plt.plot(x, y2, label='Line 2')
plt.title("Multiple Lines Plot")
plt.xlabel("X-axis")
plt.ylabel("Y-axis")
plt.legend()
plt.show()

# 9. Filter rows where a column value is greater than a threshold
df5 = pd.DataFrame({'Name': ['Alice', 'Bob', 'Charlie'], 'Score': [85, 60, 95]})
filtered_df = df5[df5['Score'] > 80]
print("Filtered rows:\n", filtered_df)

# 10. Create a histogram using Seaborn
data = [1, 2, 2, 3, 3, 3, 4, 4, 4, 4]
sns.histplot(data, kde=True)
plt.title("Distribution Histogram")
plt.show()

# 11. Perform matrix multiplication using NumPy
A = np.array([[1, 2], [3, 4]])
B = np.array([[5, 6], [7, 8]])
matrix_product = np.dot(A, B)
print("Matrix multiplication result:\n", matrix_product)

# 12. Load a CSV file and display first 5 rows
# Replace 'your_file.csv' with your actual file path
# df6 = pd.read_csv('your_file.csv')
# print(df6.head())
print("Skipped CSV loading for now (uncomment and provide file to use)")

# 13. Create a 3D scatter plot using Plotly
fig2 = go.Figure(data=[go.Scatter3d(
    x=[1, 2, 3],
    y=[4, 5, 6],
    z=[7, 8, 9],
    mode='markers',
    marker=dict(size=8, color='blue')
)])
fig2.update_layout(title='3D Scatter Plot')
fig2.show()
