# Part 1: Theory

### 1. What is NumPy, and why is it widely used in Python
NumPy (Numerical Python) is a fundamental library for numerical computing in Python. It provides support for arrays, matrices, and many mathematical functions. It is widely used due to its high performance, efficient memory usage, and ability to interface with C/C++ code.

### 2. How does broadcasting work in NumPy
Broadcasting allows NumPy to perform element-wise operations on arrays of different shapes by automatically expanding their shapes to be compatible.

### 3. What is a Pandas DataFrame
A DataFrame is a 2-dimensional labeled data structure with columns of potentially different types. It is similar to a table in SQL or Excel.

### 4. Explain the use of the groupby() method in Pandas
`groupby()` is used to split the data into groups based on some criteria, apply a function to each group independently, and then combine the results.

### 5. Why is Seaborn preferred for statistical visualizations
Seaborn provides a high-level interface for drawing attractive and informative statistical graphics, making it easier to visualize complex datasets.

### 6. What are the differences between NumPy arrays and Python lists
- NumPy arrays are more efficient for numerical operations.
- Arrays are homogeneous, whereas lists can contain mixed data types.
- NumPy provides broadcasting and vectorized operations.

### 7. What is a heatmap, and when should it be used
A heatmap is a data visualization technique that shows the magnitude of a phenomenon as color. It is used to visualize matrices or correlation matrices.

### 8. What does the term “vectorized operation” mean in NumPy
Vectorized operations are operations that are applied to entire arrays rather than individual elements, leading to faster execution.

### 9. How does Matplotlib differ from Plotly
Matplotlib is static and great for basic plots. Plotly is interactive and better for web-based visualizations.

### 10. What is the significance of hierarchical indexing in Pandas
Hierarchical indexing allows multiple (two or more) index levels on an axis, enabling advanced data representation and operations.

### 11. What is the role of Seaborn’s pairplot() function
It visualizes pairwise relationships in a dataset, showing scatter plots for each pair of variables and histograms for individual ones.

### 12. What is the purpose of the describe() function in Pandas
It provides a summary of statistics such as mean, median, std, min, max, and percentiles for numeric columns.

### 13. Why is handling missing data important in Pandas
Missing data can lead to incorrect analysis. Pandas provides tools to detect, remove, or fill missing values.

### 14. What are the benefits of using Plotly for data visualization
Plotly provides interactive, web-ready, and publication-quality visualizations with support for 3D and dashboards.

### 15. How does NumPy handle multidimensional arrays
NumPy uses `ndarray` to handle n-dimensional arrays and provides efficient indexing, slicing, and operations.

### 16. What is the role of Bokeh in data visualization
Bokeh is a Python library for creating interactive and real-time streaming visualizations in web browsers.

### 17. Explain the difference between apply() and map() in Pandas
- `map()` is used with Series and operates element-wise.
- `apply()` can be used on both Series and DataFrames and can apply a function along an axis.

### 18. What are some advanced features of NumPy
- Broadcasting
- Memory-mapped files
- Linear algebra and FFT
- Masked arrays

### 19. How does Pandas simplify time series analysis
Pandas provides date range generation, resampling, shifting, and time zone handling to work with time series data.

### 20. What is the role of a pivot table in Pandas
Pivot tables summarize data with aggregate functions like mean, sum, count over multiple dimensions.

### 21. Why is NumPy’s array slicing faster than Python’s list slicing
Because NumPy arrays use a continuous block of memory and optimized C-based implementations.

### 22. What are some common use cases for Seaborn
- Visualizing distribution (hist, KDE)
- Categorical plots
- Heatmaps and correlation matrices
- Pairwise comparisons

# Part 2: Practical

In [None]:
# 1 Create a 2D NumPy array and calculate the sum of each row
import numpy as np
arr = np.array([[1, 2, 3], [4, 5, 6]])
row_sums = arr.sum(axis=1)
print(row_sums)

In [None]:
# 2  Pandas script to find the mean of a specific column in a DataFrame
import pandas as pd
df = pd.DataFrame({'A': [1, 2, 3], 'B': [4, 5, 6]})
mean_val = df['B'].mean()
print(mean_val)

In [None]:
# 3  Create a scatter plot using Matplotlib
import matplotlib.pyplot as plt
x = [1, 2, 3, 4]
y = [10, 20, 25, 30]
plt.scatter(x, y)
plt.show()

In [None]:
# 4 Calculate the correlation matrix and visualize it with a heatmap using Seaborn
import seaborn as sns
data = pd.DataFrame(np.random.rand(5, 5), columns=list('ABCDE'))
corr = data.corr()
sns.heatmap(corr, annot=True, cmap='coolwarm')

In [None]:
# 5 Generate a bar plot using Plotly
import plotly.express as px
df = pd.DataFrame({'x': ['A', 'B', 'C'], 'y': [10, 20, 30]})
fig = px.bar(df, x='x', y='y')
fig.show()

In [None]:
# 6 Create a DataFrame and add a new column based on an existing column
df['C'] = df['y'] * 2
print(df)

In [None]:
# 7 Element-wise multiplication of two NumPy arrays
a = np.array([1, 2, 3])
b = np.array([4, 5, 6])
product = a * b
print(product)

In [None]:
# 8 Create a line plot with multiple lines using Matplotlib
plt.plot([1, 2, 3], [1, 4, 9], label='x^2')
plt.plot([1, 2, 3], [1, 8, 27], label='x^3')
plt.legend()
plt.show()

In [None]:
# 9 Generate a Pandas DataFrame and filter rows where a column value is greater than a threshold
df = pd.DataFrame({'A': [10, 20, 30, 40]})
filtered = df[df['A'] > 25]
print(filtered)

In [None]:
# 10 Create a histogram using Seaborn to visualize a distribution
sns.histplot(data=df, x='A', bins=5, kde=True)

In [None]:
# 11 Perform matrix multiplication using NumPy
mat1 = np.array([[1, 2], [3, 4]])
mat2 = np.array([[5, 6], [7, 8]])
result = np.dot(mat1, mat2)
print(result)

In [None]:
# 12 Use Pandas to load a CSV file and display its first 5 rows
df = pd.read_csv('data.csv')
print(df.head())

In [None]:
# 13 Create a 3D scatter plot using Plotly
import plotly.graph_objects as go
fig = go.Figure(data=[go.Scatter3d(
    x=[1, 2, 3], y=[4, 5, 6], z=[7, 8, 9],
    mode='markers',
    marker=dict(size=5)
)])
fig.show()