1. What is NumPy, and why is it widely used in Python?
-NumPy (Numerical Python) is a fundamental package for scientific computing in Python. It provides support for large, multi-dimensional arrays and matrices, along with a collection of mathematical functions to operate on them efficiently.

2. How does broadcasting work in NumPy?
-Broadcasting allows NumPy to perform arithmetic operations on arrays of different shapes by automatically expanding their dimensions to be compatible.

3. What is a Pandas DataFrame?
-A Pandas DataFrame is a two-dimensional, size-mutable, and heterogeneous data structure with labeled axes (rows and columns), ideal for data manipulation and analysis.

4. Explain the use of the groupby() method in Pandas.
-The groupby() method is used to split data into groups based on some criteria, apply a function to each group, and then combine the results.

5. Why is Seaborn preferred for statistical visualizations?
-Seaborn is built on top of Matplotlib and provides a high-level interface for drawing attractive and informative statistical graphics with minimal code.

6. What are the differences between NumPy arrays and Python lists?
-NumPy arrays are more memory-efficient, support vectorized operations, and are optimized for numerical computations, while Python lists are more flexible but slower and not suited for mathematical tasks.

7. What is a heatmap, and when should it be used?
-A heatmap is a data visualization technique that shows the magnitude of a phenomenon as color in two dimensions. It’s used for showing correlation matrices or frequency tables.

8. What does the term “vectorized operation” mean in NumPy?
-A vectorized operation refers to performing operations on entire arrays without using explicit loops, leading to faster execution and more concise code.

9. How does Matplotlib differ from Plotly?
-Matplotlib is a static plotting library ideal for basic charts and custom visualizations, while Plotly supports interactive and web-based visualizations out of the box.

10. What is the significance of hierarchical indexing in Pandas?
-Hierarchical indexing (MultiIndex) allows multiple levels of indexing in a single DataFrame or Series, enabling more complex data representations and operations.

11. What is the role of Seaborn’s pairplot() function?
-pairplot() creates a grid of scatter plots for each pair of variables in a dataset, helping to visualize relationships and distributions simultaneously.

12. What is the purpose of the describe() function in Pandas?
-describe() provides summary statistics (mean, std, min, max, etc.) for numeric columns in a DataFrame, aiding quick data exploration.

13. Why is handling missing data important in Pandas?
-Handling missing data ensures the accuracy of analysis and prevents errors in operations or misleading statistical summaries.

14. What are the benefits of using Plotly for data visualization?
-Plotly provides interactive, web-based visualizations with minimal configuration, supports zooming, tooltips, and is great for dashboards.

15. How does NumPy handle multidimensional arrays?
-NumPy uses the ndarray object to represent multidimensional arrays, supporting multiple axes and shape-based indexing.

16. What is the role of Bokeh in data visualization?
-Bokeh is used to create interactive visualizations for modern web browsers, focusing on creating dashboards and web apps with real-time data.

17. Explain the difference between apply() and map() in Pandas.
-map() is used with Series for element-wise transformations, while apply() is more flexible and can be used on Series or DataFrames with custom functions.

18. What are some advanced features of NumPy?
-Advanced features include broadcasting, memory mapping, linear algebra operations, Fourier transforms, and integration with C/C++/Fortran code.

19. How does Pandas simplify time series analysis?
-Pandas provides built-in support for datetime indexing, frequency conversion, resampling, and time zone handling, making time series manipulation easy.

20. What is the role of a pivot table in Pandas?
-A pivot table summarizes data by grouping and aggregating it based on specified keys, offering a flexible way to rearrange data.

21. Why is NumPy’s array slicing faster than Python’s list slicing?
NumPy arrays use contiguous memory and optimized C-based operations, making slicing and other operations significantly faster than Python lists.

22. What are some common use cases for Seaborn?
Seaborn is often used for visualizing statistical relationships, correlation matrices, distribution plots, categorical data comparisons, and regression plots.

In [None]:
1. How do you create a 2D NumPy array and calculate the sum of each row?
You can use np.array() to create a 2D array and np.sum(axis=1) to compute the row-wise sum.

python
Copy
Edit
import numpy as np

arr = np.array([[1, 2, 3], [4, 5, 6]])
row_sums = np.sum(arr, axis=1)
print(row_sums)
2. Write a Pandas script to find the mean of a specific column in a DataFrame.
Use the mean() function on the desired column.

python
Copy
Edit
import pandas as pd

df = pd.DataFrame({'scores': [85, 90, 78, 92]})
mean_score = df['scores'].mean()
print(mean_score)
3. Create a scatter plot using Matplotlib.
Use plt.scatter() to plot data points.

python
Copy
Edit
import matplotlib.pyplot as plt

x = [1, 2, 3, 4]
y = [10, 20, 25, 30]

plt.scatter(x, y)
plt.xlabel('X-axis')
plt.ylabel('Y-axis')
plt.title('Scatter Plot')
plt.show()
4. How do you calculate the correlation matrix using Seaborn and visualize it with a heatmap?
Use df.corr() to compute correlation and sns.heatmap() to visualize it.

python
Copy
Edit
import seaborn as sns
import pandas as pd

df = pd.DataFrame({
    'A': [1, 2, 3],
    'B': [5, 6, 7],
    'C': [8, 9, 10]
})

corr = df.corr()
sns.heatmap(corr, annot=True, cmap='coolwarm')
5. Generate a bar plot using Plotly.
Use plotly.express.bar() for a quick bar chart.

python
Copy
Edit
import plotly.express as px

data = {'Fruits': ['Apple', 'Banana', 'Orange'], 'Quantity': [10, 15, 7]}
df = pd.DataFrame(data)

fig = px.bar(df, x='Fruits', y='Quantity', title='Fruit Sales')
fig.show()
6. Create a DataFrame and add a new column based on an existing column.
You can create a new column by applying a function or operation to an existing one.

python
Copy
Edit
df = pd.DataFrame({'Price': [100, 200, 300]})
df['Discounted'] = df['Price'] * 0.9
print(df)
7. Write a program to perform element-wise multiplication of two NumPy arrays.
Use the * operator for element-wise multiplication.

python
Copy
Edit
import numpy as np

a = np.array([1, 2, 3])
b = np.array([4, 5, 6])
result = a * b
print(result)
8. Create a line plot with multiple lines using Matplotlib.
Use plt.plot() multiple times for different lines.

python
Copy
Edit
x = [1, 2, 3, 4]
y1 = [1, 4, 9, 16]
y2 = [2, 5, 10, 17]

plt.plot(x, y1, label='Line 1')
plt.plot(x, y2, label='Line 2')
plt.legend()
plt.title('Multiple Lines')
plt.show()
9. Generate a Pandas DataFrame and filter rows where a column value is greater than a threshold.
Use conditional filtering on a column.

python
Copy
Edit
df = pd.DataFrame({'Age': [22, 30, 17, 25]})
filtered_df = df[df['Age'] > 20]
print(filtered_df)
10. Create a histogram using Seaborn to visualize a distribution.
Use sns.histplot() for histogram visualization.

python
Copy
Edit
import seaborn as sns

data = [1, 2, 2, 3, 3, 3, 4, 4, 5]
sns.histplot(data, bins=5, kde=True)
11. Perform matrix multiplication using NumPy.
Use np.dot() or the @ operator for matrix multiplication.

python
Copy
Edit
a = np.array([[1, 2], [3, 4]])
b = np.array([[5, 6], [7, 8]])
result = np.dot(a, b)
print(result)
12. Use Pandas to load a CSV file and display its first 5 rows.
Use pd.read_csv() and head().

python
Copy
Edit
df = pd.read_csv('data.csv')
print(df.head())
13. Create a 3D scatter plot using Plotly.
Use plotly.express.scatter_3d() for 3D plots.

python
Copy
Edit
import plotly.express as px
import pandas as pd

df = pd.DataFrame({
    'x': [1, 2, 3],
    'y': [4, 5, 6],
    'z': [7, 8, 9]
})

fig = px.scatter_3d(df, x='x', y='y', z='z', title='3D Scatter Plot')
fig.show()