1. What is NumPy, and why is it widely used in Python?  
NumPy is a Python library used for fast numerical computations. It is mainly used to work with arrays and perform mathematical operations efficiently.

2. How does broadcasting work in NumPy?  
Broadcasting lets NumPy perform operations on arrays of different shapes by automatically expanding their dimensions during calculations.

3. What is a Pandas DataFrame?  
A Pandas DataFrame is like a table with rows and columns. It’s used to store and work with structured data easily.

4. Explain the use of the groupby() method in Pandas.  
The groupby() method is used to group rows based on a column’s value and then apply some operation like sum or average on those groups.

5. Why is Seaborn preferred for statistical visualizations?  
Seaborn is preferred because it makes beautiful and easy-to-understand charts with fewer lines of code, especially for statistical data.

6. What are the differences between NumPy arrays and Python lists?  
NumPy arrays are faster and use less memory than Python lists. They are fixed in type, while lists can hold different data types.

7. What is a heatmap, and when should it be used?  
A heatmap shows data using colors in a matrix form. It’s helpful when you want to quickly see patterns, like in correlation matrices.

8. What does the term “vectorized operation” mean in NumPy?  
It means doing operations on whole arrays without using loops. This makes the code faster and shorter.

9. How does Matplotlib differ from Plotly?  
Matplotlib is used for basic, static plots, while Plotly is better for interactive and web-based visualizations.

10. What is the significance of hierarchical indexing in Pandas?  
Hierarchical indexing lets you use multiple levels of indexes in your data, making it easier to organize and access complex datasets.

11. What is the role of Seaborn’s pairplot() function?  
The pairplot() function shows relationships between multiple columns at once using scatterplots and histograms in one go.

12. What is the purpose of the describe() function in Pandas?  
The describe() function gives a quick summary of numerical data, like mean, median, min, max, and standard deviation.

13. Why is handling missing data important in Pandas?  
It’s important because missing values can affect the accuracy of your analysis. You need to deal with them to get correct results.

14. What are the benefits of using Plotly for data visualization?  
Plotly makes interactive charts that users can zoom, hover, and explore, which is useful for better understanding of data.

15. How does NumPy handle multidimensional arrays?  
NumPy uses ndarrays to handle multidimensional data efficiently, allowing operations across many axes at once.

16. What is the role of Bokeh in data visualization?  
Bokeh is used for making interactive visualizations for web browsers. It’s good for building dashboards and live charts.

17. Explain the difference between apply() and map() in Pandas.  
apply() works on rows or columns of a DataFrame, while map() is used on a single column or Series for element-wise changes.

18. What are some advanced features of NumPy?  
Some advanced features include linear algebra functions, random number generation, broadcasting, and handling masked arrays.

19. How does Pandas simplify time series analysis?  
Pandas helps with time series by offering tools like date indexing, resampling, shifting, and rolling calculations.

20. What is the role of a pivot table in Pandas?  
A pivot table summarizes data by grouping and aggregating values, which helps in quickly analyzing and comparing data.

21. Why is NumPy’s array slicing faster than Python’s list slicing?  
NumPy slicing is faster because arrays are stored in continuous memory and operations are handled internally in C.

22. What are some common use cases for Seaborn?  
Seaborn is commonly used for making plots like heatmaps, distribution plots, box plots, and visualizing relationships in data.


In [None]:
# 1. Create a 2D NumPy array and calculate the sum of each row
import numpy as np
arr = np.array([[1, 2, 3], [4, 5, 6]])
row_sums = np.sum(arr, axis=1)
print(row_sums)

[ 6 15]


In [None]:
# 2. Find the mean of a specific column in a DataFrame
import pandas as pd
df = pd.DataFrame({'A': [1, 2, 3], 'B': [4, 5, 6]})
mean_b = df['B'].mean()
print(mean_b)

In [None]:
# 3. Create a scatter plot using Matplotlib
import matplotlib.pyplot as plt
x = [1, 2, 3, 4]
y = [10, 15, 13, 18]
plt.scatter(x, y)
plt.show()

In [None]:
# 4. Calculate the correlation matrix and visualize with heatmap (Seaborn)
import seaborn as sns
data = pd.DataFrame(np.random.rand(5, 3), columns=['A', 'B', 'C'])
corr = data.corr()
sns.heatmap(corr, annot=True)
plt.show()

In [None]:
# 5. Generate a bar plot using Plotly
import plotly.express as px
data = pd.DataFrame({'Fruits': ['Apple', 'Banana', 'Orange'], 'Count': [10, 20, 15]})
fig = px.bar(data, x='Fruits', y='Count')
fig.show()

In [None]:
# 6. Create DataFrame and add a new column based on existing column
df = pd.DataFrame({'Price': [100, 200, 300]})
df['Price with Tax'] = df['Price'] * 1.1
print(df)

In [None]:
# 7. Element-wise multiplication of two NumPy arrays
a = np.array([1, 2, 3])
b = np.array([4, 5, 6])
result = a * b
print(result)

In [None]:
# 8. Create line plot with multiple lines using Matplotlib
x = [1, 2, 3, 4]
y1 = [10, 20, 25, 30]
y2 = [5, 15, 20, 25]
plt.plot(x, y1, label='Line 1')
plt.plot(x, y2, label='Line 2')
plt.legend()
plt.show()

In [None]:
# 9. Filter DataFrame rows by threshold
df = pd.DataFrame({'A': [5, 10, 15, 20]})
filtered_df = df[df['A'] > 10]
print(filtered_df)

In [None]:
# 10. Create a histogram using Seaborn
data = np.random.randn(100)
sns.histplot(data, kde=True)
plt.show()

In [None]:
# 11. Matrix multiplication using NumPy
mat1 = np.array([[1, 2], [3, 4]])
mat2 = np.array([[5, 6], [7, 8]])
product = np.dot(mat1, mat2)
print(product)


In [None]:
# 12. Load CSV and display first 5 rows
df = pd.read_csv('your_file.csv')  # Replace 'your_file.csv' with actual filename
print(df.head())

In [1]:
# 13. Create a 3D scatter plot using Plotly
import plotly.graph_objects as go
fig = go.Figure(data=[go.Scatter3d(
    x=[1, 2, 3], y=[4, 5, 6], z=[7, 8, 9],
    mode='markers',
    marker=dict(size=5)
)])
fig.show()