#1. What is NumPy, and why is it widely used in Python?
Answer:
NumPy (Numerical Python) is a powerful library for numerical computing. It provides efficient array objects and functions for mathematical operations. It's widely used due to its speed, support for large datasets, and compatibility with other scientific libraries.

#2. How does broadcasting work in NumPy?
Answer:
Broadcasting allows NumPy to perform operations on arrays of different shapes by automatically expanding their dimensions to match. This simplifies array arithmetic without explicit loops.

#3. What is a Pandas DataFrame?
Answer:
A DataFrame is a 2D labeled data structure in Pandas, similar to a table in a database or an Excel spreadsheet. It allows data manipulation with columns of different types.

#4. Explain the use of the groupby() method in Pandas.
Answer:
The groupby() method is used to group data based on one or more keys and apply aggregation functions (like sum, mean). It's useful for summarizing large datasets.

#5. Why is Seaborn preferred for statistical visualizations?
Answer:
Seaborn provides high-level functions for attractive, informative statistical graphics. It integrates well with Pandas and offers plots like boxplots, heatmaps, and violin plots.

#6. What are the differences between NumPy arrays and Python lists?
Answer:

NumPy arrays are more memory-efficient.

They support vectorized operations.

Fixed type, unlike Python lists.

Faster performance for numerical computations.

#7. What is a heatmap, and when should it be used?
Answer:
A heatmap is a graphical representation of data where individual values are represented by color. It's useful for visualizing correlation matrices or showing concentration of values.

#8. What does the term “vectorized operation” mean in NumPy?
Answer:
Vectorized operations allow element-wise operations on arrays without explicit loops, making code concise and efficient.

#9. How does Matplotlib differ from Plotly?
Answer:

Matplotlib: Static, scriptable plotting library.

Plotly: Interactive, browser-based plots with zoom and hover support.

#10. What is the significance of hierarchical indexing in Pandas?
Answer:
Hierarchical indexing (MultiIndex) allows multiple levels of indexing on rows or columns, enabling complex data representation and efficient querying.

#11. What is the role of Seaborn’s pairplot() function?
Answer:
pairplot() creates pairwise scatter plots of numeric columns, useful for visualizing relationships and distributions across multiple variables.

#12. What is the purpose of the describe() function in Pandas?
Answer:
describe() provides summary statistics (count, mean, std, min, max, etc.) of DataFrame columns, useful for data exploration.

#13. Why is handling missing data important in Pandas?
Answer:
Missing data can bias analysis or cause errors. Pandas offers methods (fillna(), dropna()) to handle missing values appropriately.

#14. What are the benefits of using Plotly for data visualization?
Answer:

Interactive charts

Web-friendly

Beautiful built-in themes

Easy integration with Dash for dashboards

#15. How does NumPy handle multidimensional arrays?
Answer:
NumPy uses the ndarray object to handle multi-dimensional arrays efficiently with attributes like shape, ndim, and supports operations along axes.

#16. What is the role of Bokeh in data visualization?
Answer:
Bokeh creates interactive visualizations for the web using Python. It supports zooming, panning, and live streaming of data.

#17. Explain the difference between apply() and map() in Pandas.
Answer:

map() is used with Series for element-wise transformations.

apply() works with both Series and DataFrames, applying functions row-wise or column-wise.

#18. What are some advanced features of NumPy?
Answer:

Broadcasting

Structured arrays

Memory mapping

Universal functions (ufuncs)

Masked arrays

#19. How does Pandas simplify time series analysis?
Answer:
Pandas provides powerful tools like date_range, resample, and time-aware indexing for easy manipulation and analysis of time series data.

#20. What is the role of a pivot table in Pandas?
Answer:
Pivot tables summarize data by aggregating values based on categories. It’s used to reshape and analyze datasets easily.

#21. Why is NumPy’s array slicing faster than Python’s list slicing?
Answer:
NumPy arrays are stored in contiguous memory blocks and processed via C-level operations, making slicing operations much faster than native Python lists.

#22. What are some common use cases for Seaborn?
Answer:

Visualizing distributions (histplot, kdeplot)

Exploring relationships (scatterplot, pairplot)

Statistical comparisons (boxplot, violinplot)

Heatmaps for correlation or matrix data

In [None]:
import numpy as np

arr = np.array([[1, 2, 3], [4, 5, 6]])
row_sum = arr.sum(axis=1)
print("Sum of each row:", row_sum)


In [None]:
import pandas as pd

df = pd.DataFrame({'Math': [80, 90, 70], 'Science': [88, 76, 95]})
mean_math = df['Math'].mean()
print("Mean of Math:", mean_math)


In [None]:
import matplotlib.pyplot as plt

x = [1, 2, 3]
y = [4, 1, 9]
plt.scatter(x, y)
plt.title("Scatter Plot")
plt.xlabel("X-axis")
plt.ylabel("Y-axis")
plt.show()


In [None]:
import seaborn as sns

df = pd.DataFrame({'A': [1, 2, 3], 'B': [3, 2, 1], 'C': [2, 3, 4]})
corr = df.corr()
sns.heatmap(corr, annot=True, cmap="coolwarm")
plt.title("Correlation Heatmap")
plt.show()


In [None]:
import plotly.express as px

data = {'Fruits': ['Apple', 'Banana', 'Orange'], 'Quantity': [10, 15, 7]}
df = pd.DataFrame(data)
fig = px.bar(df, x='Fruits', y='Quantity', title="Fruit Quantity")
fig.show()


In [None]:
df = pd.DataFrame({'Sales': [100, 200, 150]})
df['Tax'] = df['Sales'] * 0.1
print(df)


In [None]:
a = np.array([1, 2, 3])
b = np.array([4, 5, 6])
result = a * b
print("Element-wise multiplication:", result)


In [None]:
x = [1, 2, 3]
y1 = [1, 4, 9]
y2 = [2, 3, 5]

plt.plot(x, y1, label='Line 1')
plt.plot(x, y2, label='Line 2')
plt.legend()
plt.title("Multiple Lines")
plt.show()


In [None]:
df = pd.DataFrame({'Score': [55, 82, 91, 45]})
filtered = df[df['Score'] > 60]
print(filtered)


In [None]:
import seaborn as sns

data = [10, 20, 20, 30, 40, 40, 40, 50]
sns.histplot(data, bins=5, kde=True)
plt.title("Histogram")
plt.show()


In [None]:
A = np.array([[1, 2], [3, 4]])
B = np.array([[2, 0], [1, 2]])
result = np.dot(A, B)
print("Matrix Multiplication:\n", result)


In [None]:
# Assuming a file named "data.csv" exists in your directory
df = pd.read_csv("data.csv")
print(df.head())


In [None]:
import plotly.express as px
import pandas as pd

df = pd.DataFrame({
    'x': [1, 2, 3],
    'y': [4, 5, 6],
    'z': [7, 8, 9]
})

fig = px.scatter_3d(df, x='x', y='y', z='z', title='3D Scatter Plot')
fig.show()
