In [None]:
"""
1.What is NumPy, and why is it widely used in Python?
ans. NumPy (Numerical Python) is a powerful library for numerical computing in Python. It provides support for large, 
multi-dimensional arrays and matrices, along with a collection of mathematical functions to operate on them efficiently.
NumPy is widely used due to its high performance, vectorized operations, and ability to integrate with other scientific computing libraries.

2.How does broadcasting work in NumPy?
ans.Broadcasting allows NumPy to perform element-wise operations on arrays of different shapes without explicitly copying or reshaping data. 
It expands smaller arrays to match the shape of larger ones when performing arithmetic operations, making computations more efficient.

3.What is a Pandas DataFrame?
ans. A Pandas DataFrame is a two-dimensional, tabular data structure with labeled axes (rows and columns). 
It is similar to a spreadsheet or SQL table and provides powerful data manipulation and analysis capabilities.

4.Explain the use of the groupby() method in Pandas?
ans.The groupby() method is used to split a dataset into groups based on a given criterion, apply a function to each group,
and then combine the results. It is useful for aggregating, summarizing, and analyzing data.

5.Why is Seaborn preferred for statistical visualizations?
ans.Seaborn is preferred because it provides high-level, aesthetically pleasing statistical graphics with built-in support for data relationships,
categorical variables, and complex visualizations like violin plots and pair plots. It also integrates well with Pandas.

6.What are the differences between NumPy arrays and Python lists?
ans. NumPy arrays are faster and more memory-efficient than Python lists.
They support vectorized operations, whereas lists require loops.
Arrays must have homogeneous data types, whereas lists can store mixed data types.

7.What is a heatmap, and when should it be used?
ans. A heatmap is a graphical representation of data where individual values are represented using color gradients. 
It is commonly used to visualize correlation matrices, feature importance, or large datasets to identify patterns.

8.How does Matplotlib differ from Plotly?
ans.Matplotlib is a static plotting library, allowing for detailed customization and low-level control over plots.
Plotly is an interactive plotting library with built-in support for zooming, tooltips, and web-based visualizations.

9.What is the significance of hierarchical indexing in Pandas?
ans.Hierarchical indexing allows multiple levels of row or column labels, enabling the representation of multi-dimensional data within a DataFrame. 
It is useful for organizing complex datasets efficiently.

10.What is the role of Seaborn’s pairplot() function?
ans.The pairplot() function visualizes pairwise relationships in a dataset, making it useful for exploring correlations between 
multiple numerical variables.

11.What is the purpose of the describe() function in Pandas?
ans.The describe() function provides summary statistics (mean, median, count, standard deviation, etc.) for numerical columns in a DataFrame.

12.Why is handling missing data important in Pandas?
ans. Handling missing data ensures data integrity, prevents errors in analysis, and improves the reliability of insights derived from datasets.

13.What are the benefits of using Plotly for data visualization?
ans.Plotly offers interactive plots, support for web-based dashboards, real-time updates, and integration with various data analysis tools.

14.How does NumPy handle multidimensional arrays?
ans.NumPy supports multi-dimensional arrays (ndarrays) that allow efficient storage and operations across multiple dimensions using indexing, 
slicing, and broadcasting.

15.What is the role of Bokeh in data visualization?
ans. Bokeh is a Python library for creating interactive visualizations for web applications, focusing on interactivity and real-time streaming data.

16.Explain the difference between apply() and map() in Pandas?
ans. map() is used for element-wise transformations on a Series.
apply() can be used for row-wise or column-wise transformations in a DataFrame.

17.What are some advanced features of NumPy?
ans. Broadcasting
Fancy indexing
Universal functions (ufuncs)
Linear algebra functions
Memory-mapped arrays

18.How does Pandas simplify time series analysis?
ans.Pandas provides built-in time series functionality such as resampling, shifting, rolling windows, and datetime indexing, 
making time-based data manipulation easy.

19.What is the role of a pivot table in Pandas?
ans.A pivot table summarizes data by grouping and aggregating values, making it useful for analyzing trends and patterns.

20.Why is NumPy’s array slicing faster than Python’s list slicing?
ans.NumPy arrays use continuous memory storage and optimized operations, making slicing much faster than Python lists, 
which require accessing individual objects.

21.What are some common use cases for Seaborn?
ans.Statistical analysis
Exploratory data analysis
Correlation matrices
Distribution visualizations
Categorical data analysis
"""

In [None]:
#1. How do you create a 2D NumPy array and calculate the sum of each row.
import numpy as np

# Creating a 2D NumPy array
arr = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])

# Calculating the sum of each row
row_sums = np.sum(arr, axis=1)

print(row_sums)  # Output: [ 6 15 24]


In [None]:
#2. Write a Pandas script to find the mean of a specific column in a DataFrame.

import pandas as pd

# Creating a DataFrame
data = {'A': [10, 20, 30, 40, 50], 'B': [5, 15, 25, 35, 45]}
df = pd.DataFrame(data)

# Finding the mean of column 'A'
mean_value = df['A'].mean()

print(mean_value)  # Output: 30.0


In [None]:
#3. Create a scatter plot using Matplotlib

import matplotlib.pyplot as plt
import numpy as np

# Generating data
x = np.random.rand(50)
y = np.random.rand(50)

# Creating scatter plot
plt.scatter(x, y, color='blue', alpha=0.6)
plt.xlabel('X-axis')
plt.ylabel('Y-axis')
plt.title('Scatter Plot Example')
plt.show()


In [None]:
import seaborn as sns
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

# Creating a DataFrame with random values
data = np.random.rand(10, 4)
df = pd.DataFrame(data, columns=['A', 'B', 'C', 'D'])

# Calculating the correlation matrix
correlation_matrix = df.corr()

# Creating the heatmap
sns.heatmap(correlation_matrix, annot=True, cmap='coolwarm', fmt=".2f")
plt.title('Correlation Matrix Heatmap')
plt.show()


In [None]:

import plotly.express as px
import pandas as pd

# Creating a DataFrame
data = {'Category': ['A', 'B', 'C', 'D'], 'Values': [10, 20, 15, 25]}
df = pd.DataFrame(data)

# Creating the bar plot
fig = px.bar(df, x='Category', y='Values', title="Bar Plot Example")
fig.show()


In [None]:
# 6. Create a DataFrame and add a new column based on an existing column
import pandas as pd

# Creating a DataFrame
df = pd.DataFrame({'A': [10, 20, 30, 40]})

# Adding a new column based on column 'A'
df['B'] = df['A'] * 2

print(df)


In [None]:
# 7. Write a program to perform element-wise multiplication of two NumPy arrays
import numpy as np

# Creating two NumPy arrays
arr1 = np.array([1, 2, 3, 4])
arr2 = np.array([5, 6, 7, 8])

# Element-wise multiplication
result = arr1 * arr2

print(result)  # Output: [ 5 12 21 32]


In [None]:
# 8. Create a line plot with multiple lines using Matplotlib

import matplotlib.pyplot as plt
import numpy as np

# Generating data
x = np.linspace(0, 10, 100)
y1 = np.sin(x)
y2 = np.cos(x)

# Creating the plot
plt.plot(x, y1, label="sin(x)", color="blue")
plt.plot(x, y2, label="cos(x)", color="red")
plt.xlabel("X-axis")
plt.ylabel("Y-axis")
plt.title("Line Plot with Multiple Lines")
plt.legend()
plt.show()


In [None]:
# 9. Generate a Pandas DataFrame and filter rows where a column value is greater than a threshold
import pandas as pd

# Creating a DataFrame
df = pd.DataFrame({'A': [10, 20, 30, 40, 50]})

# Filtering rows where column 'A' is greater than 25
filtered_df = df[df['A'] > 25]

print(filtered_df)


In [None]:
# 10. Create a histogram using Seaborn to visualize a distribution
import seaborn as sns
import numpy as np
import matplotlib.pyplot as plt

# Generating random data
data = np.random.randn(1000)

# Creating the histogram
sns.histplot(data, bins=30, kde=True, color='purple')
plt.title("Histogram of Random Data")
plt.show()


In [None]:
#11. Perform matrix multiplication using NumPy
import numpy as np

# Creating two matrices
A = np.array([[1, 2], [3, 4]])
B = np.array([[5, 6], [7, 8]])

# Performing matrix multiplication
result = np.dot(A, B)

print(result)


In [None]:
# 12. Use Pandas to load a CSV file and display its first 5 rows
import pandas as pd

# Loading the CSV file (assuming 'data.csv' exists)
df = pd.read_csv('data.csv')

# Displaying the first 5 rows
print(df.head())


In [None]:
#13. Create a 3D scatter plot using Plotly
import plotly.express as px
import pandas as pd
import numpy as np

# Generating random data
df = pd.DataFrame({
    'X': np.random.rand(100),
    'Y': np.random.rand(100),
    'Z': np.random.rand(100)
})

# Creating the 3D scatter plot
fig = px.scatter_3d(df, x='X', y='Y', z='Z', title="3D Scatter Plot Example")
fig.show()
