Q1. How do you load a CSV file into a Pandas DataFrame?     
Ans.    
To load a CSV file into a Pandas DataFrame, you can use the pandas.read_csv function. This function takes a file path (or URL) and various optional parameters to control how the file is read.     
--Load the CSV file
df = pd.read_csv('file.csv')

--Print the contents of the DataFrame
print(df)

Q2. How do you check the data type of a column in a Pandas DataFrame?       
Ans.    
To check the data type of a column in a Pandas DataFrame, you can use the dtypes attribute of the DataFrame. This attribute returns a Pandas Series object with the data types of each column.  
--Load the CSV file
df = pd.read_csv('file.csv')

--Print the data types of the columns
print(df.dtypes)


Q3. How do you select rows from a Pandas DataFrame based on a condition?
Ans.    
To select rows from a Pandas DataFrame based on a condition, you can use the loc method, which is a more flexible indexer for DataFrames.   
--Load the CSV file     
df = pd.read_csv('file.csv')

--Select rows where the 'col1' column is greater than 0     
df = df.loc[df['col1'] > 0]

--Print the contents of the DataFrame       
print(df)


Q4. How do you rename columns in a Pandas DataFrame?    
Ans.    
To rename columns in a Pandas DataFrame, you can use the rename method. This method takes a dictionary of old column names as keys, and new column names as values, and returns a new DataFrame with the renamed columns    
--Load the CSV file     
df = pd.read_csv('file.csv')

--Rename the 'col1' column to 'new_name'        
df = df.rename(columns={'col1': 'new_name'})


Q5. How do you drop columns in a Pandas DataFrame?      
Ans.    
To drop columns in a Pandas DataFrame, you can use the drop method. This method takes a list of column names as input, and returns a new DataFrame with the specified columns removed.      
--Load the CSV file     
df = pd.read_csv('file.csv')

--Drop the 'col1' column    
df = df.drop(columns=['col1'])

Q6. How do you find the unique values in a column of a Pandas DataFrame?    
Ans.    
To find the unique values in a column of a Pandas DataFrame, you can use the unique method of the Series object.    

--Find the unique values in the 'col1' column
uniq_vals = df['col1'].unique()

Q7. How do you find the number of missing values in each column of a Pandas DataFrame?      
Ans.    
To find the number of missing values in each column of a Pandas DataFrame, you can use the isnull method to create a Boolean mask indicating which values are missing, and then use the sum method to count the number of True values in each column    
--Count the number of missing values in each column     
null_counts = df.isnull().sum()

Q8. How do you fill missing values in a Pandas DataFrame with a specific value?     
Ans.   
--Fill missing values in the 'col1' column with 0       
df['col1'] = df['col1'].fillna(0)




Q9. How do you concatenate two Pandas DataFrames?   
Ans.    
--Load the first CSV file       
df1 = pd.read_csv('file1.csv')

--Load the second CSV file      
df2 = pd.read_csv('file2.csv')

--Concatenate the two DataFrames        
df = pd.concat([df1, df2])

--Print the contents of the resulting DataFrame         
print(df)


Q10. How do you merge two Pandas DataFrames on a specific column?   
Ans.    
To merge two Pandas DataFrames on a specific column, you can use the merge function     
import pandas as pd

df1 = pd.DataFrame({'key': ['K0', 'K1', 'K2', 'K3'],
                     'A': ['A0', 'A1', 'A2', 'A3'],
                     'B': ['B0', 'B1', 'B2', 'B3']})

df2 = pd.DataFrame({'key': ['K0', 'K1', 'K2', 'K3'],
                     'C': ['C0', 'C1', 'C2', 'C3'],
                     'D': ['D0', 'D1', 'D2', 'D3']})

result = pd.merge(df1, df2, on='key')   
print(result)



Q11. How do you group data in a Pandas DataFrame by a specific column and apply an aggregation function?    
Ans.    
import pandas as pd

df = pd.DataFrame({'key': ['K0', 'K1', 'K2', 'K3'],
                   'col1': [1, 2, 3, 4],
                   'col2': [5, 6, 7, 8]})

result = df.groupby('key').agg({'col1': 'sum', 'col2': 'mean'})     
print(result)



Q12. How do you pivot a Pandas DataFrame?       
Ans.    
To pivot a Pandas DataFrame, you can use the pivot() function. This function will take three arguments: index, columns, and values.     
import pandas as pd

--Load the data     
df = pd.read_csv('data.csv')

--Pivot the data        
pivoted = df.pivot(index='index_column_name', columns='column_name', values='value_column_name')



Q13. How do you change the data type of a column in a Pandas DataFrame?     
Ans.    
To change the data type of a column in a Pandas DataFrame, you can use the astype() function. This function allows you to specify the data type you want to convert the column.      
import pandas as pd 

--Load the data     
df = pd.read_csv('data.csv')

--Convert the 'age' column to an integer        
df['age'] = df['age'].astype(int)


Q14. How do you sort a Pandas DataFrame by a specific column?       
Ans.       
To sort a Pandas DataFrame by a specific column, you can use the sort_values() function. This function takes a by parameter, which specifies the name of the column to sort by. 

import pandas as pd 

--Load the data     
df = pd.read_csv('data.csv')

--Sort the DataFrame by the 'age' column        
df.sort_values(by='age')



Q15. How do you create a copy of a Pandas DataFrame?        
Ans.    
o create a copy of a Pandas DataFrame, you can use the copy() function. This function creates a new object with a copy of the data in the original DataFrame    

import pandas as pd

--Load the data     
df = pd.read_csv('data.csv')

--Create a copy of the DataFrame        
df_copy = df.copy()



Q16. How do you filter rows of a Pandas DataFrame by multiple conditions?   
Ans.    
To filter rows of a Pandas DataFrame by multiple conditions, you can use the & and | operators to specify multiple conditions   

import pandas as pd     

--Load the data     
df = pd.read_csv('data.csv')

--Filter rows where the 'age' column is greater than 30 and the 'name' column starts with 'A'       
df = df[(df['age'] > 30) & (df['name'].str.startswith('A'))]


Q17. How do you calculate the mean of a column in a Pandas DataFrame?
Ans.    
To calculate the mean of a column in a Pandas DataFrame, you can use the mean() function. This function returns the mean of all the values in the column.   
import pandas as pd

--Load a sample DataFrame   
df = pd.read_csv("data.csv")

--Calculate the mean of the "Score" column      
mean_score = df["Score"].mean()

print(mean_score)


Q18. How do you calculate the standard deviation of a column in a Pandas DataFrame?
Ans.    
You can calculate the standard deviation of a column in a Pandas DataFrame using the std() function 
import pandas as pd

--Create a sample DataFrame     
df = pd.DataFrame({'col1': [1, 2, 3, 4, 5], 'col2': [2, 3, 4, 5, 6]})

--Calculate the standard deviation of column 'col1'     
col1_std = df['col1'].std()

print(col1_std)


Q19. How do you calculate the correlation between two columns in a Pandas DataFrame?
Ans.        
You can calculate the correlation between two columns in a Pandas DataFrame using the corr() function.      
import pandas as pd

--Create a sample DataFrame     
df = pd.DataFrame({'col1': [1, 2, 3, 4, 5], 'col2': [2, 3, 4, 5, 6]})

--Calculate the correlation between column 'col1' and 'col2'        
corr = df['col1'].corr(df['col2'])

print(corr)


Q20. How do you select specific columns in a DataFrame using their labels?
Ans.    
***Using the [] operator:  
import pandas as pd

--Create a sample DataFrame     
df = pd.DataFrame({'col1': [1, 2, 3, 4, 5], 'col2': [2, 3, 4, 5, 6], 'col3': [3, 4, 5, 6, 7]})

--Select column 'col1'      
col1 = df['col1']

print(col1)


***Using the dot notation:      
--Select columns 'col1' and 'col3'
col1_and_col3 = df[['col1', 'col3']]

print(col1_and_col3)


Q21. How do you select specific rows in a DataFrame using their indexes?
Ans.    
***Using the [] operator:   
import pandas as pd

--Create a sample DataFrame         
df = pd.DataFrame({'col1': [1, 2, 3, 4, 5], 'col2': [2, 3, 4, 5, 6], 'col3': [3, 4, 5, 6, 7]}, index=['a', 'b', 'c', 'd', 'e'])

--Select row 'b'    
row_b = df.loc['b']

print(row_b)

***Using the iloc[] function:   
--Select rows 2 and 4 (indices 1 and 3)         
rows_2_and_4 = df.iloc[[1, 3]]

print(rows_2_and_4)



Q22. How do you sort a DataFrame by a specific column?  
Ans.    
import pandas as pd

--Create a sample DataFrame     
df = pd.DataFrame({'col1': [5, 2, 3, 1, 4], 'col2': [2, 3, 4, 5, 6], 'col3': [3, 4, 5, 6, 7]})

--Sort the DataFrame by column 'col1' in ascending order        
df_sorted = df.sort_values(by='col1')

print(df_sorted)


Q23. How do you create a new column in a DataFrame based on the values of another column?       
Ans.        
import pandas as pd

--Create a sample DataFrame     
df = pd.DataFrame({'col1': [1, 2, 3, 4, 5], 'col2': [2, 3, 4, 5, 6]})

--Create a new column 'col3' based on the values in 'col1'      
df = df.assign(col3=df['col1'] * 2)

print(df)



Q24. How do you remove duplicates from a DataFrame?     
Ans.    
import pandas as pd

--Create a sample DataFrame with duplicates     
df = pd.DataFrame({'col1': [1, 2, 3, 3, 4, 4, 5], 'col2': [2, 3, 4, 4, 5, 5, 6]})

--Remove duplicates     
df_deduped = df.drop_duplicates()

print(df_deduped)



Q25. What is the difference between .loc and .iloc in Pandas?   
Ans.    
n Pandas, .loc and .iloc are both used to select rows and columns from a DataFrame. The primary difference between the two is that .loc uses label-based indices, whereas .iloc uses integer-based indices  

import pandas as pd

--Create a sample DataFrame     
df = pd.DataFrame({'col1': [1, 2, 3, 4, 5], 'col2': [2, 3, 4, 5, 6], 'col3': [3, 4, 5, 6, 7]}, index=['a', 'b', 'c', 'd', 'e'])

--Select row 'b' using .loc     
row_b = df.loc['b']

print(row_b)

--Select row 2 (index 1) using .iloc        
row_2 = df.iloc[1]

print(row_2)
