# <b>CaRM Module: Advanced Topics in Data Preparation Using Python (2024/2025)</b>
## <b>Exercises for Session 03</b>

### Exercise 1: Handling Missing Values

1.	Create a DataFrame with missing values.<br> 
2.	Display the DataFrame and count the number of missing values in each column.<br>
3.	Fill the missing values in the 'Age' column with the mean age.<br>
4.	Drop rows where the 'Score' column has missing values.<br>


In [None]:
import pandas as pd
import numpy as np

data = {
    'Name': ['Alice', 'Bob', 'Charlie', 'David', 'Eva'],
    'Age': [24, np.nan, 22, np.nan, 29],
    'Score': [85, 90, np.nan, 88, 92]
}

### Exercise 2: Dropping Duplicates

1.	Create a DataFrame with duplicate rows.<br>
2.	Display the DataFrame and identify duplicate rows using the .duplicated() method.<br>
3.	Drop duplicate rows and display the DataFrame after dropping duplicates.<br>

In [None]:
import pandas as pd

data = {
    'Name': ['Alice', 'Bob', 'Charlie', 'Alice', 'Eva', 'Bob'],
    'Age': [24, 27, 22, 24, 29, 27],
    'Score': [85, 90, 78, 85, 92, 90]
}

### Exercise 3: Filling Missing Values

1.	Create a DataFrame with missing values.<br>
2.	Use the .fillna() method to fill missing values in the 'Score' column with a specified value (e.g., 0).<br>
3.	Use the .fillna() method to fill missing values in the 'Age' column with the median age.<br>

In [None]:
import pandas as pd

data = {
    'Name': ['Alice', 'Bob', 'Charlie', 'David', 'Eva'],
    'Age': [24, np.nan, 22, 32, 29],
    'Score': [85, 90, np.nan, 88, 92]
}


### Exercise 4: Replacing Values

1.	Create a DataFrame with some invalid values.<br>
2.	Replace the invalid age (-1) with the mean age.<br>
3.	Use the .fillna() method to replace missing values in the 'Score' column with the mean score.<br>

In [None]:
import pandas as pd

data = {
    'Name': ['Alice', 'Bob', 'Charlie', 'David', 'Eva'],
    'Age': [24, -1, 22, 32, 29],
    'Score': [85, 90, 78, np.nan, 92]
}

### Exercise 5: Using concat() to Combine DataFrames Along Rows

1.	Create two DataFrames.<br>
2.	Use the concat() method to combine the DataFrames along rows.

In [None]:
import pandas as pd

data1 = {
    'A': ['A0', 'A1', 'A2', 'A3'],
    'B': ['B0', 'B1', 'B2', 'B3']
}

data2 = {
    'A': ['A4', 'A5', 'A6', 'A7'],
    'B': ['B4', 'B5', 'B6', 'B7']
}

### Exercise 6: Using concat() to Combine DataFrames Along Columns

1.	Create two DataFrames.<br>
2.	Use the concat() method to combine the DataFrames along columns.

In [None]:
import pandas as pd

data1 = {
    'A': ['A0', 'A1', 'A2', 'A3'],
    'B': ['B0', 'B1', 'B2', 'B3']
}

data2 = {
    'C': ['C0', 'C1', 'C2', 'C3'],
    'D': ['D0', 'D1', 'D2', 'D3']
}

### Exercise 7: Using .join() to Combine DataFrames

1.	Create two DataFrames with a common index.<br>
2.	Use the .join() method to combine the DataFrames based on the index.

In [None]:
import pandas as pd

df1 = pd.DataFrame({
    'A': ['A0', 'A1', 'A2', 'A3'],
    'B': ['B0', 'B1', 'B2', 'B3']
}, index=['K0', 'K1', 'K2', 'K3'])

df2 = pd.DataFrame({
    'C': ['C0', 'C1', 'C2', 'C3'],
    'D': ['D0', 'D1', 'D2', 'D3']
}, index=['K0', 'K1', 'K2', 'K3'])

### Exercise 8: Using merge() to Combine DataFrames on a Common Column

1.	Create two DataFrames with a common column.<br>
2.	Use the merge() method to combine the DataFrames on the 'Key' column.

In [None]:
import pandas as pd

data1 = {
    'Key': ['K0', 'K1', 'K2', 'K3'],
    'A': ['A0', 'A1', 'A2', 'A3'],
    'B': ['B0', 'B1', 'B2', 'B3']
}

data2 = {
    'Key': ['K0', 'K1', 'K2', 'K3'],
    'C': ['C0', 'C1', 'C2', 'C3'],
    'D': ['D0', 'D1', 'D2', 'D3']
}

  Key   A   B
0  K0  A0  B0
1  K1  A1  B1
2  K2  A2  B2
3  K3  A3  B3


### Exercise 9: Merging DataFrames with Different Keys

1.	Create two DataFrames with different keys.<br>
2.	Use the merge() method to join the dataframes based on the common keys.

In [None]:
data1 = {
    'Key': ['K0', 'K1', 'K2', 'K3'],
    'A': ['A0', 'A1', 'A2', 'A3'],
    'B': ['B0', 'B1', 'B2', 'B3']
}

data2 ={
    'Key': ['K0', 'K1', 'K3', 'K4'],
    'C': ['C0', 'C1', 'C2', 'C3'],
    'D': ['D0', 'D1', 'D2', 'D3']
}