In [1]:
# Title: Data Cleaning using Pandas
# Description: Check for missing values and handle them by imputing the median.
import pandas as pd
import numpy as np

# Sample DataFrame with missing values
data = {
    'A': [1, 2, np.nan, 4, 5],
    'B': [10, np.nan, 30, 40, 50],
    'C': [100, 200, 300, np.nan, 500]
}
df = pd.DataFrame(data)

# 1. Check for missing values
print("Missing values before imputation:")
print(df.isnull().sum())

# 2. Impute missing values with column medians
for column in df.columns:
    median_value = df[column].median()
    df[column].fillna(median_value, inplace=True)

# 3. Verify no missing values remain
print("\nMissing values after imputation:")
print(df.isnull().sum())

# Show the cleaned DataFrame
print("\nCleaned DataFrame:")
print(df)

Missing values before imputation:
A    1
B    1
C    1
dtype: int64

Missing values after imputation:
A    0
B    0
C    0
dtype: int64

Cleaned DataFrame:
     A     B      C
0  1.0  10.0  100.0
1  2.0  35.0  200.0
2  3.0  30.0  300.0
3  4.0  40.0  250.0
4  5.0  50.0  500.0


The behavior will change in pandas 3.0. This inplace method will never work because the intermediate object on which we are setting values always behaves as a copy.

For example, when doing 'df[col].method(value, inplace=True)', try using 'df.method({col: value}, inplace=True)' or df[col] = df[col].method(value) instead, to perform the operation inplace on the original object.


  df[column].fillna(median_value, inplace=True)
The behavior will change in pandas 3.0. This inplace method will never work because the intermediate object on which we are setting values always behaves as a copy.

For example, when doing 'df[col].method(value, inplace=True)', try using 'df.method({col: value}, inplace=True)' or df[col] = df[col].method(value) instead, to perform the operation inplace on the original object.


  df[column].fillna(median_value, inplace=True)
The behavior will change in pandas 3.0. This inplace method will never work because the intermediate object on which we are setting values a