# Data Cleaning 
## Automating Missing Values Handling with Python Functions

Missing data is one of the most common issues in datasets, and it can wreak havoc if not handled properly. Depending on your dataset and problem, you might choose to drop rows with missing values, fill them with defualt values, or even use more advanced techniques like imputation

In [10]:
# Code example: Handling Missing Values

import pandas as pd
# Define a reusable function to handle missing values
def handle_missing_values(df, method='mean', fill_value=None):
    if method == 'drop':
        return df.dropna()
    elif method == 'fill':
        return df.fillna(fill_value)
    elif method == 'mean':
        numeric_cols =df.select_dtypes(include=['number']).columns
        df[numeric_cols]=df[numeric_cols].fillna(df[numeric_cols].mean().round(2))
        return df
    else:
        raise ValueError("Invalid method provided")

# Example dataset
data= {"Name": ['Joshua', 'Judith', None, 'Jude'],
       "Age": [25, None, 30, 22], 
       "Salary":[500000, 60000, None, 450000]}
df=pd.DataFrame(data)

# use the function to handle missing values by filling with the mean
cleaned_df=handle_missing_values(df, method= 'mean')
print(cleaned_df)

     Name    Age     Salary
0  Joshua  25.00  500000.00
1  Judith  25.67   60000.00
2    None  30.00  336666.67
3    Jude  22.00  450000.00
