# `MissingValueHandler` Example

This notebook demonstrates how to use the `MissingValueHandler` to impute missing values in your dataset using different strategies.

In [4]:
import sys
sys.path.insert(0, '..')

import pandas as pd
import numpy as np
from transfory.missing import MissingValueHandler

## 1. Create Sample Data

Let's create a DataFrame with missing values in both numeric and categorical columns.

In [5]:
df = pd.DataFrame({
    'Age': [22, 35, np.nan, 19, 40],
    'City': ["New York", "London", "London", np.nan, "Paris"]
})

print("Original DataFrame:")
df

Original DataFrame:


Unnamed: 0,Age,City
0,22.0,New York
1,35.0,London
2,,London
3,19.0,
4,40.0,Paris


## 2. Apply the MissingValueHandler

We'll use the `'mean'` strategy. The handler is smart enough to apply the mean to the numeric 'Age' column and the mode (most frequent value) to the categorical 'City' column.

In [6]:
imputer = MissingValueHandler(strategy="mean")

imputed_df = imputer.fit_transform(df)

print("Imputed DataFrame:")
imputed_df

Imputed DataFrame:


Unnamed: 0,Age,City
0,22.0,New York
1,35.0,London
2,29.0,London
3,19.0,
4,40.0,Paris
