# Example Usage of ExampleDatasetLoader with IonosphereDatasetLoader

This notebook demonstrates how to use the ExampleDatasetLoader class and its implementation, specifically the IonosphereDatasetLoader. We will:
1. Load the dataset
2. Perform default preprocessing
3. Use custom preprocessing techniques such as imputing missing values, scaling, and encoding categorical data.


In [None]:
!pip install rocelib

In [None]:
# Importing the required class
from rocelib.datasets.ExampleDatasets import get_example_dataset

# Instantiate the dataset loader, here we use get_example_dataset
# which has options:
#                - iris
#                - ionosphere
#                - adult
#                - titanic
# Alternatively instantiate DatasetLoader directly, e.g., IonosphereDatasetLoader()
ionosphere_loader = get_example_dataset("ionosphere")

# Load the dataset
ionosphere_loader.load_data()

# Display the first 5 rows of the dataset
ionosphere_loader.data.head()

## Preprocessing the Data Using the Default Method
The `default_preprocess` method applies the default pipeline which in this case includes standard scaling.

In [None]:
# Apply default preprocessing
ionosphere_loader.default_preprocess()

# Show the preprocessed data
ionosphere_loader.data.head()

## Custom Preprocessing
We can also customize the preprocessing by defining different imputation strategies, scaling methods, or encoding choices.

In [None]:
# Custom preprocessing
ionosphere_loader.preprocess(
    impute_strategy_numeric='mean',  # Impute missing numeric values with mean
    scale_method='minmax',           # Apply min-max scaling
    encode_categorical=False         # No categorical encoding needed (since no categorical features)
)

# Show preprocessed data after custom preprocessing
ionosphere_loader.data.head()

If you do not want to alter the state of the DatasetLoader after preprocessing, you can
get the preprocessed features directly too using the functions below:

In [None]:
ionosphere_loader = get_example_dataset("ionosphere")

default_preprocessed = ionosphere_loader.get_default_preprocessed_features()

print(default_preprocessed.head())

custom_preprocessed = ionosphere_loader.get_preprocessed_features(
    impute_strategy_numeric='mean',  # Impute missing numeric values with mean
    scale_method='minmax',           # Apply min-max scaling
    encode_categorical=False         # No categorical encoding needed (since no categorical features)
)

print(custom_preprocessed.head())