## MIPHA test on a real use case

The purpose of this notebook is to test the MIPHA framework with a real use case. A proper "example" package will be provided in a future version, once the framework has stabilized.

The data used in this example has been extracted from the [MIMIC-IV database](https://physionet.org/content/mimiciv/2.2/). The code used to extract the data will also be added to the example package.

In [None]:
import sys
from importlib import reload

import pandas as pd 

import tests.test_real_data.real_data_manager as data
import tests.test_real_data.simple_implementation as impl # necessary for pickle to work, cannot use from ... import *
from mipha.framework import MiphaPredictor

In [None]:
reload(sys.modules['src.mipha.framework'])
reload(sys.modules['tests.test_real_data.simple_implementation'])
reload(sys.modules['tests.test_real_data.real_data_manager'])

### Framework implementation

We test a simple implementation of the framework applied to the prediction of stage 5 CKD (using a year of history for a prediction up to 15 months in advance).
The data sources used in this example are:
- The evolution of creatinine over time.
- The age and gender of the patient.

The framework is implemented as such:
- Feature extraction for the first data source is performed using the `tsfel` package.
- Aggregation is a simple concatenation of the extracted features.
- The machine learning model is a simple CNN.

In [None]:
data_sources_train, labels_train, data_sources_test, labels_test = data.load_stage_5_ckd(random_state=25)

In [None]:
mipha = MiphaPredictor(
    feature_extractors=[
        impl.BiologyFeatureExtractor(["Creatinine"]),
        impl.DemographicsFeatureExtractor(["Demographics"])],
    aggregator=impl.SimpleAggregator(),
    model=impl.SimpleCnnModel(rows=1, columns=142, output_dim=1, n_filters=3), # input dimensions are picked for the aggregator, output is binary
    evaluator=impl.SimpleEvaluator(),
)

In [None]:
mipha.fit(data_sources_train, labels_train, epochs=3)

In [None]:
mipha.evaluate(data_sources=data_sources_test, test_labels=pd.DataFrame(labels_test), threshold=0.5)

In [None]:
mipha.save("out/mipha_real_data")

In [None]:
from datetime import datetime

now = datetime.now()
formatted_time = now.strftime("%Y-%m-%d_%H-%M-%S")
file_path = f"out/mipha_real_data_{formatted_time}"
mipha.save(file_path)