# Exploring dimlpfidex with OmniXAI for heart attack prediction

**Introduction:**

Welcome to HES-Xplain, our interactive platform designed to facilitate explainable artificial intelligence (XAI) techniques. In this use case, we use the [Heart Attack Analysis & Prediction Dataset](https://www.kaggle.com/datasets/rashikrahmanpritom/heart-attack-analysis-prediction-dataset). This notebook is going to showcase how to use the MLXplain (OmniXAI) integration of [DimlpBT](https://hes-xplain.github.io/documentation/dimlpfidex/dimlp/dimlpbt/), [Fidex](https://hes-xplain.github.io/documentation/dimlpfidex/fidex/fidex/) & [FidexGloRules](https://hes-xplain.github.io/documentation/dimlpfidex/fidex/fidexglorules/).

**Objectives:**

    1. Observe a different use case where XAI can be used.
    2. Understand how to pre-process data.
    3. Understand how to use Dimlp and Fidex with OmniXAI.
    4. Showcase the versatility of HES-Xplain using a different dataset.
    5. Provide practical insights into applying Dimlp and Fidex to heart attack prediction through an interactive notebook.
    6. Foster a community of XAI enthusiasts and practitioners.

**Outline:**

    1. Dataset and Problem Statement.
    2. Load and pre-process the dataset.
    3. Train the Model.
    4. Local and global rules generation & OmniXAI dashboard display.
    5. References.

Through this use case, we aim to show the users the potential of Dimlp and Fidex as tools for transparent and interpretable classification. With HES-Xplain, we make XAI accessible, helping users build trust in their models and make informed decisions.

# Dataset and Problem Statement
The dataset we'll be working with is called the [Heart Attack Analysis & Prediction Dataset](https://www.kaggle.com/datasets/rashikrahmanpritom/heart-attack-analysis-prediction-dataset) and is accessible on [Kaggle](https://www.kaggle.com). It comprises 303 anonymized data records containing health information. In this notebook, our focus is predicting the risk of heart attack based on all given factors. By leveraging deep learning techniques and Fidex algorithms, we aim to not only achieve high classification performance but also gain insights into the attributes (pixels here) that contribute to the classification decisions.

# Load and pre-process the dataset
Let's start by preprocessing the data. We categorize the cp attribtue and the classes:

In [None]:
import pandas as pd
import pathlib as pl
from random import randint
from omnixai_community.data.tabular import Tabular
from omnixai_community.visualization.dashboard import Dashboard
from omnixai_community.explainers.tabular.auto import TabularExplainer
from mlxplain.explainers.tabular.specific.dimlpfidex import DimlpBTModel


dataset = pd.read_csv("heart.csv")

cp = pd.get_dummies(
    dataset["cp"], prefix="cp", prefix_sep="_", columns=["cp"], dtype="int8"
)
dataset = pd.concat([dataset.iloc[:, :2], cp, dataset.iloc[:, 3:]], axis=1)
dataset = dataset.rename(
    columns={
        "cp_0": "cp_typical",
        "cp_1": "cp_atypical",
        "cp_2": "cp_nonanginal ",
        "cp_3": "cp_asymptomatic",
    }
)

output = pd.get_dummies(
    dataset["output"], prefix="risk", prefix_sep="_", columns=["output"], dtype="int8"
)

dataset = pd.concat([dataset.iloc[:, :-1], output], axis=1)
dataset = dataset.rename(columns={"risk_0": "risk_no", "risk_1": "risk_yes"})

Now we must split the dataset into a train and test subsets and write feature names in a file: 

In [None]:
dataset = dataset.sample(frac=1)
split = int(dataset.shape[0] * 0.8)

features = dataset.columns

nb_classes = 2
nb_features = len(features) - nb_classes

train_dataset = Tabular(
    data=dataset.iloc[:split,:]
)

test_dataset = Tabular(
    data=dataset.iloc[split:,:]
)

root_dir = pl.Path("out")
root_dir.mkdir(parents=True, exist_ok=True)
features_filename = "attributes.txt"

with open(root_dir.joinpath(features_filename), "w") as file:
    for feature in features:
        file.write(f"{feature}\n")

# Train the Model
Let's train the [DimlpBT](https://hes-xplain.github.io/documentation/dimlpfidex/dimlp/dimlpbt/) model with our datasets:

In [None]:
# WARNING: verbose_console can't be used when used in notebooks for some reason

model = DimlpBTModel(
    root_dir,
    train_dataset,
    test_dataset,
    nb_features,
    nb_classes,
    attributes_file=features_filename,
)

model.train()


# Local and global rules generation & OmniXAI dashboard display
The rule extraction must now be initialized and executed with Fidex & FidexGloRules to generate local & global explainations. Then, we display the explanations on the OmniXAI dashboard :

In [None]:
max_iterations = 10
min_covering = 2
max_failed_attempts = 15
min_fidelity = 1.0
lowest_fidelity_allowed = 0.9
use_minimal_version = True

explainer = TabularExplainer(
    explainers=["fidex", "fidexGloRules"],
    data=train_dataset,
    model=model,
    mode="classification",
    params={
        "fidex": {
            "max_iterations": max_iterations,
            "attributes_file": features_filename,
            "min_covering": min_covering,
            "max_failed_attempts": max_failed_attempts,
            "min_fidelity": min_fidelity,
            "lowest_min_fidelity": lowest_fidelity_allowed,
        },
        "fidexGloRules": {
            "heuristic": 1,
            "with_fidexGlo": True,
            "fidexGlo": {
                "attributes_file": features_filename,
                "with_minimal_version": use_minimal_version,
                "max_iterations": max_iterations,
                "min_covering": min_covering,
                "covering_strategy": True,
                "max_failed_attempts": max_failed_attempts,
                "min_fidelity": min_fidelity,
                "lowest_min_fidelity": lowest_fidelity_allowed,
            },
            "attributes_file": features_filename,
            "nb_threads": 4,
            "with_minimal_version": use_minimal_version,
            "max_iterations": max_iterations,
            "min_covering": min_covering,
        },
    },
)

# predict with random sample
sample_to_test = test_dataset[randint(0, len(test_dataset)-1)]

local_explainations = explainer.explain(X=sample_to_test)
global_explainations = explainer.explain_global()

dashboard = Dashboard(
    instances=sample_to_test,
    local_explanations=local_explainations,
    global_explanations=global_explainations,
    class_names=features[-2:],
)
dashboard.show()

In the "Local Explaination" tab, you can observe the sample randomly chosen along with the prediction made by the model.  The rule explaining the model's decision is displayed on the right. To see details about the rule, you can hover on the blue bar. 

In the "Global Explaination" tab, there is a bar chart with all the global rules computed. They are ranked by their covering size (number of samples covered).

# References

HES-XPLAIN: [website](https://hes-xplain.github.io/), [Github page](https://github.com/HES-XPLAIN)

Dataset: [source](https://www.kaggle.com/datasets/rashikrahmanpritom/heart-attack-analysis-prediction-dataset)

Dimlpfidex: [Github repository](https://github.com/HES-XPLAIN/dimlpfidex), [documentation](https://hes-xplain.github.io/documentation/overview/)

Algorithms: [DimlpBT](https://hes-xplain.github.io/documentation/dimlpfidex/dimlp/dimlpbt/), [Fidex](https://hes-xplain.github.io/documentation/dimlpfidex/fidex/fidex/), [FidexGloRules](https://hes-xplain.github.io/documentation/dimlpfidex/fidex/fidexglorules/)