GitHub - SquareResearchCenter-AI/BEExAI: Benchmark to Evaluate EXplainable AI

BEExAI

Benchmark to Evaluate EXplainable AI
Explore the docs »

Report Bug · Request Feature

Table of Contents

About The Project
Getting Started
- Prerequisites
- Installation
Usage
Technical aspects

Content description
Supported models
Supported explainability methods
Implemented metrics
Disclaimer

Contributing
License
Contact
Acknowledgments

About The Project

This project provides simple tools to benchmark multiple explainable AI methods on multiples Machine Learning models with customizable datasets and compute metrics to evaluate these methods.

(back to top)

Getting Started

Prerequisites

The project is entirely made in Python 3.9 and tested on Windows 11 64 bits. Both CPU and GPU are supported with PyTorch 2.0.1.

Installation

BEExAI can be installed from PyPI with:

pip install beexai

You can also install the project from source using:

Clone the repo

git clone https://github.com/SquareResearchCenter-AI/BEExAI.git

Install the requirements

cd BEExAI
pip install -r requirements.txt

(back to top)

Usage

Setup a config

To train a model, compute explaination attributions and evaluation metrics on tabular data, you will need to specify a config file for each dataset. There are several examples in config/ with the following format:

path: "data/my_dataset.csv"
target_col: "class"
datetime_cols: 
    - "date"
cols_to_delete:
    - "ID"
cleaned_data_path: "output/data/my_dataset_cleaned.csv"
task: "classification"

The different options can be described as follow:

path: path of the dataset, it can be usually placed in a folder data/
target_col: target column for training
datetime_cols: columns with a datetime format that will be divided in several integer columns (year,month,day,hour)
cols_to_delete: columns to drop (for example ID columns)
cleaned_data_path: path to save the dataset after preprocessing for repeated usage, usually in output/data
task: classification or regression

Other operations such as adding specific colums from columns operations or deleting specific values must be done during the instanciation of the dataset in the notebooks or scripts.

Notebooks

Several notebooks are available in notebooks/ for simple use cases:

The numeroted serie can be ran in the order with your own dataset or with the examples provided (kickstarter and boston-credit dataset).
all_in_one.ipynb synthesizes the 3 notebooks in a single one without the detailed explanations.

Load data and train model

from beexai.dataset.load_data import load_data
from beexai.dataset.dataset import Dataset
from beexai.training.train import Trainer

DATA_NAME = "configname"
MODEL_NAME = "NeuralNetwork"
CONFIG_PATH = f"config/{DATA_NAME}.yml"

df,target_col,task,_ = load_data(from_cleaned=False,config_path=CONFIG_PATH)
data = Dataset(df,target_col)
X_train, X_test, y_train, y_test = data.get_train_test()

NN_PARAMS = {"input_dim":X_train.shape[1],"output_dim":num_labels}
trainer = Trainer(MODEL_NAME,task,NN_PARAMS)
trainer.train(X_train, y_train)

Compute explanability metrics

from beexai.explaining import CaptumExplainer
from beexai.metrics.get_results import get_all_metrics

METHOD = "IntegratedGradients"
exp = CaptumExplainer(trainer.model,task=task,method=METHOD,sklearn=False)
exp.init_explainer()

LABEL=0

get_all_metrics(X_test.values,LABEL,trainer.model,exp)

For more examples, please refer to the Documentation

(back to top)

Download datasets

The datasets used in this benchmarks are issued from several openml suites.

The ones from Why do tree-based models still outperform deep learning on typical tabular data? are the suites with ID 297,298,299 and 304.

The ones for multiclass classification are from tasks 12,14,16,18,22,23,28 and 32.

A simplified script to download them with OpenML API and create their configuration files is available in the root folder.

python openml_download.py

Run benchmarks

Running benchmarks can be done with the script benchmetrics.py with multiple arguments:

python benchmetrics.py --config_path config_folder --save_path output/my_benchmark --seed 42 --n_sample 1000

For comparison with the benchmarks in the benchmark_results folder, we used 1000 samples from the test set.

Technical aspects

Content description

benchmark_results: Complete benchmark results from our paper insert_link averaged on 5 random seeds
config: Please detail here some basic information on your data. Other more complex operations on your data need to be done directly in the notebooks or scripts
data: boston and kickstarter datasets from Kaggle
notebooks: Simple use cases in notebook format
output: Store outputs such as cleaned datasets, saved models and computed attributions
src: Python scripts with main classes

Supported models

Linear Regression, Logistic Regression
Random Forest
Decision Tree
Gradient Boosting
XGBoost
Dense Neural Network

Supported explainability methods

Perturbation based: FeatureAblation, Lime, ShapleyValueSampling, KernelShap
Gradient based: Integrated Gradients, Saliency, DeepLift, InputXGradient

Implemented metrics

Robustness: Sensitivity
Faithfulness: Infidelity, Comprehensiveness, Sufficiency, Faithfulness Correlation, AUC-TP, Monotonicity
Complexity: Complexity, Sparseness

Disclaimer

The proposed pipeline might not include all possible customizations (especially for data preprocessing), feel free to add your own processing within the example notebooks.

(back to top)

Contributing

Contributions are what make the open source community such an amazing place to learn, inspire, and create. Any contributions you make are greatly appreciated.

If you have a suggestion that would make this better, please fork the repo and create a pull request. You can also simply open an issue with the tag "enhancement". Don't forget to give the project a star! Thanks again!

Fork the Project
Create your Feature Branch (git checkout -b feature/AmazingFeature)
Commit your Changes (git commit -m 'Add some AmazingFeature')
Push to the Branch (git push origin feature/AmazingFeature)
Open a Pull Request

Name		Name	Last commit message	Last commit date
Latest commit History 30 Commits
benchmark_results		benchmark_results
config		config
data		data
docs		docs
notebooks		notebooks
output		output
src/beexai		src/beexai
.gitignore		.gitignore
.readthedocs.yaml		.readthedocs.yaml
LICENSE		LICENSE
README.md		README.md
benchmetrics.py		benchmetrics.py
openml_download.py		openml_download.py
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

License

SquareResearchCenter-AI/BEExAI

Folders and files

Latest commit

History

Repository files navigation

BEExAI

About The Project

Getting Started

Prerequisites

Installation

Usage

Setup a config

Notebooks

Load data and train model

Compute explanability metrics

Download datasets

Run benchmarks

Technical aspects

Content description

Supported models

Supported explainability methods

Implemented metrics

Disclaimer

Contributing

License

Contact

Acknowledgments

About

Topics

Resources

License

Stars

Watchers

Forks

Languages