Skip to content

AMLBID: An auto-explained Automated Machine Learning tool for Big Industrial Data. To cite this Original Software Publication: https://www.sciencedirect.com/science/article/pii/S2352711021001631

License

Notifications You must be signed in to change notification settings

ElsevierSoftwareX/SOFTX-D-21-00111

 
 

Repository files navigation

https://pypi.python.org/pypi/explainerdashboard/

Transparent and Auto-explainable AutoML


AMLBID

AMLBID stands for Automating Machine-Learning model selection and configuration with Big Industrial Data.

Curently, AMLBID is a Python-Package representing a meta learning-based framework for automating the process of algorithm selection, and hyper-parameter tuning in supervised machine learning. Being meta-learning based, the framework is able to simulate the role of the machine learning expert as a decision support system. In particular, AMLBID is considered the first complete, transparent and auto-explainable AutoML system for recommending the most adequate ML configuration for a problem at hand, and explain the rationale behind the recommendation and analyzing the predictive results in an interpretable and faithful manner through an interactive multiviews artifact.

AMLBID is an interactive and user-guided framework for improving the utility and usability of the AutoML process with the following main features:

  • The framework provides end-users (Industry 4.0 actors & Researchers) with a user-friendly control panel that allows nontechnical users and domain experts (e.g., physicians, researchers) to overcome machine-learning predictive models building and configuring process challenges according to their own preferences.

  • The first framework system that automate machine-learning predictive models building and configuration for big industrial data.

  • The framework is equipped with a recommendation engine, that provide suggestion of the most appropriate pipelines (classifiers with their hyperparameters configuration) through the use of a collaborative knowledge-base that grows by time as more users are using our tool.

  • AMLBID will automate the most tedious part of machine learning by intelligently exploring more than 3.000.000 possible pipelines to find the best one for your data in a negligible amount of time and without need to a strong computational budget.

  • Automatically select ML algorithms and hyperparameters configurations for a given machine-learning problem more quickly than current methods with a computational complexity near O(1).

  • Provide a multi-level interactive visualization artifact that facilitate the models workings and performance inspection to address the “black-box trusting”.

--


Usage

The Framework will help you with:

  • Explaining and understanding your data.
  • Automate the Algorithm Selection and Hyper-Parameters tuning process.
  • Provide reports from analysis with details about all models (Atomatic-Explanation).
  • Interactively inspect the inner workings of the models without having to depend on a data scientist to generate tables and plots.
  • Provide a guidance, when AutoML returns unsatisfying results, to improve to predictive performances.
  • Increase the transparency, controllability, and the acceptance of AutoML.

It has two built-in modes of work:

  • Recommender mode, for recommending and building highly-tuned ML pipelines to use in production.
  • Recommender_Explainer mode, which allow users to inspect the recommended model's inner working and decision’s generation process, with many explanations levels, like feature importances, feature contributions to individual predictions, "what if" analysis, SHAP (interaction) values, visualisation of individual decision trees, Hyperameters inportance and correlation etc.

Curently,supports 08 Scikit-Learn classification algorithms, AdaBoost, Support Vector Classifier, Extra Trees, Gradient Boosting, Decision Tree, Logistic Regression, Random Forest, and Stochastic Gradient Descent Classifier.

Installation

AMLBID is built on top of several existing Python libraries, including:

Most of the necessary Python packages can be installed via the PyPi packages index or Anaconda Python distribution.

# Install additional Python requirements
pip install -r requirements.txt

Finally to install AMLBID itself along with required dependencies, run the following command:

# Install additional Python requirements
pip install AMLBID

Examples of use

A working example is deployed in: AMLBID

Mode Recommender:

Below is a minimal working example of the Recommendermode .

from AMLBID.Recommender import AMLBID_Recommender
from AMLBID.Explainer import AMLBID_Explainer
from AMLBID.loader import *

#load dataset
Data,X_train,Y_train,X_test,Y_test=load_data("TestData.csv")

#Generate the optimal configuration according to a desired predictive metric
AMLBID=AMLBID_Recommender.recommend(Data, metric="Accuracy", mode="Recommender")
AMLBID.fit(X_train, Y_train)
print("obtained score:", AMLBID.score(X_test, Y_test))

The corresponding Python code of the recommended pipeline should be exported to the Recommended_pipeline.py file and look similar to the following:
Note that the packages import code is generated automatically and dynamically according to the recommended ML pipeline.

import numpy as np
import pandas as pd
from sklearn.tree import DecisionTreeClassifier
from sklearn.metrics import classification_report
from sklearn.model_selection import train_test_split

data = pd.read_csv("Evaluation/Dataset.csv")

X = data.drop('class', axis=1)
Y = data['class']

X_train, X_test, Y_train, Y_test = train_test_split(X, Y, test_size=0.3, random_state=42)

model= DecisionTreeClassifier(criterion='entropy', max_features=0.5672564318672457,
                       min_samples_leaf=5, min_samples_split=20)
                       
model.fit(X_train, Y_train)

Y_pred = model.predict(X_test)
score = model.score(X_test, Y_test)

print(classification_report(Y_test, Y_pred))
print(' Pipeline test accuracy:  %.3f' % score)

Mode Recommender_Explainer:

Below is a minimal working example of the Recommender_Explainer mode.

from AMLBID.Recommender import AMLBID_Recommender
from AMLBID.Explainer import AMLBID_Explainer
from AMLBID.loader import *

#load dataset
Data,X_train,Y_train,X_test,Y_test=load_data("TestData.csv")

#Generate the optimal configurations according to a desired predictive metric
AMLBID,Config=AMLBID_Recommender.recommend(Data, metric="Accuracy", mode="Recommender_Explainer")
AMLBID.fit(X_train, Y_train)

#Generate the interactive explanatory dash
Explainer = AMLBID_Explainer.explain(AMLBID,Config, X_test, Y_test)
Explainer.run()

Demonstration of the explanatory artifact:

https://github.com/LeMGarouani/AMLBID/blob/main/media/Demo.gif

AMLBID was developed in the LISIC Lab at the ULCO University with funding from the ULCO, HESTIM, and CNRST.

About

AMLBID: An auto-explained Automated Machine Learning tool for Big Industrial Data. To cite this Original Software Publication: https://www.sciencedirect.com/science/article/pii/S2352711021001631

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 85.3%
  • Jupyter Notebook 14.7%