Skip to content

anasashb/aliceHU

Repository files navigation

python Code style: black

Automated Learning for Insightful Comparison and Evaluation (ALICE)

Automated Learning for Insightful Comparison and Evaluation (ALICE) merges conventional feature selection and the concept of inter-rater agreeability in a simple, user-friendly manner to seek insights into black box Machine Learning models. Currently supports (and has been tested on) Scikit-Learn and Keras models.

Authors: Bachana Anasashvili, Vahidin Jeleskovic

Paper: arXiv

Framework Architecture and Experiment Results

ALICE Framework Visualized

Results included from the repository are from three experiments on the Telco Customer Churn dataset:

  • Mulit-Layer Perceptron (MLP) vs. Logistic Regression (Logit)
  • Multi-Layer Perceptron (MLP) vs. Random Forest Classifier (RFC)
  • Random Forest Classifier (RFC) vs. Logistic Regression (Logit)

ALICE Experiment Results


Main directory:

Notebooks

  • customer_churn_test.ipynb - Jupyter Notebook for experiments and use demonstration / instructions
  • results_analysis.ipynb - Jupyter Notebook demonstrating experiment results and plots
  • customer_churn_dataprocessing.ipynb - Jupyter Notebook for transparency of data cleaning and manipualtion

Folders

  • alice - Code modules for the framework
  • clean_data - Saved train-test sets
  • test_results - Saved experiment results
    • test_results/experiment_results_20240301_1/experiment_results_20240301_1.json - MLP vs. Logit Experiment
    • test_results/experiment_results_20240302_1/experiment_results_20240302_1.json - MLP vs. RFC Experiment
    • test_results/experiment_results_20240302_2/experiment_results_20240302_2.json - RFC vs. Logit Experiment

Files

  • class_telco.pkl - Processed and cleaned Telecom customer churn dataset for classification
  • reg_telco.pkl - Processed and cleaned Telecom customer churn dataset for regression
  • Telco_customer_churn.xlsx - Raw data
  • requirements.txt - Required python libraries and their versions

Reproducibility:

Note that the results may not be exactly reproducible due to tha nature of neural networks, random forests and their optimization.

For re-running the experiments, or testing the framework:

  1. Make sure to set up a virtual Python environment and install the required packages
! pip install -r requirements.txt
  1. Run the customer_churn_test.ipynb Notebook.
    • Re-running sections 1, 2, 3, and 4 are mandatory to be able to re-run the experiments.
    • Experiments are contained under section 5. Given the computationally costly nature of the models, section 7 includes two simpler models, Logistic Regression and a Decision Tree Classifier, for those who want to quickly test the functionalities of the framework.

About

Repository for ALICE, a Python framework that fuses Feature Selection and Inter-Rater Agreeability to gain insights into ML models.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages