AutoML library for automatic uplift modeling
- Description
- Key Features
- Installation
- Quick Start
- Project Structure
- License
AUF (Automatic Uplift Framework) is an AutoML library that provides a complete pipeline for building uplift models. The library automates all stages: data validation, statistical significance testing of treatment effects, feature selection and ranking, model training (S-, T-, X-learners, uplift trees and forests), optimal model selection, and generation of detailed quality analytics with visualization.
AUF supports multi-treatment and integrates with MLflow for experiment tracking, making it an ideal tool for rapid prototyping and production-ready solutions in personalized interventions.
- Complete AutoML pipeline from raw data to production-ready model
- Statistical effect validation via bootstrap significance testing
- Automatic feature selection with 5 possible ranking strategies (filters, importance, permutation, stepwise selection)
- Support for all major uplift methods: S-Learner, X-Learner, uplift trees, random forest
- Comprehensive visualization: Qini curves, uplift curves, conversion by buckets, discrete and continuous plots
- MLflow integration for automatic logging of metrics, artifacts, and models
- Multi-treatment support
Standard installation from PyPI:
# Create a new virtual environment (highly recommended)
conda create -n auf_env python=3.8 -y
conda activate auf_env
# Install AUF
pip install aufInstallation from source:
git clone https://github.com/Alfa-Advanced-Analytics/auf.git
cd auf
pip install -e .from auf.pipeline import UpliftPipeline
# Initialize pipeline
pipeline = UpliftPipeline(
print_doc=False,
task_name_mlflow='test_auf',
run_description='Testing AUF library',
)
# Load data with ID, target, treatment, features columns
df = load_your_data()
# Map unified base columns names into user base columns names
base_cols_mapper = {
'id': "id",
'treatment': 'treatment',
'target': 'target',
'segm': None
}
# Map treatment groups names into unified ones (0 and 1)
treatment_groups_mapper = {
"control": 0,
"treatment": 1
}
# Load data in the pipeline
pipeline.load_sample(
df,
base_cols_mapper,
treatment_groups_mapper
)
# Run full pipeline
pipeline.run()
# All results are:
# 1) saved to MLflow (if configured)
# 2) plotted by pipeline during its workImportant: DataFrame df must be pre-formatted with column names for ID, target, and treatment specified via mapping dictionaries.
auf/
βββ __init__.py
βββ constants/
β βββ # Predefined metrics and parameters
β βββ __init__.py
β βββ metrics.py
β βββ numbers.py
βββ data/
β βββ # Data validation and preprocessing
β βββ __init__.py
β βββ checks.py
β βββ preprocessing.py
β βββ split.py
βββ feature_rankers/
β βββ # Feature ranking strategies
β βββ __init__.py
β βββ filter.py
β βββ importance.py
β βββ permutation.py
β βββ stepwise.py
β βββ straightforward.py
βββ log/
β βββ # Logging and progress tracking
β βββ __init__.py
β βββ log.py
βββ metrics/
β βββ # Custom uplift metrics
β βββ __init__.py
β βββ averaged.py
β βββ by_top.py
β βββ overfit.py
βββ ml_flow/
β βββ # MLflow integration
β βββ __init__.py
β βββ ml_flow.py
βββ models/
β βββ # Uplift model implementations
β βββ __init__.py
β βββ auf_forest.py
β βββ auf_model.py
β βββ auf_tree.py
β βββ auf_x_learner.py
βββ pipeline/
β βββ # Main pipeline and components
β βββ __init__.py
β βββ calibration.py
β βββ evaluation.py
β βββ inference.py
β βββ pipeline.py
βββ plots/
β βββ # Result visualization
β βββ __init__.py
β βββ plots.py
βββ training/
βββ # Training and optimization
βββ __init__.py
βββ fitting.py
βββ gridsearch.py
βββ model_generation.py
This project is licensed under the MIT License. See the LICENSE.txt file for details.