autoxgboostMC - Multiple-Criteria tuning and fitting of xgboost models.
Status
The software is still heavily under construction!
Examples on how to use the software be found in the vignette!
-
Installing the development version
# Install requirements install.packages("devtools") devtools::install_github("compstat-lmu/paper_2019_iml_measures") devtools::install_github("johnmyleswhite/log4r") devtools::install_github("mlr-org/mlrMBO") # Install package devtools::install_github("pfistfl/autoxgboostMC")
autoxgboost aims to find an optimal xgboost model automatically using the machine learning framework mlr and the bayesian optimization framework mlrMBO.
Work in progress!
AutoxgboostMC embraces R6
for a cleaner design.
See the example code below for the new API.
First we split our data into train and test.
train = sample(c(TRUE, FALSE), getTaskSize(pid.task), replace = TRUE)
task_train = subsetTask(pid.task, subset = train)
task_test = subsetTask(pid.task, subset = !train)
Then we start the AutoML process:
# Instantiate the object with a list of measures to optimize.
axgb = AutoxgboostMC$new(task_train, measures = list(auc, timepredict))
# Set hyperparameters (we want to work on two threads
axgb$nthread(2L)
# Fit for 5 seconds
axgb$fit(time_budget = 15L)
after searching and finding a good model, we can use the best model to predict.
p = axgb$predict(task_test)
Several options are available for plotting:
axgb$plot_opt_path()
axgb$plot_opt_result()
axgb$plot_pareto_front()
AutoxgboostMC currently searches and optimizes the following Pipeline:
fix_factors %>% impact_encoding | dummy encoding %>% drop_constant_feats %>% learner %>% tune_threshold
To be added:
- Categorical Encoding using mixed models
- Imputation
- Fairness Post-Processing
Eventually:
- Ensemble Stacking
- Model Compression
The Automatic Gradient Boosting framework was presented at the ICML/IJCAI-ECAI 2018 AutoML Workshop (poster).
Please cite our ICML AutoML workshop paper on arxiv.
You can get citation info via citation("autoxgboost")
or copy the following BibTex entry:
@inproceedings{autoxgboost,
title={Automatic Gradient Boosting},
author={Thomas, Janek and Coors, Stefan and Bischl, Bernd},
booktitle={International Workshop on Automatic Machine Learning at ICML},
year={2018}
}