No description, website, or topics provided.
Branch: master
Clone or download
Latest commit 4b43529 Jul 27, 2018
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
R extended parset Apr 25, 2018
inst citiation Jul 17, 2018
man logo Jul 26, 2018
tests deactivate feature sets for now Apr 17, 2018
todo-files deactivate feature sets for now Apr 17, 2018
.Rbuildignore ... Jul 10, 2018
.gitignore
.travis.yml deactivate feature sets for now Apr 17, 2018
DESCRIPTION more feature eng. Apr 13, 2018
LICENSE Refactor factories, revisit visitors Jan 15, 2018
NAMESPACE deactivate feature sets for now Apr 17, 2018
README.md readme Jul 24, 2018
poster_2018.pdf ... Jul 10, 2018

README.md

autoxgboost - Automatic tuning and fitting of xgboost.

Build Status Coverage Status CRAN Status Badge CRAN Downloads

  • Install the development version

    devtools::install_github("ja-thomas/autoxgboost")

General overview

autoxgboost aims to find an optimal xgboost model automatically using the machine learning framework mlr and the bayesian optimization framework mlrMBO.

Work in progress!

Benchmark

Name Factors Numerics Classes Train instances Test instances
Dexter 20 000 0 2 420 180
GermanCredit 13 7 2 700 300
Dorothea 100 000 0 2 805 345
Yeast 0 8 10 1 038 446
Amazon 10 000 0 49 1 050 450
Secom 0 591 2 1 096 471
Semeion 256 0 10 1 115 478
Car 6 0 4 1 209 519
Madelon 500 0 2 1 820 780
KR-vs-KP 37 0 2 2 237 959
Abalone 1 7 28 2 923 1 254
Wine Quality 0 11 11 3 425 1 469
Waveform 0 40 3 3 500 1 500
Gisette 5 000 0 2 4 900 2 100
Convex 0 784 2 8 000 50 000
Rot. MNIST + BI 0 784 10 12 000 50 000

Datasets used for the comparison benchmark of autoxgboost, Auto-WEKA and auto-sklearn.

Dataset baseline autoxgboost Auto-WEKA auto-sklearn
Dexter 52,78 12.22 7.22 5.56
GermanCredit 32.67 27.67 28.33 27.00
Dorothea 6.09 5.22 6.38 5.51
Yeast 68.99 38.88 40.45 40.67
Amazon 99.33 26.22 37.56 16.00
Secom 7.87 7.87 7.87 7.87
Semeion 92.45 8.38 5.03 5.24
Car 29,15 1.16 0.58 0.39
Madelon 50.26 16.54 21.15 12.44
KR-vs-KP 48.96 1.67 0.31 0.42
Abalone 84.04 73.75 73.02 73.50
Wine Quality 55.68 33.70 33.70 33.76
Waveform 68.80 15.40 14.40 14.93
Gisette 50.71 2.48 2.24 1.62
Convex 50.00 22.74 22.05 17.53
Rot. MNIST + BI 88.88 47.09 55.84 46.92

Benchmark results are median percent error across 100 000 bootstrap samples (out of 25 runs) simulating 4 parallel runs. Bold numbers indicate best performing algorithms.

autoxgboost - How to Cite

The Automatic Gradient Boosting framework was presented at the ICML/IJCAI-ECAI 2018 AutoML Workshop (poster).
Please cite our ICML AutoML workshop paper on arxiv. You can get citation info via citation("autoxgboost") or copy the following BibTex entry:

@article{autoxgboost,
  title = {Automatic Gradient Boosting},
  url = {https://arxiv.org/abs/1807.03873v2},
  shorttitle = {{{autoxgboost}}},
  archivePrefix = {arXiv},
  eprinttype = {arxiv},
  eprint = {1807.03873v2},
  primaryClass = {stat.ML},
  author = {Thomas, Janek and Coors, Stefan and Bischl, Bernd},
  date = {2017-07-13},
}