Benchmarking Gradient Boosting in TensorFlow and XGBoost
Gradient Boosting in TensorFlow vs XGBoost

TensorFlow 1.4 includes a Gradient Boosting implementation, aptly named TensorFlow Boosted Trees (TFBT). This repo contains the benchmarking code that I used to compare it XGBoost.

For more background, have a look at the article.

Getting started

# Prepare the python environment
mkvirtualenv env
source env/bin/activate
pip install -r requirements.txt

# Download the dataset
bunzip2 {2006,2007}.csv.bz2

# Prepare the dataset

Running the experiments

Train and run xgboost:


Train and run TensorFlow:


Draw nice plots:


Timing results

./ --num_trees=50  42.06s user 1.82s system 1727% cpu 2.540 total

./ --num_trees=50 --examples_per_layer=1000  124.12s user 27.50s system 374% cpu 40.456 total
./ --num_trees=50 --examples_per_layer=5000  659.74s user 188.80s system 356% cpu 3:58.30 total
