Scalable, Portable and Distributed Gradient Boosting (GBDT, GBRT or GBM) Library, for Python, R, Java, Scala, C++ and more. Runs on single machine, Hadoop, Spark, Flink and DataFlow
Clone or download
hcho3 Fix #3523: Fix CustomGlobalRandomEngine for R (#3781)
**Symptom** Apple Clang's implementation of `std::shuffle` expects doesn't work
correctly when it is run with the random bit generator for R package:
```cpp
CustomGlobalRandomEngine::result_type
CustomGlobalRandomEngine::operator()() {
  return static_cast<result_type>(
      std::floor(unif_rand() * CustomGlobalRandomEngine::max()));
}
```

Minimial reproduction of failure (compile using Apple Clang 10.0):
```cpp
std::vector<int> feature_set(100);
std::iota(feature_set.begin(), feature_set.end(), 0);
    // initialize with 0, 1, 2, 3, ..., 99
std::shuffle(feature_set.begin(), feature_set.end(), common::GlobalRandom());
    // This returns 0, 1, 2, ..., 99, so content didn't get shuffled at all!!!
```

Note that this bug is platform-dependent; it does not appear when GCC or
upstream LLVM Clang is used.

**Diagnosis** Apple Clang's `std::shuffle` expects 32-bit integer
inputs, whereas `CustomGlobalRandomEngine::operator()` produces 64-bit
integers.

**Fix** Have `CustomGlobalRandomEngine::operator()` produce 32-bit integers.

Closes #3523.
Latest commit b38c636 Oct 15, 2018
Permalink
Failed to load latest commit information.
.github [DOCS] Update link to readme (#3437) Jul 4, 2018
R-package Fix #3523: Fix CustomGlobalRandomEngine for R (#3781) Oct 15, 2018
amalgamation Implementation of hinge loss for binary classification (#3477) Aug 6, 2018
cmake Add travis sanitizers tests. (#3557) Aug 19, 2018
cub @ b20808b Update cub submodule again (fixes GPU build) (#2599) Aug 13, 2017
demo Typo fixed (#3784) Oct 10, 2018
dmlc-core @ e3377de Update dmlc-core, to fix partitioned file loading (#3673) Sep 6, 2018
doc Added some instructions on using MinGW-built XGBoost with python. (#3774 Oct 9, 2018
include/xgboost Add basic unittests for gpu-hist method. (#3785) Oct 15, 2018
jvm-packages [jvm-packages] For training data with group, empty RDD partition thre… Oct 9, 2018
make Not use -msse2 on power or arm arch. close #2446 (#2475) Jul 7, 2017
plugin Replaced std::vector with HostDeviceVector in MetaInfo and SparsePage. ( Aug 30, 2018
python-package Allow sklearn grid search over parameters specified as kwargs (#3791) Oct 13, 2018
rabit @ 87143de Fix CRAN check for lintr (#3372) Jun 18, 2018
src Fix #3523: Fix CustomGlobalRandomEngine for R (#3781) Oct 15, 2018
tests Address #2754, accuracy issues with gpu_hist (#3793) Oct 15, 2018
.clang-tidy Fix model saving for 'count:possion': max_delta_step as Booster attri… Jul 27, 2018
.editorconfig Added configuration for python into .editorconfig (#3494) Jul 23, 2018
.gitignore Improve .gitignore patterns (#3184) May 9, 2018
.gitmodules Upgrading to NCCL2 (#3404) Jul 10, 2018
.travis.yml Add travis sanitizers tests. (#3557) Aug 19, 2018
CITATION simplify software citation (#2912) Dec 1, 2017
CMakeLists.txt Produce xgboost.so for XGBoost-R on Mac OSX, so that `make install` w… Oct 7, 2018
CONTRIBUTORS.md Update committer list (#3788) Oct 15, 2018
Jenkinsfile Retry Jenkins CI tests up to 3 times to improve reliability (redux) (#… Oct 8, 2018
Jenkinsfile-restricted Fix Jenkins syntax (#3777) Oct 8, 2018
LICENSE Include full text of Apache 2.0 license (#3698) Sep 13, 2018
Makefile Add callback interface to re-direct console output (#3438) Jul 5, 2018
NEWS.md Dmatrix refactor stage 2 (#3395) Sep 30, 2018
README.md Update README.md Jul 4, 2018
appveyor.yml Dynamically allocate GPU histogram memory (#3519) Jul 28, 2018
build.sh Suggest git submodule update instead of delete + reclone (#3214) May 9, 2018

README.md

eXtreme Gradient Boosting

Build Status Build Status Documentation Status GitHub license CRAN Status Badge PyPI version

Community | Documentation | Resources | Contributors | Release Notes

XGBoost is an optimized distributed gradient boosting library designed to be highly efficient, flexible and portable. It implements machine learning algorithms under the Gradient Boosting framework. XGBoost provides a parallel tree boosting (also known as GBDT, GBM) that solve many data science problems in a fast and accurate way. The same code runs on major distributed environment (Hadoop, SGE, MPI) and can solve problems beyond billions of examples.

License

© Contributors, 2016. Licensed under an Apache-2 license.

Contribute to XGBoost

XGBoost has been developed and used by a group of active community members. Your help is very valuable to make the package better for everyone. Checkout the Community Page

Reference

  • Tianqi Chen and Carlos Guestrin. XGBoost: A Scalable Tree Boosting System. In 22nd SIGKDD Conference on Knowledge Discovery and Data Mining, 2016
  • XGBoost originates from research project at University of Washington.