- Version 0.5 is skipped due to major improvements in the core
- Major refactor of core library.
- Goal: more flexible and modular code as a portable library.
- Switch to use of c++11 standard code.
- Random number generator defaults to
- Share the data loading pipeline and logging module from dmlc-core.
- Enable registry pattern to allow optionally plugin of objective, metric, tree constructor, data loader.
- Future plugin modules can be put into xgboost/plugin and register back to the library.
- Remove most of the raw pointers to smart ptrs, for RAII safety.
- Add official option to approximate algorithm
- Change default behavior to switch to prefer faster algorithm.
- User will get a message when approximate algorithm is chosen.
- Change library name to libxgboost.so
- Backward compatiblity
- The binary buffer file is not backward compatible with previous version.
- The model file is backward compatible on 64 bit platforms.
- The model file is compatible between 64/32 bit platforms(not yet tested).
- External memory version and other advanced features will be exposed to R library as well on linux.
- Previously some of the features are blocked due to C++11 and threading limits.
- The windows version is still blocked due to Rtools do not support
- rabit and dmlc-core are maintained through git submodule
- Anyone can open PR to update these dependencies now.
- Rabit and xgboost libs are not thread-safe and use thread local PRNGs
- This could fix some of the previous problem which runs xgboost on multiple threads.
- JVM Package
- Enable xgboost4j for java and scala
- XGBoost distributed now runs on Flink and Spark.
- Support model attributes listing for meta data.
- Support callback API
- Support new booster DART(dropout in tree boosting)
- Add CMake build system
This is last version release of 0.4 series, with many changes in the language bindings.
This is also a checkpoint before we switch to xgboost-brick #736
- Changes in R library
- fixed possible problem of poisson regression.
- switched from 0 to NA for missing values.
- exposed access to additional model parameters.
- Changes in Python library
- throws exception instead of crash terminal when a parameter error happens.
- has importance plot and tree plot functions.
- accepts different learning rates for each boosting round.
- allows model training continuation from previously saved model.
- allows early stopping in CV.
- allows feval to return a list of tuples.
- allows eval_metric to handle additional format.
- improved compatibility in sklearn module.
- additional parameters added for sklearn wrapper.
- added pip installation functionality.
- supports more Pandas DataFrame dtypes.
- added best_ntree_limit attribute, in addition to best_score and best_iteration.
- Java api is ready for use
- Added more test cases and continuous integration to make each build more robust.
Fix a major bug in v0.2 #8 and #10 .
scale_pos_weight is not properly initialized to default.
The bug will cause binary classification model work improperly when scale_pos_weight is not set in the parameter.
But if the scale_pos_weight is set properly, the result will be correct in v0.2