@sab sab released this Jan 30, 2018 · 4338 commits to master since this release

Assets 9

Speedups

  • 25% speedup of the model applier
  • 43% speedup for training on large datasets.
  • 15% speedup for QueryRMSE and calculation of querywise metrics.
  • Large speedups when using binary categorical features.
  • Significant (x200 on 5k trees and 50k lines dataset) speedup for plot and stage predict calculations in cmdline.
  • Compilation time speedup.

Major Features And Improvements

  • Industry fastest applier implementation.
  • Introducing new parameter boosting-type to switch between standard boosting scheme and dynamic boosting, described in paper "Dynamic boosting".
  • Adding new bootstrap types bootstrap_type, subsample. Using Bernoulli bootstrap type with subsample < 1 might increase the training speed.
  • Better logging for cross-validation, added parameter logging_level and metric_period (should be set in training parameters) to cv.
  • Added a separate train function that receives the parameters and returns a trained model.
  • Ranking mode QueryRMSE now supports default settings for dynamic boosting.
  • R-package pre-build binaries are included into release.
  • We added many synonyms to our parameter names, now it is more convenient to try CatBoost if you are used to some other library.

Bug Fixes and Other Changes

  • Fix for CPU QueryRMSE with weights.
  • Adding several missing parameters into wrappers.
  • Fix for data split in querywise modes.
  • Better logging.
  • From this release we'll provide pre-build R-binaries.
  • More parallelisation.
  • Memory usage improvements.
  • And some other bug fixes.

Thanks to our Contributors

This release contains contributions from CatBoost team.

We are grateful to all who filed issues or helped resolve them, asked and answered questions.