Releases: microsoft/LightGBM
v4.3.0
Changes
💡 New Features
- [CUDA] Add arch=8.9 to CUDA_ARCHS for RTX 40XX @DmitryUlyanov (#6272)
🔨 Breaking
- [cmake] [c++] require CMake 3.18+ @jameslamb (#6260)
- [R-package] remove readRDS.lgb.Booster() and saveRDS.lgb.Booster() @jameslamb (#6246)
🚀 Efficiency Improvement
- [R-package] Remove non-beneficial parallelization @david-cortes (#6209)
🐛 Bug Fixes
- [R-package] [ci] remove unnecessary include in linear_tree_learner (fixes #6264) @jameslamb (#6265)
- [cmake] [CUDA] ignore CUDA-specific source files in non-CUDA builds (fixes #6267) @sabjohnso (#6268)
- [c++] include OpenMP-control files in MSBuild solution file (fixes #6238) @jameslamb (#6251)
- [cmake] [swig] use CMake's built-in file-copying mechanisms instead of 'cp' @jameslamb (#6259)
📖 Documentation
- [docs] Add LightGBMLSS to README @StatMixedML (#6254)
- [ci] [docs] add Oliver to CODEOWNERS @jameslamb (#6247)
- Fix small typo and grammar in docs @arunstar (#6245)
🧰 Maintenance
- [ci] fix conda env creation in 'regular' CI job (fixes #6282) @jameslamb (#6283)
- [R-package] [ci] switch vignettes from 'rmarkdown' to 'markdown' @jameslamb (#6258)
- [python-package] fix mypy error about pandas categorical features @jameslamb (#6253)
- [ci] update issue-locking workflow @jameslamb (#6256)
- [ci] upgrade to GoogleTest v1.14.0 (fixes #5976) @jameslamb (#5981)
- [ci] [R-package] speed up valgrind job @jameslamb (#6237)
- bump development version to 4.2.0.99 @jameslamb (#6241)
v4.2.0
✨ v4.2.0 of the R package is now available on CRAN (link), the first major release of the R package in 2+ years.
✨ The Python package now accepts Apache Arrow Tables and Arrays (thanks @borchero!)
🔧 A critical bug in quantized training support is fixed
Changes
💡 New Features
- [python-package] Allow to pass Arrow table for prediction @borchero (#6168)
- [python-package] Allow to pass Arrow table and array as init scores @borchero (#6167)
- [python-package] Allow to pass Arrow array as groups @borchero (#6166)
- [python-package] Allow to pass Arrow array as weights @borchero (#6164)
- [python-package] Accept numpy generators as
random_state
@david-cortes (#6174) - [python-package] Allow to pass Arrow array as labels @borchero (#6163)
- [python-package] Allow to pass Arrow table as training data @borchero (#6034)
🔨 Breaking
- [python-package] fix access to Dataset metadata in scikit-learn custom metrics and objectives @jameslamb (#6108)
- [CUDA] drop CUDA 10 support, start supporting CUDA 12 (fixes #5789) @jameslamb (#6099)
🚀 Efficiency Improvement
- [R-package] Fix inefficiency in retrieving pointers @david-cortes (#6208)
- [CUDA] CUDA Quantized Training (fixes #5606) @shiyu1994 (#5933)
🐛 Bug Fixes
- fix errors from MSVC '/permissive-' mode (fixes #6230) @Zhaojun-Liu (#6232)
- [R-package] [c++] add tighter multithreading control, avoid global OpenMP side effects (fixes #4705, fixes #5102) @jameslamb (#6226)
- [python-package] take shallow copy of dataframe in predict (fixes #6195) @jmoralez (#6218)
- Fix null handling for Arrow data @borchero (#6227)
- [R-package] use safer pattern for error formatting (fixes #6212) @jameslamb (#6216)
- [python-package] fix libpath.py @shiyu1994 (#6192)
- set explicit number of threads in every OpenMP
parallel
region @jameslamb (#6135) - ignore unknown parameters when loading from model file @jmoralez (#6126)
- [python-package] [R-package] include more params in model text representation (fixes #6010) @jameslamb (#6077)
- [fix] fix quantized training (fixes #5982) (fixes #5994) @shiyu1994 (#6092)
- [python-package] Fix misdetected objective after multiple calls to
LGBMClassifier.fit
@david-cortes (#6002)
📖 Documentation
- [docs] remove links to Laurae++ site @jameslamb (#6193)
- [docs] reduce redirects in docs links @jameslamb (#6181)
- [docs] fix broken links @jameslamb (#6161)
🧰 Maintenance
- release v4.2.0 @jameslamb (#6191)
- [ci] [R-package] allow more possibly-lost warnings from valgrind @jameslamb (#6233)
- [ci] Upgrade Azure VMSS to use Mariner Linux @shiyu1994 (#6222)
- Add msvc conformance check @Zhaojun-Liu (#6234)
- [python-package] Add tests for passing Arrow arrays with empty chunks @borchero (#6210)
- [R-package] change CRAN maintainer @jameslamb (#6224)
- [CUDA] fix typo in error message @jameslamb (#6207)
- [python-package] ignore mypy errors related to ctypes string buffers @jameslamb (#6198)
- [python-package] consolidate pandas-to-numpy conversion code @jameslamb (#6156)
- [R-package] standardize naming of internal functions @jameslamb (#6179)
- [R-package] remove unreachable code @jameslamb (#6180)
- allow new files in include/LightGBM @jameslamb (#6177)
- [R-package] Use
cat()
instead ofprint()
for metrics and callbacks @david-cortes (#6171) - [ci] resolve warning in tests @jameslamb (#6154)
- [ci] use
mamba
instead ofconda
in macOS and Linux CI jobs @borchero (#6140) - factor out uses of omp_get_num_threads() and omp_get_max_threads() outside of OpenMP wrapper @jameslamb (#6133)
- remove unnecessary allocations in HistogramSumReducer @jameslamb (#6132)
- [ci] [R-package] enforce more {lintr} checks @jameslamb (#6130)
- fix compiler warnings for CPP tests @jameslamb (#6124)
- [ci] [R-package] test against R 4.3 on Windows @jameslamb (#6061)
- [python-package] reorganize early stopping callback @jameslamb (#6114)
- [python-package] simplify Dataset._compare_params_for_warning() @jameslamb (#6120)
- [ci] fix sh-compatibility issue in build-cran-package.sh @jameslamb (#6118)
- [python-package] remove unnecessary allocations in ctypes code @jameslamb (#6111)
- [python-package] fix mypy errors in Dataset construction @jameslamb (#6106)
- [ci] ensure correct R version is used on GitHub Actions (fixes #5640) @jameslamb (#6107)
- [python-package] fix mypy error about eval result tuples @jameslamb (#6105)
- [python-package] fix mypy error from Dataset.pandas_categorical @jameslamb (#6098)
- [ci] Fix typo in dependencies @borchero (#6100)
- [python-package] fix mypy errors related to eval result parsing in callbacks @jameslamb (#6096)
- [python-package] mark EarlyStopException as part of public API @jameslamb (#6095)
- [python-package] fix mypy errors related to eval result tuples @jameslamb (#6097)
- update to fmt 10.1.1, fast_double_parser 0.7.0 @jameslamb (#6074)
v4.1.0
Changes
💡 New Features
🐛 Bug Fixes
- Fix updates in random forest model using GOSS data sample strategy @mjmckp (#6017)
- [R-package] Fix misdetected objective when passing
lgb.Dataset
instance tolightgbm()
@david-cortes (#6005) - [python-package] make it possible to build wheels without internet connection (fixes #5979) @jameslamb (#6042)
- fix percentile computation for regression objectives @zachary62 (#5848)
- [CUDA] Set GPU device ID in threads @shiyu1994 (#6028)
- [R-package] Fix error when passing categorical features to lightgbm() (fixes #6000) @david-cortes (#6003)
- [R-package] limit number of threads used in tests and examples (fixes #5987) @jameslamb (#5988)
📖 Documentation
- [python-package] [docs] Update key format of eval_hist in docstring example @Alnusjaponica (#5980)
- [docs] add vaex-ml to list of external repositories @jameslamb (#6085)
- [ci] [docs] fix broken ACM links @jameslamb (#6083)
- [docs] Fix typo in README @kyleengel (#6071)
- [docs] fix broken links @jameslamb (#6059)
- Fix Python Dockerfile @GyuminJack (#5984)
🧰 Maintenance
- Release v4.1.0 @jameslamb (#6076)
- Remove superfluous todo from gitignore @borchero (#6081)
- [python-package] simplify processing of pandas data @jameslamb (#6066)
- [ci] [R-package] test against R 4.3 on Linux and macOS @jameslamb (#6075)
- reduce verbosity of some log messages @jameslamb (#6073)
- [python-package] remove CVBooster._append() @jameslamb (#6057)
- [python-package] use dataclass for CallbackEnv @jameslamb (#6048)
- [ci] [python-package] add more linting checks @jameslamb (#6049)
- [ci] prevent lock-threads from locking issues with label 'feature request' @jameslamb (#6047)
- [ci] add bot to lock inactive issues and PRs @jameslamb (#6037)
- [ci] fix GPG key download for R Linux jobs (fixes #6038) @jameslamb (#6039)
- [ci] enforce dask version to be >=2023.5.0 in some builds (fixes #6030) @shiyu1994 (#6032)
- [ci] [R-package] use {lintr} 3.1 @jameslamb (#5997)
- [python-package] replace np.find_common_type with np.result_type @jmoralez (#5999)
- [ci] simplify CODEOWNERS @jameslamb (#5998)
- [R-package] consolidate testing constants in helpers file @jameslamb (#5992)
- [R-package] remove unused internal variables @jameslamb (#5991)
- [ci] use newer h5py in AppVeyor jobs (fixes #5995) @jameslamb (#5996)
- [python-package] make _InnerPredictor construction stricter @jameslamb (#5961)
Note
This release of the R package was not submitted to CRAN, due to the issues documented in #5987.
v4.0.0
Changes
This release contains all previously-unreleased changes since v3.3.1
over 1.5 years ago (link).
Summary of improvements:
- totally-rewritten CUDA implementation, and more operations in the CUDA implementation performed on the GPU
- quantized training can be used for greatly improved training speeds on CPU (paper link)
- support for C++17
- Python package:
- now uses
scikit-build-core
(link) as its build backend manylinux_2_28
Linux wheels now support GPU (OpenCL-based, not CUDA) build automatically... justpip install lightgbm
then pass{"device": "gpu"}
in params (thanks @jgiannuzzi!)- much more use of inline type hints, exported with
py.typed
so any code using LightGBM can benefit - support for Python 3.10, 3.11
- support for
pandas
nullable types - configurable threshold (
lgb.early_stopping(..., min_delta=n)
) for how much eval metrics must improve to be considered "improved" for early stopping - custom objective functions in Dask
scikit-learn
is no longer a required dependency- all callbacks are now pickleable (for better interoperability with e.g.
ray
, Dask) (thanks @Yard1!)
- now uses
- R package:
- efficient support for more data types in prediction, like
dgCMatrix
anddsparseMatrix
(thanks @david-cortes!) - much more idomatic interface... e.g. support for
saveRDS()
andreadRDS()
forBooster
,print()
andsummary()
methods forDataset
(thanks @david-cortes!) - various bug fixes related to multiple competing ways to provide parameters
- support for R 4.2, 4.3
- efficient support for more data types in prediction, like
Summary of breaking changes:
- Python package:
- dropped most testing, promise of support for Python 3.6 (although it should still technically be installable)
- dropped support for macOS Mojave (10.14)
- made many functions and class attributes private, including significantly reducing what is pulled in by
from lightgbm import *
- removed
setup.py
,pip install --install-option
support - remove support for
pip install --install-option
(to work with newerpip
, see pypa/pip#11358)- see https://github.com/microsoft/LightGBM/blob/master/python-package/README.rst for new patterns
- see pypa/pip#11358 and #5061 for background
- dropped support for installation with
MSBUild.exe
... that now requires compilinglib_lightgbm.dll
separately and then building a wheel that bundles it
- R package:
- dropped support for Solaris
- removed most support for passing parameters through
...
- removed
lgb.unloader()
- switched to
predict(newdata, type = ...)
inpredict()
, for consistency with base R and most other machine learning projects
💡 New Features
- [python-package] add 'pandas' extra @jameslamb (#5937)
- [CUDA] Add more CUDA Regression Metrics @Xuweijia-buaa (#5924)
- [python-package] adding max_category_values parameter to create_tree_digraph method (fixes #5687) @moziada (#5818)
- [c++] support building with Ninja on Linux @jameslamb (#5910)
- add CMakeLists options to disable building CLI, installing headers @jameslamb (#5880)
- Add quantized training (CPU part) @shiyu1994 (#5800)
- [CUDA] Add quantile regression objective for new CUDA version @shiyu1994 (#5605)
- [CUDA] Add quantile metric for new CUDA version (contribute to #5163) @shiyu1994 (#5665)
- [python-package] add Booster.set_leaf_output method @jmoralez (#5712)
- feature: Add serialization of reference dataset @svotaw (#5427)
- [R-package] Accept factor labels and use their levels @david-cortes (#5341)
- [CUDA] Add binary logloss metric for new CUDA version @shiyu1994 (#5635)
- [CUDA] Add binary logloss metric for new CUDA version @shiyu1994 (#5635)
- Decouple Boosting Types (fixes #3128) @lyf-00 (#4827)
- [CUDA] Add L2 metric for new CUDA version @shiyu1994 (#5633)
- [CUDA] Add rmse metric for new CUDA version @shiyu1994 (#5611)
- Build integrated OpenCL Linux wheels @jgiannuzzi (#5252)
- [CUDA] Add Poisson regression objective for cuda_exp and refactor objective functions for cuda_exp @shiyu1994 (#5486)
- [CUDA] Add multiclass_ova objective for cuda_exp @shiyu1994 (#5491)
- [python-package] add install option to enable printing of time costs @Remy-Luciani (#5497)
- [python-package][R-package] load parameters from model file (fixes #2613) @jmoralez (#5424)
- [CUDA] Add multiclass objective for cuda_exp @shiyu1994 (#5473)
- [CUDA] Add feature interaction constraint for cuda_exp (fix #4785) @shiyu1994 (#5474)
- [CUDA] Add rank_xendcg objective for cuda_exp @shiyu1994 (#5472)
- [CUDA] Add fair regression objective for cuda_exp @shiyu1994 (#5469)
- [CUDA] Add lambdarank objective for cuda_exp @shiyu1994 (#5453)
- [CUDA] Add Huber regression objective for cuda_exp @shiyu1994 (#5462)
- [CUDA] Add L1 regression objective for cuda_exp @shiyu1994 (#5457)
- [CUDA] L2 regression objective for cuda_exp @shiyu1994 (#5452)
- [CUDA] Add binary objective for cuda_exp @shiyu1994 (#5425)
- [R-package] Add remainder of prediction funtions @david-cortes (#5312)
- [python-package] support saving and loading CVBooster (fixes #3556) @nyanp (#5160)
- feature: Add true streaming APIs to reduce client-side memory usage @svotaw (#5299)
- [python-package] highlight the path a sample takes through a tree in
plot_tree
andcreate_tree_digraph
(fixes #4784) @jmoralez (#5119) - reproducible parameter alias resolution for wrappers (fixes #5304) @jmoralez (#5338)
- [CUDA] Initial work for boosting and evaluation with CUDA @shiyu1994 (#5279)
- [python-package] add validate_features argument to refit @jmoralez (#5331)
- [python-package] check feature names in predict with dataframe (fixes #812) @jmoralez (#4909)
- [R-package] Add sparse feature contribution predictions @david-cortes (#5108)
- [python-package][R-package] allow using feature names when retrieving number of bins @jmoralez (#5116)
- [R-package] Keep row names in output from
predict
@david-cortes (#4977) - [python] make
reset_parameter
callback pickleable @StrikerRUS (#5109) - [python] make
record_evaluation
callback pickleable @StrikerRUS (#5107) - [python] make
log_evaluation
callback pickleable @StrikerRUS (#5101) - [python] allow to register any custom logger (fixes #4783) @RustingSword (#4880)
- Load initial scores with binary data files in CLI version @shiyu1994 (#4807)
- [R-package] Rename
weight
->weights
@david-cortes (#4975) - [CUDA] New CUDA version Part 1 @shiyu1994 (#4630)
- [python] make
early_stopping
callback pickleable @Yard1 (#5012) - [c-api][python-package][R-package] expose feature num bin @jmoralez (#5048)
- [python-package] [R-package] propagate the best iteration of cvbooster into the individual boosters @jmoralez (#5066)
- [python-package] add support for pandas nullable types @jmoralez (#4927)
- [R-package] Promote objective and init_score to top-level arguments in
lightgbm()
@david-cortes (#4976) - [python] Start supporting Python 3.10 @StrikerRUS (#4893)
- [python-package] support customizing Dataset creation in Booster.refit() (fixes #3038) @TremaMiguel (#4894)
- [dask] add support for custom objective functions (fixes #3934) @jameslamb (#4920)
- [R-package] added argument eval_train_metric to lgb.cv() (fixes #4911) @mayer79 (#4918)
- Add support for Visual Studio 2022 @StrikerRUS (#4889)
- Add C API function that returns all parameter names with their aliases @StrikerRUS (#4829)
- [python][sklearn] respect parameters for predictions in
init()
andset_params()
methods @StrikerRUS (#4822) - Add customized parser support @chjinche (#4782)
- [R-package] Add
print()
andsummary()
methods for Booster @david-cortes (#4686) - Add 'nrounds' as an alias for 'num_iterations' (fixes #4743) @mikemahoney218 (#4746)
- [python-package] early stopping min_delta (fixes #2526) @jmoralez (#4580)
- [python][sklearn] respect objective aliases @StrikerRUS (#4758)
- [python][sklearn] add
n_estimators_
andn_iter_
post-fit attributes @StrikerRUS (#4753)
🔨 Breaking
- [python-package] make Booster and Dataset 'handle' attributes private (fixes #5313) @jameslamb (#5947)
- [python-package] remove hard dependency on 'scikit-learn', fix minimal runtime dependencies @jameslamb (#5942)
- [python-package] [ci] switch to PEP 517 / 518 builds (remove
setup.py
) (fixes #5061) @jameslamb (#5759) - [ci] [python-package] replace 'python setup.py' with a shell script @jameslamb (#5837)
- [R-package] use C++17 in the CRAN package @jameslamb (#5690)
- [python-package] make some Booster and Dataset attributes private @jameslamb (#5723)
- [CUDA] consolidate CUDA versions @jameslamb (#5677)
- [python-package] make public API members explicit with module-level all variables @jameslamb (#5655)
- [ci] migrate CI from macOS 10.15 to 11 (fixes #5391) @StrikerRUS (#5396)
- [ci] switch to manylinux_2_28 for Linux artifacts (fixes #5514, fixes #5589) @jameslamb (#5580)
- fix: Adjust LGBM_DatasetCreateFromSampledColumn to handle distributed data @svotaw (#5344)
- [python-package] allow custom weighing in fobj for scikit-learn API (closes #5027) @jmoralez (#5211)
- [R-package] Use
type
argument to control prediction types @david-cortes (#5133) - [python-package] Use scikit-learn interpretation of negative
n_jobs
and change default to number of cores @david-cortes (#5105) - [python-package] remove Booster.set_attr() and Booster.attr() @jameslamb (#5272)
- remove support for Solaris (fixes #5216) @jameslamb (#5226)
- [R-package] stop automatically calculating eval metrics on training data in lightgbm() @jameslamb (#5209)
- [R-package] remove lgb.unloader() @jameslamb (#5204)
- [python-package] remove 'fobj' in favor of passing custom objective func...
v3.3.5
Changes
This is a special release, put up to prevent the R package from being archived on CRAN.
See #5661 and #5662 for context.
This release only contains the changes, relative to v3.3.4
, necessary to prevent removal of the R package from CRAN.
💡 New Features
None
🔨 Breaking
None
🚀 Efficiency Improvement
None
🐛 Bug Fixes
- [ci] [R-package] fix clang 15 warning about unqualified calls (fixes #5661) @jameslamb (#662)
📖 Documentation
None
🧰 Maintenance
None
v3.3.4
Changes
This is a special release, put up to prevent the R package from being archived on CRAN.
See #5618 and #5619 for context.
This release only contains the changes, relative to v3.3.3
, necessary to prevent removal of the R package from CRAN.
💡 New Features
None
🔨 Breaking
None
🚀 Efficiency Improvement
None
🐛 Bug Fixes
- prefer 'vsnprintf' to 'vsprintf' @jameslamb (#5561)
📖 Documentation
None
🧰 Maintenance
- [ci] test against R 4.2.2 @jameslamb (#5621)
v3.3.3
Changes
This is a special release, put up to prevent the R package from being archived on CRAN.
See #5502 and #5525 for context.
This release only contains the changes, relevant to v3.3.2
, necessary to prevent removal of the R package from CRAN.
💡 New Features
- [PARTIALLY: only for R-package] Add support for Visual Studio 2022 @StrikerRUS (#4889)
🔨 Breaking
None
🚀 Efficiency Improvement
None
🐛 Bug Fixes
- Check existence of inet_pton for win32 in CMakeLists.txt (fixes #5019) @shiyu1994 (#5159)
- [ci] [R-package] use R 4.2.1 in Windows CI jobs (fixes #4881) @jameslamb (#5503)
- [R-package] fix test on non-ASCII features in non-UTF8 locales @jameslamb (#5526)
📖 Documentation
None
🧰 Maintenance
- [ci] run Appveyor checks on PRs targeting release/ branches @jameslamb (#5528)
- [ci] run CI on pull requests targeting release/ branches @jameslamb (#5527)
- [ci] [R-package] ensure that MSVC jobs fail when tests fail (fixes #5439) @jameslamb (#5448)
v3.3.2
Changes
🐛 Bug Fixes
- [R-package] Apply patch for R4.2 on Windows @shiyu1994 (#4923)
📖 Documentation
- [docs] [R-package] update cran-comments for v3.3.1 release @jameslamb (#4738)
🧰 Maintenance
- [ci] Temporary pin dask version at CI @StrikerRUS (#4770)
v3.3.1
Changes
💡 New Features
- [R-package] Expand user paths in file names @david-cortes (#4687)
🐛 Bug Fixes
- [python][sklearn] Allow non-serializable objects in callbacks argument @StrikerRUS (#4723)
- Fix ASAN issues with
std::function
usage @david-cortes (#4673) - fix behavior for default objective and metric @StrikerRUS (#4660)
📖 Documentation
- [docs] Add Tong Wu and Zhiyuan He as code owners @shiyu1994 (#4717)
- [docs] fix R API link to point to the current version of docs @StrikerRUS (#4691)
- [docs] update comment about pre-installed Java version for SWIG build @StrikerRUS (#4710)
- [docs] fix C API docs rendering @StrikerRUS (#4688)
- [docs] Add avatar to Yu Shi in R docs @StrikerRUS (#4690)
🧰 Maintenance
- release v3.3.1 @jameslamb (#4715)
- [ci] introduce CI jobs that mimic CRAN gcc-ASAN and clang-ASAN tests (fixes #4674) @jameslamb (#4678)
- [R-package][test] add reference to the original issue in R-package test @StrikerRUS (#4720)
- [python] Improve error message for plot_metric with Booster @js850 (#4709)
- [R-package] allow for small numerical differences in Booster test @jameslamb (#4714)
- Fix some paramater hints when loading from binary file @hzy46 (#4701)
- [ci] fix CI Windows script to use downloaded SWIG, not the pre-installed one @StrikerRUS (#4711)
- [ci] Rename RTD config file @StrikerRUS (#4689)
- [ci] Bump Google Test version from
1.10.0
to1.11.0
@StrikerRUS (#4683) - [python] fix mypy error in engine.py @rakki-18 (#4675)
- [python] fix mypy error in setup.py @rakki-18 (#4671)
- update Guolin's e-mail in
setup.py
@StrikerRUS (#4663) - Replace deprecated
licenseUrl
field withlicense
one in.nuspec
file @StrikerRUS (#4669) - [python][sklearn] use
__sklearn_is_fitted__()
in all estimator fitness checks @StrikerRUS (#4654) - [ci] Bump version for development @StrikerRUS (#4662)
v3.3.0
Changes
💡 New Features
- allow inclusion in C programs @drewmiller (#4608)
- add param aliases from scikit-learn @StrikerRUS (#4637)
- [python] add placeholders to titles in plotting functions @StrikerRUS (#4614)
- [python-package] Support 2d collections as input for
init_score
in multiclass classification task @jmoralez (#4150) - [python] add parameter object_hook to method dump_model @xadupre (#4533)
- [python] support Dataset.get_data for Sequence input. @cyfdecyf (#4472)
- [python] allow to pass some params as pathlib.Path objects @StrikerRUS (#4440)
- [python-package] Create Dataset from multiple data files @cyfdecyf (#4089)
- [dask] add support for eval sets and custom eval functions @ffineis (#4101)
- Add linear leaf models to json output (fixes #4186) @btrotta (#4329)
- [dask] run Dask tests on aarch64 architecture @StrikerRUS (#3996)
- [python] handle arbitrary length feature names in Python-package @StrikerRUS (#4293)
- Precise text file parsing @cyfdecyf (#4081)
- added aliases to params @StrikerRUS (#4205)
- [swig] add wrapper for LGBM_DatasetGetFeatureNames @shuttie (#4103)
🔨 Breaking
- [python] deprecate "auto" value of
ylabel
argument ofplot_metric()
function @StrikerRUS (#4624) - [python] rename
print_evaluation()
intolog_evaluation()
@StrikerRUS (#4604) - [RFC][python] deprecate advanced args of
train()
andcv()
functions and sklearn wrapper @StrikerRUS (#4574) - [RFC][python] deprecate
silent
and standaloneverbose
args. Prefer globalverbose
param @StrikerRUS (#4577) - [python] add 'auto' value for
importance_type
param in plotting @StrikerRUS (#4570) - [dask] Make output of feature contribution predictions for sparse matrices match those from sklearn estimators (fixes #3881) @jameslamb (#4378)
- [R-package] change default nrounds to 100 to match LightGBM core library default @david-cortes (#4197)
🚀 Efficiency Improvement
- simplify and speed up comparisons for splits with identical gains @jameslamb (#4542)
- factor out .size() checks in GetDataType() @jameslamb (#4541)
- consolidate duplicate conditions in TextReader @jameslamb (#4530)
- [python] replace numpy.zeros with numpy.empty for the speedup @StrikerRUS (#4410)
- [R-package] avoid unnecessary computation of std deviations in lgb.cv() @jameslamb (#4360)
- Replace division of exponential in Gamma loss @lorentzenchr (#4289)
🐛 Bug Fixes
- [R-package] fix segfaults caused by missing Booster and Dataset handles (fixes #4208) @jameslamb (#4586)
- move Network method implementations from network.h to network.cpp (fixes #4464) @jameslamb (#4496)
- [R-package] prevent memory leak if pointer fails to allocate @david-cortes (#4613)
- [R-package] Fix R memory leaks (fixes #4282, fixes #3462) @david-cortes (#4597)
- [python][sklearn] respect
eval_at
aliases in keyword arguments @StrikerRUS (#4599) - [dask] Fixed Dask type annotation @StrikerRUS (#4558)
- [R-package] allow construction of Dataset from CSV without header (fixes #4553) @jameslamb (#4554)
- [R-package] fix OpenMP checking on macOS (fixes #4131) @jameslamb (#4507)
- [R-package] pass R-configured compiler flags to checks in configure @jameslamb (#4506)
- [R-package] use C++ compiler for pre-compile checks on Windows @jameslamb (#4504)
- [dask] find all needed ports in each host at once (fixes #4458) @jmoralez (#4498)
- Fix undefined behavior with NaN input in CategoricalDecision() @hcho3 (#4468)
- [dask] determine output shape of array in predict (fixes #4285) @jmoralez (#4351)
- [fix] fix Reservoir Sampling in Sample of random.h (fix #4371 and #4134) @shiyu1994 (#4450)
- [CUDA] fix CUDA memory error by reducing block number (#4315) @RobinDong (#4327)
- [R-package] fix protection stack imbalance and unprotected objects (fixes #4390) @fabsig (#4391)
- [dask] pass additional predict() parameters through when input is a Dask Array @jameslamb (#4399)
- fix param aliases @StrikerRUS (#4387)
- sync for init score of binary objective function @loveclj (#4332)
- Fix undefined behavior in ArrayArgs::Partition() when interval size is 1 (fixes #4272) @kruda (#4280)
- Log warning instead of fatal when parsing float get under/overflow. @cyfdecyf (#4336)
- [fix] fix Sample when sampling only one element (fix #4134) @shiyu1994 (#4324)
- [R-package] move more finalizer logic into C++ side to address memory leaks @jameslamb (#4353)
- [tests][python] fix f-string in test_dask.py @StrikerRUS (#4373)
- [fix] skip empty bins when calculating cnt_in_bin in BinMapper::FindBin (fix #4301) @shiyu1994 (#4325)
- [fix] fix GatherInfoForThresholdNumerical boundary (fix #4286) @shiyu1994 (#4322)
- fix calculation of weighted gamma loss (fixes #4174) @mayer79 (#4283)
- [R-package] prevent symbol lookup conflicts (fixes #4045) @jameslamb (#4155)
- [R-package] avoid misleading warnings when using interaction constraints (fixes #4108) @jameslamb (#4232)
- [fix] Fix bug in data distributed learning with local empty leaf @shiyu1994 (#4185)
- fix: Dataset::CreateValid init fields which saves to binary. @cyfdecyf (#4177)
📖 Documentation
- [docs] add Mars to docs @StrikerRUS (#4616)
- [docs] update link to MinGW-w64 site @StrikerRUS (#4606)
- [docs] add lightgbm_ray to docs @jameslamb (#4584)
- [docs][python] Refer to functions as
callable
in docstrings @StrikerRUS (#4575) - [R-package] fix warnings in demos @jameslamb (#4569)
- [R-package] fix warnings in examples @jameslamb (#4568)
- [python][docs] Refer to string type as
str
in docstrings @StrikerRUS (#4565) - [docs] add José Morales to repo maintainers @StrikerRUS (#4563)
- [docs] update links to SynapseML (former MMLSpark) @StrikerRUS (#4564)
- [python][docs] Refer to string type as
str
and add commas inlist of ...
types @StrikerRUS (#4557) - [docs][python] Improve description of
eval_result
argument inrecord_evaluation()
@StrikerRUS (#4559) - [doc] Add link to Neptune hyperparam tuning guide @Blaizzy (#4529)
- [docs] Update link to
daal4py
in README @StrikerRUS (#4532) - [docs] Add notes in installation guide, including ones about OpenMP @StrikerRUS (#4520)
- [docs] [R-package] use CRAN-style builds when building pkgdown site @jameslamb (#4513)
- [docs] Update link to mlr3-compliant interface in README @StrikerRUS (#4509)
- [docs] document CLI behavior when label_column is omitted @jameslamb (#4485)
- [docs] clarify description of prediction early stopping @StrikerRUS (#4411)
- [docs][python] add versionadded to Sequence class in Python wrapper @StrikerRUS (#4441)
- [docs] add lleaves to README @StrikerRUS (#4431)
- [docs] Add shapash to the list of related projects @StrikerRUS (#4408)
- [docs] update link to LightGBM example in MMLSpark repo @StrikerRUS (#4401)
- [docs][R-package] add authors in R-package description @StrikerRUS (#4395)
- fix: typo in python class _InnerPredictor docstring @cyfdecyf (#4389)
- [dask] Dask Vector types for group, init_score, sample_weights (fixes #4375) @ffineis (#4380)
- [docs] document sanitizers @StrikerRUS (#4365)
- [docs][python] enhance
keep_training_booster
param description @StrikerRUS (#4364) - [docs] add anchor for nightly builds in docs @StrikerRUS (#4366)
- [docs] document how to pass multi-value params from Python and R (fixes #4345) @jameslamb (#4346)
- [docs] make building of C++ tests section collapsable @StrikerRUS (#4340)
- [docs] replace broken mmlspark notebook link in docs @jameslamb (#4303)
- [docs] clarify docs for LGBM_BoosterGetEvalNames and LGBM_BoosterGetEvalCounts (fixes #4264) @jameslamb (#4270)
- [docs][R-package] update docs on C++ interface @jameslamb (#4257)
- [docs][python] update some docs related to custom objective @StrikerRUS (#4245)
- [docs][python][scikit-learn] added note for LGBMRanker @StrikerRUS (#4243)
- [docs] fix broken MS MPI link in Installation Guide @jameslamb (#4224)
- [R-package] clarify parameter documentation (fixes #4193) @jameslamb (#4202)
- [docs][R-package] Update the explanation of num_threads (fixes #4192) @issactoast (#4199)
- [docs] add working dir to R package docker run examples @jameslamb (#4190)
- [docs] fix markdown in docs @StrikerRUS (#4191)
- [docs] Add changes to gcc-tips @akshitadixit (#4187)
- [docs] bring back macOS installation method with Homebrew formula in docs @StrikerRUS (#4182)
🧰 Maintenance
- v3.3.0 release (fixes #4310) @jameslamb (#4633)
- fix possible precision loss in xentropy and fair loss objectives @jameslamb (#4651)
- [tests][python-package] refactor list_to_1d_numpy test to run without pandas installed @jmoralez (#4639)
- [python] add type hints to _safe_call @strobelTha (#4641)
- remove unused DCGCalculator::CalDCGAtK() @jameslamb (#4650)
- [python][sklearn] add
__sklearn_is_fitted__()
method to be better compatible with scikit-learn API @StrikerRUS (#4636) - [ci] Use the latest gcc version in macOS CI jobs @StrikerRUS (#4640)
- remove duplicated debug printing in
CMakeLists.txt
for MPI @StrikerRUS (#4644) - remove unused BinMapper::SizeForSpecificBin() @jameslamb (#4643)
- [ci] ignore certificates for kitware apt channel in CUDA jobs (fixes #4646) @jameslamb (#4648)
- [ci] bump CUDA version from 11.4.0 to 11.4.2 at CI @StrikerRUS (#4628)
- [R-package] introduce Dataset methods set_field() and get_field() @jameslamb (#4571)
- [ci] Recover running CUDA tests at CI (fixed #4611) @shiyu1994 (#4621)
- [ci] Run cmakelint at CI and fix some errors @StrikerRUS (#4617)
- [python] initialize installation options with boolean values in
setup.py
@StrikerRUS (#4620) - [python] fix mypy error in
dask.py
@StrikerRUS (#4615) - [ci] Stop running CUDA tests at CI @StrikerRUS (#4611)
- [R-package] avoid unnecessary computation and add tests for Dataset set_reference() method @jameslamb (#4587)
- [ci] fix link to LightGBM public e-mail @StrikerRUS (#4603)
- [tests][dask] Use workers hostname in tests (fixes #4594) @jmoralez (#4595)
- p...