13 Apr 10:45

github-actions

0bcf252

1.2.8 Latest

Latest

Python package

Support Python 3.13 #2748. Thanks to @jeremy010203.
Support NumPy 2.x. #2671
Drop support for obsolete Python 3.7.
Use the proper name of the implementation class as a string id when storing values calculated for custom metrics on GPU. #1792
Propagate exceptions from custom metrics code on GPU. #1792

CatBoost for Apache Spark

Fix workers hanging after training. #2151. Thanks to @Shamann.
Remove support for Spark 2.x

Improvements

[R-package] Allow targets of character and factor types (useful for classes). #1874
Better default leaf_estimation_iterations for Tweedie regression on GPU. #2812

Build & testing

Switch to external Cython 3.0.10+ instead of 0.29.x-based version from contrib. #2810
Switch to Conan 2.x. #2582
[CUDA]. Do not output detailed ptxas statistics by default.
Used OpenSSL version updated to 3.0.15

Bugfixes

[JVM applier]. Methods related to evaluator types have been private by mistake.
[JVM applier]. Categorical features hashing methods have been private by mistake.
Fix crash when training on a quantized dataset that contains categorical features. #2816
Fix prediction of type Probability on CPUs that do not have SSE4 instruction set (that includes all ARM CPUs).
Values with probability 0 have been erroneously computed as nan.
Fix race condition when loading sparse datasets.

Contributors

Shamann and jeremy010203

Assets 33

catboost-1.2.8.exe

342 MB 2025-04-13T10:45:50Z
catboost-darwin-1.2.8

68.8 MB 2025-04-13T10:45:50Z
catboost-darwin-universal2-1.2.8

68.8 MB 2025-04-13T10:45:50Z
catboost-linux-1.2.8

294 MB 2025-04-13T10:45:50Z
catboost-linux-aarch64-1.2.8

295 MB 2025-04-13T10:45:50Z
catboost-linux-x86_64-1.2.8

294 MB 2025-04-13T10:45:50Z
catboost-R-Darwin-1.2.8.tgz

18.9 MB 2025-04-13T10:45:50Z
catboost-R-darwin-universal2-1.2.8.tgz

18.9 MB 2025-04-13T10:45:50Z
catboost-R-Linux-1.2.8.tgz

88.9 MB 2025-04-13T10:45:50Z
catboost-R-linux-aarch64-1.2.8.tgz

88.6 MB 2025-04-13T10:45:50Z
Source code (zip)

2025-04-13T02:00:23Z
Source code (tar.gz)

2025-04-13T02:00:23Z

0 Join discussion

07 Dec 11:22

andrey-khropov

node-package-v1.26.0

b46eb8c

Node package release 1.26.0

(uses catboostmodel native libraries from the main CatBoost release v1.2.7)

Fix MultiClassification models support. #1903
Fix predict on GPU. #1901, #1923.
Make specifying categorical features parameter optional.
Support text and embedding features. #2523
Add support for 'MultiProbability' prediction type
Support Linux aarch64
Support macOS arm64
Support Windows x86_64

Assets 2

0 Join discussion

07 Sep 20:11

github-actions

v1.2.7

f903943

1.2.7

Bugfixes

[R-package]: Restore basic functionality.

Build & testing

[GPU] Return configuration for multi-node GPU training with CMake-based build. See documentation.

Assets 33

0 Join discussion

05 Sep 10:59

github-actions

v1.2.6

e432431

1.2.6

⚠️ R-package is broken in this release. Please use release 1.2.7+

Major changes

CatBoost open source build, test and release infrastructure has been switched to GitHub actions. It is possible to run it if you fork CatBoost repository as well. See the announcement for details.

Python package

Adapt numpy dependency specification to prohibit numpy >= 2.0 for now. #2671

New features

User-defined metric GPU evaluation for task_type=GPU. Thanks to @pnsemyon.
GPU Custom objective support. Thanks to @pnsemyon.
[C/C++ applier]. APT_MULTI_PROBABILITY prediction type is now supported. #2639. Thanks to @aivarasbaranauskas.
GroupQuantile metric
Aggregated graph features

Build & testing

[Windows]: Visual Studio 2022 with MSVC toolset 14.29.30133 is now supported. #2302

Speedups

[GPU]: Increase block size in QueryCrossEntropy (~3x faster on a100 for 6m samples, 350 features, query size near 1).

Improvements

[datasets] Use mkstemp to replace deprecated mktemp. #2660. Thanks to @fatmo666

Bugfixes

[C/C++ applier]. Add missed PredictSpecificClassFlat to calcer.exports. #2715
[Linux]. Restore readable backtraces
[GPU] Make CUDA_MAX_THREADS_PER_SM cuda arch-specific
[JVM applier][Windows]: Fixed bloating temp directory with copies of native libraries on Windows. #2622. Thanks to @DKARAGODIN.
Calculate F1, Precision, and Recall for all labels in multi-label classification
Synchronize values of NCB::NModelEvaluation::EPredictionType and EApiPredictionType. #2643
Fix sign of 2nd derivative for Tweedie loss
Fix 'Can't find borders for feature ...' error when using text features on GPU. #2657
Fix indexing of tokenized text features in model saver and dataset loader when some features are ignored
Fix descent direction for Cox regression fix #2701
Fix GetTreeNodeToLeaf in multidimensional case (fixes plot_tree for multidimensional approx with non-oblivious trees). #2668

Contributors

aivarasbaranauskas, DKARAGODIN, and 2 other contributors

Assets 33

0 Join discussion

18 Apr 20:19

andrey-khropov

v1.2.5

2605fe6

1.2.5

New features

[Python-package]: Support custom eval metrics on GPU. #1792. Thanks to @pnsemyon.

Bugfixes

[Python-package]: Check eval_period parameter validity for staged prediction. #2593
[Python-package]: Fix _CustomLoggersStack.pop logic. #2620
[R-package]: Fix Caret object: Inconsistent grid creation with documentation. #2609
[JVM applier]: Fix issues with exposing undesired symbols in JNI shared libraries (including allocators) on macOS. #2606
Fix training with embedding features on GPU. #2249, #2308, #2591
Fix training with text features on GPU
Use correct sample count in MultiRMSE on multiple GPUs. #2557
Fix sign of 2nd order derivative in Huber loss
Enable gradient walker for non-additive metrics
Fixes for Cox objective: buffer overflow in derivatives calculation, derivatives summation, metric calculation, disable ordered boosting
Fix text features data serialization in the model files

Contributors

pnsemyon

Assets 33

1 Join discussion

23 Feb 14:10

andrey-khropov

v1.2.3

fe0941b

1.2.3

Python package

Support Python 3.12. #2510
[Performance]: Fix ineffective loops in Cython. Significant speedups (up to 3x) on dataset construction from data in C-order can be expected.
[Performance]: Make features data initialization from C-order numpy.ndarrays with float32 data type multithreaded. Significant speedups of 5x up to 10x (on CPUs with many cores) can be expected. #385, #2542
Save training metrics into the model metadata. So best_score_, evals_result_, best_iteration_ model attributes now work after model saving and loading. Can be removed by model metadata manipulation if needed. #1166
[Breaking change]. Support a separate boolean target type, now Class predictions for models that have been trained with boolean targets will also be boolean instead of True, False strings as before. Such models will be incompatible with the previous versions of CatBoost appliers. If you want the old behavior convert your target to False, True strings before training. #1954
Restrict jupyterlab version for setup to 3.x for now. Fixes #2530
utils.read_cd: Support CD files with non-increasing column indices.
Make log_cout, log_cerr specification consistent, avoid reset in recursive calls.
Late-initialize default values for log_cout, log_cerr. #2195
Add missing generated metrics: Cox, PairLogitPairwise, UserPerObjMetric, SurvivalAft.

New features

Support boolean target/labels type during training in Python and Spark (in the latter case only when using fit with Pool arguments) and Class prediction in Python. #1954
[Spark]: Support Spark 3.5.x.
[C/C++ applier]. Add functions for getting indices of features of different types to C and C++ API. #2568. Thanks to @nimusp.
[C/C++ applier]. Add staged prediction functions to C API. #2584. Thanks to @Mb-NextTime.
[JVM applier]. Add loading CatBoostModel from a byte array to API. #2539
[Linux] Support CgroupsV2 when computing default number of threads used in parallel computations. #2519. Thanks to @elukey.
[CLI] Support printing Auxiliary columns by name in evaluation result output. #1659
Save training metrics into the model metadata. Can be removed by model metadata manipulation if needed. #1166

Build & testing

[Windows]: Use clang-cl compiler and tools from Visual Studio 2022 for the build without CUDA (build with CUDA still uses standard Microsoft toolchain from Visual Studio 2019).
[macOS]: Pass os.version to conan host settings to ensure version consistency.
[Linux aarch64]: Set -mno-outline-atomics for modern versions of CLang and GCC to avoid unresolved symbols linking errors. #2527
Added missing CMakeLists for unit tests for util. #2525

Bugfixes

[Performance]: Fix performance regression that could slow down training on GPU by 50% on some datasets that had been introduced in release 1.2. Thanks to @JeanPaulShapo.
[Python-package]: Fix segfault on Pool(data=None). #2522
[Python-package]: Fix Python exception in Pool() when pairs_weight is a numpy array. #1913
[Python-package]: Fix segfault and other strange errors when specifying custom logger with __call__ method. #2277
[Python-package]: Fix returning complex params in hyperparameter search. #1741, #1833
[Python-package]: Fix ignored exceptions for missed metrics descriptions on startup. This has not been visible to users but has been making debugging more difficult.
[Python-package]: Fix misleading Targets are required for YetiRank loss function. error in Cross validation. #2083
[Python-package]: Fix Pool.get_label() returns constant True for boolean labels. #2133
[Python-package]: Copying models does not lose best_score_, evals_result_, best_iteration_ attributes values anymore. #1793
[Spark]: Fix hangs at the end of the training. #2151
Precision metric default value in the absense of positive samples is changed to 0 and a warning is added
(similar to the behavior of scikit-learn implementation). #2422
Fix ignoring embedding features
Try to avoid hash collisions when computing group ids with datasets with a lot of groups (may occur in datasets with around a 10^9 samples).
Fix Multiclass models export to C++ and Python code. #2549
Fix dataset_statistics mode when no Target data is available.
Fix Error: can't proceed some features error on GPU. #1024
Fix allow_const_label=True for classification. #1933
Add checking of approx and target dimensions for SurvivalAft objective/metric.
Fix Focal loss derivatives sign. #2563

Contributors

elukey, JeanPaulShapo, and 2 other contributors

Assets 33

19 Sep 20:01

andrey-khropov

v1.2.2

e888c31

1.2.2

Bugfixes

Fix Segmentation fault when using custom eval_metric in binary python packages of version 1.2.1 on PyPI. #2486
Fix LossFunctionChange fstr with embedding features.
Fix a segmentation fault in JVM applier when using embedding features on JVM 11+.
Fix CTR data handling in model summation (especially for models with CTRs with multiple target quantizations).

Assets 33

28 Aug 20:57

andrey-khropov

v1.2.1

d03b246

1.2.1

New features

Allow to optimize specific ranking loss functions with YetiRank and YetiRankPairwise by specifying mode parameter. See Which Tricks are Important for Learning to Rank? paper for details (this family of losses is called YetiLoss there). CPU-only for now.
Add Kernel Gradient Boosting support (use catboost.sample_gaussian_process function). #2408, thanks to @TakeOver. See Gradient Boosting Performs Gaussian Process Inference paper for details.
LambdaMart loss: support new target metrics MRR, ERR and MAP.
StochasticRank loss: support new target metrics ERR and MRR.
Support MultiRMSE on GPU. #2264, #2390
Load JSON model format in Java Client. #1627, thanks to @timotta
Implement exporting of Multiclass models to C++ and Python. #2283, thanks to @antoninkriz

Improvements

Speedup BM25 feature calcers 3x
Use int instead of deprecated numpy.int. #2378
Add ModelCalcerWrapper::CalcFlatTransposed, #2413 thanks to @faucct
Update dependencies to avoid known vulnerabilities

Bugfixes

Fix __shfl_up_sync mask. #2339
TFocalMetric negative values fix. #2386, thanks to @diditforlulz273
Focal loss: Use user-defined alpha and gamma
Fix exception propagation: Rethrow exceptions caused by user's python code as C++ exceptions
CatBoost trained with user defined objective was incompatible with ShapValues calculation
Avoid nan's in Newton step calculation for RMSEWithUncertainty
Fix score method for y with shape (N, 1). #2405
Fix scalePosWeight support for Spark. #2470

Contributors

timotta, TakeOver, and 3 other contributors

Assets 33

01 May 23:11

andrey-khropov

v1.2

9b286dc

1.2

Release 1.2

Major changes

CatBoost's build system has been switched from Ya Make (Yandex's build system) to CMake. This means more transparency in the build process and more familiar tools for Open Source developers.
For now it is possible to build CatBoost for:

Linux on x86-64 with or without CUDA
Linux on aarch64 with or without CUDA
macOS on x86-64 and arm64, including creating universal binaries
Windows on x86-64 with or without CUDA
Android (only model applier) on All supported ABIs.

This allowed us to prepare the Python package in the source distribution form (also known as sdist). #830

msvs subdirectory with the Microsoft Visual Studio solution has been removed. Visual Studio solutions can be generated using CMake instead.
make subdirectory with Makefiles has been removed. Use CMake + ninja (recommended) or CMake + make instead.

Python package

Switch to the standard Python build and installation method that uses setup.py instead of the custom mk_wheel.py script. All common scenarios (sdist, build, install, editable install, bdist_wheel) are supported.
Switch wheel platform tag on Linux from obsolete manylinux1 to manylinux2014.
The source distribution is now available on PyPI. #830
Wheels for Linux aarch64 are now available on PyPI. #2091
Support Python 3.11. #2213
Drop support for obsolete Python 3.6.
Make wheels PEP427-compliant. #2165
Fix wrong checksums in wheels that caused problems with poetry. #2331
Improved performance due to caching TBB local executors. #2203
Add fixed_binary_splits to the regressor, classifier, and ranker.
Compatibility with pandas 2.0. #2320
CatBoost widget is now compatible with ipywidgets 8.x. #2266

Rust package

Support CUDA applier. #1925, thanks to @getumen.
Properly forward debug/release setting to native library build.
Passing features: switch from String and Vec types for features to AsRef of slices to make code more generic
Support text and embedding features.
Support multidimensional output in predictions.

New features

[JVM applier]: Support CUDA.
[Spark]: Support Spark 3.4.x (if you want to use Spark with python 3.11 use this version).
Static model applier library now works on Windows.
Add binary-classification-threshold parameter to the CLI model applier.
Support Multi-target regression with text features (but only Bag-of-Words features are generated for now). #2229
Support RMSEWithUncertainty loss function on GPU.
Support MultiLogloss and MultiCrossEntropy loss functions with numerical features on GPU.
Support MultiLogloss loss function with text features on CPU and GPU. #1885
Enable univariate metrics for models with uncertainty
Add Focal loss (CPU-only for now). #1807, thanks to @diditforlulz273.

Improvements

Removed legacy dependency on Python 2 interpreter in the build process. #2297
Calc metrics: Throw catboost exception if column index exceeds column count.
Speedup MultiLogloss on CPU by 8% per tree (110K samples, 20 targets, 480 float features, 3 cat features, 16 cores CPU).
Update .NET projects from obsolete .NET Core 2.1 to .NET Core 3.1.
Code generation for new CUDA Compute Architectures 8.6, 8.9 and 9.0 is enabled by default (requires CUDA 11.8 to build from source).
Check that evaluator implementation is available in TFullModel::SetEvaluatorType (it was possible to get a Segmentation fault when calling it for non-available implementstion). Add TFullModel::GetSupportedEvaluatorTypes.
Cross Validation on GPU no longer requires allow_write_files=True.

Bugfixes

[Python-package]: Clear model params before load_model. Fixes #2225.
[Python-package]: Fix CatBoostRanker score computation. #2231
[Python-package]: Fix _get_embedding_feature_indices. #2273
[Python-package]: Fix set_feature_names with text or embedding features. #2090
[Python-package]: pandas.Categorical.categories is not necessarily a numpy.ndarray. #1965
[Spark]: Pass classpath in a file to avoid hitting cmdline length limits. #1842
[CUDA Applier]: Apply scale and bias.
[CUDA Applier]: Fix that libs/model_interface applier always produced an error in CUDA mode.
Fix CUDA error 700 in pairwise ranking.
Fix kernel registration for distributed training on GPU.
Fix `floating point exception' on CPU for small datasets on GPU.
Fix wrong log message 'There are invalid params and some of them will be ignored'. #2253
Fix incorrect results and crashes for GPU applier on Nvidia Ampere - based GPUs.
Fix 'CUDA error 9' in Multi-GPU training.
Fix serialization of embedding features structures in the model.
Fix GPU buffer overrun in distributed multi-classification training.
Fix catboost/cuda/cuda_util/sort.cpp:166: CUDA error 9 on Nvidia Ampere - based GPUs.
Fix inf/nan parsing in dataset input files.
Fix floating point exception for very small datasets on GPU.
Fix: built static applier library lacked the part with 'global' objects. #2187
Fix sum of models with categorical features with CTRs.
Fix: model_interface/cmake_example failed build "‘runtime_error’ is not a member of ‘std’". #2324, thanks to @Mandelag.
Fix Segmentation fault in Cross Validation and hyperparameter search functions that use it on GPU.
Fix Segmentation fault in utils.eval_metrics for groupwise metrics when group data has not been specified. #2343
Fix errors when running Cross Validation repeatedly on GPU. #2221

P.S. There's an issue with somewhat unexpected binary size increases. We're investingating in #2369

Contributors

getumen, Mandelag, and diditforlulz273

Assets 37

01 Nov 19:55

andrey-khropov

v1.1.1

e125028

1.1.1

Release 1.1.1

New features

Support building for Linux on aarch64 from sources using CMake (no prebuilt binaries or PyPI packages yet). #1981
[C/C++ applier] Support embedding features. #2172
[C/C++ applier] Add GetModelUsedFeaturesNames. #2204
[Python] Add text features to utils.create_cd. #2193
[Spark] Full support for Apache Spark 3.3
[Spark] Read/write PySpark's DataFrame-like API for Pool. #2030
[Spark] Allow to specify trainingDriver and worker listening ports. #2181

Bugfixes

Fix prediction dimension check for RMSEWithUncertainty and MultiQuantile. #2155
[C/C++ applier] Fix segmentation fault in prediction for multiple objects for multiple dimension models.
[JVM applier] Fix catboost-common dependency version in catboost-prediction (Fixes JVM applier on macOS). #2121
[Python] Update for pandas 1.5.0: iteritems -> items (Fixes annoying deprecation warning). #2179
[Python] Fix segmentation fault when target is np.ndarray with dtype=object. #2201
[Python] Fix specifying feature_names in utils.create_cd. #2211

Assets 16

Releases: catboost/catboost

1.2.8

Python package

CatBoost for Apache Spark

Improvements

Build & testing

Bugfixes

Contributors

Uh oh!

Node package release 1.26.0

Uh oh!

1.2.7

Bugfixes

Build & testing

Uh oh!

1.2.6

Major changes

Python package

New features

Build & testing

Speedups

Improvements

Bugfixes

Contributors

Uh oh!

1.2.5

New features

Bugfixes

Contributors

Uh oh!

1.2.3

Python package

New features

Build & testing

Bugfixes

Contributors

Uh oh!

1.2.2

Bugfixes

Uh oh!

1.2.1

New features

Improvements

Bugfixes

Contributors

Uh oh!

1.2

Release 1.2

Major changes

Python package

Rust package

New features

Improvements

Bugfixes

Contributors

Uh oh!

1.1.1

Release 1.1.1

New features

Bugfixes

Uh oh!