Website | Documentation | Installation | Release Notes
Fork by Alexey Stytsenko for Yandex Data Analysis School homework assignment.
CatBoost is a machine learning method based on gradient boosting over decision trees.
Main advantages of CatBoost:
- Superior quality when compared with other GBDT libraries.
- Best in class inference speed.
- Support for both numerical and categorical features.
- Fast GPU and multi-GPU support for training (compiled binaries and python package for learning on one host, build cmd-line MPI version from source to learn on several GPU machines).
- Data visualization tools included.
All CatBoost documentation is available here.
Install CatBoost by following the guide for the
Next you may want to investigate:
- Tutorials
- Training modes on CPU and GPU
- Cross-validation
- Implemented metrics
- Parameters tuning
- Feature importance calculation
- Regular and staged predictions
If you want to evaluate Catboost model in your application read model api documentation.
- For reporting bugs please use the catboost/bugreport page.
- Ask your question about CatBoost on Stack Overflow.
- Check out help wanted issues to see what can be improved, or open an issue if you want something.
- Add your stories and experience to Awesome CatBoost.
- To contribute to CatBoost you need to first read CLA text and add to your pull request, that you agree to the terms of the CLA. More information can be found in CONTRIBUTING.md
- Instructions for contributors can be found here.
Latest news are published on twitter.
Anna Veronika Dorogush, Andrey Gulin, Gleb Gusev, Nikita Kazeev, Liudmila Ostroumova Prokhorenkova, Aleksandr Vorobev "Fighting biases with dynamic boosting". arXiv:1706.09516, 2017.
Anna Veronika Dorogush, Vasily Ershov, Andrey Gulin "CatBoost: gradient boosting with categorical features support". Workshop on ML Systems at NIPS 2017.
© YANDEX LLC, 2017-2018. Licensed under the Apache License, Version 2.0. See LICENSE file for more details.