Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WB-7886] Add CatBoost Integration #2975

Merged
merged 35 commits into from
Feb 17, 2022
Merged

[WB-7886] Add CatBoost Integration #2975

merged 35 commits into from
Feb 17, 2022

Conversation

ayulockin
Copy link
Member

@ayulockin ayulockin commented Dec 1, 2021

Fixes WB-7886
Fixes #965

Description

The PR adds a simple WandbCallback for CatBoost.

The PR currently enables:

  • logging iteration
  • logging train and validation metrics

Testing

yea test and manually tested.

ayulockin and others added 6 commits November 17, 2021 23:01
* improved xgboost wandb_callback

* log config, better logging of metrics, typo

* cleaniness is good

* add feature importance plotting

* best score + iteration

* deprecated callback, docstring, fixes

* define_metric, rename, fixes

* test added

* define metrics fixes

* add command to yea
@codecov
Copy link

codecov bot commented Dec 1, 2021

Codecov Report

Merging #2975 (37cc0de) into master (55b885c) will decrease coverage by 0.20%.
The diff coverage is 92.18%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #2975      +/-   ##
==========================================
- Coverage   80.15%   79.95%   -0.21%     
==========================================
  Files         210      213       +3     
  Lines       27818    27872      +54     
==========================================
- Hits        22298    22285      -13     
- Misses       5520     5587      +67     
Flag Coverage Δ
functest 56.62% <89.06%> (-0.56%) ⬇️
unittest 69.58% <9.37%> (-0.13%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
wandb/sdk/data_types.py 84.34% <ø> (ø)
wandb/integration/catboost/catboost.py 90.56% <90.56%> (ø)
wandb/__init__.py 91.34% <100.00%> (-0.25%) ⬇️
wandb/catboost/__init__.py 100.00% <100.00%> (ø)
wandb/integration/catboost/__init__.py 100.00% <100.00%> (ø)
wandb/integration/lightgbm/__init__.py 95.45% <100.00%> (ø)
wandb/sdk/internal/profiler.py 95.00% <100.00%> (ø)
wandb/sdk/wandb_artifacts.py 81.68% <100.00%> (ø)
wandb/util.py 85.58% <100.00%> (-0.13%) ⬇️
wandb/integration/metaflow/metaflow.py 52.29% <0.00%> (-32.76%) ⬇️
... and 8 more

@dmitryduev dmitryduev self-requested a review December 3, 2021 18:45
Copy link
Member

@morganmcg1 morganmcg1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Look good so far, the new XGB callback code is included here too, that should be kept in a separate PR.

There is a log_function function to be added too right?

train_pool = Pool(train[features], label=train['label'], cat_features=cat_features)
test_pool = Pool(test[features], label=test['label'], cat_features=cat_features)

model = CatBoostRegressor(iterations=100,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit, maybe move the iterations argument to a new line

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍🏻 In general, please re-format the example code. I'd copy it into a dummy file, the run tox -e format -- dummy.py and copy back into docstrings.

wandb/integration/catboost/catboost.py Outdated Show resolved Hide resolved
Copy link
Member

@dmitryduev dmitryduev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @ayulockin! Mostly LGTM, a few minor comments.

train_pool = Pool(train[features], label=train['label'], cat_features=cat_features)
test_pool = Pool(test[features], label=test['label'], cat_features=cat_features)

model = CatBoostRegressor(iterations=100,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍🏻 In general, please re-format the example code. I'd copy it into a dummy file, the run tox -e format -- dummy.py and copy back into docstrings.

wandb/integration/catboost/catboost.py Outdated Show resolved Hide resolved
wandb/integration/catboost/catboost.py Outdated Show resolved Hide resolved
- :wandb:runs[0][summary][learn-MultiClass]
- 0.0
- :wandb:runs[0][exitcode]: 0

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

formatting

functional_tests/catboost/test_catboost.py Outdated Show resolved Hide resolved
@dmitryduev
Copy link
Member

@raubitsj looking at wandb/proto/wandb_telemetry.proto, seems like the stuff to enable telemetry for catboost is already there, is that right?

@ayulockin ayulockin marked this pull request as ready for review December 8, 2021 21:32
mypy.ini Show resolved Hide resolved
wandb/sdk/data_types.py Outdated Show resolved Hide resolved
dmitryduev and others added 2 commits February 16, 2022 12:54
Co-authored-by: Katia Patkin <87335417+kptkin@users.noreply.github.com>
@raubitsj
Copy link
Member

Thanks @ayulockin! Mostly LGTM, a few minor comments.

This is the type of changes needed for telemetry:
80afe3f

Let me know if you need help, hopefully this is a clean change to model after.

@raubitsj
Copy link
Member

@raubitsj looking at wandb/proto/wandb_telemetry.proto, seems like the stuff to enable telemetry for catboost is already there, is that right?

Responded at: #2975 (comment)

@dmitryduev
Copy link
Member

@raubitsj: I added feature usage tracking to telemetry + testing that with yea as you requested. Passes locally and the CI is green, so merging.
Many thanks for this once again, @ayulockin!!

@dmitryduev dmitryduev merged commit f82669a into wandb:master Feb 17, 2022
@ayulockin
Copy link
Member Author

Thanks @dmitryduev for shipping it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Enhancement] CatBoost support in W&B
5 participants