xgb: infer metric data names from `evals` and deprecate `metric_data` #587

sisp · 2023-05-27T19:34:49Z

I've added support for logging metrics for multiple data sets/splits (e.g. train and test/eval) with XGBoost. This is consistent with other supported frameworks, e.g. Keras. It's also a common use case to compare the metrics on multiple data sets/splits, e.g. to check overfitting.

This addition is backwards compatible.

I'll create a PR to the docs project once this addition has been approved.

❗ I have followed the Contributing to DVCLive
guide.
📖 If this PR requires documentation updates, I have created a separate PR (or issue, at least) in dvc.org and linked it here.

codecov-commenter · 2023-05-28T05:34:43Z

Codecov Report

Patch coverage: 96.00% and project coverage change: +0.05 🎉

Comparison is base (ce68fe1) 89.65% compared to head (85cdbeb) 89.71%.

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #587      +/-   ##
==========================================
+ Coverage   89.65%   89.71%   +0.05%     
==========================================
  Files          43       43              
  Lines        2938     2955      +17     
  Branches      242      245       +3     
==========================================
+ Hits         2634     2651      +17     
  Misses        264      264              
  Partials       40       40

Impacted Files	Coverage Δ
src/dvclive/xgb.py	`95.83% <87.50%> (+0.83%)`	⬆️
tests/test_frameworks/test_xgboost.py	`96.29% <100.00%> (+1.17%)`	⬆️

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

sisp · 2023-05-28T06:59:55Z

Or should we remove the metric_data argument and create a subdirectory for each evaluation set? It seems the other framework integrations work like that. It would be a breaking change though.

Alternatively, metric_data could also accept None and default to None in which case metrics for all evaluation sets would be logged. That would be backwards compatible if we kept omitting the subdirectory for the evaluation set when only one set is used. It seems other framework integrations always create this subdirectory though, so the XGBoost integration is the exception.

daavoo

Hi @sisp thanks for the contribution!

I took a look and I think what makes sense is to just go ahead and drop metric_data , as you suggested in #587 (comment)

sisp · 2023-05-29T14:08:36Z

@daavoo Shouldn't we provide at least a grace period until the breaking change takes effect, still violating SemVer but only after, e.g., 3 minor releases? We could allow None (i.e. Optional[str] and write metrics logs to those inferred subdirectories), default to None, and show a deprecation warning to notify users and give them time to migrate.

daavoo · 2023-05-29T14:14:14Z

We could allow None (i.e. Optional[str] and write metrics logs to those inferred subdirectories), default to None, and show a deprecation warning to notify users and give them time to migrate.

Yes, sorry for the lack of clarity, I meant the option in your last paragraph

sisp · 2023-05-29T14:15:48Z

Great! I'll update the PR tomorrow.

sisp · 2023-05-30T09:36:54Z

@daavoo I've updated the PR as we discussed and resolved merge conflicts with main. I've also submitted a PR to the dvc.org project to first document the metric_data parameter and also add a notice regarding its deprecation: treeverse/dvc.org#4578

src/dvclive/xgb.py

Co-authored-by: David de la Iglesia Castro <daviddelaiglesiacastro@gmail.com>

for more information, see https://pre-commit.ci

daavoo

Thank you for contributing and addressing all comments!

sisp · 2023-05-30T10:01:51Z

You're welcome! Thanks for the thorough review and your responsiveness! 🙏 🙇

xgb: add support for multiple metrics data sets

ce54031

Merge branch 'main' into feat/xgb-multiple-metrics-data

20f4387

daavoo self-requested a review May 29, 2023 08:01

daavoo added A: frameworks Area: ML Framework integration feature labels May 29, 2023

daavoo suggested changes May 29, 2023

View reviewed changes

xgb: infer metric data names from evals and deprecate metric_data

65db854

sisp changed the title ~~xgb: add support for multiple metrics data sets~~ xgb: infer metric data names from evals and deprecate metric_data May 30, 2023

sisp mentioned this pull request May 30, 2023

dvclive: add metric_data parameter with deprecation notice treeverse/dvc.org#4578

Closed

Merge branch 'main' into feat/xgb-multiple-metrics-data

85cdbeb

sisp requested a review from daavoo May 30, 2023 09:37

daavoo reviewed May 30, 2023

View reviewed changes

src/dvclive/xgb.py Outdated Show resolved Hide resolved

sisp and others added 2 commits May 30, 2023 11:48

xgb: improve code aesthetics

d8ad745

Co-authored-by: David de la Iglesia Castro <daviddelaiglesiacastro@gmail.com>

[pre-commit.ci] auto fixes from pre-commit.com hooks

29911f7

for more information, see https://pre-commit.ci

daavoo approved these changes May 30, 2023

View reviewed changes

daavoo merged commit 9088996 into treeverse:main May 30, 2023

sisp deleted the feat/xgb-multiple-metrics-data branch May 30, 2023 10:01

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

xgb: infer metric data names from `evals` and deprecate `metric_data` #587

xgb: infer metric data names from `evals` and deprecate `metric_data` #587

Uh oh!

sisp commented May 27, 2023 •

edited

Loading

Uh oh!

codecov-commenter commented May 28, 2023 •

edited

Loading

Uh oh!

sisp commented May 28, 2023 •

edited

Loading

Uh oh!

daavoo left a comment •

edited

Loading

Uh oh!

sisp commented May 29, 2023 •

edited

Loading

Uh oh!

daavoo commented May 29, 2023

Uh oh!

sisp commented May 29, 2023

Uh oh!

sisp commented May 30, 2023

Uh oh!

Uh oh!

daavoo left a comment

Uh oh!

sisp commented May 30, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

xgb: infer metric data names from evals and deprecate metric_data #587

xgb: infer metric data names from evals and deprecate metric_data #587

Uh oh!

Conversation

sisp commented May 27, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov-commenter commented May 28, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

sisp commented May 28, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

daavoo left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

sisp commented May 29, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

daavoo commented May 29, 2023

Uh oh!

sisp commented May 29, 2023

Uh oh!

sisp commented May 30, 2023

Uh oh!

Uh oh!

daavoo left a comment

Choose a reason for hiding this comment

Uh oh!

sisp commented May 30, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

xgb: infer metric data names from `evals` and deprecate `metric_data` #587

xgb: infer metric data names from `evals` and deprecate `metric_data` #587

sisp commented May 27, 2023 •

edited

Loading

codecov-commenter commented May 28, 2023 •

edited

Loading

sisp commented May 28, 2023 •

edited

Loading

daavoo left a comment •

edited

Loading

sisp commented May 29, 2023 •

edited

Loading