Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document eval_metric in XGBoost #6887

Closed
exalate-issue-sync bot opened this issue May 11, 2023 · 4 comments
Closed

Document eval_metric in XGBoost #6887

exalate-issue-sync bot opened this issue May 11, 2023 · 4 comments
Assignees

Comments

@exalate-issue-sync
Copy link

In [https://github.com//pull/6399|https://github.com//pull/6399|smart-link] we added a notebook describing how to properly use {{eval_metric}} to speed-up XGBoost scoring in early stopping scenario with frequent scoring.

[~accountid:5afa05ceac509206c8203255] pointed out that we need to also document it clearly in H2O documentation: [https://github.com//pull/6399#pullrequestreview-1160569522|https://github.com//pull/6399#pullrequestreview-1160569522|smart-link] - the use case of early stopping being an important factor

@exalate-issue-sync
Copy link
Author

hannah.tillman commented: also add {{score_eval_metric_only}} to params:

Score only the evaluation metric when enabled. This can make model training faster if scoring is frequent (e.g. each iteration). Defaults to {{False}}.

@exalate-issue-sync
Copy link
Author

hannah.tillman commented: Quick notes:

Put in the FAQ:

  • How do I use the eval_metric?

eval_metric is calculated on both training and validation datasets after each iteration.

By default, H2O calculates all appropriate metrics for given problems. Given a binary classification model, H2O will report at least logloss, AUC, and AUCPR.

When early stopping is used, you will need to choose one of the built-in early stopping metrics. For consistency between different model types and/or algorithm implementations, these are always calculated by H2O itself and are independent of XGBoost’s eval_metric implementation.

You don’t always need to specify your custom eval_metric, but it is beneficial for both frequent scoring and when H2O doesn’t provide a suitable built-in metric.

  • eval_metric: Specify the evaluation metric that will be passed to the native XGBoost backend. Must be one of: rmse, rmsle, mae, mape, mphe, logloss, error, error@t, merror, mlogloss, auc, aucpr, ndcg, map, ndcg@n/map@n, ndcg-/map-/ndcg@n-/map@n, poisson-nloglik, gamma-nloglik, cox-nloglik, tweedie-nloglik, aft-nloglik, interval-regression-accuracy
    ** [https://xgboost.readthedocs.io/en/latest/parameter.html#learning-task-parameters|https://xgboost.readthedocs.io/en/latest/parameter.html#learning-task-parameters|smart-link]

@h2o-ops
Copy link
Collaborator

h2o-ops commented May 14, 2023

JIRA Issue Details

Jira Issue: PUBDEV-8889
Assignee: hannah.tillman
Reporter: Michal Kurka
State: Resolved
Fix Version: 3.40.0.2
Attachments: N/A
Development PRs: Available

@h2o-ops
Copy link
Collaborator

h2o-ops commented May 14, 2023

Linked PRs from JIRA

#6492

@h2o-ops h2o-ops closed this as completed May 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants