Add f1_binary and f1_macro #190

huibinshen · 2021-03-29T18:24:49Z

Change f1 to f1macro to make it more explicit on what it does. f1macro should be used for multi-class classification datasets.
For binary classification tasks, use f1 (with the average option 'binary').

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

src/sagemaker_xgboost_container/constants/xgb_constants.py

src/sagemaker_xgboost_container/metrics/custom_metrics.py

edwardjkim · 2021-04-14T19:26:40Z

The CI is failing due to docker pull rate limit because it uses an anonymous user to docker pull. We have to ultimately fix this, but for the short term, I would simply wait a few hours and push a dummy commit, e.g., git commit --allow-empty -m "Trigger Build" to make the CI succeed.

edwardjkim · 2021-04-14T20:37:02Z

flake8 error:

src/sagemaker_xgboost_container/metrics/custom_metrics.py:13:1: F401 'collections.Counter' imported but unused
979

huibinshen · 2021-04-14T20:39:41Z

flake8 error:

src/sagemaker_xgboost_container/metrics/custom_metrics.py:13:1: F401 'collections.Counter' imported but unused
979

huibinshen · 2021-04-14T20:40:27Z

flake8 error:

src/sagemaker_xgboost_container/metrics/custom_metrics.py:13:1: F401 'collections.Counter' imported but unused
979

Thanks, fixed.

edwardjkim · 2021-04-14T20:47:08Z

Sorry, another docker pull rate limit error. Is the PR ready other than flake8? I can start running integration tests if ready.

huibinshen · 2021-04-14T20:50:05Z

Sorry, another docker pull rate limit error. Is the PR ready other than flake8? I can start running integration tests if ready.

I can't see the failed tests, and not 100% sure if there is any other issues besides the pull rate limit error. I could trigger another dummy commit tomorrow morning to confirm.

huibinshen · 2021-04-15T09:39:43Z

@iyerr3 Could you have a look of this change? Thanks.

iyerr3 · 2021-04-15T16:09:16Z

src/sagemaker_xgboost_container/metrics/custom_metrics.py

+    if preds.size > 0:
+        labels = dtrain.get_label()
+        pred_labels = margin_to_class_label(preds)
+        # this function is used only for AutoPilot and the least frequent label is already encoded as 1.


There's nothing that stops external customers from using this directly, so I wouldn't make this assumption.
Also I'm not sure what the implications of the 'least frequent label is already encoded as 1' is. Can you restate the assumption here to ensure behavior is clear.

Also do we need validation for the number of labels to be 2?

Why does this assumption matter? For F1, we just care what's positive/negative.

To make f1 binary work, the only requirement is the label is encoded in {0, 1}. Taking least frequent label as 1 is only a behavior AutoPilot enforces. The customer could also use more frequent label as 1.

A question here is that are the labels already encoded in {0, 1}? If not, we need to have another label encoder to make f1 binary works.

@huibinshen Note that you get the labels from margin_to_class_label function, which always encodes in {0,1}. So your code is fine. More importantly, XGBoost is a product itself. You shouldn't add comments about Autopilot's behavior here. Just remove the comment, which is the root cause of confusions.

Thanks Haifeng, will remove the comment in the next revision.

@iyerr3 Do you have other comments? I hope the explanation above is clear.

Thanks for removing the comment.
My question on validation still stands: if customer uses f1_binary with multi-class, then they should get a Customer Error. Is that validation happening somewhere that I missed?

Hi @iyerr3 , I added a customer error in the new revision. Let me know if this is not the right way.

lihaife · 2021-04-20T15:12:07Z

src/sagemaker_xgboost_container/constants/xgb_constants.py

@@ -42,6 +44,7 @@

 LOGISTIC_REGRESSION_LABEL_RANGE_ERROR = "label must be in [0,1] for logistic regression"
 MULTI_CLASS_LABEL_RANGE_ERROR = "label must be in [0, num_class)"
+MULTI_CLASS_F1_BINARY_ERROR = "Target is multiclass but average='binary'"


Why do we throw this error?

When user uses F1 binary score for multiclass dataset, f1_sccore(..., average="binary") will throw this error. This is because a mis-configuration, so it is now thrown as a CustomerError.

edwardjkim

Changes including customer error look good to me. Integration tests all succeeded.

huibinshen commented Mar 29, 2021

View reviewed changes

src/sagemaker_xgboost_container/constants/xgb_constants.py Outdated Show resolved Hide resolved

lihaife reviewed Apr 6, 2021

View reviewed changes

src/sagemaker_xgboost_container/metrics/custom_metrics.py Outdated Show resolved Hide resolved

huibinshen force-pushed the master branch from bc95f8d to cea87ed Compare April 6, 2021 16:08

Introduce f1_binary and f1_macro

5baece6

huibinshen force-pushed the master branch from 0f9b51a to 5baece6 Compare April 13, 2021 19:22

huibinshen changed the title ~~fix f1 and add f1macro~~ Add f1_binary and f1_macro Apr 13, 2021

fix f1_macro doc

6294577

lihaife self-requested a review April 14, 2021 20:37

remove unused import in custom_metrics.py

7cc7433

lihaife approved these changes Apr 14, 2021

View reviewed changes

huibinshen closed this Apr 14, 2021

huibinshen reopened this Apr 14, 2021

iyerr3 reviewed Apr 15, 2021

View reviewed changes

Huibin Shen added 2 commits April 19, 2021 15:14

remove confusing comments

18c2205

add customer error when using f1 binary for multiclass datasets

f4d892d

lihaife reviewed Apr 20, 2021

View reviewed changes

edwardjkim approved these changes Apr 26, 2021

View reviewed changes

edwardjkim merged commit 845f177 into aws:master Apr 26, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add f1_binary and f1_macro #190

Add f1_binary and f1_macro #190

huibinshen commented Mar 29, 2021

edwardjkim commented Apr 14, 2021

edwardjkim commented Apr 14, 2021

huibinshen commented Apr 14, 2021

huibinshen commented Apr 14, 2021

edwardjkim commented Apr 14, 2021 •

edited

huibinshen commented Apr 14, 2021

huibinshen commented Apr 15, 2021

iyerr3 Apr 15, 2021

lihaife Apr 15, 2021

huibinshen Apr 16, 2021

lihaife Apr 16, 2021

huibinshen Apr 16, 2021

iyerr3 Apr 19, 2021

huibinshen Apr 20, 2021 •

edited

lihaife Apr 20, 2021

huibinshen Apr 20, 2021

edwardjkim left a comment

Add f1_binary and f1_macro #190

Add f1_binary and f1_macro #190

Conversation

huibinshen commented Mar 29, 2021

edwardjkim commented Apr 14, 2021

edwardjkim commented Apr 14, 2021

huibinshen commented Apr 14, 2021

huibinshen commented Apr 14, 2021

edwardjkim commented Apr 14, 2021 • edited

huibinshen commented Apr 14, 2021

huibinshen commented Apr 15, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

huibinshen Apr 20, 2021 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

edwardjkim left a comment

Choose a reason for hiding this comment

edwardjkim commented Apr 14, 2021 •

edited

huibinshen Apr 20, 2021 •

edited