Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add f1_binary and f1_macro #190

Merged
merged 5 commits into from
Apr 26, 2021
Merged

Add f1_binary and f1_macro #190

merged 5 commits into from
Apr 26, 2021

Conversation

huibinshen
Copy link
Contributor

Change f1 to f1macro to make it more explicit on what it does. f1macro should be used for multi-class classification datasets.
For binary classification tasks, use f1 (with the average option 'binary').

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

@huibinshen huibinshen changed the title fix f1 and add f1macro Add f1_binary and f1_macro Apr 13, 2021
@edwardjkim
Copy link
Contributor

The CI is failing due to docker pull rate limit because it uses an anonymous user to docker pull. We have to ultimately fix this, but for the short term, I would simply wait a few hours and push a dummy commit, e.g., git commit --allow-empty -m "Trigger Build" to make the CI succeed.

@edwardjkim
Copy link
Contributor

flake8 error:

src/sagemaker_xgboost_container/metrics/custom_metrics.py:13:1: F401 'collections.Counter' imported but unused
979

@lihaife lihaife self-requested a review April 14, 2021 20:37
@huibinshen
Copy link
Contributor Author

flake8 error:

src/sagemaker_xgboost_container/metrics/custom_metrics.py:13:1: F401 'collections.Counter' imported but unused
979

@huibinshen huibinshen closed this Apr 14, 2021
@huibinshen huibinshen reopened this Apr 14, 2021
@huibinshen
Copy link
Contributor Author

flake8 error:

src/sagemaker_xgboost_container/metrics/custom_metrics.py:13:1: F401 'collections.Counter' imported but unused
979

Thanks, fixed.

@edwardjkim
Copy link
Contributor

edwardjkim commented Apr 14, 2021

Sorry, another docker pull rate limit error. Is the PR ready other than flake8? I can start running integration tests if ready.

@huibinshen
Copy link
Contributor Author

Sorry, another docker pull rate limit error. Is the PR ready other than flake8? I can start running integration tests if ready.

I can't see the failed tests, and not 100% sure if there is any other issues besides the pull rate limit error. I could trigger another dummy commit tomorrow morning to confirm.

@huibinshen
Copy link
Contributor Author

@iyerr3 Could you have a look of this change? Thanks.

if preds.size > 0:
labels = dtrain.get_label()
pred_labels = margin_to_class_label(preds)
# this function is used only for AutoPilot and the least frequent label is already encoded as 1.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's nothing that stops external customers from using this directly, so I wouldn't make this assumption.
Also I'm not sure what the implications of the 'least frequent label is already encoded as 1' is. Can you restate the assumption here to ensure behavior is clear.

Also do we need validation for the number of labels to be 2?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does this assumption matter? For F1, we just care what's positive/negative.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To make f1 binary work, the only requirement is the label is encoded in {0, 1}. Taking least frequent label as 1 is only a behavior AutoPilot enforces. The customer could also use more frequent label as 1.

A question here is that are the labels already encoded in {0, 1}? If not, we need to have another label encoder to make f1 binary works.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@huibinshen Note that you get the labels from margin_to_class_label function, which always encodes in {0,1}. So your code is fine. More importantly, XGBoost is a product itself. You shouldn't add comments about Autopilot's behavior here. Just remove the comment, which is the root cause of confusions.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Haifeng, will remove the comment in the next revision.

@iyerr3 Do you have other comments? I hope the explanation above is clear.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for removing the comment.
My question on validation still stands: if customer uses f1_binary with multi-class, then they should get a Customer Error. Is that validation happening somewhere that I missed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @iyerr3 , I added a customer error in the new revision. Let me know if this is not the right way.

@@ -42,6 +44,7 @@

LOGISTIC_REGRESSION_LABEL_RANGE_ERROR = "label must be in [0,1] for logistic regression"
MULTI_CLASS_LABEL_RANGE_ERROR = "label must be in [0, num_class)"
MULTI_CLASS_F1_BINARY_ERROR = "Target is multiclass but average='binary'"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we throw this error?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When user uses F1 binary score for multiclass dataset, f1_sccore(..., average="binary") will throw this error. This is because a mis-configuration, so it is now thrown as a CustomerError.

Copy link
Contributor

@edwardjkim edwardjkim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes including customer error look good to me. Integration tests all succeeded.

@edwardjkim edwardjkim merged commit 845f177 into aws:master Apr 26, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants