-
Notifications
You must be signed in to change notification settings - Fork 686
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Generalize text tutorial to multiclass datasets #229
Conversation
Codecov Report
@@ Coverage Diff @@
## master #229 +/- ##
=======================================
Coverage 95.82% 95.82%
=======================================
Files 12 12
Lines 911 911
Branches 180 180
=======================================
Hits 873 873
Misses 14 14
Partials 24 24 Continue to review full report at Codecov.
|
less likely to be missed if users skip the previous cell to load their own datasets
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall LGTM! just a tiny comment to consider.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good call generalizing this to multiclass!
I just made 2 very minor edits
The current training model assumes that the dataset has binary labels. This PR tweaks the model setup to generalize to datasets where there are >2 classes. As part of this change, the ROC-AUC score has been replaced with Log Loss.