Generalize text tutorial to multiclass datasets #229

calebchiam · 2022-04-18T17:43:20Z

The current training model assumes that the dataset has binary labels. This PR tweaks the model setup to generalize to datasets where there are >2 classes. As part of this change, the ROC-AUC score has been replaced with Log Loss.

codecov · 2022-04-18T17:43:29Z

Codecov Report

Merging #229 (74ad3ff) into master (1746e63) will not change coverage.
The diff coverage is n/a.

@@           Coverage Diff           @@
##           master     #229   +/-   ##
=======================================
  Coverage   95.82%   95.82%           
=======================================
  Files          12       12           
  Lines         911      911           
  Branches      180      180           
=======================================
  Hits          873      873           
  Misses         14       14           
  Partials       24       24

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 1746e63...74ad3ff. Read the comment docs.

less likely to be missed if users skip the previous cell to load their own datasets

weijinglok

Overall LGTM! just a tiny comment to consider.

docs/source/tutorials/text.ipynb

jwmueller

Good call generalizing this to multiclass!
I just made 2 very minor edits

generalize text tutorial to multiclass datasets

d853f80

calebchiam requested review from jwmueller and weijinglok April 18, 2022 17:44

calebchiam added 3 commits April 18, 2022 14:05

fixed typo

8c79089

fixed misspelling

de8cd8e

separate cell for num_classes

90b97ce

less likely to be missed if users skip the previous cell to load their own datasets

weijinglok reviewed Apr 18, 2022

View reviewed changes

docs/source/tutorials/text.ipynb Outdated Show resolved Hide resolved

calebchiam and others added 2 commits April 18, 2022 15:47

removing pycharm metadata

55da3d5

print classes

74ad3ff

jwmueller approved these changes Apr 19, 2022

View reviewed changes

jwmueller merged commit 31c939f into master Apr 19, 2022

jwmueller deleted the patch/text-tutorial branch April 19, 2022 04:35

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Generalize text tutorial to multiclass datasets #229

Generalize text tutorial to multiclass datasets #229

calebchiam commented Apr 18, 2022

codecov bot commented Apr 18, 2022 •

edited

weijinglok left a comment

jwmueller left a comment •

edited

Generalize text tutorial to multiclass datasets #229

Generalize text tutorial to multiclass datasets #229

Conversation

calebchiam commented Apr 18, 2022

codecov bot commented Apr 18, 2022 • edited

Codecov Report

weijinglok left a comment

Choose a reason for hiding this comment

jwmueller left a comment • edited

Choose a reason for hiding this comment

codecov bot commented Apr 18, 2022 •

edited

jwmueller left a comment •

edited