You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm currently looking into the text classifier and face a question regarding the use of multiple labels at training time with the goal of predicting only one.
wohlg@wohlg-XPS:~/itmo/misc/cooking_classification/preprocessed$ head cooking.train
__label__sauce __label__cheese how much does potato starch affect a cheese sauce recipe ?
__label__food-safety __label__acidity dangerous pathogens capable of growing in acidic environments
__label__cast-iron __label__stove how do i cover up the white spots on my cast iron stove ?
__label__restaurant michelin three star restaurant; but if the chef is not there
__label__knife-skills __label__dicing without knife skills , how can i quickly and accurately dice vegetables ?
__label__storage-method __label__equipment __label__bread what ' s the purpose of a bread box ?
Let's say I provide multiple labels since sauce might be correlated with cheese which is of value for a classifier, in addition to providing the text.
However, my final goal is to label texts as sauce. I am not interested in labelling texts using the other four labels. Is there any setting/parameter I can use to tell the classifier to use all provided data (including all labels) but to optimize for a prediction of sauce?
Best,
Tobias
The text was updated successfully, but these errors were encountered:
Hello @TDaudert that's an interesting use case. We don't currently offer such an option. I wonder how something like this might best be implemented - I guess we could modify the loss function so that errors in the desired class weigh more heavily than the other classes. See a related discussion here.
We could also do data sampling so that data from the relevant class gets upsampled during training. I think this somewhat fits our ongoing work in looking at problems with class imbalance, so maybe if we find a good solution there it could also apply to this use case. We'll definitely keep your use case in mind - please also let us know if you find a good solution.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
Hi all,
I'm currently looking into the text classifier and face a question regarding the use of multiple labels at training time with the goal of predicting only one.
As an example, I borrowed the code of #678
Let's say I provide multiple labels since sauce might be correlated with cheese which is of value for a classifier, in addition to providing the text.
However, my final goal is to label texts as sauce. I am not interested in labelling texts using the other four labels. Is there any setting/parameter I can use to tell the classifier to use all provided data (including all labels) but to optimize for a prediction of sauce?
Best,
Tobias
The text was updated successfully, but these errors were encountered: