pytext multi-label support (#729) #731

haowu666 · 2019-06-26T22:18:07Z

Summary:
Pull Request resolved: #729

Add multi-label support to Pytext training workflow including

LabelListTensorizer to read label list
MultiLabelSoftMarginLoss with n_hot_encoding encoding to calculate the loss of multi-label task
(need to take care of the padded -1 in n_hot_encoding)
MultiLabelOutputLayer with predictions of all potential labels for each example
LabelListPrediction(NamedTuple) for an example including
- label_scores: List[float]
- predicted_label: List[int]
- expected_label: List[int]
MultiLabelClassificationMetricReporter
- compute_multi_label_classification_metrics with both predicted and expected labels in lists
- compute_multi_label_soft_metrics with both predicted and expected labels in lists
handle the label / label list inputs at the same time in channel.py
In input arguments, users are able to choose
- LabelTensorizer / LabelListTensorizer
- BinaryClassificationOutputLayer/ MultiLabelOutputLayer / MulticlassOutputLayer
- loss including BinaryCrossEntropyLoss, MultiLabelSoftMarginLoss, etc.
- associated with loss, user can choose ClassificationMetricReporter / MultiLabelClassificationMetricReporter
define@register_adapter(from_version=12) v11_to_v12 in config_adapter.py to make ClassificationMetricReporter expansible
keep RECALL_AT_PRECISION_THREHOLDS as it was, users can change the values to test their data set (for example, add 0.7 into the list of thresholds)

It has been tested on single-label and multi-label examples for DocNN and BERT models

Differential Revision: D15777482

Summary: Pull Request resolved: facebookresearch#729 Add multi-label support to Pytext training workflow including * LabelListTensorizer to read label list * MultiLabelSoftMarginLoss with n_hot_encoding encoding to calculate the loss of multi-label task (need to take care of the padded -1 in n_hot_encoding) * MultiLabelOutputLayer with predictions of all potential labels for each example * LabelListPrediction(NamedTuple) for an example including * label_scores: List[float] * predicted_label: List[int] * expected_label: List[int] * MultiLabelClassificationMetricReporter * compute_multi_label_classification_metrics with both predicted and expected labels in lists * compute_multi_label_soft_metrics with both predicted and expected labels in lists * handle the label / label list inputs at the same time in channel.py * In input arguments, users are able to choose * LabelTensorizer / LabelListTensorizer * BinaryClassificationOutputLayer/ MultiLabelOutputLayer / MulticlassOutputLayer * loss including BinaryCrossEntropyLoss, MultiLabelSoftMarginLoss, etc. * associated with loss, user can choose ClassificationMetricReporter / MultiLabelClassificationMetricReporter * define@register_adapter(from_version=12) v11_to_v12 in config_adapter.py to make ClassificationMetricReporter expansible * keep RECALL_AT_PRECISION_THREHOLDS as it was, users can change the values to test their data set (for example, add 0.7 into the list of thresholds) It has been tested on single-label and multi-label examples for DocNN and BERT models Differential Revision: D15777482 fbshipit-source-id: 7eff3b27eff076d6c36a0feaef223573608b0d5d

facebook-github-bot · 2019-06-27T00:46:12Z

This pull request has been merged in 1a44019.

facebook-github-bot added the CLA Signed Do not delete this pull request or issue due to inactivity. label Jun 26, 2019

haowu666 force-pushed the export-D15777482 branch from 4e66c6d to ba6ea88 Compare June 26, 2019 22:18

facebook-github-bot closed this in 1a44019 Jun 26, 2019

facebook-github-bot added the Merged label Jun 27, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

pytext multi-label support (#729) #731

pytext multi-label support (#729) #731

haowu666 commented Jun 26, 2019

facebook-github-bot commented Jun 27, 2019

pytext multi-label support (#729) #731

pytext multi-label support (#729) #731

Conversation

haowu666 commented Jun 26, 2019

facebook-github-bot commented Jun 27, 2019