You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently, transformers' FeaturesManager._TASKS_TO_AUTOMODELS to handle strings passed to load models. Notably, this is used in the ORTQuantizer.from_pretrained() method (where here, for example, feature="sequence-classification"):
In the meanwhile, pipeline abstraction for text classification expects pipeline(..., task="text-classification"). Hence it could be troublesome for users to pass both "text-classification" and "sequence-classification".
and provide the feature to ORTQuantizer, which is troublesome.
Possible solutions are:
Have an auto-mapping from "tasks" (as in https://huggingface.co/models ) to "features" ("text-classification" --> "sequence-classification", "audio-classification" --> "sequence-classification")
Modify transformers.onnx.FeaturesManager to use real tasks and not "sequence-classification"
Add abstraction classes like ForTextClassification, ForAudioClassification classes just inheriting from ForSequenceClassification and modify transformers.onnx.FeaturesManager accordingly
One challenge with unifying the "features" used in the ONNX export and the tasks defined in the pipeline() function is that one can have past key values that need to be differentiated, e.g. these two features are different:
causal-lm
causal-lm-with-past
Having said that, I agree that it would be nice if one could reuse the same task taxonomy from the transformers.pipeline() function, so maybe some light refactoring can capture the majority of tasks.
cc @michaelbenayoun who knows more of the history behind the ONNX "features" names
Yes, I think the original feature names were chosen by looking at the classes names (BertForSequenceClassification, etc). I think @fxmarty's first suggestion could work and is easy to implement.
Currently, transformers'
FeaturesManager._TASKS_TO_AUTOMODELS
to handle strings passed to load models. Notably, this is used in theORTQuantizer.from_pretrained()
method (where here, for example,feature="sequence-classification"
):optimum/optimum/onnxruntime/quantization.py
Line 102 in 5653a16
In the meanwhile, pipeline abstraction for text classification expects
pipeline(..., task="text-classification")
. Hence it could be troublesome for users to pass both"text-classification"
and"sequence-classification"
.A handy workflow could be the following:
which currently raises
KeyError: "Unknown task: text-classification
forORTQuantizer.from_pretrained()
.Right now we need to pass something like
and provide the
feature
toORTQuantizer
, which is troublesome.Possible solutions are:
"text-classification" --> "sequence-classification"
,"audio-classification" --> "sequence-classification"
)transformers.onnx.FeaturesManager
to use real tasks and not "sequence-classification"ForTextClassification
,ForAudioClassification
classes just inheriting fromForSequenceClassification
and modifytransformers.onnx.FeaturesManager
accordingly@lewtun
The text was updated successfully, but these errors were encountered: