Join GitHub today
GitHub is home to over 40 million developers working together to host and review code, manage projects, and build software together.Sign up
[Draft] Adding KeyToValueTransformers before finishing exporting to ONNX #4841
When NimbusML deals with category Pandas columns, it automatically convert them into
In general these behaviors aren't a problem, but it causes troubles when exporting models to ONNX. Since the models created by nimbusml don't have the appropriate
In this PR I add a
This is thought to be a temporary solution, to unblock Azure AutoML to work with NimbusML. Ideally these transformers would be added in NimbusML pipeline, but currently it's not possible to add transforms after predictors in NimbusML, and doing these changes would also need to change how NimbusML converts between IDataView and Pandas DataFrames. So for the time being the solution in here was found to be the best.
I am running with some problems when using this fix with BinaryClassification models coming from NimbusML. For some reason the exported ONNX model can not be loaded back, as it throws an exception when attempting to do that. It seems that, for some reason, exporting KeyToValueTransformers to ONNX, after a binary classification pipeline, is the problem.
The error message looks as follows:
Notice that I get a similar error message when exporting to ONNX an ML.NET OVA model (without NimbusML) that is followed by a KeyToValueTransformer. Wonder if there's a connection with the fact that OVA uses Binary classifiers.
On the other hand, I haven't been able to reproduce the above error message using only ML.NET (without NimbusML), because apparently there's no way of training Binary classification trainers with a Label column that is of KeyDataViewType (exceptions are thrown if I attempt to do this)... it seems they only work with Boolean Label columns. I don't know why is the reason for this, since NimbusML actually converts non-Boolean labels to KeyDataViewType behind the scenes, and the BinaryClassificationTrainer works with them as expected. Apparently different code paths are followed on each case that stop ML.NET from being capable of doing the same as NimbusML. I wonder if this difference in design has anything to do with the fact that the resulting ONNX model is, for some reason, not well-formed causing the exception above.