We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OneHotEncoder
drop="if_binary"
We've hit what appears to be an interesting bug that triggers when one uses OneHotEncoder with drop="if_binary". It can be replicated by dropping the following test in https://github.com/onnx/sklearn-onnx/blob/main/tests/test_sklearn_one_hot_encoder_converter.py
@unittest.skipIf( pv.Version(ort_version) <= pv.Version("0.4.0"), reason="issues with shapes" ) @unittest.skipIf( not one_hot_encoder_supports_drop(), reason="OneHotEncoder does not support drop in scikit versions < 0.21", ) def test_one_hot_encoder_drop_if_binary(self): data = [ ["c0.4", "c0.2", 0], ["c1.4", "c1.2", 0], ["c0.2", "c2.2", 1], ["c0.2", "c2.2", 1], ["c0.2", "c2.2", 1], ["c0.2", "c2.2", 1], ] test = [["c0.2", "c2.2", 1]] model = OneHotEncoder(categories="auto", drop="if_binary") model.fit(data) inputs = [ ("input1", StringTensorType([None, 2])), ("input2", Int64TensorType([None, 1])), ] model_onnx = convert_sklearn( model, "one-hot encoder", inputs, target_opset=TARGET_OPSET ) self.assertTrue(model_onnx is not None) dump_data_and_model( test, model, model_onnx, verbose=False, basename="SklearnOneHotEncoderMixedStringIntDrop", )
Investigating the stack trace, one winds up here
sklearn-onnx/skl2onnx/operator_converters/one_hot_encoder.py
Lines 161 to 163 in 895c3a7
with ohe_op.drop_idx_ = [None, None, 0], and therefore np.delete(np.arange(3), None).
ohe_op.drop_idx_ = [None, None, 0]
np.delete(np.arange(3), None)
The text was updated successfully, but these errors were encountered:
The test suite passes if one changes
to
if ohe_op.drop_idx_[index] is not None: indices_to_keep = np.delete( np.arange(len(categories)), ohe_op.drop_idx_[index] ) else: indices_to_keep = np.arange(len(categories))
Sorry, something went wrong.
Closing this issue as the fix was merged into main branch.
No branches or pull requests
We've hit what appears to be an interesting bug that triggers when one uses
OneHotEncoder
withdrop="if_binary"
. It can be replicated by dropping the following test in https://github.com/onnx/sklearn-onnx/blob/main/tests/test_sklearn_one_hot_encoder_converter.pyInvestigating the stack trace, one winds up here
sklearn-onnx/skl2onnx/operator_converters/one_hot_encoder.py
Lines 161 to 163 in 895c3a7
with
ohe_op.drop_idx_ = [None, None, 0]
, and thereforenp.delete(np.arange(3), None)
.The text was updated successfully, but these errors were encountered: