Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Binary classification probability in ONNX for SVM with probability=False in sklearn #990

Open
qi-yuan-cresset opened this issue Apr 27, 2023 · 2 comments

Comments

@qi-yuan-cresset
Copy link

Hello,
I'm confused by the output probability of SVM classifier converted to ONNX. For the following code:

import numpy as np
import onnxruntime as rt
import skl2onnx
from skl2onnx.common.data_types import FloatTensorType
from sklearn.datasets import load_iris
from sklearn.svm import SVC
from skl2onnx import convert_sklearn


X, y = load_iris(return_X_y=True)
X, y = X[y != 2], y[y != 2]
n_samples, n_features = X.shape
model = SVC(probability=False,random_state=5)
model.fit(X.astype(numpy_type), y)

initial_type = [("float_input", FloatTensorType([None,  4]))]
onnx = convert_sklearn(model, initial_types=initial_type)
with open("model_t.onnx", "wb") as f:
    f.write(onnx.SerializeToString())

sess = rt.InferenceSession("model_t.onnx")
input_name = sess.get_inputs()[0].name
pred_onx = sess.run(None, {input_name: np.array(X[:10, :]).astype(np.float32)})
print(pred_onx[0])
print(pred_onx[1])

The output I got are as below:

[0 0 0 0 0 0 0 0 0 0]
[[-1.2660402  1.2660402]
 [-1.1427525  1.1427525]
 [-1.2851433  1.2851433]
 [-1.1439595  1.1439595]
 [-1.3043578  1.3043578]
 [-1.0796759  1.0796759]
 [-1.2667828  1.2667828]
 [-1.1935505  1.1935505]
 [-1.1515036  1.1515036]
 [-1.13976    1.13976  ]]

All the "labels" are 0, while all the "probability" for class 1 are great than class 0.
Is this a bug or some expected behaviour?
The "labels" predicted are consistent between sklearn and onnx, however, the probability values predicted from onnx are opposite to the labels, which caused inconsistency for me in building and converting voting classifiers in sklearn + onnx, since onnx does "soft voting" only.
I know that by setting SVC with probability=True can make the probability prediction consistent between sklearn and onnx, but the probability predicted by SVC in sklearn sometimes disagree with the corresponding labels, too, which could also lead to problem when building voting classifier.

Any suggestions on this issue would be highly appreciated.

Thanks very much,

Qi

@xadupre
Copy link
Collaborator

xadupre commented Jun 22, 2023

I took your model and added two nodes to extract the first column of your results. Would that be ok?

import numpy as np
import onnxruntime as rt
import skl2onnx
from skl2onnx.common.data_types import FloatTensorType
from sklearn.datasets import load_iris
from sklearn.svm import SVC
from skl2onnx import convert_sklearn


X, y = load_iris(return_X_y=True)
X = X.astype(np.float32)
X, y = X[y != 2], y[y != 2]
n_samples, n_features = X.shape
model = SVC(probability=False, random_state=5)
model.fit(X, y)
print(model.predict(X[:10]))
print(model.decision_function(X[:10]))
print("----------")

initial_type = [("float_input", FloatTensorType([None, 4]))]
onx = convert_sklearn(model, initial_types=initial_type, target_opset=18)

sess = rt.InferenceSession(onx.SerializeToString(), providers=["CPUExecutionProvider"])
input_name = sess.get_inputs()[0].name
pred_onx = sess.run(None, {input_name: np.array(X[:10, :]).astype(np.float32)})
print(pred_onx[0])
print(pred_onx[1])

print("---------------------")
# see https://onnx.ai/onnx/intro/python.html
# Add one node to the model to extract the first column.
from onnx import TensorProto
from onnx.helper import make_node, make_tensor_value_info, make_model, make_graph
from onnx.numpy_helper import from_array
from onnx.version_converter import convert_version

# Make sure the onnx opset is one of the latest.
# sklearn-onnx chooses the lower possible value.
# In this case, it is 9. The operator Slice(opset 13) is added at the
# end of the graph to extract one column. The model needs to be upgraded.
onx = convert_version(onx, target_version=18)

inits = list(onx.graph.initializer)
inits.extend(
    [
        from_array(np.array([0], dtype=np.int64), name="zero"),
        from_array(np.array([1], dtype=np.int64), name="one"),
        from_array(np.array([-1], dtype=np.int64), name="mone"),
    ]
)
nodes = list(onx.graph.node)
nodes.extend(
    [
        make_node("Slice", ["probabilities", "zero", "one", "one"], ["new_scores2"]),
        make_node("Reshape", ["new_scores2", "mone"], ["new_scores"]),
    ]
)
outputs = [
    onx.graph.output[0],
    make_tensor_value_info("new_scores", TensorProto.FLOAT, [None]),
]
graph = make_graph(nodes, onx.graph.name, onx.graph.input, outputs, inits)
new_model = make_model(graph, opset_imports=onx.opset_import)


sess = rt.InferenceSession(
    new_model.SerializeToString(), providers=["CPUExecutionProvider"]
)
input_name = sess.get_inputs()[0].name
pred_onx = sess.run(None, {input_name: np.array(X[:10, :]).astype(np.float32)})
print(pred_onx[0])
print(pred_onx[1])

Output is the following:

[0 0 0 0 0 0 0 0 0 0]
[-1.26604015 -1.14275248 -1.28514317 -1.14395955 -1.30435769 -1.07967584
 -1.2667828  -1.19355031 -1.15150364 -1.13976   ]
----------
[0 0 0 0 0 0 0 0 0 0]
[[-1.2660402  1.2660402]
 [-1.1427525  1.1427525]
 [-1.2851433  1.2851433]
 [-1.1439595  1.1439595]
 [-1.3043578  1.3043578]
 [-1.0796759  1.0796759]
 [-1.2667828  1.2667828]
 [-1.1935505  1.1935505]
 [-1.1515036  1.1515036]
 [-1.13976    1.13976  ]]
---------------------
[0 0 0 0 0 0 0 0 0 0]
[-1.2660402 -1.1427525 -1.2851433 -1.1439595 -1.3043578 -1.0796759
 -1.2667828 -1.1935505 -1.1515036 -1.13976  ]

@qi-yuan-cresset
Copy link
Author

Hi @xadupre,

Thanks very much for your response. I understand that your code extract the first column of "probability" prediction from the ONNX model. However, if I understand correctly, that doesn't solve my problem, where the "probability" of class 0 from the prediction is smaller than class1, but the predicted "label" was 0 - For example, for the first sample, the "probability" of 0 is -1.266 and the "probability" of 1 is 1.266, but the predicted label was 0, which disagrees with the probability prediction.

Any suggestions would be appreciated, and let me know if I have misunderstood your code, or if anything from my question is unclear.

Best wishes,

Qi

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants