-
Notifications
You must be signed in to change notification settings - Fork 98
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Binary classification probability in ONNX for SVM with probability=False in sklearn #990
Comments
I took your model and added two nodes to extract the first column of your results. Would that be ok? import numpy as np
import onnxruntime as rt
import skl2onnx
from skl2onnx.common.data_types import FloatTensorType
from sklearn.datasets import load_iris
from sklearn.svm import SVC
from skl2onnx import convert_sklearn
X, y = load_iris(return_X_y=True)
X = X.astype(np.float32)
X, y = X[y != 2], y[y != 2]
n_samples, n_features = X.shape
model = SVC(probability=False, random_state=5)
model.fit(X, y)
print(model.predict(X[:10]))
print(model.decision_function(X[:10]))
print("----------")
initial_type = [("float_input", FloatTensorType([None, 4]))]
onx = convert_sklearn(model, initial_types=initial_type, target_opset=18)
sess = rt.InferenceSession(onx.SerializeToString(), providers=["CPUExecutionProvider"])
input_name = sess.get_inputs()[0].name
pred_onx = sess.run(None, {input_name: np.array(X[:10, :]).astype(np.float32)})
print(pred_onx[0])
print(pred_onx[1])
print("---------------------")
# see https://onnx.ai/onnx/intro/python.html
# Add one node to the model to extract the first column.
from onnx import TensorProto
from onnx.helper import make_node, make_tensor_value_info, make_model, make_graph
from onnx.numpy_helper import from_array
from onnx.version_converter import convert_version
# Make sure the onnx opset is one of the latest.
# sklearn-onnx chooses the lower possible value.
# In this case, it is 9. The operator Slice(opset 13) is added at the
# end of the graph to extract one column. The model needs to be upgraded.
onx = convert_version(onx, target_version=18)
inits = list(onx.graph.initializer)
inits.extend(
[
from_array(np.array([0], dtype=np.int64), name="zero"),
from_array(np.array([1], dtype=np.int64), name="one"),
from_array(np.array([-1], dtype=np.int64), name="mone"),
]
)
nodes = list(onx.graph.node)
nodes.extend(
[
make_node("Slice", ["probabilities", "zero", "one", "one"], ["new_scores2"]),
make_node("Reshape", ["new_scores2", "mone"], ["new_scores"]),
]
)
outputs = [
onx.graph.output[0],
make_tensor_value_info("new_scores", TensorProto.FLOAT, [None]),
]
graph = make_graph(nodes, onx.graph.name, onx.graph.input, outputs, inits)
new_model = make_model(graph, opset_imports=onx.opset_import)
sess = rt.InferenceSession(
new_model.SerializeToString(), providers=["CPUExecutionProvider"]
)
input_name = sess.get_inputs()[0].name
pred_onx = sess.run(None, {input_name: np.array(X[:10, :]).astype(np.float32)})
print(pred_onx[0])
print(pred_onx[1]) Output is the following:
|
Hi @xadupre, Thanks very much for your response. I understand that your code extract the first column of "probability" prediction from the ONNX model. However, if I understand correctly, that doesn't solve my problem, where the "probability" of class 0 from the prediction is smaller than class1, but the predicted "label" was 0 - For example, for the first sample, the "probability" of 0 is -1.266 and the "probability" of 1 is 1.266, but the predicted label was 0, which disagrees with the probability prediction. Any suggestions would be appreciated, and let me know if I have misunderstood your code, or if anything from my question is unclear. Best wishes, Qi |
Hello,
I'm confused by the output probability of SVM classifier converted to ONNX. For the following code:
The output I got are as below:
All the "labels" are 0, while all the "probability" for class 1 are great than class 0.
Is this a bug or some expected behaviour?
The "labels" predicted are consistent between sklearn and onnx, however, the probability values predicted from onnx are opposite to the labels, which caused inconsistency for me in building and converting voting classifiers in sklearn + onnx, since onnx does "soft voting" only.
I know that by setting SVC with probability=True can make the probability prediction consistent between sklearn and onnx, but the probability predicted by SVC in sklearn sometimes disagree with the corresponding labels, too, which could also lead to problem when building voting classifier.
Any suggestions on this issue would be highly appreciated.
Thanks very much,
Qi
The text was updated successfully, but these errors were encountered: