Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FunctionTransformer is not supported #609

Open
HamzaSaouli opened this issue Feb 5, 2021 · 18 comments
Open

FunctionTransformer is not supported #609

HamzaSaouli opened this issue Feb 5, 2021 · 18 comments

Comments

@HamzaSaouli
Copy link


RuntimeError Traceback (most recent call last)
in
1 # Export the model
2 initial_type = [('numfeat', FloatTensorType([None, 30]))]
----> 3 model_onnx = convert_sklearn(model, initial_types=initial_type)
4
5 # Save it into wanted file

~/.local/lib/python3.6/site-packages/skl2onnx/convert.py in convert_sklearn(model, name, initial_types, doc_string, target_opset, custom_conversion_functions, custom_shape_calculators, custom_parsers, options, dtype, intermediate, white_op, black_op, final_types)
148
149 # Infer variable shapes
--> 150 topology.compile()
151
152 # Convert our Topology object into ONNX. The outcome is an ONNX model.

~/.local/lib/python3.6/site-packages/skl2onnx/common/_topology.py in compile(self)
901 self._resolve_duplicates()
902 self._fix_shapes()
--> 903 self._infer_all_types()
904 self._check_structure()
905

~/.local/lib/python3.6/site-packages/skl2onnx/common/_topology.py in _infer_all_types(self)
754 shape_calc(operator)
755 else:
--> 756 operator.infer_types()
757
758 def _resolve_duplicates(self):

~/.local/lib/python3.6/site-packages/skl2onnx/common/_topology.py in infer_types(self)
220 "Unable to find a shape calculator for alias '{}' "
221 "and type '{}'.".format(self.type, type(self.raw_operator)))
--> 222 shape_calc(self)
223
224 @Property

~/.local/lib/python3.6/site-packages/skl2onnx/shape_calculators/function_transformer.py in calculate_sklearn_function_transformer_output_shapes(operator)
15 """
16 if operator.raw_operator.func is not None:
---> 17 raise RuntimeError("FunctionTransformer is not supported unless the "
18 "transform function is None (= identity). "
19 "You may raise an issue at "

RuntimeError: FunctionTransformer is not supported unless the transform function is None (= identity). You may raise an issue at https://github.com/onnx/sklearn-onnx/issues.

@xadupre
Copy link
Collaborator

xadupre commented Feb 6, 2021

FunctionTransformer includes custom codes which is difficult to automatically convert into ONNX. It should be easier if the custom function could be directly written with onnx operators. That's one option : write the custom function with ONNX operators. The second option is to convert that function into a python operator onnxruntime can use. That's what is done by the package ort-customops. Both ways would probably require examples to guide users.

@xadupre xadupre added this to To do in Questions Jun 3, 2021
@xadupre
Copy link
Collaborator

xadupre commented Jul 15, 2021

I explored a more simple way to do it with a syntax very close to numpy, see Numpy API for ONNX and scikit-learn .

@xadupre xadupre added this to pending user response in Summer 2021 Sep 17, 2021
@paranjapeved15
Copy link

@xadupre I checked your post but I am not sure I understand. The problem is that onnx converters do not accept FunctionTransformer. Can you please elaborate how your example solution would solve the issue?

@xadupre
Copy link
Collaborator

xadupre commented Nov 7, 2023

skl2onnx converts a scikit-learn pipeline into onnx. It means for many estimators, skl2onnx has an onnx implementation of the inference function implemented in scikit-learn. The final onnx graph is just putting all these blocks together. FunctionTransformer includes a piece of code skl2onnx has no implementation for and one piece is missing unless the user provides that implementation.

Now how to provide that onnx implementation of your custom code. That's the main difficult part. The blog post you mention uses a package I don't maintain anymore as I broke it into smaller packages. Some parts were added to onnx. I tried to make a list of the options available today: Many ways to implement a custom graph in ONNX. In your case, the first choice is to select the option which fits your need.

Once you have an onnx representation of your python code, I recently added function add_onnx_graph in PR #1023 which integrates any onnx graph into your pipeline. It is not released yet but skl2onnx can be installed from github to use it.

@paranjapeved15
Copy link

Thanks for the reply @xadupre! So if I rewrite my custom function to use onnx operators instead of numpy and then call my function using FunctionTransformer, would that work? Or would I need to add my onnx operator function directly to the onnx graph using add_onnx_graph?

@paranjapeved15
Copy link

@xadupre, because FunctionTransformer was giving errors I was also trying to create a custom transformer and then write onnx converter for it, similar to - https://onnx.ai/sklearn-onnx/auto_tutorial/plot_icustom_converter.html. Do you think this approach might work too?

@xadupre
Copy link
Collaborator

xadupre commented Nov 7, 2023

Whatever the option, there will be two versions, one for scikit learn, one for onnx. The numpy api is a way to have the same code produce the two versions. Otherwise there are two. Switch from FunctionTransformer to a custom transformer is easier given the design of skl2onnx. The converter for the custom transformer may be written with any api. With skl2onnx api or another one + add_onnx_graph. I assume you cannot share your custom function but maybe you can share a more simple one with a very simple pipeline and i can write a short example on how to do it. You would then change it for the real function.

@paranjapeved15
Copy link

paranjapeved15 commented Nov 8, 2023

def calculate(df):
    df['c'] = 100*(df['a'] - df['b'])/df['b']
    return df



mapper = ColumnTransformer(
    transformers=[
        (
            "c",
            FunctionTransformer(calculate),
            ['a', 'b']
        ),
    ],
    remainder='passthrough',
    verbose_feature_names_out=False
)
mapper.set_output(transform="pandas")

pipe = Pipeline([("mapper", mapper), ("classifier", XGBoostClassifier())])

Thanks so much for the help @xadupre !

@paranjapeved15
Copy link

paranjapeved15 commented Nov 8, 2023

```
class OverpriceCalculator(TransformerMixin):

def __init__(self):
    pass

def calculate_overprice(self,x,y):
    return 100 * (x-y)/y
    
def fit(self,X,y=None):
    return self

def transform(self,X,y=None):
    X['c'] = X.apply(lambda x: self.calculate(x.a, x.b), axis=1)
    return X
        
 Here is the custom transformer I wrote in case you need it to write the onnx converter.

@xadupre
Copy link
Collaborator

xadupre commented Nov 8, 2023

Thanks, I'll write the example today.

@xadupre
Copy link
Collaborator

xadupre commented Nov 8, 2023

I created a PR with an example similar to yours. Feel free to add comment wherever it needs more explanations from me.

@paranjapeved15
Copy link

@xadupre I am trying to now write more custom transformers like above, I need to refer to the various onnx operators for writing converter functions. Is there a documentation page or wiki to explain the various onnx operators available?

@xadupre
Copy link
Collaborator

xadupre commented Dec 8, 2023

You can look into https://onnx.ai/onnx/operators/ or https://github.com/onnx/onnx/blob/main/docs/Operators.md. The first page contains some explaination about onnx, opset, domain, operators, ...

@paranjapeved15
Copy link

@xadupre I am a bit confused about how to import these. In the PR that you created above (#1042) you imported operators like OnnxSlice, etc from skl2onnx.algebra.onnx_ops. But when I go to that location I don't see the source code for the function definitions of these operators.
Also, why are the operators on the above page named as something like "Slice" but we import it as OnnxSlice?

@xadupre
Copy link
Collaborator

xadupre commented Dec 8, 2023

They are dynamically created by the package based on the operator schema. They have the same signature as the operators. Operator Slice becomes class OnnxSlice. When I created this API, onnx was growing at every release and I did not want to update the code at every release. These classes would have the same signature as the methods described at https://github.com/microsoft/onnxscript/blob/main/onnxscript/onnx_opset/_impl/opset18.py (without self).

@paranjapeved15
Copy link

So all onnx operator imports would look like skl2onnx.algebra.onnx_ops.*?

@xadupre
Copy link
Collaborator

xadupre commented Dec 8, 2023

Yes.

@addisonklinke
Copy link

addisonklinke commented Feb 7, 2024

@xadupre thanks for all your documentation on registering custom converters!

Regarding the available operators to use in the operator converter, I saw this list in the ONNX docs. However, it appears to have different namespaces like ai.onnx and ai.onnx.ml (each with their own opset versions). When I inspect an example ONNX pipeline in Netron, I see both namespaces are imported

image

convert_sklearn(target_opset=...) would allow me to alter ai.onnx, but what if I wanted a particular opset from ai.onnx.ml?

EDIT: based on this example I see that target_opset can be a dict where keys represent the different namespaces

model_onnx = convert_sklearn(
    pipe,
    "pipeline_xgboost",
    [("input", FloatTensorType([None, 2]))],
    target_opset={"": 12, "ai.onnx.ml": 2},
)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
No open projects
Summer 2021
pending user response
Development

No branches or pull requests

4 participants