Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

graph may be disconnected with StackingClassifier #1069

Closed
DiTo97 opened this issue Feb 4, 2024 · 7 comments
Closed

graph may be disconnected with StackingClassifier #1069

DiTo97 opened this issue Feb 4, 2024 · 7 comments

Comments

@DiTo97
Copy link

DiTo97 commented Feb 4, 2024

Hello,

I have a complex scikit-learn pipeline which I have been trying to convert to ONNX.

The complexity comes from me trying to "mask out" parts of the input for different components of the pipeline and using a lightGBM classifier as base model, but the pipeline is being trained and run successfully when using standard scikit-learn.

To put the pipeline together and convert it to ONNX I have drawn inspiration from the following tutorials:

The error I am getting when trying to export the pipeline is the following:

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
[<ipython-input-239-eef2369a13a5>](https://localhost:8080/#) in <cell line: 1>()
----> 1 exported = to_onnx(
      2     model,
      3     X=numpy.asarray(sample),
      4     name="classifier",
      5     target_opset={"": 12, "ai.onnx.ml": 2},

3 frames
[/usr/local/lib/python3.10/dist-packages/skl2onnx/common/_container.py](https://localhost:8080/#) in ensure_topological_order(self)
    944                 for n in self.nodes
    945             )
--> 946             raise RuntimeError(
    947                 "After %d iterations for %d nodes, still unable "
    948                 "to sort names %r. The graph may be disconnected. "

RuntimeError: After 2 iterations for 34 nodes, still unable to sort names {'probability_tensor2', 'output_probability1', 'output_label', 'probability_tensor_castio', 'probability_tensor2_castio', 'reshaped_result', 'output_label1', 'probability_tensor', 'argmax_output', 'merged_stacked_proba', 'probability_tensor3', 'probability_tensor1', 'merged_probability_tensor', 'array_feature_extractor_result', 'output_probability'}. The graph may be disconnected. List of operators: 
Cast(probability_tensor) -> [probability_tensor_castio]
Cast1(probability_tensor1) -> [probability_tensor1_castio]
Concat(probability_tensor_castio, probability_tensor1_castio) -> [merged_probability_tensor]
N3(merged_probability_tensor) -> [merged_stacked_proba]
Cast2(probability_tensor2) -> [probability_tensor2_castio]
OpProb(probability_tensor2_castio) -> [probabilities]
ArgMax(probability_tensor2_castio) -> [argmax_output]
ArrayFeatureExtractor(classes#0, argmax_output) -> [array_feature_extractor_result]
Reshape(array_feature_extractor_result, shape_tensor#0) -> [reshaped_result]
Cast3(reshaped_result) -> [label]
IdSklearnPipeline(output_label) -> [label1]
CastSklearnPipeline(output_probability) -> [probability_tensor]
IdSklearnPipeline1(output_label1) -> [label2]
CastSklearnPipeline1(output_probability1) -> [probability_tensor1]
LinearClassifier(merged_stacked_proba) -> [label3, probability_tensor3]
Normalizer(probability_tensor3) -> [probability_tensor2]
--
--all-nodes--
--
Cast|Cast(probability_tensor) -> [probability_tensor_castio]
Cast|Cast1(probability_tensor1) -> [probability_tensor1_castio]
Concat|Concat(probability_tensor_castio, probability_tensor1_castio) -> [merged_probability_tensor]
Identity|N3(merged_probability_tensor) -> [merged_stacked_proba]
Cast|Cast2(probability_tensor2) -> [probability_tensor2_castio]
Identity|OpProb(probability_tensor2_castio) -> [probabilities]
ArgMax|ArgMax(probability_tensor2_castio) -> [argmax_output]
ArrayFeatureExtractor|ArrayFeatureExtractor(classes#0, argmax_output) -> [array_feature_extractor_result]
Reshape|Reshape(array_feature_extractor_result, shape_tensor#0) -> [reshaped_result]
Cast|Cast3(reshaped_result) -> [label]
Identity|IdSklearnPipeline(output_label) -> [label1]
Cast|CastSklearnPipeline(output_probability) -> [probability_tensor]
Identity|IdSklearnPipeline1(output_label1) -> [label2]
Cast|CastSklearnPipeline1(output_probability1) -> [probability_tensor1]
LinearClassifier|LinearClassifier(merged_stacked_proba) -> [label3, probability_tensor3]
Normalizer|Normalizer(probability_tensor3) -> [probability_tensor2]
ArrayFeatureExtractor|ArrayFeatureExtractor1(X#0, column_indices#0) -> [extracted_feature_columns]
Identity|Identity(extracted_feature_columns#2) -> [variable]
Cast|Cast4(variable#4) -> [variable1]
Sub|Su_Sub(variable1#6, Su_Subcst#0) -> [Su_C0]
Div|Di_Div(Su_C0#8, Di_Divcst#0) -> [variable2]
Cast|Cast5(variable2#10) -> [variable3]
TreeEnsembleClassifier|LightGbmLGBMClassifier(variable3#12) -> [label_tensor, probability_tensor4]
Identity|Identity1(label_tensor#14) -> [label4]
Identity|N24(probability_tensor4#14) -> [probabilities1]
ArrayFeatureExtractor|ArrayFeatureExtractor2(X#0, column_indices1#0) -> [extracted_feature_columns1]
Identity|Identity2(extracted_feature_columns1#2) -> [variable4]
Cast|Cast6(variable4#4) -> [variable5]
Sub|Su_Sub1(variable5#6, Su_Subcst1#0) -> [Su_C02]
Div|Di_Div1(Su_C02#8, Di_Divcst1#0) -> [variable6]
Cast|Cast7(variable6#10) -> [variable7]
TreeEnsembleClassifier|LightGbmLGBMClassifier1(variable7#12) -> [label_tensor1, probability_tensor5]
Identity|Identity3(label_tensor1#14) -> [label5]
Identity|N33(probability_tensor5#14) -> [probabilities2]

I will attach the archived Jupyter notebook for reproducibility on Google Colab: notebook.zip

am I missing something in the pipeline definition or in the export? What's wrong?

@xadupre
Copy link
Collaborator

xadupre commented Feb 8, 2024

skl2onnx cannot translate such an expression Identity = lambda: preprocessing.FunctionTransformer() because it does not know to function a custom function. In your case, you should replace it by 'passthrough' to tell sklearn to output them without any modification. For other custom functions, this page will give you more options: https://onnx.ai/sklearn-onnx/auto_tutorial/plot_jfunction_transformer.html.

@DiTo97
Copy link
Author

DiTo97 commented Feb 8, 2024

skl2onnx cannot translate such an expression Identity = lambda: preprocessing.FunctionTransformer() because it does not know to function a custom function. In your case, you should replace it by 'passthrough' to tell sklearn to output them without any modification. For other custom functions, this page will give you more options: https://onnx.ai/sklearn-onnx/auto_tutorial/plot_jfunction_transformer.html.

replaced all Identity() instances with the "passthrough" specifier, but same error, @xadupre:

RuntimeError                              Traceback (most recent call last)
[<ipython-input-55-eef2369a13a5>](https://localhost:8080/#) in <cell line: 1>()
----> 1 exported = to_onnx(
      2     model,
      3     X=numpy.asarray(sample),
      4     name="classifier",
      5     target_opset={"": 12, "ai.onnx.ml": 2},

3 frames
[/usr/local/lib/python3.10/dist-packages/skl2onnx/common/_container.py](https://localhost:8080/#) in ensure_topological_order(self)
    944                 for n in self.nodes
    945             )
--> 946             raise RuntimeError(
    947                 "After %d iterations for %d nodes, still unable "
    948                 "to sort names %r. The graph may be disconnected. "

RuntimeError: After 2 iterations for 32 nodes, still unable to sort names {'probability_tensor2', 'merged_stacked_proba', 'merged_probability_tensor', 'argmax_output', 'probability_tensor_castio', 'probability_tensor', 'array_feature_extractor_result', 'reshaped_result', 'output_probability1', 'probability_tensor3', 'output_label1', 'output_label', 'output_probability', 'probability_tensor1', 'probability_tensor2_castio'}. The graph may be disconnected. List of operators: 
Cast(probability_tensor) -> [probability_tensor_castio]
Cast1(probability_tensor1) -> [probability_tensor1_castio]
Concat(probability_tensor_castio, probability_tensor1_castio) -> [merged_probability_tensor]
N3(merged_probability_tensor) -> [merged_stacked_proba]
Cast2(probability_tensor2) -> [probability_tensor2_castio]
OpProb(probability_tensor2_castio) -> [probabilities]
ArgMax(probability_tensor2_castio) -> [argmax_output]
ArrayFeatureExtractor(classes#0, argmax_output) -> [array_feature_extractor_result]
Reshape(array_feature_extractor_result, shape_tensor#0) -> [reshaped_result]
Cast3(reshaped_result) -> [label]
IdSklearnPipeline(output_label) -> [label1]
CastSklearnPipeline(output_probability) -> [probability_tensor]
IdSklearnPipeline1(output_label1) -> [label2]
CastSklearnPipeline1(output_probability1) -> [probability_tensor1]
LinearClassifier(merged_stacked_proba) -> [label3, probability_tensor3]
Normalizer(probability_tensor3) -> [probability_tensor2]
--
--all-nodes--
--
Cast|Cast(probability_tensor) -> [probability_tensor_castio]
Cast|Cast1(probability_tensor1) -> [probability_tensor1_castio]
Concat|Concat(probability_tensor_castio, probability_tensor1_castio) -> [merged_probability_tensor]
Identity|N3(merged_probability_tensor) -> [merged_stacked_proba]
Cast|Cast2(probability_tensor2) -> [probability_tensor2_castio]
Identity|OpProb(probability_tensor2_castio) -> [probabilities]
ArgMax|ArgMax(probability_tensor2_castio) -> [argmax_output]
ArrayFeatureExtractor|ArrayFeatureExtractor(classes#0, argmax_output) -> [array_feature_extractor_result]
Reshape|Reshape(array_feature_extractor_result, shape_tensor#0) -> [reshaped_result]
Cast|Cast3(reshaped_result) -> [label]
Identity|IdSklearnPipeline(output_label) -> [label1]
Cast|CastSklearnPipeline(output_probability) -> [probability_tensor]
Identity|IdSklearnPipeline1(output_label1) -> [label2]
Cast|CastSklearnPipeline1(output_probability1) -> [probability_tensor1]
LinearClassifier|LinearClassifier(merged_stacked_proba) -> [label3, probability_tensor3]
Normalizer|Normalizer(probability_tensor3) -> [probability_tensor2]
ArrayFeatureExtractor|ArrayFeatureExtractor1(X#0, column_indices#0) -> [extracted_feature_columns]
Cast|Cast4(extracted_feature_columns#2) -> [variable]
Sub|Su_Sub(variable#4, Su_Subcst#0) -> [Su_C0]
Div|Di_Div(Su_C0#6, Di_Divcst#0) -> [variable1]
Cast|Cast5(variable1#8) -> [variable2]
TreeEnsembleClassifier|LightGbmLGBMClassifier(variable2#10) -> [label_tensor, probability_tensor4]
Identity|Identity(label_tensor#12) -> [label4]
Identity|N23(probability_tensor4#12) -> [probabilities1]
ArrayFeatureExtractor|ArrayFeatureExtractor2(X#0, column_indices1#0) -> [extracted_feature_columns1]
Cast|Cast6(extracted_feature_columns1#2) -> [variable3]
Sub|Su_Sub1(variable3#4, Su_Subcst1#0) -> [Su_C02]
Div|Di_Div1(Su_C02#6, Di_Divcst1#0) -> [variable4]
Cast|Cast7(variable4#8) -> [variable5]
TreeEnsembleClassifier|LightGbmLGBMClassifier1(variable5#10) -> [label_tensor1, probability_tensor5]
Identity|Identity1(label_tensor1#12) -> [label5]
Identity|N31(probability_tensor5#12) -> [probabilities2]

did it work for you replacing for "passthrough"? The model pipeline definition after the fix:

def abc_Embedder() -> list[tuple[str, Any]]:
    return [
        ("cast64", skl2onnx.sklapi.CastTransformer(dtype=numpy.float64)),
        ("scaler", preprocessing.StandardScaler()),
        ("cast32", skl2onnx.sklapi.CastTransformer()),
        ("basemodel", lightgbm.LGBMClassifier()),
    ]

def Classifier(features: list[str]) -> base.BaseEstimator:
    facepreprocessor = compose.ColumnTransformer([("identity", "passthrough", [i for i, x in enumerate(features) if x.startswith("facelandmark")])])
    posepreprocessor = compose.ColumnTransformer([("identity", "passthrough", [i for i, x in enumerate(features) if x.startswith("poselandmark")])])

    faceembedder = pipeline.Pipeline(steps=abc_Embedder())
    poseembedder = pipeline.Pipeline(steps=abc_Embedder())

    facepipeline = pipeline.Pipeline([("preprocessor", facepreprocessor), ("embedder", faceembedder)])
    posepipeline = pipeline.Pipeline([("preprocessor", posepreprocessor), ("embedder", poseembedder)])

    head = linear_model.LogisticRegression(multi_class="multinomial")

    classifier = ensemble.StackingClassifier(
        estimators=[("facepipeline", facepipeline), ("posepipeline", posepipeline)],
        final_estimator=head
    )

    return classifier

any idea why the error may persist?

@DiTo97
Copy link
Author

DiTo97 commented Feb 15, 2024

any pointers on that, @xadupre?

@DiTo97
Copy link
Author

DiTo97 commented Feb 21, 2024

After digging deeper, it seems like the problem arises only when exporting the stacking classifier with the lightGBM model as backbone, while the lightGBM model alone and the preprocessing work fine, @xadupre

I have also tried replacing the ligthGBM classifiers with logistic regressions or with hist gradient boosting classifiers, and replacing the logistic regression head with other available classifiers, but the error is always the same.

The weird thing is that I tried the notebook shared in #817, which also uses a stacking classifier, and the export works.

@DiTo97
Copy link
Author

DiTo97 commented Feb 21, 2024

For some reason, the nested pipeline definition was the problem, @xadupre.

The following definition works as expected:

def abc_Embedder() -> list[tuple[str, Any]]:
    return [
        ("cast64", skl2onnx.sklapi.CastTransformer(dtype=numpy.float64)),
        ("scaler", preprocessing.StandardScaler()),
        ("cast32", skl2onnx.sklapi.CastTransformer()),
        ("basemodel", lightgbm.LGBMClassifier()),
    ]


def Classifier(features: list[str]) -> base.BaseEstimator:
    facepreprocessor = compose.ColumnTransformer([("identity", "passthrough", [i for i, x in enumerate(features) if x.startswith("facelandmark")])])
    posepreprocessor = compose.ColumnTransformer([("identity", "passthrough", [i for i, x in enumerate(features) if x.startswith("poselandmark")])])

    faceembedder = abc_Embedder()
    poseembedder = abc_Embedder()

    facepipeline = pipeline.Pipeline([("preprocessor", facepreprocessor), *faceembedder])
    posepipeline = pipeline.Pipeline([("preprocessor", posepreprocessor), *poseembedder])

    head = linear_model.LogisticRegression(multi_class="multinomial")

    classifier = ensemble.StackingClassifier(
        estimators=[("facepipeline", facepipeline), ("posepipeline", posepipeline)],
        final_estimator=head
    )

    return classifier

@xadupre
Copy link
Collaborator

xadupre commented Feb 21, 2024

Sorry for the delay, I was able to create a unit test from the notebook you shared. I can replicate the bug. I will work on a fix.

@xadupre
Copy link
Collaborator

xadupre commented Apr 4, 2024

PR #1072 fixes it. Sorry for the delay again. Feel free to reopen it if you have another issue with it.

@xadupre xadupre closed this as completed Apr 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants