Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error converting pipeline: "Failed to create ONNX node. Undefined attribute pair (default_string, None) found for type 'LabelEncoder' and version 2" #830

Closed
torshind opened this issue Feb 21, 2022 · 3 comments · Fixed by #834

Comments

@torshind
Copy link

Hello,
I attach an example to reproduce.

import sys

import numpy as np
import pandas as pd

from sklearn.compose import ColumnTransformer
from sklearn.compose import make_column_selector as selector
from sklearn.datasets import fetch_openml
from sklearn.impute import SimpleImputer
from sklearn.preprocessing import StandardScaler, OrdinalEncoder
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split, GridSearchCV
from sklearn.pipeline import Pipeline

np.random.seed(0)

X, y = fetch_openml("titanic", version=1, as_frame=True, return_X_y=True)

X['home_dest'] = X['home.dest']
X = X.drop('home.dest', axis=1)

X.loc[:, X.select_dtypes(exclude=np.number).columns] = X.select_dtypes(exclude=np.number).apply(lambda x: x.astype(str))
X = X.replace("None", '')

print(X.dtypes)

numeric_transformer = Pipeline(
    steps=[("num_imputer", SimpleImputer(strategy="median")),
           ("scaler", StandardScaler())]
)

categorical_transformer = Pipeline(
    steps=[("cat_imputer", SimpleImputer(strategy='most_frequent', missing_values='')),
           ("encoder", OrdinalEncoder(dtype=np.int32,
                                      handle_unknown='use_encoded_value',
                                      unknown_value=-999))]
)

preprocessor = ColumnTransformer(
    transformers=[
        ("num", numeric_transformer, selector(dtype_include=np.number)),
        ("cat", categorical_transformer, selector(dtype_exclude=np.number)),
    ]
)
clf = Pipeline(
    steps=[("preprocessor", preprocessor), ("classifier", LogisticRegression())]
)

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=0)

clf.fit(X_train, y_train)
print("model score: %.3f" % clf.score(X_test, y_test))

from skl2onnx.common.data_types import (FloatTensorType, Int64TensorType, StringTensorType)

def convert_dataframe_schema(df, drop=None):
    inputs = []
    for k, v in zip(df.columns, df.dtypes):
        if drop is not None and k in drop:
            continue
        if v == 'int64':
            t = Int64TensorType([None, 1])
        elif v == 'float64':
            t = FloatTensorType([None, 1])
        else:
            t = StringTensorType([None, 1])
        inputs.append((k, t))
    return inputs

initial_inputs = convert_dataframe_schema(X)

from skl2onnx import convert_sklearn

import logging
log = logging.getLogger('skl2onnx')
log.setLevel(logging.DEBUG)
logging.basicConfig(level=logging.DEBUG)

try:
    model_onnx = convert_sklearn(clf, 'pipeline_onnx', initial_inputs,
                                 target_opset=14, verbose=1)
except Exception as e:
    print(e)

Output:

DEBUG:skl2onnx:[Var] +Variable('pclass', 'pclass', type=FloatTensorType(shape=[None, 1]))
DEBUG:skl2onnx:[Var] +Variable('name', 'name', type=StringTensorType(shape=[None, 1]))
DEBUG:skl2onnx:[Var] +Variable('sex', 'sex', type=StringTensorType(shape=[None, 1]))
DEBUG:skl2onnx:[Var] +Variable('age', 'age', type=FloatTensorType(shape=[None, 1]))
DEBUG:skl2onnx:[Var] +Variable('sibsp', 'sibsp', type=FloatTensorType(shape=[None, 1]))
DEBUG:skl2onnx:[Var] +Variable('parch', 'parch', type=FloatTensorType(shape=[None, 1]))
DEBUG:skl2onnx:[Var] +Variable('ticket', 'ticket', type=StringTensorType(shape=[None, 1]))
DEBUG:skl2onnx:[Var] +Variable('fare', 'fare', type=FloatTensorType(shape=[None, 1]))
DEBUG:skl2onnx:[Var] +Variable('cabin', 'cabin', type=StringTensorType(shape=[None, 1]))
DEBUG:skl2onnx:[Var] +Variable('embarked', 'embarked', type=StringTensorType(shape=[None, 1]))
DEBUG:skl2onnx:[Var] +Variable('boat', 'boat', type=StringTensorType(shape=[None, 1]))
DEBUG:skl2onnx:[Var] +Variable('body', 'body', type=FloatTensorType(shape=[None, 1]))
DEBUG:skl2onnx:[Var] +Variable('home_dest', 'home_dest', type=StringTensorType(shape=[None, 1]))
DEBUG:skl2onnx:[Var] update is_root=True for Variable('pclass', 'pclass', type=FloatTensorType(shape=[None, 1]))
DEBUG:skl2onnx:[Var] update is_root=True for Variable('name', 'name', type=StringTensorType(shape=[None, 1]))
DEBUG:skl2onnx:[Var] update is_root=True for Variable('sex', 'sex', type=StringTensorType(shape=[None, 1]))
DEBUG:skl2onnx:[Var] update is_root=True for Variable('age', 'age', type=FloatTensorType(shape=[None, 1]))
DEBUG:skl2onnx:[Var] update is_root=True for Variable('sibsp', 'sibsp', type=FloatTensorType(shape=[None, 1]))
DEBUG:skl2onnx:[Var] update is_root=True for Variable('parch', 'parch', type=FloatTensorType(shape=[None, 1]))
DEBUG:skl2onnx:[Var] update is_root=True for Variable('ticket', 'ticket', type=StringTensorType(shape=[None, 1]))
DEBUG:skl2onnx:[Var] update is_root=True for Variable('fare', 'fare', type=FloatTensorType(shape=[None, 1]))
DEBUG:skl2onnx:[Var] update is_root=True for Variable('cabin', 'cabin', type=StringTensorType(shape=[None, 1]))
DEBUG:skl2onnx:[Var] update is_root=True for Variable('embarked', 'embarked', type=StringTensorType(shape=[None, 1]))
DEBUG:skl2onnx:[Var] update is_root=True for Variable('boat', 'boat', type=StringTensorType(shape=[None, 1]))
DEBUG:skl2onnx:[Var] update is_root=True for Variable('body', 'body', type=FloatTensorType(shape=[None, 1]))
DEBUG:skl2onnx:[Var] update is_root=True for Variable('home_dest', 'home_dest', type=StringTensorType(shape=[None, 1]))
DEBUG:skl2onnx:[Op] +Operator(type='SklearnConcat', onnx_name='SklearnConcat', inputs='', outputs='', raw_operator=None)
DEBUG:skl2onnx:[Op] add In Variable('pclass', 'pclass', type=FloatTensorType(shape=[None, 1])) to Operator(type='SklearnConcat', onnx_name='SklearnConcat', inputs='pclass', outputs='', raw_operator=None)
DEBUG:skl2onnx:[Op] add In Variable('age', 'age', type=FloatTensorType(shape=[None, 1])) to Operator(type='SklearnConcat', onnx_name='SklearnConcat', inputs='pclass,age', outputs='', raw_operator=None)
DEBUG:skl2onnx:[Op] add In Variable('sibsp', 'sibsp', type=FloatTensorType(shape=[None, 1])) to Operator(type='SklearnConcat', onnx_name='SklearnConcat', inputs='pclass,age,sibsp', outputs='', raw_operator=None)
DEBUG:skl2onnx:[Op] add In Variable('parch', 'parch', type=FloatTensorType(shape=[None, 1])) to Operator(type='SklearnConcat', onnx_name='SklearnConcat', inputs='pclass,age,sibsp,parch', outputs='', raw_operator=None)
DEBUG:skl2onnx:[Op] add In Variable('fare', 'fare', type=FloatTensorType(shape=[None, 1])) to Operator(type='SklearnConcat', onnx_name='SklearnConcat', inputs='pclass,age,sibsp,parch,fare', outputs='', raw_operator=None)
DEBUG:skl2onnx:[Op] add In Variable('body', 'body', type=FloatTensorType(shape=[None, 1])) to Operator(type='SklearnConcat', onnx_name='SklearnConcat', inputs='pclass,age,sibsp,parch,fare,body', outputs='', raw_operator=None)
DEBUG:skl2onnx:[Var] +Variable('merged_columns', 'merged_columns', type=FloatTensorType(shape=[None, None]))
DEBUG:skl2onnx:[Var] set parent for Variable('merged_columns', 'merged_columns', type=FloatTensorType(shape=[None, None])), parent=Operator(type='SklearnConcat', onnx_name='SklearnConcat', inputs='pclass,age,sibsp,parch,fare,body', outputs='', raw_operator=None)
DEBUG:skl2onnx:[Op] add Out Variable('merged_columns', 'merged_columns', type=FloatTensorType(shape=[None, None])) to Operator(type='SklearnConcat', onnx_name='SklearnConcat', inputs='pclass,age,sibsp,parch,fare,body', outputs='merged_columns', raw_operator=None)
DEBUG:skl2onnx:[parsing] found alias='SklearnSimpleImputer' for type=<class 'sklearn.impute._base.SimpleImputer'>.
DEBUG:skl2onnx:[Op] +Operator(type='SklearnSimpleImputer', onnx_name='SklearnSimpleImputer', inputs='', outputs='', raw_operator=SimpleImputer(strategy='median'))
DEBUG:skl2onnx:[Op] add In Variable('merged_columns', 'merged_columns', type=FloatTensorType(shape=[None, None])) to Operator(type='SklearnSimpleImputer', onnx_name='SklearnSimpleImputer', inputs='merged_columns', outputs='', raw_operator=SimpleImputer(strategy='median'))
DEBUG:skl2onnx:[Var] +Variable('variable', 'variable', type=FloatTensorType(shape=[]))
DEBUG:skl2onnx:[Var] set parent for Variable('variable', 'variable', type=FloatTensorType(shape=[])), parent=Operator(type='SklearnSimpleImputer', onnx_name='SklearnSimpleImputer', inputs='merged_columns', outputs='', raw_operator=SimpleImputer(strategy='median'))
DEBUG:skl2onnx:[Op] add Out Variable('variable', 'variable', type=FloatTensorType(shape=[])) to Operator(type='SklearnSimpleImputer', onnx_name='SklearnSimpleImputer', inputs='merged_columns', outputs='variable', raw_operator=SimpleImputer(strategy='median'))
DEBUG:skl2onnx:[parsing] found alias='SklearnScaler' for type=<class 'sklearn.preprocessing._data.StandardScaler'>.
DEBUG:skl2onnx:[Op] +Operator(type='SklearnScaler', onnx_name='SklearnScaler', inputs='', outputs='', raw_operator=StandardScaler())
DEBUG:skl2onnx:[Var] +Variable('variable', 'variable1', type=FloatTensorType(shape=[]))
DEBUG:skl2onnx:[Var] set parent for Variable('variable', 'variable1', type=FloatTensorType(shape=[])), parent=Operator(type='SklearnScaler', onnx_name='SklearnScaler', inputs='variable', outputs='', raw_operator=StandardScaler())
DEBUG:skl2onnx:[Op] add Out Variable('variable', 'variable1', type=FloatTensorType(shape=[])) to Operator(type='SklearnScaler', onnx_name='SklearnScaler', inputs='variable', outputs='variable1', raw_operator=StandardScaler())
DEBUG:skl2onnx:[Op] +Operator(type='SklearnConcat', onnx_name='SklearnConcat1', inputs='', outputs='', raw_operator=None)
DEBUG:skl2onnx:[Op] add In Variable('name', 'name', type=StringTensorType(shape=[None, 1])) to Operator(type='SklearnConcat', onnx_name='SklearnConcat1', inputs='name', outputs='', raw_operator=None)
DEBUG:skl2onnx:[Op] add In Variable('sex', 'sex', type=StringTensorType(shape=[None, 1])) to Operator(type='SklearnConcat', onnx_name='SklearnConcat1', inputs='name,sex', outputs='', raw_operator=None)
DEBUG:skl2onnx:[Op] add In Variable('ticket', 'ticket', type=StringTensorType(shape=[None, 1])) to Operator(type='SklearnConcat', onnx_name='SklearnConcat1', inputs='name,sex,ticket', outputs='', raw_operator=None)
DEBUG:skl2onnx:[Op] add In Variable('cabin', 'cabin', type=StringTensorType(shape=[None, 1])) to Operator(type='SklearnConcat', onnx_name='SklearnConcat1', inputs='name,sex,ticket,cabin', outputs='', raw_operator=None)
DEBUG:skl2onnx:[Op] add In Variable('embarked', 'embarked', type=StringTensorType(shape=[None, 1])) to Operator(type='SklearnConcat', onnx_name='SklearnConcat1', inputs='name,sex,ticket,cabin,embarked', outputs='', raw_operator=None)
DEBUG:skl2onnx:[Op] add In Variable('boat', 'boat', type=StringTensorType(shape=[None, 1])) to Operator(type='SklearnConcat', onnx_name='SklearnConcat1', inputs='name,sex,ticket,cabin,embarked,boat', outputs='', raw_operator=None)
DEBUG:skl2onnx:[Op] add In Variable('home_dest', 'home_dest', type=StringTensorType(shape=[None, 1])) to Operator(type='SklearnConcat', onnx_name='SklearnConcat1', inputs='name,sex,ticket,cabin,embarked,boat,home_dest', outputs='', raw_operator=None)
DEBUG:skl2onnx:[Var] +Variable('merged_columns', 'merged_columns1', type=StringTensorType(shape=[None, None]))
DEBUG:skl2onnx:[Var] set parent for Variable('merged_columns', 'merged_columns1', type=StringTensorType(shape=[None, None])), parent=Operator(type='SklearnConcat', onnx_name='SklearnConcat1', inputs='name,sex,ticket,cabin,embarked,boat,home_dest', outputs='', raw_operator=None)
DEBUG:skl2onnx:[Op] add Out Variable('merged_columns', 'merged_columns1', type=StringTensorType(shape=[None, None])) to Operator(type='SklearnConcat', onnx_name='SklearnConcat1', inputs='name,sex,ticket,cabin,embarked,boat,home_dest', outputs='merged_columns1', raw_operator=None)
DEBUG:skl2onnx:[parsing] found alias='SklearnSimpleImputer' for type=<class 'sklearn.impute._base.SimpleImputer'>.
DEBUG:skl2onnx:[Op] +Operator(type='SklearnSimpleImputer', onnx_name='SklearnSimpleImputer1', inputs='', outputs='', raw_operator=SimpleImputer(missing_values='', strategy='most_frequent'))
DEBUG:skl2onnx:[Op] add In Variable('merged_columns', 'merged_columns1', type=StringTensorType(shape=[None, None])) to Operator(type='SklearnSimpleImputer', onnx_name='SklearnSimpleImputer1', inputs='merged_columns1', outputs='', raw_operator=SimpleImputer(missing_values='', strategy='most_frequent'))
DEBUG:skl2onnx:[Var] +Variable('variable', 'variable2', type=StringTensorType(shape=[]))
DEBUG:skl2onnx:[Var] set parent for Variable('variable', 'variable2', type=StringTensorType(shape=[])), parent=Operator(type='SklearnSimpleImputer', onnx_name='SklearnSimpleImputer1', inputs='merged_columns1', outputs='', raw_operator=SimpleImputer(missing_values='', strategy='most_frequent'))
DEBUG:skl2onnx:[Op] add Out Variable('variable', 'variable2', type=StringTensorType(shape=[])) to Operator(type='SklearnSimpleImputer', onnx_name='SklearnSimpleImputer1', inputs='merged_columns1', outputs='variable2', raw_operator=SimpleImputer(missing_values='', strategy='most_frequent'))
DEBUG:skl2onnx:[parsing] found alias='SklearnOrdinalEncoder' for type=<class 'sklearn.preprocessing._encoders.OrdinalEncoder'>.
DEBUG:skl2onnx:[Op] +Operator(type='SklearnOrdinalEncoder', onnx_name='SklearnOrdinalEncoder', inputs='', outputs='', raw_operator=OrdinalEncoder(dtype=<class'numpy.int32'>,handle_unknown='use_encoded_value',unknown_value=-999))
DEBUG:skl2onnx:[Var] +Variable('variable', 'variable3', type=FloatTensorType(shape=[]))
DEBUG:skl2onnx:[Var] set parent for Variable('variable', 'variable3', type=FloatTensorType(shape=[])), parent=Operator(type='SklearnOrdinalEncoder', onnx_name='SklearnOrdinalEncoder', inputs='variable2', outputs='', raw_operator=OrdinalEncoder(dtype=<class'numpy.int32'>,handle_unknown='use_encoded_value',unknown_value=-999))
DEBUG:skl2onnx:[Op] add Out Variable('variable', 'variable3', type=FloatTensorType(shape=[])) to Operator(type='SklearnOrdinalEncoder', onnx_name='SklearnOrdinalEncoder', inputs='variable2', outputs='variable3', raw_operator=OrdinalEncoder(dtype=<class'numpy.int32'>,handle_unknown='use_encoded_value',unknown_value=-999))
DEBUG:skl2onnx:[Op] +Operator(type='SklearnConcat', onnx_name='SklearnConcat2', inputs='', outputs='', raw_operator=None)
DEBUG:skl2onnx:[Op] add In Variable('variable', 'variable1', type=FloatTensorType(shape=[])) to Operator(type='SklearnConcat', onnx_name='SklearnConcat2', inputs='variable1', outputs='', raw_operator=None)
DEBUG:skl2onnx:[Op] add In Variable('variable', 'variable3', type=FloatTensorType(shape=[])) to Operator(type='SklearnConcat', onnx_name='SklearnConcat2', inputs='variable1,variable3', outputs='', raw_operator=None)
DEBUG:skl2onnx:[Var] +Variable('transformed_column', 'transformed_column', type=FloatTensorType(shape=[None, None]))
DEBUG:skl2onnx:[Var] set parent for Variable('transformed_column', 'transformed_column', type=FloatTensorType(shape=[None, None])), parent=Operator(type='SklearnConcat', onnx_name='SklearnConcat2', inputs='variable1,variable3', outputs='', raw_operator=None)
DEBUG:skl2onnx:[Op] add Out Variable('transformed_column', 'transformed_column', type=FloatTensorType(shape=[None, None])) to Operator(type='SklearnConcat', onnx_name='SklearnConcat2', inputs='variable1,variable3', outputs='transformed_column', raw_operator=None)
DEBUG:skl2onnx:[parsing] found alias='SklearnLinearClassifier' for type=<class 'sklearn.linear_model._logistic.LogisticRegression'>.
DEBUG:skl2onnx:[Op] +Operator(type='SklearnLinearClassifier', onnx_name='SklearnLinearClassifier', inputs='', outputs='', raw_operator=LogisticRegression())
DEBUG:skl2onnx:[Var] +Variable('label', 'label', type=Int64TensorType(shape=[]))
DEBUG:skl2onnx:[Var] +Variable('probabilities', 'probabilities', type=FloatTensorType(shape=[]))
DEBUG:skl2onnx:[Var] set parent for Variable('label', 'label', type=Int64TensorType(shape=[])), parent=Operator(type='SklearnLinearClassifier', onnx_name='SklearnLinearClassifier', inputs='transformed_column', outputs='', raw_operator=LogisticRegression())
DEBUG:skl2onnx:[Op] add Out Variable('label', 'label', type=Int64TensorType(shape=[])) to Operator(type='SklearnLinearClassifier', onnx_name='SklearnLinearClassifier', inputs='transformed_column', outputs='label', raw_operator=LogisticRegression())
DEBUG:skl2onnx:[Var] set parent for Variable('probabilities', 'probabilities', type=FloatTensorType(shape=[])), parent=Operator(type='SklearnLinearClassifier', onnx_name='SklearnLinearClassifier', inputs='transformed_column', outputs='label', raw_operator=LogisticRegression())
DEBUG:skl2onnx:[Op] add Out Variable('probabilities', 'probabilities', type=FloatTensorType(shape=[])) to Operator(type='SklearnLinearClassifier', onnx_name='SklearnLinearClassifier', inputs='transformed_column', outputs='label,probabilities', raw_operator=LogisticRegression())
DEBUG:skl2onnx:[Op] +Operator(type='SklearnZipMap', onnx_name='SklearnZipMap', inputs='', outputs='', raw_operator=None)
DEBUG:skl2onnx:[Var] +Variable('output_label', 'output_label', type=StringTensorType(shape=[None]))
DEBUG:skl2onnx:[Var] set parent for Variable('output_label', 'output_label', type=StringTensorType(shape=[None])), parent=Operator(type='SklearnZipMap', onnx_name='SklearnZipMap', inputs='label,probabilities', outputs='', raw_operator=None)
DEBUG:skl2onnx:[Op] add Out Variable('output_label', 'output_label', type=StringTensorType(shape=[None])) to Operator(type='SklearnZipMap', onnx_name='SklearnZipMap', inputs='label,probabilities', outputs='output_label', raw_operator=None)
DEBUG:skl2onnx:[Var] +Variable('output_probability', 'output_probability', type=SequenceType(element_type=DictionaryType(key_type=StringTensorType(shape=[None]), value_type=FloatTensorType(shape=[]))))
DEBUG:skl2onnx:[Var] set parent for Variable('output_probability', 'output_probability', type=SequenceType(element_type=DictionaryType(key_type=StringTensorType(shape=[None]), value_type=FloatTensorType(shape=[])))), parent=Operator(type='SklearnZipMap', onnx_name='SklearnZipMap', inputs='label,probabilities', outputs='output_label', raw_operator=None)
DEBUG:skl2onnx:[Op] add Out Variable('output_probability', 'output_probability', type=SequenceType(element_type=DictionaryType(key_type=StringTensorType(shape=[None]), value_type=FloatTensorType(shape=[])))) to Operator(type='SklearnZipMap', onnx_name='SklearnZipMap', inputs='label,probabilities', outputs='output_label,output_probability', raw_operator=None)
DEBUG:skl2onnx:[Op] update is_evaluated=True for Operator(type='SklearnZipMap', onnx_name='SklearnZipMap', inputs='label,probabilities', outputs='output_label,output_probability', raw_operator=None)
DEBUG:skl2onnx:[Var] update is_leaf=True for Variable('output_label', 'output_label', type=StringTensorType(shape=[None]))
DEBUG:skl2onnx:[Var] update is_leaf=True for Variable('output_probability', 'output_probability', type=SequenceType(element_type=DictionaryType(key_type=StringTensorType(shape=[None]), value_type=FloatTensorType(shape=[]))))
DEBUG:skl2onnx:[Var] update is_fed=True for Variable('pclass', 'pclass', type=FloatTensorType(shape=[None, 1])), parent=None
DEBUG:skl2onnx:[Var] update is_fed=True for Variable('name', 'name', type=StringTensorType(shape=[None, 1])), parent=None
DEBUG:skl2onnx:[Var] update is_fed=True for Variable('sex', 'sex', type=StringTensorType(shape=[None, 1])), parent=None
DEBUG:skl2onnx:[Var] update is_fed=True for Variable('age', 'age', type=FloatTensorType(shape=[None, 1])), parent=None
DEBUG:skl2onnx:[Var] update is_fed=True for Variable('sibsp', 'sibsp', type=FloatTensorType(shape=[None, 1])), parent=None
DEBUG:skl2onnx:[Var] update is_fed=True for Variable('parch', 'parch', type=FloatTensorType(shape=[None, 1])), parent=None
DEBUG:skl2onnx:[Var] update is_fed=True for Variable('ticket', 'ticket', type=StringTensorType(shape=[None, 1])), parent=None
DEBUG:skl2onnx:[Var] update is_fed=True for Variable('fare', 'fare', type=FloatTensorType(shape=[None, 1])), parent=None
DEBUG:skl2onnx:[Var] update is_fed=True for Variable('cabin', 'cabin', type=StringTensorType(shape=[None, 1])), parent=None
DEBUG:skl2onnx:[Var] update is_fed=True for Variable('embarked', 'embarked', type=StringTensorType(shape=[None, 1])), parent=None
DEBUG:skl2onnx:[Var] update is_fed=True for Variable('boat', 'boat', type=StringTensorType(shape=[None, 1])), parent=None
DEBUG:skl2onnx:[Var] update is_fed=True for Variable('body', 'body', type=FloatTensorType(shape=[None, 1])), parent=None
DEBUG:skl2onnx:[Var] update is_fed=True for Variable('home_dest', 'home_dest', type=StringTensorType(shape=[None, 1])), parent=None
DEBUG:skl2onnx:[Var] update is_fed=False for Variable('merged_columns', 'merged_columns', type=FloatTensorType(shape=[None, None])), parent=Operator(type='SklearnConcat', onnx_name='SklearnConcat', inputs='pclass,age,sibsp,parch,fare,body', outputs='merged_columns', raw_operator=None)
DEBUG:skl2onnx:[Var] update is_fed=False for Variable('variable', 'variable', type=FloatTensorType(shape=[])), parent=Operator(type='SklearnSimpleImputer', onnx_name='SklearnSimpleImputer', inputs='merged_columns', outputs='variable', raw_operator=SimpleImputer(strategy='median'))
DEBUG:skl2onnx:[Var] update is_fed=False for Variable('variable', 'variable1', type=FloatTensorType(shape=[])), parent=Operator(type='SklearnScaler', onnx_name='SklearnScaler', inputs='variable', outputs='variable1', raw_operator=StandardScaler())
DEBUG:skl2onnx:[Var] update is_fed=False for Variable('merged_columns', 'merged_columns1', type=StringTensorType(shape=[None, None])), parent=Operator(type='SklearnConcat', onnx_name='SklearnConcat1', inputs='name,sex,ticket,cabin,embarked,boat,home_dest', outputs='merged_columns1', raw_operator=None)
DEBUG:skl2onnx:[Var] update is_fed=False for Variable('variable', 'variable2', type=StringTensorType(shape=[])), parent=Operator(type='SklearnSimpleImputer', onnx_name='SklearnSimpleImputer1', inputs='merged_columns1', outputs='variable2', raw_operator=SimpleImputer(missing_values='', strategy='most_frequent'))
DEBUG:skl2onnx:[Var] update is_fed=False for Variable('variable', 'variable3', type=FloatTensorType(shape=[])), parent=Operator(type='SklearnOrdinalEncoder', onnx_name='SklearnOrdinalEncoder', inputs='variable2', outputs='variable3', raw_operator=OrdinalEncoder(dtype=<class'numpy.int32'>,handle_unknown='use_encoded_value',unknown_value=-999))
DEBUG:skl2onnx:[Var] update is_fed=False for Variable('transformed_column', 'transformed_column', type=FloatTensorType(shape=[None, None])), parent=Operator(type='SklearnConcat', onnx_name='SklearnConcat2', inputs='variable1,variable3', outputs='transformed_column', raw_operator=None)
DEBUG:skl2onnx:[Var] update is_fed=False for Variable('label', 'label', type=Int64TensorType(shape=[])), parent=Operator(type='SklearnLinearClassifier', onnx_name='SklearnLinearClassifier', inputs='transformed_column', outputs='label,probabilities', raw_operator=LogisticRegression())
DEBUG:skl2onnx:[Var] update is_fed=False for Variable('probabilities', 'probabilities', type=FloatTensorType(shape=[])), parent=Operator(type='SklearnLinearClassifier', onnx_name='SklearnLinearClassifier', inputs='transformed_column', outputs='label,probabilities', raw_operator=LogisticRegression())
DEBUG:skl2onnx:[Var] update is_fed=False for Variable('output_label', 'output_label', type=StringTensorType(shape=[None])), parent=Operator(type='SklearnZipMap', onnx_name='SklearnZipMap', inputs='label,probabilities', outputs='output_label,output_probability', raw_operator=None)
DEBUG:skl2onnx:[Var] update is_fed=False for Variable('output_probability', 'output_probability', type=SequenceType(element_type=DictionaryType(key_type=StringTensorType(shape=[None]), value_type=FloatTensorType(shape=[])))), parent=Operator(type='SklearnZipMap', onnx_name='SklearnZipMap', inputs='label,probabilities', outputs='output_label,output_probability', raw_operator=None)
DEBUG:skl2onnx:[Op] update is_evaluated=False for Operator(type='SklearnConcat', onnx_name='SklearnConcat', inputs='pclass,age,sibsp,parch,fare,body', outputs='merged_columns', raw_operator=None)
DEBUG:skl2onnx:[Op] update is_evaluated=False for Operator(type='SklearnSimpleImputer', onnx_name='SklearnSimpleImputer', inputs='merged_columns', outputs='variable', raw_operator=SimpleImputer(strategy='median'))
DEBUG:skl2onnx:[Op] update is_evaluated=False for Operator(type='SklearnScaler', onnx_name='SklearnScaler', inputs='variable', outputs='variable1', raw_operator=StandardScaler())
DEBUG:skl2onnx:[Op] update is_evaluated=False for Operator(type='SklearnConcat', onnx_name='SklearnConcat1', inputs='name,sex,ticket,cabin,embarked,boat,home_dest', outputs='merged_columns1', raw_operator=None)
DEBUG:skl2onnx:[Op] update is_evaluated=False for Operator(type='SklearnSimpleImputer', onnx_name='SklearnSimpleImputer1', inputs='merged_columns1', outputs='variable2', raw_operator=SimpleImputer(missing_values='', strategy='most_frequent'))
DEBUG:skl2onnx:[Op] update is_evaluated=False for Operator(type='SklearnOrdinalEncoder', onnx_name='SklearnOrdinalEncoder', inputs='variable2', outputs='variable3', raw_operator=OrdinalEncoder(dtype=<class'numpy.int32'>,handle_unknown='use_encoded_value',unknown_value=-999))
DEBUG:skl2onnx:[Op] update is_evaluated=False for Operator(type='SklearnConcat', onnx_name='SklearnConcat2', inputs='variable1,variable3', outputs='transformed_column', raw_operator=None)
DEBUG:skl2onnx:[Op] update is_evaluated=False for Operator(type='SklearnLinearClassifier', onnx_name='SklearnLinearClassifier', inputs='transformed_column', outputs='label,probabilities', raw_operator=LogisticRegression())
DEBUG:skl2onnx:[Op] update is_evaluated=False for Operator(type='SklearnZipMap', onnx_name='SklearnZipMap', inputs='label,probabilities', outputs='output_label,output_probability', raw_operator=None)
DEBUG:skl2onnx:[Shape2] call infer_types for Operator(type='SklearnConcat', onnx_name='SklearnConcat', inputs='pclass,age,sibsp,parch,fare,body', outputs='merged_columns', raw_operator=None)
DEBUG:skl2onnx:[Shape-a] Operator(type='SklearnConcat', onnx_name='SklearnConcat', inputs='pclass,age,sibsp,parch,fare,body', outputs='merged_columns', raw_operator=None) fed 'TrueTrueTrueTrueTrueTrue' - 'False'
DEBUG:skl2onnx:[Var] update type for Variable('merged_columns', 'merged_columns', type=FloatTensorType(shape=[None, None]))
DEBUG:skl2onnx:[Shape-b] Operator(type='SklearnConcat', onnx_name='SklearnConcat', inputs='pclass,age,sibsp,parch,fare,body', outputs='merged_columns', raw_operator=None) inputs=[Variable('pclass', 'pclass', type=FloatTensorType(shape=[None, 1])), Variable('age', 'age', type=FloatTensorType(shape=[None, 1])), Variable('sibsp', 'sibsp', type=FloatTensorType(shape=[None, 1])), Variable('parch', 'parch', type=FloatTensorType(shape=[None, 1])), Variable('fare', 'fare', type=FloatTensorType(shape=[None, 1])), Variable('body', 'body', type=FloatTensorType(shape=[None, 1]))] - outputs=[Variable('merged_columns', 'merged_columns', type=FloatTensorType(shape=[None, 6]))]
DEBUG:skl2onnx:[Conv] call Operator(type='SklearnConcat', onnx_name='SklearnConcat', inputs='pclass,age,sibsp,parch,fare,body', outputs='merged_columns', raw_operator=None) fed 'TrueTrueTrueTrueTrueTrue' - 'False'
DEBUG:skl2onnx:[Node] 'Cast' - 'pclass' -> 'pclass_cast' (name='Cast')
DEBUG:skl2onnx:[Node] 'Cast' - 'age' -> 'age_cast' (name='Cast1')
DEBUG:skl2onnx:[Node] 'Cast' - 'sibsp' -> 'sibsp_cast' (name='Cast2')
DEBUG:skl2onnx:[Node] 'Cast' - 'parch' -> 'parch_cast' (name='Cast3')
DEBUG:skl2onnx:[Node] 'Cast' - 'fare' -> 'fare_cast' (name='Cast4')
DEBUG:skl2onnx:[Node] 'Cast' - 'body' -> 'body_cast' (name='Cast5')
DEBUG:skl2onnx:[Node] 'Concat' - 'pclass_cast,age_cast,sibsp_cast,parch_cast,fare_cast,body_cast' -> 'merged_columns' (name='Concat')
DEBUG:skl2onnx:[Conv] end - Operator(type='SklearnConcat', onnx_name='SklearnConcat', inputs='pclass,age,sibsp,parch,fare,body', outputs='merged_columns', raw_operator=None)
DEBUG:skl2onnx:[Op] update is_evaluated=True for Operator(type='SklearnConcat', onnx_name='SklearnConcat', inputs='pclass,age,sibsp,parch,fare,body', outputs='merged_columns', raw_operator=None)
DEBUG:skl2onnx:[Var] update is_fed=True for Variable('merged_columns', 'merged_columns', type=FloatTensorType(shape=[None, 6])), parent=Operator(type='SklearnConcat', onnx_name='SklearnConcat', inputs='pclass,age,sibsp,parch,fare,body', outputs='merged_columns', raw_operator=None)
DEBUG:skl2onnx:[Shape2] call infer_types for Operator(type='SklearnSimpleImputer', onnx_name='SklearnSimpleImputer', inputs='merged_columns', outputs='variable', raw_operator=SimpleImputer(strategy='median'))
DEBUG:skl2onnx:[Shape-a] Operator(type='SklearnSimpleImputer', onnx_name='SklearnSimpleImputer', inputs='merged_columns', outputs='variable', raw_operator=SimpleImputer(strategy='median')) fed 'True' - 'False'
DEBUG:skl2onnx:[Shape-b] Operator(type='SklearnSimpleImputer', onnx_name='SklearnSimpleImputer', inputs='merged_columns', outputs='variable', raw_operator=SimpleImputer(strategy='median')) inputs=[Variable('merged_columns', 'merged_columns', type=FloatTensorType(shape=[None, 6]))] - outputs=[Variable('variable', 'variable', type=FloatTensorType(shape=[None, 6]))]
DEBUG:skl2onnx:[Conv] call Operator(type='SklearnSimpleImputer', onnx_name='SklearnSimpleImputer', inputs='merged_columns', outputs='variable', raw_operator=SimpleImputer(strategy='median')) fed 'True' - 'False'
DEBUG:skl2onnx:[Node] 'Imputer' - 'merged_columns' -> 'variable' (name='Imputer')
DEBUG:skl2onnx:[Conv] end - Operator(type='SklearnSimpleImputer', onnx_name='SklearnSimpleImputer', inputs='merged_columns', outputs='variable', raw_operator=SimpleImputer(strategy='median'))
DEBUG:skl2onnx:[Op] update is_evaluated=True for Operator(type='SklearnSimpleImputer', onnx_name='SklearnSimpleImputer', inputs='merged_columns', outputs='variable', raw_operator=SimpleImputer(strategy='median'))
DEBUG:skl2onnx:[Var] update is_fed=True for Variable('variable', 'variable', type=FloatTensorType(shape=[None, 6])), parent=Operator(type='SklearnSimpleImputer', onnx_name='SklearnSimpleImputer', inputs='merged_columns', outputs='variable', raw_operator=SimpleImputer(strategy='median'))
DEBUG:skl2onnx:[Shape2] call infer_types for Operator(type='SklearnScaler', onnx_name='SklearnScaler', inputs='variable', outputs='variable1', raw_operator=StandardScaler())
DEBUG:skl2onnx:[Shape-a] Operator(type='SklearnScaler', onnx_name='SklearnScaler', inputs='variable', outputs='variable1', raw_operator=StandardScaler()) fed 'True' - 'False'
DEBUG:skl2onnx:[Shape-b] Operator(type='SklearnScaler', onnx_name='SklearnScaler', inputs='variable', outputs='variable1', raw_operator=StandardScaler()) inputs=[Variable('variable', 'variable', type=FloatTensorType(shape=[None, 6]))] - outputs=[Variable('variable', 'variable1', type=FloatTensorType(shape=[None, 6]))]
DEBUG:skl2onnx:[Conv] call Operator(type='SklearnScaler', onnx_name='SklearnScaler', inputs='variable', outputs='variable1', raw_operator=StandardScaler()) fed 'True' - 'False'
DEBUG:skl2onnx:[Node] 'Scaler' - 'variable' -> 'variable1' (name='Scaler')
DEBUG:skl2onnx:[Conv] end - Operator(type='SklearnScaler', onnx_name='SklearnScaler', inputs='variable', outputs='variable1', raw_operator=StandardScaler())
[convert_sklearn] parse_sklearn_model
[convert_sklearn] convert_topology
[convert_operators] begin
[convert_operators] iteration 1 - n_vars=0 n_ops=9
[call_converter] call converter for 'SklearnConcat'.
[call_converter] call converter for 'SklearnSimpleImputer'.
[call_converter] call converter for 'SklearnScaler'.
DEBUG:skl2onnx:[Op] update is_evaluated=True for Operator(type='SklearnScaler', onnx_name='SklearnScaler', inputs='variable', outputs='variable1', raw_operator=StandardScaler())
DEBUG:skl2onnx:[Var] update is_fed=True for Variable('variable', 'variable1', type=FloatTensorType(shape=[None, 6])), parent=Operator(type='SklearnScaler', onnx_name='SklearnScaler', inputs='variable', outputs='variable1', raw_operator=StandardScaler())
DEBUG:skl2onnx:[Shape2] call infer_types for Operator(type='SklearnConcat', onnx_name='SklearnConcat1', inputs='name,sex,ticket,cabin,embarked,boat,home_dest', outputs='merged_columns1', raw_operator=None)
DEBUG:skl2onnx:[Shape-a] Operator(type='SklearnConcat', onnx_name='SklearnConcat1', inputs='name,sex,ticket,cabin,embarked,boat,home_dest', outputs='merged_columns1', raw_operator=None) fed 'TrueTrueTrueTrueTrueTrueTrue' - 'False'
DEBUG:skl2onnx:[Var] update type for Variable('merged_columns', 'merged_columns1', type=StringTensorType(shape=[None, None]))
DEBUG:skl2onnx:[Shape-b] Operator(type='SklearnConcat', onnx_name='SklearnConcat1', inputs='name,sex,ticket,cabin,embarked,boat,home_dest', outputs='merged_columns1', raw_operator=None) inputs=[Variable('name', 'name', type=StringTensorType(shape=[None, 1])), Variable('sex', 'sex', type=StringTensorType(shape=[None, 1])), Variable('ticket', 'ticket', type=StringTensorType(shape=[None, 1])), Variable('cabin', 'cabin', type=StringTensorType(shape=[None, 1])), Variable('embarked', 'embarked', type=StringTensorType(shape=[None, 1])), Variable('boat', 'boat', type=StringTensorType(shape=[None, 1])), Variable('home_dest', 'home_dest', type=StringTensorType(shape=[None, 1]))] - outputs=[Variable('merged_columns', 'merged_columns1', type=StringTensorType(shape=[None, 7]))]
DEBUG:skl2onnx:[Conv] call Operator(type='SklearnConcat', onnx_name='SklearnConcat1', inputs='name,sex,ticket,cabin,embarked,boat,home_dest', outputs='merged_columns1', raw_operator=None) fed 'TrueTrueTrueTrueTrueTrueTrue' - 'False'
[call_converter] call converter for 'SklearnConcat'.
DEBUG:skl2onnx:[Node] 'Cast' - 'name' -> 'name_cast' (name='Cast6')
DEBUG:skl2onnx:[Node] 'Cast' - 'sex' -> 'sex_cast' (name='Cast7')
DEBUG:skl2onnx:[Node] 'Cast' - 'ticket' -> 'ticket_cast' (name='Cast8')
DEBUG:skl2onnx:[Node] 'Cast' - 'cabin' -> 'cabin_cast' (name='Cast9')
DEBUG:skl2onnx:[Node] 'Cast' - 'embarked' -> 'embarked_cast' (name='Cast10')
DEBUG:skl2onnx:[Node] 'Cast' - 'boat' -> 'boat_cast' (name='Cast11')
DEBUG:skl2onnx:[Node] 'Cast' - 'home_dest' -> 'home_dest_cast' (name='Cast12')
DEBUG:skl2onnx:[Node] 'Concat' - 'name_cast,sex_cast,ticket_cast,cabin_cast,embarked_cast,boat_cast,home_dest_cast' -> 'merged_columns1' (name='Concat1')
DEBUG:skl2onnx:[Conv] end - Operator(type='SklearnConcat', onnx_name='SklearnConcat1', inputs='name,sex,ticket,cabin,embarked,boat,home_dest', outputs='merged_columns1', raw_operator=None)
DEBUG:skl2onnx:[Op] update is_evaluated=True for Operator(type='SklearnConcat', onnx_name='SklearnConcat1', inputs='name,sex,ticket,cabin,embarked,boat,home_dest', outputs='merged_columns1', raw_operator=None)
DEBUG:skl2onnx:[Var] update is_fed=True for Variable('merged_columns', 'merged_columns1', type=StringTensorType(shape=[None, 7])), parent=Operator(type='SklearnConcat', onnx_name='SklearnConcat1', inputs='name,sex,ticket,cabin,embarked,boat,home_dest', outputs='merged_columns1', raw_operator=None)
DEBUG:skl2onnx:[Shape2] call infer_types for Operator(type='SklearnSimpleImputer', onnx_name='SklearnSimpleImputer1', inputs='merged_columns1', outputs='variable2', raw_operator=SimpleImputer(missing_values='', strategy='most_frequent'))
DEBUG:skl2onnx:[Shape-a] Operator(type='SklearnSimpleImputer', onnx_name='SklearnSimpleImputer1', inputs='merged_columns1', outputs='variable2', raw_operator=SimpleImputer(missing_values='', strategy='most_frequent')) fed 'True' - 'False'
DEBUG:skl2onnx:[Shape-b] Operator(type='SklearnSimpleImputer', onnx_name='SklearnSimpleImputer1', inputs='merged_columns1', outputs='variable2', raw_operator=SimpleImputer(missing_values='', strategy='most_frequent')) inputs=[Variable('merged_columns', 'merged_columns1', type=StringTensorType(shape=[None, 7]))] - outputs=[Variable('variable', 'variable2', type=StringTensorType(shape=[None, 7]))]
DEBUG:skl2onnx:[Conv] call Operator(type='SklearnSimpleImputer', onnx_name='SklearnSimpleImputer1', inputs='merged_columns1', outputs='variable2', raw_operator=SimpleImputer(missing_values='', strategy='most_frequent')) fed 'True' - 'False'
DEBUG:skl2onnx:[Init] 'zero', 7, [1]
DEBUG:skl2onnx:[Node] 'LabelEncoder' - 'zero' -> 'fillvalue' (name='N17')
[call_converter] call converter for 'SklearnSimpleImputer'.
Failed to create ONNX node. Undefined attribute pair (default_string, None) found for type 'LabelEncoder' and version 2

The conversion from category to string became necessary because using pd.NA as missing_values I get a type error.

Versions:
python 3.9.7
onnx==1.11.0
skl2onnx @ git+https://github.com/onnx/sklearn-onnx.git@c0a0eb8dd4033fdee36421665a05c9e19de4436e

@xadupre
Copy link
Collaborator

xadupre commented Feb 22, 2022

I was able to replicate the issue. I'll make a PR to fix it soon.

@torshind
Copy link
Author

Great, thanks :)

@emhagman
Copy link

emhagman commented Mar 3, 2023

@xadupre I am also running into this issue, but with the latest skl2onnx.

Undefined attribute pair (default_string, None) found for type 'LabelEncoder' and version 2

Using:

categorical_str_pipeline = Pipeline([
    ("imputer", SimpleImputer(strategy='most_frequent', add_indicator=True, missing_values='UNKNOWN')),
    ("encoder", OneHotEncoder(sparse=False, handle_unknown='ignore')),
])

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants