Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for PyCaret transformers #175

Closed
szymoonl opened this issue Jun 21, 2022 · 3 comments
Closed

Support for PyCaret transformers #175

szymoonl opened this issue Jun 21, 2022 · 3 comments

Comments

@szymoonl
Copy link

According to this comment, I tried to convert Pycaret model as follow:

prep_pipe = get_config('prep_pipe')
dt = create_model('dt')
final_dt = finalize_model(dt)

from sklearn2pmml.pipeline import PMMLPipeline

pmml_pipeline = PMMLPipeline([
	("prep_pipe", prep_pipe),
	("final_model", final_dt )
])

import pickle

with open("test_pmml.pkl", "wb") as pf:
	pickle.dump(pmml_pipeline, pf)

sklearn: 0.23.2
sklearn pandas: 2.2.0
sklearn2pmml: 0.84.1
pycaret: 2.3.6
openjdk version "11.0.15" 2022-04-19

The following exception occurred during conversion using a .jar:

java -jar /jpmml-sklearn/pmml-sklearn-example/target/pmml-sklearn-example-executable-1.7-SNAPSHOT.jar --pkl-input /test_pmml.pkl --pmml-output model_pmml.pmml
Jun 21, 2022 8:06:26 AM org.jpmml.sklearn.example.Main run
INFO: Parsing PKL..
Jun 21, 2022 8:06:26 AM org.jpmml.sklearn.example.Main run
INFO: Parsed PKL in 57 ms.
Jun 21, 2022 8:06:26 AM org.jpmml.sklearn.example.Main run
INFO: Converting PKL to PMML..
Jun 21, 2022 8:06:26 AM sklearn2pmml.pipeline.PMMLPipeline initTargetFields
WARNING: Attribute 'sklearn2pmml.pipeline.PMMLPipeline.target_fields' is not set. Assuming y as the name of the target field
Jun 21, 2022 8:06:26 AM org.jpmml.sklearn.example.Main run
SEVERE: Failed to convert PKL to PMML
java.lang.IllegalArgumentException: The transformer object (Python class pycaret.internal.preprocess.DataTypes_Auto_infer) is not a supported Transformer
at org.jpmml.python.CastFunction.apply(CastFunction.java:47)
at sklearn.pipeline.Pipeline$1.apply(Pipeline.java:108)
at sklearn.pipeline.Pipeline$1.apply(Pipeline.java:95)
at com.google.common.collect.Lists$TransformingRandomAccessList.get(Lists.java:638)
at sklearn2pmml.pipeline.PMMLPipeline.getHead(PMMLPipeline.java:629)
at sklearn2pmml.pipeline.PMMLPipeline.getHead(PMMLPipeline.java:642)
at sklearn2pmml.pipeline.PMMLPipeline.encodePMML(PMMLPipeline.java:198)
at org.jpmml.sklearn.example.Main.run(Main.java:226)
at org.jpmml.sklearn.example.Main.main(Main.java:151)
Caused by: java.lang.ClassCastException: Cannot cast net.razorvine.pickle.objects.ClassDict to sklearn.Transformer
at java.base/java.lang.Class.cast(Class.java:3605)
at org.jpmml.python.CastFunction.apply(CastFunction.java:45)
... 8 more

Exception in thread "main" java.lang.IllegalArgumentException: The transformer object (Python class pycaret.internal.preprocess.DataTypes_Auto_infer) is not a supported Transformer
at org.jpmml.python.CastFunction.apply(CastFunction.java:47)
at sklearn.pipeline.Pipeline$1.apply(Pipeline.java:108)
at sklearn.pipeline.Pipeline$1.apply(Pipeline.java:95)
at com.google.common.collect.Lists$TransformingRandomAccessList.get(Lists.java:638)
at sklearn2pmml.pipeline.PMMLPipeline.getHead(PMMLPipeline.java:629)
at sklearn2pmml.pipeline.PMMLPipeline.getHead(PMMLPipeline.java:642)
at sklearn2pmml.pipeline.PMMLPipeline.encodePMML(PMMLPipeline.java:198)
at org.jpmml.sklearn.example.Main.run(Main.java:226)
at org.jpmml.sklearn.example.Main.main(Main.java:151)
Caused by: java.lang.ClassCastException: Cannot cast net.razorvine.pickle.objects.ClassDict to sklearn.Transformer
at java.base/java.lang.Class.cast(Class.java:3605)
at org.jpmml.python.CastFunction.apply(CastFunction.java:45)
... 8 more

How to solve this? 🤔
Thank you in advance!

@vruusmann
Copy link
Member

According to this comment, I tried to convert Pycaret model as follow:

Great to see that this old workaround is still valid!

However, I wonder if PyCaret has "systematized" their workflows, so that they could be programmatically converted to standard Scikit-Learn pipeline objects.

Exception in thread "main" java.lang.IllegalArgumentException: The transformer object (Python class pycaret.internal.preprocess.DataTypes_Auto_infer) is not a supported Transformer

Just as the exception message points out - there is a custom PyCaret transformer class pycaret.internal.preprocess.DataTypes_Auto_infer in your pipeline.

Potential solutions:

  1. Write a SkLearn2PMML/JPMML-SkLearn handler for this custom transformer class.
  2. Remove this step from the pipeline. Maybe if you specify column names/types explicitly, then PyCaret will have all the necessary information available, and won't perform any inference work by itself.

For starters, try to convert the model without pre-processing. When you can get this part working, only then start adding complexity (such as pre-processing).

@vruusmann vruusmann changed the title Failed pycaret model conversion Support for PyCaret transformers Jun 21, 2022
@szymoonl
Copy link
Author

When trying to convert the model alone without preprocessing, the following error appears:

Aug 16, 2022 12:48:12 PM org.jpmml.sklearn.example.Main run
INFO: Parsing PKL..
Aug 16, 2022 12:48:12 PM org.jpmml.sklearn.example.Main run
SEVERE: Failed to parse PKL
net.razorvine.pickle.PickleException: failed to setstate()
at net.razorvine.pickle.Unpickler.load_build(Unpickler.java:395)
at net.razorvine.pickle.Unpickler.dispatch(Unpickler.java:220)
at org.jpmml.python.CustomUnpickler.dispatch(CustomUnpickler.java:31)
at org.jpmml.python.PickleUtil$1.dispatch(PickleUtil.java:64)
at net.razorvine.pickle.Unpickler.load(Unpickler.java:109)
at org.jpmml.python.PickleUtil.unpickle(PickleUtil.java:85)
at org.jpmml.sklearn.example.Main.run(Main.java:163)
at org.jpmml.sklearn.example.Main.main(Main.java:151)
Caused by: java.lang.NoSuchMethodException: net.razorvine.pickle.objects.ClassDict.setstate(java.lang.Integer)
at java.base/java.lang.Class.getMethod(Class.java:2108)
at net.razorvine.pickle.Unpickler.load_build(Unpickler.java:392)
... 7 more

Exception in thread "main" net.razorvine.pickle.PickleException: failed to setstate()
at net.razorvine.pickle.Unpickler.load_build(Unpickler.java:395)
at net.razorvine.pickle.Unpickler.dispatch(Unpickler.java:220)
at org.jpmml.python.CustomUnpickler.dispatch(CustomUnpickler.java:31)
at org.jpmml.python.PickleUtil$1.dispatch(PickleUtil.java:64)
at net.razorvine.pickle.Unpickler.load(Unpickler.java:109)
at org.jpmml.python.PickleUtil.unpickle(PickleUtil.java:85)
at org.jpmml.sklearn.example.Main.run(Main.java:163)
at org.jpmml.sklearn.example.Main.main(Main.java:151)
Caused by: java.lang.NoSuchMethodException: net.razorvine.pickle.objects.ClassDict.setstate(java.lang.Integer)
at java.base/java.lang.Class.getMethod(Class.java:2108)
at net.razorvine.pickle.Unpickler.load_build(Unpickler.java:392)
... 7 more

The model is a random forest but trained with GPU, so it is a cuml object:

RandomForestClassifier()
<class 'cuml.ensemble.randomforestclassifier.RandomForestClassifier'>

Converting the model alone without GPU as a sklearn object works without problem. 🤔



When converting model with preprocessing pipeline I got below exception:

Aug 16, 2022 12:48:32 PM org.jpmml.sklearn.example.Main run
INFO: Parsing PKL..
Aug 16, 2022 12:48:32 PM org.jpmml.sklearn.example.Main run
SEVERE: Failed to parse PKL
net.razorvine.pickle.PickleException: failed to setstate()
at net.razorvine.pickle.Unpickler.load_build(Unpickler.java:395)
at net.razorvine.pickle.Unpickler.dispatch(Unpickler.java:220)
at org.jpmml.python.CustomUnpickler.dispatch(CustomUnpickler.java:31)
at org.jpmml.python.PickleUtil$1.dispatch(PickleUtil.java:64)
at net.razorvine.pickle.Unpickler.load(Unpickler.java:109)
at org.jpmml.python.PickleUtil.unpickle(PickleUtil.java:85)
at org.jpmml.sklearn.example.Main.run(Main.java:163)
at org.jpmml.sklearn.example.Main.main(Main.java:151)
Caused by: java.lang.reflect.InvocationTargetException
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:566)
at net.razorvine.pickle.Unpickler.load_build(Unpickler.java:393)
... 7 more
Caused by: net.razorvine.pickle.PickleException: Expected 8 attribute(s), got 9 attribute(s)
at org.jpmml.python.CustomPythonObject.createAttributeMap(CustomPythonObject.java:81)
at numpy.DType.setstate(DType.java:50)
... 12 more

Exception in thread "main" net.razorvine.pickle.PickleException: failed to setstate()
at net.razorvine.pickle.Unpickler.load_build(Unpickler.java:395)
at net.razorvine.pickle.Unpickler.dispatch(Unpickler.java:220)
at org.jpmml.python.CustomUnpickler.dispatch(CustomUnpickler.java:31)
at org.jpmml.python.PickleUtil$1.dispatch(PickleUtil.java:64)
at net.razorvine.pickle.Unpickler.load(Unpickler.java:109)
at org.jpmml.python.PickleUtil.unpickle(PickleUtil.java:85)
at org.jpmml.sklearn.example.Main.run(Main.java:163)
at org.jpmml.sklearn.example.Main.main(Main.java:151)
Caused by: java.lang.reflect.InvocationTargetException
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:566)
at net.razorvine.pickle.Unpickler.load_build(Unpickler.java:393)
... 7 more
Caused by: net.razorvine.pickle.PickleException: Expected 8 attribute(s), got 9 attribute(s)
at org.jpmml.python.CustomPythonObject.createAttributeMap(CustomPythonObject.java:81)
at numpy.DType.setstate(DType.java:50)
... 12 more

@vruusmann
Copy link
Member

@szymoonl Please open a new issue for each unsupported ML framework.

Otherwise I'll classify all your messages as "spam", and send to trash.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants