Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sklearn2pmml: 0.17.4 - InvalidOpcodeException for IsolationForest #31

Closed
dverstee opened this issue Mar 13, 2017 · 2 comments
Closed

sklearn2pmml: 0.17.4 - InvalidOpcodeException for IsolationForest #31

dverstee opened this issue Mar 13, 2017 · 2 comments

Comments

@dverstee
Copy link

Hello,

First of all thank you for creating and maintaining this great library.
Secondly I seem to be running into some issues when trying to create a PMML from following pipeline

    IForest_pipeline = PMMLPipeline([("isolationforest", IsolationForest(n_estimators=100,random_state=0))])

The respective errors are :
SEVERE: Failed to parse PKL net.razorvine.pickle.InvalidOpcodeException: invalid pickle opcode: 248

Versions are

sklearn:  0.18.1
sklearn.externals.joblib: 0.10.3
pandas:  0.19.2
sklearn_pandas:  1.3.0
sklearn2pmml:  0.17.4
java:1.8.0
joblib==0.11
python==3.6.0

The weird thing is that this code (using kmeans) generates the PMML correctly, so it might be something model specific

IForest_pipeline = PMMLPipeline([("classifier", KMeans(n_clusters=2, random_state=0))])

Run on windows 7 SP1 64bit

Could you give me pointers on where I might be looking to solve this error ?
Thanks in advance.

@vruusmann
Copy link
Member

SEVERE: Failed to parse PKL net.razorvine.pickle.InvalidOpcodeException: invalid pickle opcode: 248

JPMML-SkLearn depends on the Pyrolite library for low-level PKL file parsing functionality. Apparently, Pyrolite does not recognize Pickle protocol opcode 248 (hex 0xf8).

I can't find the definition of this opcode in Python 3.6 codebase:
https://github.com/python/cpython/blob/3.6/Lib/pickle.py#L102-L179

Could you give me pointers on where I might be looking to solve this error?

The stack trace of this exception points at method net.razorvine.pickle.Unpikcler#dispatch(short). You could try addinng a case statement for opcode 248 there, and collect more information about it (eg. which Scikit-Learn class, which class attribute, etc.). Then, try to factor a minimal reproducible example, and open a parallel issue with the Pyrolite project: https://github.com/irmen/Pyrolite/issues

The JPMML-SkLearn project includes an integration test for the IsolationForest model type:
https://github.com/jpmml/jpmml-sklearn/blob/master/src/test/resources/pkl/IsolationForestHousingAnomaly.pkl

This PKL file was generated using Python 3.4(.3). So, downgrading from Python 3.6 to 3.4 may provide a temporary workaround.

@dverstee
Copy link
Author

Thanks for the quick update.

I ran the code with python 3.4 and I have it confirmed working ! You are great man ! 👍

When running your debug pointer I noticed that all tree models are affected, but others not.
For now I'll use python 3.4 as a workaround and I'll close the Issue.

Thanks again.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants