Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

non-zero exit status 1 #13

Closed
leslietso opened this issue Sep 30, 2016 · 11 comments
Closed

non-zero exit status 1 #13

leslietso opened this issue Sep 30, 2016 · 11 comments

Comments

@leslietso
Copy link

leslietso commented Sep 30, 2016

I am currently attempting to transfer my sklearn model into a pmml file which will be read in a java program later on. My datasets have already gone through preprocessing so further transformations will not be very helpful. However, I keep receiving the error below even when I try different transformations.

The model seems to have been generated successfully and the output from the DataFrameMapper also seems correct. However, the error pops up when attempting to create and export the pmml file.

Code:

import pandas
import sklearn_pandas

import numpy as np
import pandas as pd

import sklearn
from sklearn import cross_validation
from sklearn.neighbors import KNeighborsRegressor

df = pd.read_csv('features.csv')
df = df.rename(columns={'Unnamed: 0': 'rowNumber'})
df = df.drop('rowNumber', 1)
print(len(df))
df = df.drop_duplicates()
print(len(df))
df = df[df.pitch <= 100][df.pitch >= 0]
print(len(df))
df = df.drop('matrix', 1)

df60 = df[df.participant < 22]
dfTrainInput = df60.drop('pitch', 1)

listofColumns = list(dfTrainInput.columns.values)

#DataFrameMapper step
df_mapper = sklearn_pandas.DataFrameMapper([
    (listofColumns, None),
    ("pitch", None)
])

data = df_mapper.fit_transform(df)

data_Input = data[:, 0:len(data[0]) - 1]
data_Target = data[:, len(data[0]) - 1]

#Training Step
neigh = KNeighborsRegressor(n_neighbors=400, algorithm='kd_tree', leaf_size=30, n_jobs = -1)
neigh.fit(data_Input, data_Target)

#PMML conversion and export step
from sklearn2pmml import sklearn2pmml
sklearn2pmml(neigh, df_mapper, "FeaturesAsFeaturesKNN.pmml", with_repr = True)

Error:

Traceback (most recent call last):
  File "/Users/leslie/Downloads/test_sklearn2pmml.py", line 73, in <module>
    sklearn2pmml(neigh, df_mapper, "FeaturesAsFeaturesKNN.pmml", with_repr = True)
  File "/Users/leslie/Library/Python/2.7/lib/python/site-packages/sklearn2pmml/__init__.py", line 65, in sklearn2pmml
    subprocess.check_call(cmd)
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/subprocess.py", line 541, in check_call
    raise CalledProcessError(retcode, cmd)
CalledProcessError: Command '['java', '-cp', '/Users/leslie/Library/Python/2.7/lib/python/site-packages/sklearn2pmml/resources/guava-19.0.jar:/Users/leslie/Library/Python/2.7/lib/python/site-packages/sklearn2pmml/resources/istack-commons-runtime-2.21.jar:/Users/leslie/Library/Python/2.7/lib/python/site-packages/sklearn2pmml/resources/jaxb-core-2.2.11.jar:/Users/leslie/Library/Python/2.7/lib/python/site-packages/sklearn2pmml/resources/jaxb-runtime-2.2.11.jar:/Users/leslie/Library/Python/2.7/lib/python/site-packages/sklearn2pmml/resources/jcommander-1.48.jar:/Users/leslie/Library/Python/2.7/lib/python/site-packages/sklearn2pmml/resources/jpmml-converter-1.1.1.jar:/Users/leslie/Library/Python/2.7/lib/python/site-packages/sklearn2pmml/resources/jpmml-sklearn-1.1.2.jar:/Users/leslie/Library/Python/2.7/lib/python/site-packages/sklearn2pmml/resources/jpmml-xgboost-1.1.1.jar:/Users/leslie/Library/Python/2.7/lib/python/site-packages/sklearn2pmml/resources/pmml-agent-1.3.3.jar:/Users/leslie/Library/Python/2.7/lib/python/site-packages/sklearn2pmml/resources/pmml-model-1.3.3.jar:/Users/leslie/Library/Python/2.7/lib/python/site-packages/sklearn2pmml/resources/pmml-model-metro-1.3.3.jar:/Users/leslie/Library/Python/2.7/lib/python/site-packages/sklearn2pmml/resources/pmml-schema-1.3.3.jar:/Users/leslie/Library/Python/2.7/lib/python/site-packages/sklearn2pmml/resources/pyrolite-4.13.jar:/Users/leslie/Library/Python/2.7/lib/python/site-packages/sklearn2pmml/resources/serpent-1.12.jar:/Users/leslie/Library/Python/2.7/lib/python/site-packages/sklearn2pmml/resources/slf4j-api-1.7.21.jar:/Users/leslie/Library/Python/2.7/lib/python/site-packages/sklearn2pmml/resources/slf4j-jdk14-1.7.21.jar', 'org.jpmml.sklearn.Main', '--pkl-estimator-input', '/var/folders/ph/8rj4z16x1w3cy86hj3qh7nxh0000gn/T/estimator-8TGJgQ.pkl.z', '--repr-estimator', "KNeighborsRegressor(algorithm='kd_tree', leaf_size=30, metric='minkowski',\n          metric_params=None, n_jobs=-1, n_neighbors=400, p=2,\n          weights='uniform')", '--pkl-mapper-input', '/var/folders/ph/8rj4z16x1w3cy86hj3qh7nxh0000gn/T/mapper-1hiTl4.pkl.z', '--repr-mapper', "DataFrameMapper(features=[(['participant', 'condition', 'yaw', 'touchX', 'touchY', 'touchW', 'touchH', 'S0cx', 'S0cy', 'S0eo', 'S0evp', 'S0evm', 'S0ee', 'S1cx', 'S1cy', 'S1eo', 'S1evp', 'S1evm', 'S1ee', 'S2cx', 'S2cy', 'S2eo', 'S2evp', 'S2evm', 'S2ee', 'T0cx', 'T0cy', 'T0eo', 'T0evp', 'T0evm', 'T0ee', 'T1cx', 'T1cy', 'T1eo', 'T1evp', 'T1evm', 'T1ee', 'T2cx', 'T2cy', 'T2eo', 'T2evp', 'T2evm', 'T2ee', 'Ucx', 'Ucy', 'Ueo', 'Uevp', 'Uevm', 'Uee'], None), ('pitch', None)],\n        sparse=False)", '--pmml-output', 'FeaturesAsFeaturesKNN.pmml']' returned non-zero exit status 1

Weirdly enough, attempting to run the same code with Jupyter returns "OSError: [Errno 2] No such file or directory".

---------------------------------------------------------------------------
OSError                                   Traceback (most recent call last)
<ipython-input-39-3b3cc81c782e> in <module>()
      1 from sklearn2pmml import sklearn2pmml
      2 
----> 3 sklearn2pmml(neigh, df_mapper, 'FeaturesAsFeaturesKNN.pmml', with_repr = True)

/home/mayersn/.local/lib/python2.7/site-packages/sklearn2pmml/__init__.pyc in sklearn2pmml(estimator, mapper, pmml, with_repr, debug)
     63                 if(debug):
     64                         print(" ".join(cmd))
---> 65                 subprocess.check_call(cmd)
     66         finally:
     67                 if(debug):

/usr/lib/python2.7/subprocess.pyc in check_call(*popenargs, **kwargs)
    533     check_call(["ls", "-l"])
    534     """
--> 535     retcode = call(*popenargs, **kwargs)
    536     if retcode:
    537         cmd = kwargs.get("args")

/usr/lib/python2.7/subprocess.pyc in call(*popenargs, **kwargs)
    520     retcode = call(["ls", "-l"])
    521     """
--> 522     return Popen(*popenargs, **kwargs).wait()
    523 
    524 

/usr/lib/python2.7/subprocess.pyc in __init__(self, args, bufsize, executable, stdin, stdout, stderr, preexec_fn, close_fds, shell, cwd, env, universal_newlines, startupinfo, creationflags)
    708                                 p2cread, p2cwrite,
    709                                 c2pread, c2pwrite,
--> 710                                 errread, errwrite)
    711         except Exception:
    712             # Preserve original exception in case os.close raises.

/usr/lib/python2.7/subprocess.pyc in _execute_child(self, args, executable, preexec_fn, close_fds, cwd, env, universal_newlines, startupinfo, creationflags, shell, to_close, p2cread, p2cwrite, c2pread, c2pwrite, errread, errwrite)
   1325                         raise
   1326                 child_exception = pickle.loads(data)
-> 1327                 raise child_exception
   1328 
   1329 

OSError: [Errno 2] No such file or directory

Any help or insight into the problem would be tremendously helpful!

@vruusmann
Copy link
Member

You should re-run your code with debug on:

sklearn2pmml(.., debug = True)

This will give you estimator and mapper Pickle files. Can you convert them using the JPMML-SkLearn command-line application?

@leslietso
Copy link
Author

leslietso commented Sep 30, 2016

Thank you for the quick reply.

Unfortunately, I am unable to convert the Pickle files when using the command-line.

When running java -jar target/converter-executable-1.1-SNAPSHOT.jar --pkl-input estimator.pkl --pmml-output estimator.pmml

I get the error:

Oct 01, 2016 12:59:30 AM org.jpmml.sklearn.Main run
INFO: Parsing Estimator PKL..
Oct 01, 2016 1:00:12 AM org.jpmml.sklearn.Main run
SEVERE: Failed to parse Estimator PKL
java.lang.IllegalArgumentException
    at numpy.DType.toDescr(DType.java:64)
    at joblib.NumpyArrayWrapper.toArray(NumpyArrayWrapper.java:39)
    at org.jpmml.sklearn.PickleUtil$1.dispatch(PickleUtil.java:249)
    at net.razorvine.pickle.Unpickler.load(Unpickler.java:99)
    at org.jpmml.sklearn.PickleUtil.unpickle(PickleUtil.java:259)
    at org.jpmml.sklearn.Main.run(Main.java:169)
    at org.jpmml.sklearn.Main.main(Main.java:107)

Exception in thread "main" java.lang.IllegalArgumentException
    at numpy.DType.toDescr(DType.java:64)
    at joblib.NumpyArrayWrapper.toArray(NumpyArrayWrapper.java:39)
    at org.jpmml.sklearn.PickleUtil$1.dispatch(PickleUtil.java:249)
    at net.razorvine.pickle.Unpickler.load(Unpickler.java:99)
    at org.jpmml.sklearn.PickleUtil.unpickle(PickleUtil.java:259)
    at org.jpmml.sklearn.Main.run(Main.java:169)
    at org.jpmml.sklearn.Main.main(Main.java:107)

Is there a problem with the types I use? Every element should be either a float or integer.

@vruusmann
Copy link
Member

This exception is most likely the root cause of the above CallProcessError:

Exception in thread "main" java.lang.IllegalArgumentException
    at numpy.DType.toDescr(DType.java:64)
    at joblib.NumpyArrayWrapper.toArray(NumpyArrayWrapper.java:39)

It means that one of your Numpy arrays contains atypical dtype specification. I would need to see inside your Pickle files in order to analyze and solve it - please attach your estimator and mapper Pickle files here and/or send them to my e-mail.

Also, what are your Numpy and Python versions? Please paste the version information (as printed by sklearn2pmml(.., debug = True)) here.

@leslietso
Copy link
Author

Thank you for the reply. Here are my python and numpy versions.
('python: ', '2.7.12')
('sklearn: ', '0.18')
('sklearn.externals.joblib:', '0.10.2')
('sklearn_pandas: ', '1.1.0')
('sklearn2pmml: ', '0.11.2')
Numpy version: 1.11.1 (was not printed out by debug mode)

Furthermore, I just double-checked and the dtype of the elements in "data" is float64.

As for the estimator and mapper Pickle files, I have emailed them to you.

@vruusmann
Copy link
Member

Upgraded my testing environment from Scikit-Learn 0.17.1 to 0.18. This exception now occurs with Python 2.7, but it doesn't occur with Python 3.4.

So, as a temporary workaround, you might try downgrading to Scikit-Learn 0.17.1, or upgrading to Python 3.4.

@leslietso
Copy link
Author

leslietso commented Oct 1, 2016

Unfortunately, I get the same error when using Python 2.7 and Python 3.4 with Scikit-Learn 0.17.1.

Python 2.7 Versions:
('python: ', '2.7.12')
('sklearn: ', '0.17.1')
('sklearn.externals.joblib:', '0.9.4')
('sklearn_pandas: ', '1.1.0')
('sklearn2pmml: ', '0.11.2')

Python 2.7 Error:

Traceback (most recent call last):
  File "/Users/leslie/Downloads/test_sklearn2pmml.py", line 68, in <module>
    sklearn2pmml(neigh, df_mapper, "FeaturesAsFeaturesKNN.pmml", with_repr = True, debug = True)
  File "/Users/leslie/Library/Python/2.7/lib/python/site-packages/sklearn2pmml/__init__.py", line 65, in sklearn2pmml
    subprocess.check_call(cmd)
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/subprocess.py", line 541, in check_call
    raise CalledProcessError(retcode, cmd)
CalledProcessError: Command '['java', '-cp', '/Users/leslie/Library/Python/2.7/lib/python/site-packages/sklearn2pmml/resources/guava-19.0.jar:/Users/leslie/Library/Python/2.7/lib/python/site-packages/sklearn2pmml/resources/istack-commons-runtime-2.21.jar:/Users/leslie/Library/Python/2.7/lib/python/site-packages/sklearn2pmml/resources/jaxb-core-2.2.11.jar:/Users/leslie/Library/Python/2.7/lib/python/site-packages/sklearn2pmml/resources/jaxb-runtime-2.2.11.jar:/Users/leslie/Library/Python/2.7/lib/python/site-packages/sklearn2pmml/resources/jcommander-1.48.jar:/Users/leslie/Library/Python/2.7/lib/python/site-packages/sklearn2pmml/resources/jpmml-converter-1.1.1.jar:/Users/leslie/Library/Python/2.7/lib/python/site-packages/sklearn2pmml/resources/jpmml-sklearn-1.1.2.jar:/Users/leslie/Library/Python/2.7/lib/python/site-packages/sklearn2pmml/resources/jpmml-xgboost-1.1.1.jar:/Users/leslie/Library/Python/2.7/lib/python/site-packages/sklearn2pmml/resources/pmml-agent-1.3.3.jar:/Users/leslie/Library/Python/2.7/lib/python/site-packages/sklearn2pmml/resources/pmml-model-1.3.3.jar:/Users/leslie/Library/Python/2.7/lib/python/site-packages/sklearn2pmml/resources/pmml-model-metro-1.3.3.jar:/Users/leslie/Library/Python/2.7/lib/python/site-packages/sklearn2pmml/resources/pmml-schema-1.3.3.jar:/Users/leslie/Library/Python/2.7/lib/python/site-packages/sklearn2pmml/resources/pyrolite-4.13.jar:/Users/leslie/Library/Python/2.7/lib/python/site-packages/sklearn2pmml/resources/serpent-1.12.jar:/Users/leslie/Library/Python/2.7/lib/python/site-packages/sklearn2pmml/resources/slf4j-api-1.7.21.jar:/Users/leslie/Library/Python/2.7/lib/python/site-packages/sklearn2pmml/resources/slf4j-jdk14-1.7.21.jar', 'org.jpmml.sklearn.Main', '--pkl-estimator-input', '/var/folders/ph/8rj4z16x1w3cy86hj3qh7nxh0000gn/T/estimator-fbU4yD.pkl.z', '--repr-estimator', "KNeighborsRegressor(algorithm='kd_tree', leaf_size=30, metric='minkowski',\n          metric_params=None, n_jobs=-1, n_neighbors=400, p=2,\n          weights='uniform')", '--pkl-mapper-input', '/var/folders/ph/8rj4z16x1w3cy86hj3qh7nxh0000gn/T/mapper-OpShIL.pkl.z', '--repr-mapper', "DataFrameMapper(features=[(['participant', 'condition', 'yaw', 'touchX', 'touchY', 'touchW', 'touchH', 'S0cx', 'S0cy', 'S0eo', 'S0evp', 'S0evm', 'S0ee', 'S1cx', 'S1cy', 'S1eo', 'S1evp', 'S1evm', 'S1ee', 'S2cx', 'S2cy', 'S2eo', 'S2evp', 'S2evm', 'S2ee', 'T0cx', 'T0cy', 'T0eo', 'T0evp', 'T0evm', 'T0ee', 'T1cx', 'T1cy', 'T1eo', 'T1evp', 'T1evm', 'T1ee', 'T2cx', 'T2cy', 'T2eo', 'T2evp', 'T2evm', 'T2ee', 'Ucx', 'Ucy', 'Ueo', 'Uevp', 'Uevm', 'Uee'], None), ('pitch', None)],\n        sparse=False)", '--pmml-output', 'FeaturesAsFeaturesKNN.pmml']' returned non-zero exit status 1

Python 3.4 Versions:
python: 3.4.3
sklearn: 0.17.1
sklearn.externals.joblib: 0.9.4
sklearn_pandas: 1.1.0
sklearn2pmml: 0.11.2

Python 3.4 Error:

Traceback (most recent call last):
  File "/Users/leslie/Downloads/test_sklearn2pmml.py", line 68, in <module>
    sklearn2pmml(neigh, df_mapper, "FeaturesAsFeaturesKNN.pmml", with_repr = True, debug = True)
  File "/Users/leslie/Library/Python/3.4/lib/python/site-packages/sklearn2pmml/__init__.py", line 65, in sklearn2pmml
    subprocess.check_call(cmd)
  File "/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/subprocess.py", line 561, in check_call
    raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['java', '-cp', '/Users/leslie/Library/Python/3.4/lib/python/site-packages/sklearn2pmml/resources/guava-19.0.jar:/Users/leslie/Library/Python/3.4/lib/python/site-packages/sklearn2pmml/resources/istack-commons-runtime-2.21.jar:/Users/leslie/Library/Python/3.4/lib/python/site-packages/sklearn2pmml/resources/jaxb-core-2.2.11.jar:/Users/leslie/Library/Python/3.4/lib/python/site-packages/sklearn2pmml/resources/jaxb-runtime-2.2.11.jar:/Users/leslie/Library/Python/3.4/lib/python/site-packages/sklearn2pmml/resources/jcommander-1.48.jar:/Users/leslie/Library/Python/3.4/lib/python/site-packages/sklearn2pmml/resources/jpmml-converter-1.1.1.jar:/Users/leslie/Library/Python/3.4/lib/python/site-packages/sklearn2pmml/resources/jpmml-sklearn-1.1.2.jar:/Users/leslie/Library/Python/3.4/lib/python/site-packages/sklearn2pmml/resources/jpmml-xgboost-1.1.1.jar:/Users/leslie/Library/Python/3.4/lib/python/site-packages/sklearn2pmml/resources/pmml-agent-1.3.3.jar:/Users/leslie/Library/Python/3.4/lib/python/site-packages/sklearn2pmml/resources/pmml-model-1.3.3.jar:/Users/leslie/Library/Python/3.4/lib/python/site-packages/sklearn2pmml/resources/pmml-model-metro-1.3.3.jar:/Users/leslie/Library/Python/3.4/lib/python/site-packages/sklearn2pmml/resources/pmml-schema-1.3.3.jar:/Users/leslie/Library/Python/3.4/lib/python/site-packages/sklearn2pmml/resources/pyrolite-4.13.jar:/Users/leslie/Library/Python/3.4/lib/python/site-packages/sklearn2pmml/resources/serpent-1.12.jar:/Users/leslie/Library/Python/3.4/lib/python/site-packages/sklearn2pmml/resources/slf4j-api-1.7.21.jar:/Users/leslie/Library/Python/3.4/lib/python/site-packages/sklearn2pmml/resources/slf4j-jdk14-1.7.21.jar', 'org.jpmml.sklearn.Main', '--pkl-estimator-input', '/var/folders/ph/8rj4z16x1w3cy86hj3qh7nxh0000gn/T/estimator-2rsvly58.pkl.z', '--repr-estimator', "KNeighborsRegressor(algorithm='kd_tree', leaf_size=30, metric='minkowski',\n          metric_params=None, n_jobs=-1, n_neighbors=400, p=2,\n          weights='uniform')", '--pkl-mapper-input', '/var/folders/ph/8rj4z16x1w3cy86hj3qh7nxh0000gn/T/mapper-mi8u4pgb.pkl.z', '--repr-mapper', "DataFrameMapper(features=[(['participant', 'condition', 'yaw', 'touchX', 'touchY', 'touchW', 'touchH', 'S0cx', 'S0cy', 'S0eo', 'S0evp', 'S0evm', 'S0ee', 'S1cx', 'S1cy', 'S1eo', 'S1evp', 'S1evm', 'S1ee', 'S2cx', 'S2cy', 'S2eo', 'S2evp', 'S2evm', 'S2ee', 'T0cx', 'T0cy', 'T0eo', 'T0evp', 'T0evm', 'T0ee', 'T1cx', 'T1cy', 'T1eo', 'T1evp', 'T1evm', 'T1ee', 'T2cx', 'T2cy', 'T2eo', 'T2evp', 'T2evm', 'T2ee', 'Ucx', 'Ucy', 'Ueo', 'Uevp', 'Uevm', 'Uee'], None), ('pitch', None)],\n        sparse=False)", '--pmml-output', 'FeaturesAsFeaturesKNN.pmml']' returned non-zero exit status 1

Recreating the estimator and mapper Pickle files with Scikit-Learn 0.17.1 and manually converting it using the command-line returns the error:

Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
    at org.jpmml.converter.DOMUtil.createRow(DOMUtil.java:49)
    at sklearn.neighbors.KNeighborsUtil.encodeNeighbors(KNeighborsUtil.java:111)
    at sklearn.neighbors.KNeighborsRegressor.encodeModel(KNeighborsRegressor.java:56)
    at sklearn.neighbors.KNeighborsRegressor.encodeModel(KNeighborsRegressor.java:31)
    at sklearn.EstimatorUtil.encodePMML(EstimatorUtil.java:54)
    at org.jpmml.sklearn.Main.run(Main.java:189)
    at org.jpmml.sklearn.Main.main(Main.java:107)

@vruusmann
Copy link
Member

You can resolve java.lang.OutOfMemoryError by increasing JVM heap size using -Xms and -Xmx command-line options:

java -Xms4G -Xmx16G -jar target/converter-executable-1.1-SNAPSHOT.jar ..

@leslietso
Copy link
Author

It worked! Thank you very much!

@vruusmann
Copy link
Member

I'm reopening this issue, as I'm still working on the permanent fix.

My testing shows that this dtype attribute issue only affects MLP and KNN model types. All other model types work correctly on Python 2.7 + Scikit-Learn 0.18.

@vruusmann vruusmann reopened this Oct 1, 2016
vruusmann added a commit to jpmml/jpmml-sklearn that referenced this issue Oct 1, 2016
@vruusmann
Copy link
Member

I've released sklearn2pmml version 0.12.0, which should be able to handle Python 2.7 in combination with Scikit-Learn 0.18. At least I was able to convert your Pickle files.

Anyway, I must warn you that if you're interested in evaluating your KNN models within JPMML/Openscoring software stack, then you'll be very likely disappointed with its performance. Your model contains 250k training instances, and during model scoring it will be necessary to select 400 of them (and average their predictions). Currently, JPMML/Openscoring performs the selection of closest training instances using brute-force scanning, which is clearly suboptimal here.

@gerardsimons
Copy link

gerardsimons commented Dec 12, 2016

I am also facing this issue now on Mac OS X, but due to a FileNotFoundError on the pkl path argument. I think it is trying to write the pkl in a temp folder (/var/folders/51/9s8qnl5x55b_1mx6fsf2gn480000gp/T/estimator-blIOk7.pkl.z) but there is nothing there when I check. Maybe it could help if you would be able to set the intermediary Pickle file location yourself?

EDIT: Actually the Pickle is there, it does fail in an IPython notebook though, but when I run the command from the debug afterwards in the terminal it does work ...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants