Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error while running h2o-pysparkling-2.4 on Zeppelin in Amazon EMR #2385

Closed
josegpg opened this issue Nov 10, 2020 · 4 comments
Closed

Error while running h2o-pysparkling-2.4 on Zeppelin in Amazon EMR #2385

josegpg opened this issue Nov 10, 2020 · 4 comments

Comments

@josegpg
Copy link

josegpg commented Nov 10, 2020

Hi!

I'm trying to run the example shown here.
Currently I just installed pysparkling via pip

pip install h2o-pysparkling-2.4

What I'm using:

  • The latest pysparkling version for spark 2.4 (h2o-pysparkling-2.4 3.32.0.1-2)
  • Execution mode YARN-client

The code I'm trying to execute is basically the docs (below just for reference):

from pysparkling import *
from pysparkling.ml import H2OAutoML
import h2o

hc = H2OContext.getOrCreate()

# Create dataframe
frame = h2o.import_file("https://raw.githubusercontent.com/h2oai/sparkling-water/master/examples/smalldata/prostate/prostate.csv")
sparkDF = hc.asSparkFrame(frame)
sparkDF = sparkDF.withColumn("CAPSULE", sparkDF.CAPSULE.cast("string"))
[trainingDF, testingDF] = sparkDF.randomSplit([0.8, 0.2])

# # Train AutoML
automl = H2OAutoML(labelCol="CAPSULE", ignoredCols=["ID"])

automl.setExcludeAlgos(["GLM"])
automl.setMaxModels(10)

automl.setSortMetric("AUC")
model = automl.fit(trainingDF)

The error I'm getting is the following:

Py4JJavaError: An error occurred while calling o469.fit.
: java.lang.NullPointerException
	at hex.genmodel.attributes.ModelAttributes.<init>(ModelAttributes.java:55)
	at hex.genmodel.ModelMojoReader.readModelSpecificAttributes(ModelMojoReader.java:220)
	at hex.genmodel.ModelMojoReader.readAll(ModelMojoReader.java:204)
	at hex.genmodel.ModelMojoReader.readFrom(ModelMojoReader.java:64)
	at ai.h2o.sparkling.ml.utils.Utils$.getMojoModel(Utils.scala:29)
	at ai.h2o.sparkling.ml.models.H2OMOJOModel$.createFromMojo(H2OMOJOModel.scala:257)
	at ai.h2o.sparkling.ml.internals.H2OModel.toMOJOModel(H2OModel.scala:41)
	at ai.h2o.sparkling.ml.algos.H2OAutoML.fit(H2OAutoML.scala:97)
	at ai.h2o.sparkling.ml.algos.H2OAutoML.fit(H2OAutoML.scala:42)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
	at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
	at py4j.Gateway.invoke(Gateway.java:282)
	at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
	at py4j.commands.CallCommand.execute(CallCommand.java:79)
	at py4j.GatewayConnection.run(GatewayConnection.java:238)
	at java.lang.Thread.run(Thread.java:748)

(<class 'py4j.protocol.Py4JJavaError'>, Py4JJavaError('An error occurred while calling o469.fit.\n', JavaObject id=o840), <traceback object at 0x7f4714ae3b08>)

I've checked using H2O Web UI and models are trained and ranked but it seems to fail when deserializing the best model at the end of the execution, apparently it's not finding some parameters.

If you need more information let me know

Regards

@mn-mikke
Copy link
Collaborator

Hi @josegpg,
What is the type of the the best model? stacked ensemble? I think this problem will be fixed in the next fix release by this PR. Can you check if you experience the same thing on the latest nightly build? https://h2o-release.s3.amazonaws.com/sparkling-water/spark-2.4/nightly/3.32.0.2-1.8-2.4/index.html

@josegpg
Copy link
Author

josegpg commented Nov 11, 2020

Hi @mn-mikke,
The best model what I get from the leaderboard:

+---+---------------------------------------------------+------------------+
|   |model_id                                           |auc               |
+---+---------------------------------------------------+------------------+
|0  |DRF_1_AutoML_20201111_113247                       |0.8115669797330696|
|1  |GBM_1_AutoML_20201111_113247                       |0.8067226890756303|
|2  |StackedEnsemble_BestOfFamily_AutoML_20201111_113247|0.8062283737024222|
|3  |StackedEnsemble_AllModels_AutoML_20201111_113247   |0.8060306475531388|
|4  |XGBoost_1_AutoML_20201111_113247                   |0.7953040039545229|
|5  |XGBoost_2_AutoML_20201111_113247                   |0.7934256055363321|
|6  |GBM_4_AutoML_20201111_113247                       |0.7901136925358379|
|7  |GBM_2_AutoML_20201111_113247                       |0.7893227879387049|
|8  |XGBoost_3_AutoML_20201111_113247                   |0.7891744933267424|
|9  |GBM_3_AutoML_20201111_113247                       |0.7851705388037568|
|10 |GBM_5_AutoML_20201111_113247                       |0.7462679189322788|
|11 |DeepLearning_1_AutoML_20201111_113247              |0.6983687592684132|
+---+---------------------------------------------------+------------------+

So I'm assuming that is that Distributed Random Forest in the first position. Not sure if that PR touches DRFs.

I also tried the nightly build (however I was only able to do it through pysparkling) and everything worked fine. I also replicated the test using the stable build and it also worked (from pyspark + pysparkling installed via pip).

I'm still getting the error when I try the code from Zeppelin so now I think it must be some missconfiguration of the Zeppelin interpreters. I'll review it, however now I think that it's not an issue of yours.

@mn-mikke
Copy link
Collaborator

(The PR suppresses all exceptions regarding the deserialization of metadata (parameters))[https://github.com//pull/2376/files#diff-c82016c040d5306013fe195042d94627e32a06b03613e2e0a4f0663aa0ccffc8R33] and thus the exception you shared. In such a scenario, you will see error in logs.

I'm still getting the error when I try the code from Zeppelin so now I think it must be some missconfiguration of the Zeppelin interpreters. I'll review it, however now I think that it's not an issue of yours.

Still the same error? Even on the nightly build?

@josegpg
Copy link
Author

josegpg commented Nov 11, 2020

@mn-mikke Yes but in Zeppelin I'm using the stable version. Honestly I haven't found how to install the nightly build using pip so the Zeppelin interpreter can use it. (That's why I did it on console using pysparkling command)

@josegpg josegpg closed this as completed Nov 17, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants