-
Notifications
You must be signed in to change notification settings - Fork 363
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ArrayIndexOutOfBoundsException when converting Spark DataFrame to H2OFrame #29
Comments
Hi Aswin-roy,
Without this information I'm not able to reproduce the issue. I suppose you are using Sparkling-shell ? Thanks!, Kuba |
I am trying to run a Spark-Streaming job which does some operations and then moves on to predictions (from trained model on disk). I am running the job on a spark 1.6.0 standalone setup (single node). The amount of executors I have specified in conf is 49. But, when I log my h2oContext it comes to something like Sparkling Water Context:
even though in the spark ui, I can see that 49 executors are up. Why is this happening? After this, I get the I often get these exceptions too :
I run the job using spark-submit and using the Thanks! |
Hi aswin-roy, I already saw that you reacted in the issue #4 . The issue explained there is exactly why in some cases we are not able to initiate H2OContext. It's because we weren't able to find all Spark executors during the creation of H2OContext. In Sparkling Water you are using we created a listener which checks for changes in the cluster topology and just kills the cloud if new executor without H2O instance appeared. It's not great, but at lest we get notified about what is happening. We are undergoing an architectural discussion with the rest of the Sparkling Water team & the community what could be the best approach when dealing with this. The I will close this one and redirect you please to comment at #4. Thanks, Kuba! |
After initializing H2OContext like
val h2o = H2OContext.getOrCreate(sc)
when I try to convert my Spark DF to H2OFrame
val h2oDF: H2OFrame = dataFrame
This gives me java.lang.ArrayIndexOutOfBoundsException. I am running on Spark 1.6.0 and Sparkling water 1.6.1. What might the reason be?
Thanks!
The text was updated successfully, but these errors were encountered: