-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
java.lang.ArrayIndexOutOfBoundsException: Index 1684 out of bounds for length 1684 when using deeplearning in ensemble #7686
Comments
Hassan Hawilo commented: I have tried older version of H2O the error now changed to java.lang.IllegalArgumentException: Unsupported MOJO model hex.genmodel.algos.deeplearning.DeeplearningMojoModel. |
Hassan Hawilo commented: the older version is {{3.26.0.11}} using python 3.6 |
Hassan Hawilo commented: will try a different method of saving the model now |
Tomas Fryda commented: [~accountid:5c108033b5881d1b2e510659] Thank you for reporting this issue. Unfortunately, I couldn’t reproduce it yet. Could you provide more information? I prepared couple questions that should help me pinpoint the problem, unless you have a reproducible example that you could share with us which would make things much easier. Did you train the StackedEnsemble using AutoML? If so, could you provide parameters you used for the H2OAutoML? Could you try saving just the DeepLearning base model and then load it and predict using it? (To find out if the issue is just with the DeepLearning or with both StackedEnsemble and DeepLearning) If your DeepLearning models have their default names (starting with DeepLearning) you could use the following (just assign the stacked_ensemble and dataset variables): {code:python}import shutil stacked_ensemble = se # CHANGE THIS: StackedEnsemble that you are not able to persist correctly for model_id in [mid for mid in stacked_ensemble.base_models if mid.startswith("DeepLearning")]: For each deep learning model it will create a temporary directory, save the deep learning model there, load it, predict and finally remove the temporary directory. Did it work correctly? Could you also provide more information about the data you used? I prepared a snippet to do simple summary. {code:python}dataset = test # CHANGE THIS: H2O Frame used for predicting print("dataset.shape =", dataset.shape) Could you paste here the output of that summary? Also any other relevant information that you could share would be greatly appreciated. Thank you! |
Hassan Hawilo commented: Can share with you the model and a prediction row that can produce the error |
Hassan Hawilo commented: if you can send me a link or email to share the model and a prediction row csv file privately would be appreciated |
Tomas Fryda commented: [~accountid:5c108033b5881d1b2e510659] That would be great! [tomas.fryda@h2o.ai|mailto:tomas.fryda@h2o.ai] |
Hassan Hawilo commented: Done Many Thanks! |
Tomas Fryda commented: [~accountid:5c108033b5881d1b2e510659] Thank you for your cooperation! I found the issue hopefully the fix will be in the next release. The problem was with fold column handling, since the fold column is the last column of your dataset, I think you can workaround it by modifying the mojo (if you didn’t find any other way):
This worked on iris dataset, hopefully it will work on yours too but if you will use the this workaround please make sure the predictions are the same, for example: {code:python}import tempfile tempdir = tempfile.mkdtemp() PATCH the mojo as decribed earliermojo_model = h2o.import_mojo(mojo_name) (predictions == mojo_predictions).all(){code} |
JIRA Issue Migration Info Jira Issue: PUBDEV-7962 Linked PRs from JIRA |
raise EnvironmentError("Job with key {} failed with an exception: {}\nstacktrace: "
OSError: Job with key $03017f00000132d4ffffffff$_90f979146e9d13e0fa230dc8b964786 failed with an exception: DistributedException from /127.0.0.1:54321: 'Index 1684 out of bounds for length 1684', caused by java.lang.ArrayIndexOutOfBoundsException: Index 1684 out of bounds for length 1684
stacktrace:
DistributedException from /127.0.0.1:54321: 'Index 1684 out of bounds for length 1684', caused by java.lang.ArrayIndexOutOfBoundsException: Index 1684 out of bounds for length 1684
at water.MRTask.getResult(MRTask.java:494)
at water.MRTask.getResult(MRTask.java:502)
at water.MRTask.doAll(MRTask.java:397)
at water.MRTask.doAll(MRTask.java:403)
at hex.Model.predictScoreImpl(Model.java:1784)
at hex.Model.score(Model.java:1618)
at water.api.ModelMetricsHandler$1.compute2(ModelMetricsHandler.java:403)
at water.H2O$H2OCountedCompleter.compute(H2O.java:1575)
at jsr166y.CountedCompleter.exec(CountedCompleter.java:468)
at jsr166y.ForkJoinTask.doExec(ForkJoinTask.java:263)
at jsr166y.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:974)
at jsr166y.ForkJoinPool.runWorker(ForkJoinPool.java:1477)
at jsr166y.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:104)
Caused by: java.lang.ArrayIndexOutOfBoundsException: Index 1684 out of bounds for length 1684
at hex.genmodel.GenModel.setCats(GenModel.java:707)
at hex.genmodel.GenModel.setInput(GenModel.java:686)
at hex.genmodel.algos.deeplearning.DeeplearningMojoModel.score0(DeeplearningMojoModel.java:70)
at hex.genmodel.algos.deeplearning.DeeplearningMojoModel.score0(DeeplearningMojoModel.java:158)
at hex.genmodel.algos.ensemble.StackedEnsembleMojoModel.score0(StackedEnsembleMojoModel.java:39)
at hex.generic.GenericModel.score0(GenericModel.java:93)
at hex.Model.score0(Model.java:1992)
at hex.Model.score0(Model.java:1959)
at hex.Model$BigScore.score0(Model.java:1903)
at hex.Model$BigScore.map(Model.java:1881)
at water.MRTask.compute2(MRTask.java:675)
at water.H2O$H2OCountedCompleter.compute1(H2O.java:1578)
at hex.Model$BigScore$Icer.compute1(Model$BigScore$Icer.java)
at water.H2O$H2OCountedCompleter.compute(H2O.java:1574)
... 5 more
The text was updated successfully, but these errors were encountered: