You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am using H2O flow to train a k-means clustering model. When I use the attached data file and select all the columns for training the model, many times I get the following exception:
JOB FAILURE.
java.lang.ArrayIndexOutOfBoundsException: 4
TOGGLE STACK TRACE
java.lang.ArrayIndexOutOfBoundsException: 4
at water.util.ArrayUtils.add(ArrayUtils.java:239)
at hex.ModelMetricsClustering$MetricBuilderClustering.reduce(ModelMetricsClustering.java:131)
at hex.ModelMetricsClustering$MetricBuilderClustering.reduce(ModelMetricsClustering.java:80)
at hex.ModelBuilder.cv_mainModelScores(ModelBuilder.java:804)
at hex.ModelBuilder.computeCrossValidation(ModelBuilder.java:518)
at hex.ModelBuilder$1.compute2(ModelBuilder.java:364)
at water.H2O$H2OCountedCompleter.compute(H2O.java:1563)
at jsr166y.CountedCompleter.exec(CountedCompleter.java:468)
at jsr166y.ForkJoinTask.doExec(ForkJoinTask.java:263)
at jsr166y.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:974)
at jsr166y.ForkJoinPool.runWorker(ForkJoinPool.java:1477)
at jsr166y.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:104)
I have changed only 3 of the default hyper parameters while training this model. They are as below:
nfold - 5
k - 5
checked the estimate_k
Rest all the parameter are default. The same data frame was used for training and validation.
I have checked the input data file and I don’t see anything wrong over there. Only thing which I found is that if I ignore PaymentAmtQtr3 and PaymentAmtQtr4 columns, the model builds successfully. But I don’t see anything wrong in these columns. Not sure what is the issue.
From Google Groups:
I am using H2O flow to train a k-means clustering model. When I use the attached data file and select all the columns for training the model, many times I get the following exception:
JOB FAILURE.
java.lang.ArrayIndexOutOfBoundsException: 4
TOGGLE STACK TRACE
java.lang.ArrayIndexOutOfBoundsException: 4
I have changed only 3 of the default hyper parameters while training this model. They are as below:
nfold - 5
k - 5
checked the estimate_k
Rest all the parameter are default. The same data frame was used for training and validation.
Here is the completer build model parameters:
buildModel 'kmeans', {"model_id":"kmeans-9b5a609c-acd7-4303-9de2-5f084493e75d","training_frame":"MLM_10k.hex","validation_frame":"MLM_10k.hex","nfolds":5,"ignored_columns":[],"ignore_const_cols":true,"k":5,"estimate_k":true,"max_iterations":10,"standardize":true,"init":"Furthest","fold_assignment":"AUTO","score_each_iteration":false,"seed":-1,"max_runtime_secs":0,"categorical_encoding":"AUTO","keep_cross_validation_models":true,"keep_cross_validation_predictions":false,"keep_cross_validation_fold_assignment":false,"cluster_size_constraints":[]}
I have checked the input data file and I don’t see anything wrong over there. Only thing which I found is that if I ignore PaymentAmtQtr3 and PaymentAmtQtr4 columns, the model builds successfully. But I don’t see anything wrong in these columns. Not sure what is the issue.
I am using latest h2o version 3.30.1.1
Data can be downloaded here:
https://groups.google.com/g/h2ostream/c/UYDyr6hs3zE?pli=1
The text was updated successfully, but these errors were encountered: