Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Infogram with CV and weights_column specified failed #15423

Closed
wendycwong opened this issue May 15, 2023 · 2 comments
Closed

Infogram with CV and weights_column specified failed #15423

wendycwong opened this issue May 15, 2023 · 2 comments
Assignees
Labels

Comments

@wendycwong
Copy link
Contributor

H2O version, Operating System and Environment
H2O 3.40.0.4 on MAC

Actual behavior
Traceback (most recent call last):
File "/Users/wendycwong/h2o-3/h2o-py/tests/testdir_algos/infogram/pyunit_PUBDEV_9085_cv_test_no_weights.py", line 36, in
pyunit_utils.standalone_test(test_cv_warning_messages)
File "../../../tests/pyunit_utils/utilsPY.py", line 700, in standalone_test
test()
File "/Users/wendycwong/h2o-3/h2o-py/tests/testdir_algos/infogram/pyunit_PUBDEV_9085_cv_test_no_weights.py", line 29, in test_cv_warning_messages
infogram_model_cv_v.train(x=x, y=target, training_frame=train, validation_frame=test)
File "../../../h2o/estimators/infogram.py", line 1087, in train
sup._train(parms, verbose=verbose)
File "../../../h2o/estimators/estimator_base.py", line 200, in _train
job.poll(poll_updates=self._print_model_scoring_history if verbose else None)
File "../../../h2o/job.py", line 91, in poll
"\n{}".format(self.job_key, self.exception, self.job["stacktrace"]))
OSError: Job with key $0301c0a8561b32d4ffffffff$_ababfd466da593529d26facbad204632 failed with an exception: DistributedException from /192.168.86.27:54321: '29', caused by java.lang.ArrayIndexOutOfBoundsException: 29
stacktrace:
DistributedException from /192.168.86.27:54321: '29', caused by java.lang.ArrayIndexOutOfBoundsException: 29
at water.MRTask.getResult(MRTask.java:660)
at water.MRTask.getResult(MRTask.java:670)
at water.MRTask.doAll(MRTask.java:530)
at water.MRTask.doAll(MRTask.java:482)
at hex.Infogram.Infogram$InfogramDriver.generateInfoGrams(Infogram.java:588)
at hex.Infogram.Infogram$InfogramDriver.buildModelCMINRelevance(Infogram.java:528)
at hex.Infogram.Infogram$InfogramDriver.buildInfoGramsNRelevance(Infogram.java:495)
at hex.Infogram.Infogram$InfogramDriver.buildModel(Infogram.java:316)
at hex.Infogram.Infogram$InfogramDriver.computeImpl(Infogram.java:303)
at hex.ModelBuilder$Driver.compute2(ModelBuilder.java:253)
at water.H2O$H2OCountedCompleter.compute(H2O.java:1677)
at jsr166y.CountedCompleter.exec(CountedCompleter.java:468)
at jsr166y.ForkJoinTask.doExec(ForkJoinTask.java:263)
at jsr166y.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:976)
at jsr166y.ForkJoinPool.runWorker(ForkJoinPool.java:1479)
at jsr166y.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:104)
Caused by: java.lang.ArrayIndexOutOfBoundsException: 29
at hex.Infogram.EstimateCMI.map(EstimateCMI.java:32)
at water.MRTask.compute2(MRTask.java:819)
at water.MRTask.compute2(MRTask.java:775)
at water.MRTask.compute2(MRTask.java:775)
at water.MRTask.compute2(MRTask.java:775)
at water.MRTask.compute2(MRTask.java:775)
at water.MRTask.compute2(MRTask.java:775)
at water.MRTask.compute2(MRTask.java:775)
at water.H2O$H2OCountedCompleter.compute1(H2O.java:1680)
at hex.Infogram.EstimateCMI$Icer.compute1(EstimateCMI$Icer.java)
at water.H2O$H2OCountedCompleter.compute(H2O.java:1676)
... 5 more

Expected behavior
Should run to completion with no problem.

Steps to reproduce
Run this code:
fr = h2o.import_file(path=pyunit_utils.locate("smalldata/admissibleml_test/Bank_Personal_Loan_Modelling.csv"))
target = "Personal Loan"
fr[target] = fr[target].asfactor()
x = ["Experience","Income","Family","CCAvg","Education","Mortgage",
"Securities Account","CD Account","Online","CreditCard"]

splits = fr.split_frame(ratios=[0.80])
train = splits[0]
weight = pyunit_utils.random_dataset_real_only(train.nrow, 1, misFrac=0)
weight = weight.abs()
weight.set_name(0, "weight_column")
train = train.cbind(weight)
test = splits[1]
infogram_model_cv_v = H2OInfogram(seed = 12345, protected_columns=["Age","ZIP Code"], nfolds=3, weights_column="weight_column") 
infogram_model_cv_v.train(x=x, y=target, training_frame=train, validation_frame=test)

pyunit_utils.checkLogWeightWarning("weight_column", wantWarnMessage=True)
pyunit_utils.checkLogWeightWarning("infogram_internal_cv_weights_", wantWarnMessage=False)

Upload logs
05-14 17:48:57.583 192.168.86.27:54321 25604 FJ-1-3 INFO hex.CVModelBuilder: Exception from CV model #0 will be reported as main exception.
05-14 17:48:57.583 192.168.86.27:54321 25604 FJ-1-3 INFO hex.CVModelBuilder: Completed cross-validation model 0 / 3.
05-14 17:48:57.586 192.168.86.27:54321 25604 FJ-1-3 WARN hex.CVModelBuilder: CV model #1 failed, the exception will not be reported
DistributedException from /192.168.86.27:54321: '32', caused by java.lang.ArrayIndexOutOfBoundsException: 32
at water.MRTask.getResult(MRTask.java:660)
at water.MRTask.getResult(MRTask.java:670)
at water.MRTask.doAll(MRTask.java:530)
at water.MRTask.doAll(MRTask.java:482)
at hex.Infogram.Infogram$InfogramDriver.generateInfoGrams(Infogram.java:588)
at hex.Infogram.Infogram$InfogramDriver.buildModelCMINRelevance(Infogram.java:528)
at hex.Infogram.Infogram$InfogramDriver.buildInfoGramsNRelevance(Infogram.java:495)
at hex.Infogram.Infogram$InfogramDriver.buildModel(Infogram.java:316)
at hex.Infogram.Infogram$InfogramDriver.computeImpl(Infogram.java:303)
at hex.ModelBuilder$Driver.compute2(ModelBuilder.java:253)
at water.H2O$H2OCountedCompleter.compute(H2O.java:1677)
at jsr166y.CountedCompleter.exec(CountedCompleter.java:468)
at jsr166y.ForkJoinTask.doExec(ForkJoinTask.java:263)
at jsr166y.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:976)
at jsr166y.ForkJoinPool.runWorker(ForkJoinPool.java:1479)
at jsr166y.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:104)
Caused by: java.lang.ArrayIndexOutOfBoundsException: 32
at hex.Infogram.EstimateCMI.map(EstimateCMI.java:32)
at water.MRTask.compute2(MRTask.java:819)
at water.MRTask.compute2(MRTask.java:775)
at water.MRTask.compute2(MRTask.java:775)
at water.MRTask.compute2(MRTask.java:775)
at water.MRTask.compute2(MRTask.java:775)
at water.MRTask.compute2(MRTask.java:775)
at water.MRTask.compute2(MRTask.java:775)
at water.H2O$H2OCountedCompleter.compute1(H2O.java:1680)
at hex.Infogram.EstimateCMI$Icer.compute1(EstimateCMI$Icer.java)
at water.H2O$H2OCountedCompleter.compute(H2O.java:1676)
... 5 more
05-14 17:48:57.592 192.168.86.27:54321 25604 FJ-1-3 INFO hex.CVModelBuilder: Completed cross-validation model 1 / 3.
05-14 17:48:57.593 192.168.86.27:54321 25604 FJ-1-3 WARN hex.CVModelBuilder: CV model #2 failed, the exception will not be reported
DistributedException from /192.168.86.27:54321: '78', caused by java.lang.ArrayIndexOutOfBoundsException: 78
at water.MRTask.getResult(MRTask.java:660)
at water.MRTask.getResult(MRTask.java:670)
at water.MRTask.doAll(MRTask.java:530)
at water.MRTask.doAll(MRTask.java:482)
at hex.Infogram.Infogram$InfogramDriver.generateInfoGrams(Infogram.java:588)
at hex.Infogram.Infogram$InfogramDriver.buildModelCMINRelevance(Infogram.java:528)
at hex.Infogram.Infogram$InfogramDriver.buildInfoGramsNRelevance(Infogram.java:495)
at hex.Infogram.Infogram$InfogramDriver.buildModel(Infogram.java:316)
at hex.Infogram.Infogram$InfogramDriver.computeImpl(Infogram.java:303)
at hex.ModelBuilder$Driver.compute2(ModelBuilder.java:253)
at water.H2O$H2OCountedCompleter.compute(H2O.java:1677)
at jsr166y.CountedCompleter.exec(CountedCompleter.java:468)
at jsr166y.ForkJoinTask.doExec(ForkJoinTask.java:263)
at jsr166y.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:976)
at jsr166y.ForkJoinPool.runWorker(ForkJoinPool.java:1479)
at jsr166y.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:104)
Caused by: java.lang.ArrayIndexOutOfBoundsException: 78
at hex.Infogram.EstimateCMI.map(EstimateCMI.java:32)
at water.MRTask.compute2(MRTask.java:819)
at water.MRTask.compute2(MRTask.java:775)
at water.MRTask.compute2(MRTask.java:775)
at water.MRTask.compute2(MRTask.java:775)
at water.MRTask.compute2(MRTask.java:775)
at water.MRTask.compute2(MRTask.java:775)
at water.H2O$H2OCountedCompleter.compute1(H2O.java:1680)
at hex.Infogram.EstimateCMI$Icer.compute1(EstimateCMI$Icer.java)
at water.H2O$H2OCountedCompleter.compute(H2O.java:1676)
... 5 more
05-14 17:48:57.594 192.168.86.27:54321 25604 FJ-1-3 INFO hex.CVModelBuilder: Completed cross-validation model 2 / 3.
05-14 17:48:57.597 192.168.86.27:54321 25604 FJ-1-3 WARN water.default: Model training job Infogram completed with exception: DistributedException from /192.168.86.27:54321: '29', caused by java.lang.ArrayIndexOutOfBoundsException: 29
05-14 17:48:57.598 192.168.86.27:54321 25604 FJ-1-3 ERROR water.default:
DistributedException from /192.168.86.27:54321: '29', caused by java.lang.ArrayIndexOutOfBoundsException: 29
at water.MRTask.getResult(MRTask.java:660)
at water.MRTask.getResult(MRTask.java:670)
at water.MRTask.doAll(MRTask.java:530)
at water.MRTask.doAll(MRTask.java:482)
at hex.Infogram.Infogram$InfogramDriver.generateInfoGrams(Infogram.java:588)
at hex.Infogram.Infogram$InfogramDriver.buildModelCMINRelevance(Infogram.java:528)
at hex.Infogram.Infogram$InfogramDriver.buildInfoGramsNRelevance(Infogram.java:495)
at hex.Infogram.Infogram$InfogramDriver.buildModel(Infogram.java:316)
at hex.Infogram.Infogram$InfogramDriver.computeImpl(Infogram.java:303)
at hex.ModelBuilder$Driver.compute2(ModelBuilder.java:253)
at water.H2O$H2OCountedCompleter.compute(H2O.java:1677)
at jsr166y.CountedCompleter.exec(CountedCompleter.java:468)
at jsr166y.ForkJoinTask.doExec(ForkJoinTask.java:263)
at jsr166y.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:976)
at jsr166y.ForkJoinPool.runWorker(ForkJoinPool.java:1479)
at jsr166y.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:104)
Caused by: java.lang.ArrayIndexOutOfBoundsException: 29
at hex.Infogram.EstimateCMI.map(EstimateCMI.java:32)
at water.MRTask.compute2(MRTask.java:819)
at water.MRTask.compute2(MRTask.java:775)
at water.MRTask.compute2(MRTask.java:775)
at water.MRTask.compute2(MRTask.java:775)
at water.MRTask.compute2(MRTask.java:775)
at water.MRTask.compute2(MRTask.java:775)
at water.MRTask.compute2(MRTask.java:775)
at water.H2O$H2OCountedCompleter.compute1(H2O.java:1680)
at hex.Infogram.EstimateCMI$Icer.compute1(EstimateCMI$Icer.java)
at water.H2O$H2OCountedCompleter.compute(H2O.java:1676)
... 5 more
05-14 17:48:58.907 192.168.86.27:54321 25604 3748097-14 INFO water.default: DELETE /4/sessions/_sid_84bc, parms: {}

@tomasfryda
Copy link
Contributor

@wendycwong does this have a JIRA ticket(I didn't find one)? If not, should I create one?

@wendycwong
Copy link
Contributor Author

Please do so Tomas.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants