Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ability to Sort grid search results with failed model #8382

Closed
exalate-issue-sync bot opened this issue May 11, 2023 · 2 comments
Closed

Add ability to Sort grid search results with failed model #8382

exalate-issue-sync bot opened this issue May 11, 2023 · 2 comments

Comments

@exalate-issue-sync
Copy link

presently cannot sort the results if there is a cancelled or failed job-

{code:java}> rf_gridperf <- h2o.getGrid(grid_id = "rf_grid_id",

  •                        sort_by = "accuracy",
    
  •                        decreasing = TRUE)
    

failure_stack_traces: water.Job$JobCancelledException
at hex.tree.SharedTree$Driver.scoreAndBuildTrees(SharedTree.java:450)
at hex.tree.SharedTree$Driver.computeImpl(SharedTree.java:360)
at hex.ModelBuilder$Driver.compute2(ModelBuilder.java:222)
at hex.ModelBuilder.trainModelNested(ModelBuilder.java:348)...{code}

steps to repro

parse data from R, build grid in flow with a few failed models and then call h2o.getGrid from R

{noformat}> xx = h2o.importFile("https://s3.amazonaws.com/h2o-public-test-data/bigdata/laptop/kdd2009/small-churn/kdd_train.csv")
|==================================================================================================================================| 100%

gg =h2o.getGrid("grid-88d599ea-9fa7-4f25-9fd5-ce195e63a3f9")
Hyper-parameter: min_rows, 1
Hyper-parameter: sample_rate, 2
[2020-02-12 16:32:03] failure_details: Illegal argument(s) for DRF model: grid-88d599ea-9fa7-4f25-9fd5-ce195e63a3f9_model_3. Details: ERRR on field: _sample_rate: sample_rate should be in interval ]0,1] but it is 2.0.

[2020-02-12 16:32:03] failure_stack_traces: water.exceptions.H2OModelBuilderIllegalArgumentException: Illegal argument(s) for DRF model: grid-88d599ea-9fa7-4f25-9fd5-ce195e63a3f9_model_3. Details: ERRR on field: _sample_rate: sample_rate should be in interval ]0,1] but it is 2.0.

at water.exceptions.H2OModelBuilderIllegalArgumentException.makeFromBuilder(H2OModelBuilderIllegalArgumentException.java:19)
at hex.tree.SharedTree$Driver.computeImpl(SharedTree.java:192)
at hex.ModelBuilder$Driver.compute2(ModelBuilder.java:242)
at hex.ModelBuilder.trainModelNested(ModelBuilder.java:384)
at hex.ModelBuilder$TrainModelNestedRunnable.run(ModelBuilder.java:419)
at water.H2O.runOnH2ONode(H2O.java:1356)
at water.H2O.runOnH2ONode(H2O.java:1345)
at hex.ModelBuilder.trainModelNested(ModelBuilder.java:400)
at hex.grid.GridSearch.buildModel(GridSearch.java:529)
at hex.grid.GridSearch.gridSearch(GridSearch.java:357)
at hex.grid.GridSearch.access$000(GridSearch.java:69)
at hex.grid.GridSearch$1.compute2(GridSearch.java:141)
at water.H2O$H2OCountedCompleter.compute(H2O.java:1468)
at jsr166y.CountedCompleter.exec(CountedCompleter.java:468)
at jsr166y.ForkJoinTask.doExec(ForkJoinTask.java:263)
at jsr166y.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:974)
at jsr166y.ForkJoinPool.runWorker(ForkJoinPool.java:1477)
at jsr166y.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:104)

Hyper-parameter: min_rows, 40
Hyper-parameter: sample_rate, 2
[2020-02-12 16:32:03] failure_details: Illegal argument(s) for DRF model: grid-88d599ea-9fa7-4f25-9fd5-ce195e63a3f9_model_4. Details: ERRR on field: _sample_rate: sample_rate should be in interval ]0,1] but it is 2.0.

[2020-02-12 16:32:03] failure_stack_traces: water.exceptions.H2OModelBuilderIllegalArgumentException: Illegal argument(s) for DRF model: grid-88d599ea-9fa7-4f25-9fd5-ce195e63a3f9_model_4. Details: ERRR on field: _sample_rate: sample_rate should be in interval ]0,1] but it is 2.0.

at water.exceptions.H2OModelBuilderIllegalArgumentException.makeFromBuilder(H2OModelBuilderIllegalArgumentException.java:19)
at hex.tree.SharedTree$Driver.computeImpl(SharedTree.java:192)
at hex.ModelBuilder$Driver.compute2(ModelBuilder.java:242)
at hex.ModelBuilder.trainModelNested(ModelBuilder.java:384)
at hex.ModelBuilder$TrainModelNestedRunnable.run(ModelBuilder.java:419)
at water.H2O.runOnH2ONode(H2O.java:1356)
at water.H2O.runOnH2ONode(H2O.java:1345)
at hex.ModelBuilder.trainModelNested(ModelBuilder.java:400)
at hex.grid.GridSearch.buildModel(GridSearch.java:529)
at hex.grid.GridSearch.gridSearch(GridSearch.java:357)
at hex.grid.GridSearch.access$000(GridSearch.java:69)
at hex.grid.GridSearch$1.compute2(GridSearch.java:141)
at water.H2O$H2OCountedCompleter.compute(H2O.java:1468)
at jsr166y.CountedCompleter.exec(CountedCompleter.java:468)
at jsr166y.ForkJoinTask.doExec(ForkJoinTask.java:263)
at jsr166y.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:974)
at jsr166y.ForkJoinPool.runWorker(ForkJoinPool.java:1477)
at jsr166y.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:104)

{noformat}

flow command

{noformat}buildModel 'drf', {"model_id":"drf-7abf4113-43e7-4a36-b83d-587a3baff74c","training_frame":"kdd_train.hex_sid_99b6_1","nfolds":0,"response_column":"churn","ignored_columns":[],"ignore_const_cols":true,"ntrees":50,"max_depth":20,"nbins":20,"seed":-1,"mtries":-1,"score_each_iteration":false,"score_tree_interval":0,"nbins_top_level":1024,"nbins_cats":1024,"r2_stopping":1.7976931348623157e+308,"stopping_rounds":0,"stopping_metric":"AUTO","stopping_tolerance":0.001,"max_runtime_secs":0,"col_sample_rate_per_tree":1,"min_split_improvement":0.00001,"histogram_type":"AUTO","categorical_encoding":"AUTO","distribution":"AUTO","build_tree_one_node":false,"sample_rate_per_class":[],"binomial_double_trees":false,"col_sample_rate_change_per_level":1,"calibrate_model":false,"check_constant_response":true,"grid_id":"grid-88d599ea-9fa7-4f25-9fd5-ce195e63a3f9","hyper_parameters":{"min_rows":[1,40],"sample_rate":[0.632,2]},"search_criteria":{"strategy":"Cartesian"}}{noformat}

@exalate-issue-sync
Copy link
Author

Nidhi Mehta commented: looks like

{noformat}> gg =h2o.getGrid("grid-88d599ea-9fa7-4f25-9fd5-ce195e63a3f9",sort_by = "residual_deviance")
Hyper-parameter: min_rows, 1
Hyper-parameter: sample_rate, 2
[2020-02-12 16:39:47] failure_details: Illegal argument(s) for DRF model: grid-88d599ea-9fa7-4f25-9fd5-ce195e63a3f9_model_3. Details: ERRR on field: _sample_rate: sample_rate should be in interval ]0,1] but it is 2.0.

[2020-02-12 16:39:47] failure_stack_traces: water.exceptions.H2OModelBuilderIllegalArgumentException: Illegal argument(s) for DRF model: grid-88d599ea-9fa7-4f25-9fd5-ce195e63a3f9_model_3. Details: ERRR on field: _sample_rate: sample_rate should be in interval ]0,1] but it is 2.0.

at water.exceptions.H2OModelBuilderIllegalArgumentException.makeFromBuilder(H2OModelBuilderIllegalArgumentException.java:19)
at hex.tree.SharedTree$Driver.computeImpl(SharedTree.java:192)
at hex.ModelBuilder$Driver.compute2(ModelBuilder.java:242)
at hex.ModelBuilder.trainModelNested(ModelBuilder.java:384)
at hex.ModelBuilder$TrainModelNestedRunnable.run(ModelBuilder.java:419)
at water.H2O.runOnH2ONode(H2O.java:1356)
at water.H2O.runOnH2ONode(H2O.java:1345)
at hex.ModelBuilder.trainModelNested(ModelBuilder.java:400)
at hex.grid.GridSearch.buildModel(GridSearch.java:529)
at hex.grid.GridSearch.gridSearch(GridSearch.java:357)
at hex.grid.GridSearch.access$000(GridSearch.java:69)
at hex.grid.GridSearch$1.compute2(GridSearch.java:141)
at water.H2O$H2OCountedCompleter.compute(H2O.java:1468)
at jsr166y.CountedCompleter.exec(CountedCompleter.java:468)
at jsr166y.ForkJoinTask.doExec(ForkJoinTask.java:263)
at jsr166y.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:974)
at jsr166y.ForkJoinPool.runWorker(ForkJoinPool.java:1477)
at jsr166y.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:104)

Hyper-parameter: min_rows, 40
Hyper-parameter: sample_rate, 2
[2020-02-12 16:39:47] failure_details: Illegal argument(s) for DRF model: grid-88d599ea-9fa7-4f25-9fd5-ce195e63a3f9_model_4. Details: ERRR on field: _sample_rate: sample_rate should be in interval ]0,1] but it is 2.0.

[2020-02-12 16:39:47] failure_stack_traces: water.exceptions.H2OModelBuilderIllegalArgumentException: Illegal argument(s) for DRF model: grid-88d599ea-9fa7-4f25-9fd5-ce195e63a3f9_model_4. Details: ERRR on field: _sample_rate: sample_rate should be in interval ]0,1] but it is 2.0.

at water.exceptions.H2OModelBuilderIllegalArgumentException.makeFromBuilder(H2OModelBuilderIllegalArgumentException.java:19)
at hex.tree.SharedTree$Driver.computeImpl(SharedTree.java:192)
at hex.ModelBuilder$Driver.compute2(ModelBuilder.java:242)
at hex.ModelBuilder.trainModelNested(ModelBuilder.java:384)
at hex.ModelBuilder$TrainModelNestedRunnable.run(ModelBuilder.java:419)
at water.H2O.runOnH2ONode(H2O.java:1356)
at water.H2O.runOnH2ONode(H2O.java:1345)
at hex.ModelBuilder.trainModelNested(ModelBuilder.java:400)
at hex.grid.GridSearch.buildModel(GridSearch.java:529)
at hex.grid.GridSearch.gridSearch(GridSearch.java:357)
at hex.grid.GridSearch.access$000(GridSearch.java:69)
at hex.grid.GridSearch$1.compute2(GridSearch.java:141)
at water.H2O$H2OCountedCompleter.compute(H2O.java:1468)
at jsr166y.CountedCompleter.exec(CountedCompleter.java:468)
at jsr166y.ForkJoinTask.doExec(ForkJoinTask.java:263)
at jsr166y.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:974)
at jsr166y.ForkJoinPool.runWorker(ForkJoinPool.java:1477)
at jsr166y.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:104)

gg
H2O Grid Details
================

Grid ID: grid-88d599ea-9fa7-4f25-9fd5-ce195e63a3f9
Used hyper parameters:

  • min_rows
  • sample_rate
    Number of models: 2
    Number of failed models: 2

Hyper-Parameter Search Summary: ordered by increasing residual_deviance
min_rows sample_rate model_ids residual_deviance
1 40.0 0.632 grid-88d599ea-9fa7-4f25-9fd5-ce195e63a3f9_model_2 0.26430696412809124
2 1.0 0.632 grid-88d599ea-9fa7-4f25-9fd5-ce195e63a3f9_model_1 0.2737035581353081
Failed models

min_rows sample_rate status_failed
1.0 2.0 FAIL
40.0 2.0 FAIL
msgs_failed
"Illegal argument(s) for DRF model: grid-88d599ea-9fa7-4f25-9fd5-ce195e63a3f9_model_3. Details: ERRR on field: _sample_rate: sample_rate should be in interval ]0,1] but it is 2.0.\n"
"Illegal argument(s) for DRF model: grid-88d599ea-9fa7-4f25-9fd5-ce195e63a3f9_model_4. Details: ERRR on field: _sample_rate: sample_rate should be in interval ]0,1] but it is 2.0.\n"

{noformat}

just prints the errors but the import and sort actually worked and user just needs to print the object.

The displayed error msgs are confusing to user and they assume the function failed. Converting this Jira to improvement where we add the improvement and convert the error msg to a simple warning.

@h2o-ops
Copy link
Collaborator

h2o-ops commented May 14, 2023

JIRA Issue Migration Info

Jira Issue: PUBDEV-7253
Assignee: Michal Kurka
Reporter: Nidhi Mehta
State: Resolved
Fix Version: 3.28.0.4
Attachments: N/A
Development PRs: Available

Linked PRs from JIRA

#4322

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant