Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pipe error messages from XGBoost through to the clients #7631

Closed
exalate-issue-sync bot opened this issue May 11, 2023 · 3 comments
Closed

Pipe error messages from XGBoost through to the clients #7631

exalate-issue-sync bot opened this issue May 11, 2023 · 3 comments

Comments

@exalate-issue-sync
Copy link

Let's pipe through the error messages from XGBoost through to the R/Python clients. Right now, there's no information passed on to the client and you have to look through the H2O logs to figure out what went wrong.

I accidentally passed a number outside the allowed range for the colsample_bytree param (must be [0,1]). here’s the error i see in the client (i can’t figure out what went wrong):

{code:java}
java.lang.RuntimeException: Error while training XGBoost model
at hex.tree.xgboost.XGBoost$XGBoostDriver.buildModelImpl(XGBoost.java:392)
at hex.tree.xgboost.XGBoost$XGBoostDriver.buildModel(XGBoost.java:344)
at hex.tree.xgboost.XGBoost$XGBoostDriver.computeImpl(XGBoost.java:334)
at hex.ModelBuilder$Driver.compute2(ModelBuilder.java:243)
at water.H2O$H2OCountedCompleter.compute(H2O.java:1575)
at jsr166y.CountedCompleter.exec(CountedCompleter.java:468)
at jsr166y.ForkJoinTask.doExec(ForkJoinTask.java:263)
at jsr166y.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:974)
at jsr166y.ForkJoinPool.runWorker(ForkJoinPool.java:1477)
at jsr166y.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:104)
Caused by: java.lang.IllegalStateException: Cannot perform booster operation: updater is inactive on node localhost/127.0.0.1:54321
at hex.tree.xgboost.task.XGBoostUpdater.invoke(XGBoostUpdater.java:106)
at hex.tree.xgboost.task.XGBoostUpdater.doUpdate(XGBoostUpdater.java:157)
at hex.tree.xgboost.task.XGBoostUpdateTask.execute(XGBoostUpdateTask.java:20)
at hex.tree.xgboost.task.AbstractXGBoostTask.setupLocal(AbstractXGBoostTask.java:34)
at water.MRTask.setupLocal0(MRTask.java:566)
at water.MRTask.dfork(MRTask.java:416)
at water.MRTask.doAll(MRTask.java:408)
at water.MRTask.doAllNodes(MRTask.java:421)
at hex.tree.xgboost.task.AbstractXGBoostTask.run(AbstractXGBoostTask.java:45)
at hex.tree.xgboost.exec.LocalXGBoostExecutor.setup(LocalXGBoostExecutor.java:99)
at hex.tree.xgboost.XGBoost$XGBoostDriver.buildModelImpl(XGBoost.java:389)
... 9 more

Error: java.lang.RuntimeException: Error while training XGBoost model
{code}

In the logs, i can easily debug what the cause is, so it would be nice to pass this to the user:

{code:java}
02-19 19:07:01.627 127.0.0.1:54321 12119 final_cv_1 ERROR hex.tree.xgboost.task.XGBoostUpdater: XGBoost training iteration failed
ai.h2o.xgboost4j.java.XGBoostError: value 13 for Parameter colsample_bytree exceed bound [0,1]
colsample_bytree: Subsample ratio of columns, resample on each tree construction.
at ai.h2o.xgboost4j.java.XGBoostJNI.checkCall(XGBoostJNI.java:48)
at ai.h2o.xgboost4j.java.Booster.saveRabitCheckpoint(Booster.java:736)
at ai.h2o.xgboost4j.java.BoosterWrapper.(BoosterWrapper.java:26)
at hex.tree.xgboost.task.XGBoostUpdater$UpdateBooster.call(XGBoostUpdater.java:117)
at hex.tree.xgboost.task.XGBoostUpdater$UpdateBooster.call(XGBoostUpdater.java:109)
at hex.tree.xgboost.task.XGBoostUpdater.run(XGBoostUpdater.java:55)
{code}

If I were to do something like this for h2o.gbm() then i get a reasonable error in the client:

{code:java}
Error in .h2o.doSafeREST(h2oRestApiVersion = h2oRestApiVersion, urlSuffix = page, :

ERROR MESSAGE:

Illegal argument(s) for GBM model: GBM_model_R_1613782704173_431. Details: ERRR on field: _learn_rate: learn_rate must be between 0 and 1
{code}

@exalate-issue-sync
Copy link
Author

Michal Kurka commented: It is not possible to implement as formulated, instead validation checks for sampling rates will be added.

@h2o-ops
Copy link
Collaborator

h2o-ops commented May 14, 2023

JIRA Issue Details

Jira Issue: PUBDEV-8017
Assignee: Michal Kurka
Reporter: Erin LeDell
State: Resolved
Fix Version: 3.32.1.1
Attachments: N/A
Development PRs: Available

@h2o-ops
Copy link
Collaborator

h2o-ops commented May 14, 2023

Linked PRs from JIRA

#5340

@h2o-ops h2o-ops closed this as completed May 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant