Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AutoML with XGBoost Only Fails on Learning Rate Search #7265

Closed
exalate-issue-sync bot opened this issue May 11, 2023 · 8 comments
Closed

AutoML with XGBoost Only Fails on Learning Rate Search #7265

exalate-issue-sync bot opened this issue May 11, 2023 · 8 comments

Comments

@exalate-issue-sync
Copy link

Stack Trace:

water.exceptions.H2OModelBuilderIllegalArgumentException: Illegal argument(s) for XGBoost model: XGBoost_lr_search_selection_AutoML_3_20211025_135531_select_grid_model_8.  Details: ERRR on field: _ntrees: ntrees and its alias n_estimators are both set to different value than default value. Set n_estimators to default value (0.0), to use ntrees actual value.
ERRR on field: _ntrees: ntrees and its alias n_estimators are both set to different value than default value. Set n_estimators to default value (0.0), to use ntrees actual value.
ERRR on field: _ntrees: ntrees and its alias n_estimators are both set to different value than default value. Set n_estimators to default value (0.0), to use ntrees actual value.
ERRR on field: _ntrees: ntrees and its alias n_estimators are both set to different value than default value. Set n_estimators to default value (0.0), to use ntrees actual value.
ERRR on field: _ntrees: ntrees and its alias n_estimators are both set to different value than default value. Set n_estimators to default value (0.0), to use ntrees actual value.
ERRR on field: _ntrees: ntrees and its alias n_estimators are both set to different value than default value. Set n_estimators to default value (0.0), to use ntrees actual value.

    at water.exceptions.H2OModelBuilderIllegalArgumentException.makeFromBuilder(H2OModelBuilderIllegalArgumentException.java:19)
    at hex.ModelBuilder.cv_makeFramesAndBuilders(ModelBuilder.java:802)
    at hex.ModelBuilder.computeCrossValidation(ModelBuilder.java:613)
    at hex.ModelBuilder.trainModelNested(ModelBuilder.java:426)
    at hex.ModelBuilder$TrainModelNestedRunnable.run(ModelBuilder.java:461)
    at water.H2O.runOnH2ONode(H2O.java:1540)
    at water.H2O.runOnH2ONode(H2O.java:1529)
    at hex.ModelBuilder.trainModelNested(ModelBuilder.java:441)
    at hex.grid.GridSearch.buildModel(GridSearch.java:513)
    at hex.grid.GridSearch.gridSearch(GridSearch.java:356)
    at hex.grid.GridSearch.access$100(GridSearch.java:69)
    at hex.grid.GridSearch$1.compute2(GridSearch.java:114)
    at water.H2O$H2OCountedCompleter.compute(H2O.java:1652)
    at jsr166y.CountedCompleter.exec(CountedCompleter.java:468)
    at jsr166y.ForkJoinTask.doExec(ForkJoinTask.java:263)
    at jsr166y.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:974)
    at jsr166y.ForkJoinPool.runWorker(ForkJoinPool.java:1477)
    at jsr166y.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:104)
10-25 14:21:25.526 10.139.64.4:54321     474        FJ-3-5  INFO water.default: Grid time is limited to: 682.0 for grid: XGBoost_lr_search_selection_AutoML_3_20211025_135531_select_grid. Remaining time is: 680.306
10-25 14:21:25.526 10.139.64.4:54321     474        FJ-3-5  INFO water.default: Due to the grid time limit, changing model max runtime from: 30.0 secs to: 30.0 secs.
10-25 14:21:25.530 10.139.64.4:54321     474        FJ-3-5  WARN water.default: _col_sample_rate_per_tree: Using user-provided parameter colsample_bytree instead of col_sample_rate_per_tree."
10-25 14:21:25.531 10.139.64.4:54321     474        FJ-3-5 ERROR water.default: _ntrees: ntrees and its alias n_estimators are both set to different value than default value. Set n_estimators to default value (0.0), to use ntrees actual value.
10-25 14:21:25.531 10.139.64.4:54321     474        FJ-3-5  WARN water.default: _ntrees: Using user-provided parameter n_estimators instead of ntrees."
10-25 14:21:25.531 10.139.64.4:54321     474        FJ-3-5  INFO water.default: Creating 5 cross-validation splits with random number seed: 8944534007554294972
10-25 14:21:25.594 10.139.64.4:54321     474    8240198-57 FATAL water.default: Override in subclasses which can be the result of a Job
10-25 14:21:25.602 10.139.64.4:54321     474    8240198-57 FATAL water.default: Stacktrace: 
10-25 14:21:25.602 10.139.64.4:54321     474    8240198-57 FATAL water.default: 
java.lang.Exception: Override in subclasses which can be the result of a Job
    at water.H2O.fail(H2O.java:1353)
    at water.H2O.fail(H2O.java:1379)
    at water.Keyed.makeSchema(Keyed.java:119)
    at water.api.schemas3.JobV3.fillFromImpl(JobV3.java:111)
    at water.api.schemas3.JobV3.fillFromImpl(JobV3.java:14)
    at water.api.JobsHandler.list(JobsHandler.java:24)
    at sun.reflect.GeneratedMethodAccessor57.invoke(Unknown Source)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at water.api.Handler.handle(Handler.java:60)
    at water.api.RequestServer.serve(RequestServer.java:470)
    at water.api.RequestServer.doGeneric(RequestServer.java:301)
    at water.api.RequestServer.doGet(RequestServer.java:225)
    at javax.servlet.http.HttpServlet.service(HttpServlet.java:687)
    at javax.servlet.http.HttpServlet.service(HttpServlet.java:790)
    at ai.h2o.org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:865)
    at ai.h2o.org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:535)
    at ai.h2o.org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:255)
    at ai.h2o.org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1317)
    at ai.h2o.org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:203)
    at ai.h2o.org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:473)
    at ai.h2o.org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:201)
    at ai.h2o.org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1219)
    at ai.h2o.org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:144)
    at ai.h2o.org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:126)
    at ai.h2o.org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
    at water.webserver.jetty9.Jetty9ServerAdapter$LoginHandler.handle(Jetty9ServerAdapter.java:130)
    at ai.h2o.org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:126)
    at ai.h2o.org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
    at ai.h2o.org.eclipse.jetty.server.Server.handle(Server.java:531)
    at ai.h2o.org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:352)
    at ai.h2o.org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:260)
    at ai.h2o.org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:281)
    at ai.h2o.org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:102)
    at ai.h2o.org.eclipse.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:118)
    at ai.h2o.org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:333)
    at ai.h2o.org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:310)
    at ai.h2o.org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:168)
    at ai.h2o.org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:126)
    at ai.h2o.org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:366)
    at ai.h2o.org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:762)
    at ai.h2o.org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:680)
    at java.lang.Thread.run(Thread.java:748)
@exalate-issue-sync
Copy link
Author

Veronika Maurerová commented: This is a new validation of XGBoost aliases, where we were allowed to use both parameters, and one rewrites the other in the past. Here is the Jira where I fixed it and implemented the check that both parameters cant be set both. [https://h2oai.atlassian.net/browse/PUBDEV-8266|https://h2oai.atlassian.net/browse/PUBDEV-8266|smart-link] So [~accountid:5b153fb1b0d76456f36daced] , first check that in AutoML you are not setting both parameters somewhere.

@exalate-issue-sync
Copy link
Author

Marek Novotny commented: [~accountid:5bd237b8dd3cc64b77e71676] Has anything in the case when only the original parameter (ntrees) without an alias (n_estimators) is sent to the h2o backend?

@exalate-issue-sync
Copy link
Author

Veronika Maurerová commented: The h2o backend should not set both parameters. I am not sure if in AutoML it somewhere happens. If a user sets only one parameter, the h2o backend should use only this one. In this case, it says:

{noformat} _ntrees: ntrees and its alias n_estimators are both set to different value than default value. {noformat}

So it means both parameters were set to different value and the h2o backend is not sure which one to use.

@exalate-issue-sync
Copy link
Author

Veronika Maurerová commented: This can happen for example if the old parameter object is used again and only one parameter is changed to a different value.

For example parameter object has ntrees = 10 & n_estimators = 10 which is ok. If you then change only ntree to 6, you sent parameters object with ntrees = 6 & n_estimators = 10, which causes the error.

@exalate-issue-sync
Copy link
Author

Sebastien Poirier commented: [~accountid:5bd237b8dd3cc64b77e71676] I think it’s a very bad idea to raise an error when both are set, especially when {{n_estimators}} has been deprecated for years, and in some (admittedly rare) cases, we’re now forcing users to set this deprecated param, preventing us from deleting it completely in the future without breaking code (it’s equivalent to resurrecting a dead param).
Aliases should aways be synced at the highest possible level, ie. API, they should quickly be completely removed from implementation, otherwise you end up with having to constantly synchronize the param and its alias, which is very bug-prone.

Here is the AutoML use-case and many users could do the same on their side:

  • we train a bunch of XGBoost models using CV.
  • we pick the best one, and try to fine-tune it.
  • for this purpose, we take the final params of that model, clone the {{Model.Parameters}} object and modify some values. When training the model with those new params, we reset {{ntrees}} to let CV do its job and here comes the problem:
    • the original best model had a {{ntrees}} found through CV, so when it was set to train the final model, XGBoost was syncing {{n_estimators}} to the same value (bad pattern!).
    • now, as we reuse the parameters of this trained model, we have {{n_trees = n_estimators = NNN}}, and we set {{n_trees = 1000}} for example, and of course without setting {{n_estimators}} (why would we? this stuff is old and deprecated! and worse, it should be an alias, since when do consumers have to set both params and their alias?).

If you want to keep {{n_estimators}} alias (and other aliases), I think it should appear only in the schema and be deleted everywhere else and the synchronization logic should be done there and only there ({{n_estimators}} schema property actually setting the value to {{n_trees}} param).
Usually, those issues with aliases are solved with setters, 2 different setters can easily modify the same private property, but in H2O-3, every param is public by design, preventing this approach…
At the minimum, as we have an official param ({{n_trees}}) and a deprecated one ({{n_estimator}}), I would always give the priority to the official one and raise a warning instead of an error when both are set to different values.

@exalate-issue-sync
Copy link
Author

Veronika Maurerová commented: Resolution from the private conversation:

  • aliases are not deprecated, user can use both of them
  • there was a discussion if both h2o parameters and their aliases should be in API - this is not solved by this JIRA
  • we keep the error message (my opinion is it is dangerous to select one parameter value as more important and rewrite the other value - this leads to a misunderstanding that the user accidentally set both parameters and get unexpected results)

@h2o-ops-ro
Copy link
Collaborator

JIRA Issue Details

Jira Issue: PUBDEV-8394
Assignee: Sebastien Poirier
Reporter: Marek Novotny
State: Resolved
Fix Version: 3.34.0.4
Attachments: N/A
Development PRs: Available

@h2o-ops-ro
Copy link
Collaborator

Linked PRs from JIRA

#5847
h2oai/h2o-flow#178

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant