Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to provide hyperparameters for Naive-Bayes and XGBoost metalearners in Stacked Ensembles #7738

Closed
exalate-issue-sync bot opened this issue May 11, 2023 · 2 comments

Comments

@exalate-issue-sync
Copy link

Code snippets to reproduce the bug

STEP-1
Train the base learners

{code:python}
import h2o
from h2o.estimators.random_forest import H2ORandomForestEstimator
from h2o.estimators.gbm import H2OGradientBoostingEstimator
from h2o.estimators.stackedensemble import H2OStackedEnsembleEstimator
h2o.init()

train = h2o.import_file("https://s3.amazonaws.com/erin-data/higgs/higgs_train_10k.csv")
test = h2o.import_file("https://s3.amazonaws.com/erin-data/higgs/higgs_test_5k.csv")

x = train.columns
y = "response"
x.remove(y)

train[y] = train[y].asfactor()
test[y] = test[y].asfactor()

nfolds = 5

my_gbm = H2OGradientBoostingEstimator(distribution="bernoulli",
ntrees=10,
max_depth=3,
min_rows=2,
learn_rate=0.2,
nfolds=nfolds,
fold_assignment="Modulo",
keep_cross_validation_predictions=True,
seed=1)

my_gbm.train(x=x, y=y, training_frame=train)

Train and cross-validate a RF

my_rf = H2ORandomForestEstimator(ntrees=50,
nfolds=nfolds,
fold_assignment="Modulo",
keep_cross_validation_predictions=True,
seed=1)
my_rf.train(x=x, y=y, training_frame=train)

{code}

STEP-2
Train the stacked ensemble with the naivebayes metalearner and pass params via metalearner_params.

{code:python}

ensemble = H2OStackedEnsembleEstimator(model_id="my_ensemble_binomial",
base_models=[my_gbm, my_rf],
metalearner_algorithm= 'naivebayes',
metalearner_params= {'min_prob': 0.5})

ensemble.train(x=x, y=y, training_frame=train)
{code}

The error message is as follows

{code}

H2OServerError Traceback (most recent call last)
in
4 metalearner_params= {'min_prob': 0.5})
5
----> 6 ensemble.train(x=x, y=y, training_frame=train)

/anaconda/envs/azureml_py36/lib/python3.6/site-packages/h2o/estimators/stackedensemble.py in train(self, x, y, training_frame, blending_frame, verbose, **kwargs)
827 parms = sup._make_parms(x, y, training_frame, extend_parms_fn=extend_parms, **kwargs)
828
--> 829 sup._train(parms, verbose=verbose)

/anaconda/envs/azureml_py36/lib/python3.6/site-packages/h2o/estimators/estimator_base.py in _train(self, parms, verbose)
192
193 rest_ver = self._get_rest_version(parms)
--> 194 model_builder_json = h2o.api("POST /%d/ModelBuilders/%s" % (rest_ver, self.algo), data=parms)
195 job = H2OJob(model_builder_json, job_type=(self.algo + " Model Build"))
196

/anaconda/envs/azureml_py36/lib/python3.6/site-packages/h2o/h2o.py in api(endpoint, data, json, filename, save_to)
107 # type checks are performed in H2OConnection class
108 _check_connection()
--> 109 return h2oconn.request(endpoint, data=data, json=json, filename=filename, save_to=save_to)
110
111

/anaconda/envs/azureml_py36/lib/python3.6/site-packages/h2o/backend/connection.py in request(self, endpoint, data, json, filename, save_to)
476 save_to = save_to(resp)
477 self._log_end_transaction(start_time, resp)
--> 478 return self._process_response(resp, save_to)
479
480 except (requests.exceptions.ConnectionError, requests.exceptions.HTTPError) as e:

/anaconda/envs/azureml_py36/lib/python3.6/site-packages/h2o/backend/connection.py in _process_response(response, save_to)
827 # Note that it is possible to receive valid H2OErrorV3 object in this case, however it merely means the server
828 # did not provide the correct status code.
--> 829 raise H2OServerError("HTTP %d %s:\n%r" % (status_code, response.reason, data))
830
831

H2OServerError: HTTP 500 Server Error:
Server error java.lang.UnsupportedOperationException:
Error: Unknown meta-learner algo: naivebayes
Request: None
{code}

STEP-3
Train the stacked ensemble with the naivebayes metalearner and pass params via metalearner_params.

{code:python}

ensemble = H2OStackedEnsembleEstimator(model_id="my_ensemble_binomial",
base_models=[my_gbm, my_rf],
metalearner_algorithm= 'xgboost',
metalearner_params= {'booster': 'dart'})

ensemble.train(x=x, y=y, training_frame=train)
{code}

The corresponding error message

{code}

H2OServerError Traceback (most recent call last)
in
4 metalearner_params= {'booster': 'dart'})
5
----> 6 ensemble.train(x=x, y=y, training_frame=train)

/anaconda/envs/azureml_py36/lib/python3.6/site-packages/h2o/estimators/stackedensemble.py in train(self, x, y, training_frame, blending_frame, verbose, **kwargs)
827 parms = sup._make_parms(x, y, training_frame, extend_parms_fn=extend_parms, **kwargs)
828
--> 829 sup._train(parms, verbose=verbose)

/anaconda/envs/azureml_py36/lib/python3.6/site-packages/h2o/estimators/estimator_base.py in _train(self, parms, verbose)
192
193 rest_ver = self._get_rest_version(parms)
--> 194 model_builder_json = h2o.api("POST /%d/ModelBuilders/%s" % (rest_ver, self.algo), data=parms)
195 job = H2OJob(model_builder_json, job_type=(self.algo + " Model Build"))
196

/anaconda/envs/azureml_py36/lib/python3.6/site-packages/h2o/h2o.py in api(endpoint, data, json, filename, save_to)
107 # type checks are performed in H2OConnection class
108 _check_connection()
--> 109 return h2oconn.request(endpoint, data=data, json=json, filename=filename, save_to=save_to)
110
111

/anaconda/envs/azureml_py36/lib/python3.6/site-packages/h2o/backend/connection.py in request(self, endpoint, data, json, filename, save_to)
476 save_to = save_to(resp)
477 self._log_end_transaction(start_time, resp)
--> 478 return self._process_response(resp, save_to)
479
480 except (requests.exceptions.ConnectionError, requests.exceptions.HTTPError) as e:

/anaconda/envs/azureml_py36/lib/python3.6/site-packages/h2o/backend/connection.py in _process_response(response, save_to)
827 # Note that it is possible to receive valid H2OErrorV3 object in this case, however it merely means the server
828 # did not provide the correct status code.
--> 829 raise H2OServerError("HTTP %d %s:\n%r" % (status_code, response.reason, data))
830
831

H2OServerError: HTTP 500 Server Error:
Server error java.lang.UnsupportedOperationException:
Error: Unknown meta-learner algo: xgboost
Request: None

{code}

@exalate-issue-sync
Copy link
Author

Erin LeDell commented: Thanks [~accountid:557058:d55d2d14-5709-435a-be18-0436a70f4414] for the bug report. We will take a look.

@h2o-ops
Copy link
Collaborator

h2o-ops commented May 14, 2023

JIRA Issue Migration Info

Jira Issue: PUBDEV-7907
Assignee: Tomas Fryda
Reporter: Abhinav Sharma
State: Resolved
Fix Version: 3.34.0.4
Attachments: N/A
Development PRs: Available

Linked PRs from JIRA

#5153

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants