-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AutoML fails when the project_name is parsable as a number #8642
Comments
Sebastien Poirier commented: [~accountid:557058:afd6e9a4-1891-4845-98ea-b5d34a2bc42c] This is a constraint on {{Frame}} ids. I don’t think it’s worth the effort (and complexity) to fix this for {{AutoML}} project names… if it’s just for display, you can prefix with {{_}}, it works: {{project_name=”_3.26.0.8”}}… or {{project_name=”v3.26.0.8”}} Should I close it? |
Sebastien Poirier commented: What I can do though, is to validate {{project_name}} upfront, so that there’s no surprise after the run. |
Sebastien Poirier commented: For some reason, it doesn’t work either in R, but the error is much more cryptic (I believe there’s no validation there). |
Sebastien Poirier commented: This constraint is connected with Rapids language.
|
Erin LeDell commented: [~accountid:5b153fb1b0d76456f36daced] Good plan, thanks! |
JIRA Issue Migration Info Jira Issue: PUBDEV-6998 Linked PRs from JIRA |
If you use a project_name string like "3.26.0.8", AutoML will fail at the client side because it can't retreive the leaderboard. I guess H2OFrames cannot begin with numbers?
{code:java}library(h2o)
h2o.init()
Import a sample binary outcome train/test set into H2O
train <- h2o.importFile("https://s3.amazonaws.com/erin-data/higgs/higgs_train_10k.csv")
test <- h2o.importFile("https://s3.amazonaws.com/erin-data/higgs/higgs_test_5k.csv")
Identify predictors and response
y <- "response"
x <- setdiff(names(train), y)
For binary classification, response should be a factor
train[,y] <- as.factor(train[,y])
test[,y] <- as.factor(test[,y])
aml <- h2o.automl(x = x, y = y,
training_frame = train,
max_models = 2,
project_name = "3.26.0.8",
seed = 1){code}
This is the error:
{{Error in if (leaderboard$model_id[1, 1] == "") { : }}
{{argument is of length zero}}
If you try to grab the model from Python, you will see another error:
{code}In [4]: aml = get_automl("3.26.0.8")
H2OValueError Traceback (most recent call last)
in ()
----> 1 aml = get_automl("3.26.0.8")
/home/ledell/venv/h2o-3/local/lib/python2.7/site-packages/h2o/automl/autoh2o.pyc in get_automl(project_name)
610 :returns: A dictionary containing the project_name, leader model, leaderboard, event_log.
611 """
--> 612 return H2OAutoML._fetch_state(project_name)
/home/ledell/venv/h2o-3/local/lib/python2.7/site-packages/h2o/automl/autoh2o.pyc in _fetch_state(project_name, properties)
585 leaderboard = None
586 if should_fetch('leaderboard'):
--> 587 leaderboard = H2OAutoML._fetch_table(state_json['leaderboard_table'], key=project_name+"_leaderboard", progress_bar=False)
588 leaderboard = h2o.assign(leaderboard[1:], project_name+"_leaderboard") # removing index and reassign id to ensure persistence on backend
589
/home/ledell/venv/h2o-3/local/lib/python2.7/site-packages/h2o/automl/autoh2o.pyc in _fetch_table(table, key, progress_bar)
565 H2OJob.PROGRESS_BAR = progress_bar
566 # Parse leaderboard H2OTwoDimTable & return as an H2OFrame
--> 567 return h2o.H2OFrame(table.cell_values, destination_frame=key, column_names=table.col_header, column_types=table.col_types)
568 finally:
569 H2OJob.PROGRESS_BAR = ori_progress_state
/home/ledell/venv/h2o-3/local/lib/python2.7/site-packages/h2o/frame.pyc in init(self, python_obj, destination_frame, header, separator, column_names, column_types, na_strings, skipped_columns)
100 assert_is_type(column_types, None, [coltype], {str: coltype})
101 assert_is_type(na_strings, None, [str], [[str]], {str: [str]})
--> 102 check_frame_id(destination_frame)
103
104 self._ex = ExprNode()
/home/ledell/venv/h2o-3/local/lib/python2.7/site-packages/h2o/utils/shared_utils.pyc in check_frame_id(frame_id)
56 raise H2OValueError("Character '%s' is illegal in frame id: %s" % (ch, frame_id))
57 if re.match(r"-?[0-9]", frame_id):
---> 58 raise H2OValueError("Frame id cannot start with a number: %s" % frame_id)
59
60{code}
The text was updated successfully, but these errors were encountered: