Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error when trying to use a fold column when number of folds < official number of levels in that column #7624

Closed
exalate-issue-sync bot opened this issue May 11, 2023 · 3 comments

Comments

@exalate-issue-sync
Copy link

I am trying to do to a pretty standard thing in ML and i am getting an error.

task:

  • there’s a “cv” categorical column, which has 5 values (5-folds)
  • i subset the frame by the cv column, to make train (1-4) and test (5)
  • now i try to train a h2o.glm using train and i want to do 4-fold CV here using the 4 folds i have left, using the fold_column argument.
  • however there’s an error in h2o.glm because its mad that train$cv says it has 5 levels, but only 4 are represented in the dataset. ive confimed this because it works if i use the original dataset with all 5 folds.
  • i can’t find a way to re-level the frame to tell it that cv column only has 4 levels. h2o.setLevels() is just a re-naming tool but you cant change the cardinality of the domain.

C !Screen Shot 2021-02-21 at 5.03.46 PM.png|thumbnail! an we relax this restriction on fold_column in H2O algos?

@h2o-ops
Copy link
Collaborator

h2o-ops commented May 14, 2023

JIRA Issue Details

Jira Issue: PUBDEV-8024
Assignee: Michal Kurka
Reporter: Erin LeDell
State: Resolved
Fix Version: 3.32.1.1
Attachments: Available (Count: 1)
Development PRs: Available

@h2o-ops
Copy link
Collaborator

h2o-ops commented May 14, 2023

Attachments From Jira

Attachment Name: Screen Shot 2021-02-21 at 5.03.46 PM.png
Attached By: Erin LeDell
File Link:https://h2o-3-jira-github-migration.s3.amazonaws.com/PUBDEV-8024/Screen Shot 2021-02-21 at 5.03.46 PM.png

@h2o-ops
Copy link
Collaborator

h2o-ops commented May 14, 2023

Linked PRs from JIRA

#5345

@h2o-ops h2o-ops closed this as completed May 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant