Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

R: h2o.unique returns the entire domain rather than the unique values present in the column #11601

Closed
exalate-issue-sync bot opened this issue May 12, 2023 · 1 comment

Comments

@exalate-issue-sync
Copy link

On original frames the unique values in a categorical column may be the same as the levels in a column. However if I was to subset iris dataset to only have setosa and versicolor, unique will still return the original domain. h2o.unique and h2o.levels have the same functionality.

{code:r}
library(h2o)
h2o.init(nthreads = -1)
iris.hex = as.h2o(iris)
h2o.unique(iris.hex$Species)

C1

1 setosa

2 versicolor

3 virginica

Subset iris dataset

iris_subset = iris.hex[1:100,]

Run table to track

h2o.table(iris_subset$Species)

Species Count

1 setosa 50

2 versicolor 50

However h2o.unique still returns virginica

h2o.unique(iris_subset$Species)

C1

1 setosa

2 versicolor

3 virginica

{code}

@hasithjp
Copy link
Member

JIRA Issue Migration Info

Jira Issue: PUBDEV-4722
Assignee: Pavel Pscheidl
Reporter: Amy Wang
State: Resolved
Fix Version: 3.30.1.2
Attachments: N/A
Development PRs: N/A

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant