Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add to docs that transform only works on numerical columns #8257

Open
exalate-issue-sync bot opened this issue May 11, 2023 · 2 comments
Open

Add to docs that transform only works on numerical columns #8257

exalate-issue-sync bot opened this issue May 11, 2023 · 2 comments

Comments

@exalate-issue-sync
Copy link

update the transform docs to indicate that it only works with numerical data.
See: [https://h2oai.slack.com/archives/C04KNHH2H/p1584389368047200|https://h2oai.slack.com/archives/C04KNHH2H/p1584389368047200]

Original question (March 16, 2020):

When transform = "standardize" for GLRM, it appears that only numeric columns are standardized and categorical/binary columns are skipped (which seems like the right approach): [https://github.com/h2oai/h2o-3/blob/master/h2o-algos/src/main/java/hex/glrm/GLRM.java#L356-L358|https://github.com/h2oai/h2o-3/blob/master/h2o-algos/src/main/java/hex/glrm/GLRM.java#L356-L358|smart-link] and [https://github.com/h2oai/h2o-3/blob/master/h2o-algos/src/main/java/hex/glrm/GLRM.java#L1030.|https://github.com/h2oai/h2o-3/blob/master/h2o-algos/src/main/java/hex/glrm/GLRM.java#L1030.|smart-link] Is that interpretation correct? Either way, is it possible to clarify the treatment of categorical/binary variables in the documentation for transform? [https://github.com//h2oai/h2o-3/blob/master/h2o-docs/src/product/data-science/algo-params/transform.rst|https://github.com//h2oai/h2o-3/blob/master/h2o-docs/src/product/data-science/algo-params/transform.rst]

@exalate-issue-sync
Copy link
Author

Angela Bartz commented: Pull request merged into rel-yule.

@exalate-issue-sync
Copy link
Author

Wendy commented: Chris:

I messed up my answer to your question in the PR. Here is my answer again.

Standardizaton is only applied to numerical column types. Enum/binary columns are not affected by standardization.

Hope this is clear.

Wendy

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

0 participants