Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

About One hot encoder #44

Closed
d-ghale opened this issue Sep 5, 2018 · 4 comments
Closed

About One hot encoder #44

d-ghale opened this issue Sep 5, 2018 · 4 comments
Labels

Comments

@d-ghale
Copy link

d-ghale commented Sep 5, 2018

Is there a reason why "one_hot_encoder" creates the columns in integer format instead of logical ?

I need help to find an efficient way to convert the columns I created using "one_hot_encoder" into lgl format without expliciting writing out the list of columns.

Thanks!

@ELToulemonde
Copy link
Owner

Hi,

Hmm... I don't really know. Why not? Do you have a use case where 0 and 1 are painful?

@d-ghale
Copy link
Author

d-ghale commented Sep 8, 2018

Yes, say I create a numeric column called total_count that contains 0 or 1. I use one_hot_encoder to create a logical column but it is expressed as 0 or 1, not TRUE or FALSE. When I use whichAreInDouble I get a false result saying these two columns are the same.

@ELToulemonde
Copy link
Owner

Ok I say

  • First, I don't realy see why it is a problem, since both columns contains exactly the same information (for example function whichAreBijection will drop it even if it is TRUE/FALSE or 0/1.

  • Second, here is a dirty fix; pass the column you wish to fix into argument keep_cols in whichAreDouble

  • Last, I added the argument type into one_hot_encoder, this argument let's you choose between integer, numeric and logical for the type of the result. It will be released in future CRAN release.

Until CRAN release, you can have this feature by installing package directly from github

library(devtools)
install_github("ELToulemonde/dataPreparation")

Hope I answered your question.

If you have any other question/remarks please don't hesitate.

I close.

Emmanuel-Lin

@d-ghale
Copy link
Author

d-ghale commented Sep 10, 2018

Thanks! Yes, whichAreBijection will drop it even if it is TRUE/FALSE or 0/1. I am using fastFilterVariables with type = 2 to drop constant and double columns. Your update solved my problem.

Thank You!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants