Integer encoding for categorical variables in random forests in R #22

Closed
zachmayer opened this Issue Aug 12, 2015 · 2 comments

Projects

None yet

2 participants

@zachmayer

This quote stuck out to me:

It cannot cope by default with a large number of categories, therefore the data had to be one-hot encoded.

Did you try integer-encoding categories? It looks like you did for python, maybe that's worth trying with R.

@zachmayer zachmayer changed the title from Integer encoding for categorical variables in random forests to Integer encoding for categorical variables in random forests in R Aug 12, 2015
@szilard
Owner
szilard commented Aug 12, 2015

Yes, I played around, see the discussion here #1

@szilard
Owner
szilard commented Aug 14, 2015

Closing this, but let me know if you have further questions.

@szilard szilard closed this Aug 14, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment