Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Overriding string numerical conversion #54

Closed
jithurjacob opened this issue Nov 29, 2017 · 2 comments
Closed

Overriding string numerical conversion #54

jithurjacob opened this issue Nov 29, 2017 · 2 comments

Comments

@jithurjacob
Copy link

Hi,

For a data set I'm using the target and categorical variables are already label encoded to integers. MLBox is incorrectly identifying the categorical values & target as continuous. Thus MLBox is incorrectly converting a classification task into a regression task.

I tried setting the columns as string but still MLBox is converting to integer. Can I override this behavior?

dataset

@jithurjacob jithurjacob changed the title Identifying classification/regression type Overriding string numerical conversion Nov 29, 2017
@AxeldeRomblay
Copy link
Owner

Hello !

You're right : MLBox tries to cast "fake" categorical features with levels like "1", ... For the next release, the target won't be casted but the features will still be. To avoid this, unfortunately, you will need to append to each level a string like "level" :
.apply(lambda x: "level"+str(x))

Hope it will help you !

@jithurjacob
Copy link
Author

Thanks that helps

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants