Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handling of mixed type datasets with categorical features #58

Closed
candalfigomoro opened this issue Jan 15, 2020 · 2 comments
Closed

Handling of mixed type datasets with categorical features #58

candalfigomoro opened this issue Jan 15, 2020 · 2 comments
Labels
question Further information is requested

Comments

@candalfigomoro
Copy link
Contributor

From the README file:

both categorical and continuous features are handled well

How do we handle categorical features? Is one-hot-encoding enough?

In UMAP you can use different distances for one-hot-encoded categorical features (e.g. dice, jaccard etc.) and continuous features, then you perform an "intersection" (see lmcinnes/umap#58).

How can we handle mixed type datasets in ivis? Can we just use it on a dataset with continuous features and one-hot-encoded categorical features mixed together?

Thank you very much

@idroz
Copy link
Collaborator

idroz commented Jan 15, 2020

Hi - you can mix one-hot-encoded features with continuous. Since OHE will be 0/1, you may also need to scale your continuous features to 0-1 scale as well.

@idroz idroz added the question Further information is requested label Jan 15, 2020
@candalfigomoro
Copy link
Contributor Author

Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants