Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add more feature preprocessing operators #102

Closed
3 of 4 tasks
rhiever opened this issue Mar 3, 2016 · 2 comments
Closed
3 of 4 tasks

Add more feature preprocessing operators #102

rhiever opened this issue Mar 3, 2016 · 2 comments

Comments

@rhiever
Copy link
Contributor

rhiever commented Mar 3, 2016

There are some feature preprocessors that could potentially work well in the TPOT environment:

Allowing TPOT to work with these preprocessors shouldn't be much more expensive, and it will allow TPOT to explore different ways of processing the features.

@rhiever
Copy link
Contributor Author

rhiever commented Mar 6, 2016

We won't be able to add support for OneHotEncoder until #29 is solved. I will add support for the other three preprocessors.

Further complication with OneHotEncoder: It should only be applied to categorical columns. Thus, we need an effective heuristic to detect categorical vs. continuous columns. I've raised a question on StackOverflow to discuss ideas: http://stackoverflow.com/questions/35826912/what-is-a-good-heuristic-to-detect-if-a-column-in-a-pandas-dataframe-is-categori

@rhiever
Copy link
Contributor Author

rhiever commented Mar 6, 2016

The three preprocessing operators are now implemented in this branch: https://github.com/rhiever/tpot/tree/more-feature-preprocessors

TODO:

  • Add export() support for them
  • Add them to the docs

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant