Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add DecisionTreeRegressor & RandomForestRegressor #30

Closed
troglotit opened this issue Nov 19, 2015 · 3 comments
Closed

Add DecisionTreeRegressor & RandomForestRegressor #30

troglotit opened this issue Nov 19, 2015 · 3 comments
Labels

Comments

@troglotit
Copy link

I was wondering whether it is easy to implement a regressor for TPOT. It'd use DecisionTreeRegressor and RandomForestRegressor instead of classifiers.

It'd increase the number of TPOT usage and boost the development of TPOT.

I'm kinda at pre-intermediate level at commiting to open-source and precisely sklearn, but if someone could tell me this task is achievable to a newcomer I'll start working on this.

@Chris7
Copy link

Chris7 commented Nov 20, 2015

I looked at doing this briefly. From what I saw:

  1. You would need to add new methods for feature selection for continuous data,
  2. You would add a new function for each regressor along with their arguments, and add it to the primitives.

Here's a gist I made implementing an AdaBoostRegressor:
https://gist.github.com/Chris7/d46a57f03ed7507c59e6

Ultimately, TPOT should have a keyword indicate we are using continuous versus labeled data. Quite a bit of the guts need to be reworked for splitting the train/test set and feature selection for continuous data.

@rhiever
Copy link
Contributor

rhiever commented Nov 23, 2015

Hi! Just wanted to acknowledge that I saw your issue and hope to get to it soon. @Chris7 is on the right track with his response.

@rhiever
Copy link
Contributor

rhiever commented Dec 2, 2015

Okay, finally back from vacation! I'm happy to hear that you're excited about contributing to TPOT.

Looking at @Chris7's gist, he's right on the money. I'd imagine instead of having a special keyword passed to TPOT, we'd be better off implementing separate "classification" and "regression" versions of TPOT (e.g., TPOTRegressor and TPOTClassifier) that build off of a base TPOT object with the code that is shared between the two.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants