[Major refactor] Incorporate OO redesign #91

rhiever · 2016-02-25T18:34:48Z

See #63 for an example of the new OO design on an old version of TPOT. This will require a large overhaul of TPOT.

rhiever · 2016-06-01T22:46:52Z

@tonyfast: I just remembered that we have this issue open to discuss the major refactor coming up after the 0.4 release.

@teaearlgraycold, please link your WIP refactor here so we can take a look at it.

danthedaniel · 2016-06-02T01:14:50Z

Don't try to run this code - atm it's just a structural layout

Edit: Some of it will kinda work now

Edit 2: You can actually do a fit_predict run now

https://github.com/teaearlgraycold/tpot/tree/refactor

Code of interest is in tpot/operators and tpot/tpot.py#134

danthedaniel · 2016-06-02T01:27:54Z

So with this setup if you want to do something like refactor TPOT so it just uses Numpy matrices instad of Pandas DataFrames you can edit the Operator class, Classifer class, and PreProcessor class and leave all actuall classifiers and preprocessors untouched.

Edit:

And something I'm interested in doing is largely forgoing the preprocess_args() method (from the refactor) as it is now, and just implement some general rules for arguments that will be applied based off of what argument names are used.

So for example:

If you have a Classifier that takes arguments 'max_features' and 'max_depth', there will be general rules that apply that say max_features should be between 1 and len(training_features.columns). There will be another rule that states max_depth should be between 1 and max_depth.

So when you add a new classifier or pre-processor you don't need to add extra code that threshholds values that we've already determined reasonable limits for. You'd just need to say you want some set of parameters, and if any of them have pre-defined limits it'll use those. Failing that it would run any code you specify to limit the arguments.

This however assumes that an argument's name can reliably be used to determine what kind of threshholding is useful for that argument.

Doing this would also means that instead of testing each operator with extreme values that are covered in argument preprocessing code, you can test the general rules once.

tonyfast · 2016-06-02T20:07:35Z

I started looking into a refactor myself to understand the project a little bit more. I haven't put this into scripts yet, but the idea in drawn out in the notebook.

There were a few main opinions for this to study.

training and testing classes information are contained in a Pandas DataFrame multiindex.
Make a custom BaseEstimator that has a fit_predict classmethod. This method fits the model with the training then applies a transform, predict, or selection/support operation.
New models are created by subclassing an existing sklearn model with some defaults.

With a limited corpus of models so far this gets all the way through. There is a problem with the scoring function at the moment.

Are there are reasons not to use a direct sklearn models as subclasses? The BaseEstimator provides method for get_params and set_params which may help here.

danthedaniel · 2016-06-02T21:07:46Z

@tonyfast I pushed out a commit so your line number is off. You were referring to the _apply_default_params() method though, correct?

That method is largely there so that certain parameters can be blindly applied to all estimators (regardless of whether they're applicable), and behaves differently than set_params.

Also, currently my refactor branch is in a state where it can be ran - albeit with only 2 classifiers and one preprocessor at the moment.

rhiever added enhancement need contributor labels Feb 25, 2016

rhiever added being worked on high priority and removed need contributor labels Jun 1, 2016

rhiever changed the title ~~Incorporate OO redesign~~ [Major refactor] Incorporate OO redesign Jun 1, 2016

tonyfast mentioned this issue Jun 3, 2016

[WIP] Refactor tpot to many sklearn models #164

Closed

rhiever added this to the Major refactor milestone Jun 17, 2016

tonyfast mentioned this issue Jun 25, 2016

Regression with TPOT? #186

Closed

rhiever closed this as completed Jul 18, 2016

AIAdventures mentioned this issue Jun 6, 2017

Titanic example -problem with 2nd last cell. #492

Closed

saddy001 mentioned this issue Mar 20, 2018

Segfault on optimization process #676

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Major refactor] Incorporate OO redesign #91

[Major refactor] Incorporate OO redesign #91

rhiever commented Feb 25, 2016

rhiever commented Jun 1, 2016

danthedaniel commented Jun 2, 2016 •

edited

danthedaniel commented Jun 2, 2016 •

edited

tonyfast commented Jun 2, 2016

danthedaniel commented Jun 2, 2016 •

edited

[Major refactor] Incorporate OO redesign #91

[Major refactor] Incorporate OO redesign #91

Comments

rhiever commented Feb 25, 2016

rhiever commented Jun 1, 2016

danthedaniel commented Jun 2, 2016 • edited

danthedaniel commented Jun 2, 2016 • edited

tonyfast commented Jun 2, 2016

danthedaniel commented Jun 2, 2016 • edited

danthedaniel commented Jun 2, 2016 •

edited

danthedaniel commented Jun 2, 2016 •

edited

danthedaniel commented Jun 2, 2016 •

edited