Trim out data transformations operators that are downstream of the last classification step #70

dmarx · 2016-01-04T19:23:38Z

Sometimes the optimized pipeline will look something like this:

transformation -> transformation -> classification -> transformation

The last transformation step adds nothing. We should cleanup the pipeline by adding a post-processing step to tpot.fit that trims out unnecessary operators from the optimized pipeline. This will be trivial after incorporating the refactor in #63 as we could just add an attribute to the base classes to identify whether or not an operator can be the pipeline terminus. Something like:

class BasicOperator(object): 
        ...
        self._terminal_operator = False
        ...

class LearnerOperator(object): 
        ...
        self._terminal_operator = True
        ...

I felt it'd probably be better to create a new issue for this topic rather than unilaterally adding a commit downstream of the #63 HEAD.

The text was updated successfully, but these errors were encountered:

kadarakos · 2016-01-06T09:21:44Z

Could we solve this by having a multi-objective fitness function? Combined from:

reward accuracy/f-score
penalize the number of operations
penalize runtime

As far as I understand deap supports multi-objective optimization out-of-the-box.

rhiever · 2016-01-06T09:55:21Z

I'm currently developing and testing a multi-objective version of TPOT.
Including various measures of model complexity as an axis to minimize does
indeed eliminate cases like this.

On Tuesday, January 5, 2016, kadarakos notifications@github.com wrote:

Could we solve this by having a multi-objective fitness function? Combined
from:

reward accuracy/f-score

penalize the number of operations

penalize runtime

As far as I understand deap supports multi-objective optimization
out-of-the-box.

—
Reply to this email directly or view it on GitHub
#70 (comment).

Randal S. Olson, Ph.D.
Postdoctoral Researcher, Institute for Biomedical Informatics
University of Pennsylvania

E-mail: rso@randalolson.com | Twitter: @randal_olson
https://twitter.com/randal_olson
http://www.randalolson.com

kadarakos · 2016-01-06T10:49:10Z

What kind of measures are you using for complexity?

rhiever · 2016-01-06T18:47:30Z

The two you mentioned -- number of pipeline operators and runtime -- but
also the number of features in the pipeline. Interested to hear more ideas
if you have some.

On Wed, Jan 6, 2016 at 12:49 AM, kadarakos notifications@github.com wrote:

What kind of measures are you using for complexity?

—
Reply to this email directly or view it on GitHub
#70 (comment).

Randal S. Olson, Ph.D.
Postdoctoral Researcher, Institute for Biomedical Informatics
University of Pennsylvania

E-mail: rso@randalolson.com | Twitter: @randal_olson
https://twitter.com/randal_olson
http://www.randalolson.com

rhiever · 2016-08-13T18:35:41Z

This is now encapsulated in #206, so I'm going to close this issue.

rhiever added the enhancement label Jan 4, 2016

rhiever added the need contributor label Feb 16, 2016

rhiever closed this as completed Aug 13, 2016

AIAdventures mentioned this issue Jun 6, 2017

Titanic example -problem with 2nd last cell. #492

Closed

saddy001 mentioned this issue Mar 20, 2018

Segfault on optimization process #676

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Trim out data transformations operators that are downstream of the last classification step #70

Trim out data transformations operators that are downstream of the last classification step #70

dmarx commented Jan 4, 2016

kadarakos commented Jan 6, 2016

rhiever commented Jan 6, 2016

kadarakos commented Jan 6, 2016

rhiever commented Jan 6, 2016

rhiever commented Aug 13, 2016

Trim out data transformations operators that are downstream of the last classification step #70

Trim out data transformations operators that are downstream of the last classification step #70

Comments

dmarx commented Jan 4, 2016

kadarakos commented Jan 6, 2016

rhiever commented Jan 6, 2016

kadarakos commented Jan 6, 2016

rhiever commented Jan 6, 2016

rhiever commented Aug 13, 2016