Abstract function for the classifier pipeline operators #57

bartleyn · 2015-12-14T00:50:35Z

Per #43 , this abstracts the shared code between static models into a static method that each model can call. In addition, I built a test for testing all the static models / this new abstract method indirectly, and tweaked the num_trees parameter in random_forest to n_estimators to facilitate testing.

Potential improvements:

Maybe it's just me, but It feels a little odd to call a static function from another static function. Does it make sense to move this method (and perhaps some others) to a utility function module? Either way, is it faster to call this abstract method as a static method inside the class, a class method inside the class, or something else?
Is there a more appropriate name for the method?
Is there a better place to do the input validation (i.e. if there are only three columns in the input_df) than within this nested function call?

…added test

rhiever · 2015-12-14T21:23:17Z

tpot/tpot.py

-        input_df.loc[:, sf_identifier] = input_df['guess'].values
-
-        return input_df
+        return TPOT._train_model_and_predict(input_df, DecisionTreeClassifier, max_features=max_features, max_depth=max_depth, random_state=42)


I think the TPOT. is extraneous: It should be possible to call _train_model_and_predict() without it.

rhiever · 2015-12-15T14:49:23Z

Great work as always! Thank you for adding the test.

Addressing your questions:

Maybe it's just me, but It feels a little odd to call a static function from another static function. Does it make sense to move this method (and perhaps some others) to a utility function module? Either way, is it faster to call this abstract method as a static method inside the class, a class method inside the class, or something else?

AFAIK, there is no performance difference between the two. The difference between a static and regular class function is whether the function relies on the current state of the object, i.e., it affects the scope. Static functions always have their own scope, whereas class functions share the scope of the object they're called on.

I think we have to ask ourselves: Do we want users using these functions outside of TPOT? e.g.,

from tpot import random_forest

result = random_forest(...)
...

I originally envisioned the export() function doing something like that, but I ended up exporting the pipelines directly to sklearn code (even if it looks ugly). Perhaps we should change all of the functions to regular class methods since we don't really want users doing that. I don't think any of the functions in TPOT were really made to be used outside of TPOT.

Is there a more appropriate name for the method?

Name looks fine to me!

Is there a better place to do the input validation (i.e. if there are only three columns in the input_df) than within this nested function call?

That's a tough one, since we're never really sure where the pipeline operator functions will be getting their input from and where they're sending it to. I think putting in that quick check at the beginning of the function is the best we can do.

rhiever · 2015-12-15T14:51:36Z

IMO we should keep _train_model_and_predict() within the TPOT class. It's simply an abstraction of the TPOT classifier operators.

bartleyn · 2015-12-15T15:31:38Z

I settled on moving _train_model_and_predict() out of the class to see if we could remove the TPOT. that prefaced the function call in the static models; I agree that ideally the function should remain in the TPOT class. However, without passing an instantiated TPOT object into the static model call, I'm not sure how to avoid the static TPOT._train_model_and_predict().

rhiever · 2015-12-15T15:36:08Z

I see. We should definitely remove all of the @staticfunction decorators then, and add self as the first parameter to these functions. That will allow us to call the internal functions as self.function_name().

…sing to the sklearn model easier, changed static model test name removed static decorators for static models and _train_model_and_predict() Cleaning up.

bartleyn · 2015-12-15T16:40:58Z

I'll try to squash some of my commits for brevity's sake.

rhiever · 2015-12-15T23:37:07Z

This looks ready to merge now. Anything else you planning on adding to this PR?

bartleyn · 2015-12-16T04:54:27Z

I think I'm good for now. If you think it's ready, then great!

Abstract function for the classifier pipeline operators

added abstract train_model_and_predict function for each classifier, …

a2fae6c

…added test

rhiever reviewed Dec 14, 2015
View reviewed changes

Changed random_forest num_trees to n_estimators to make parameter pas…

7e5c4c0

…sing to the sklearn model easier, changed static model test name removed static decorators for static models and _train_model_and_predict() Cleaning up.

bartleyn force-pushed the master branch from 34a23cb to 7e5c4c0 Compare December 15, 2015 16:42

rhiever pushed a commit that referenced this pull request Dec 16, 2015

Merge pull request #57 from bartleyn/master

c946b15

Abstract function for the classifier pipeline operators

rhiever merged commit c946b15 into EpistasisLab:master Dec 16, 2015

AIAdventures mentioned this pull request Jun 6, 2017

Titanic example -problem with 2nd last cell. #492

Closed

saddy001 mentioned this pull request Mar 20, 2018

Segfault on optimization process #676

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Abstract function for the classifier pipeline operators #57

Abstract function for the classifier pipeline operators #57

bartleyn commented Dec 14, 2015

rhiever Dec 14, 2015

rhiever commented Dec 15, 2015

rhiever commented Dec 15, 2015

bartleyn commented Dec 15, 2015

rhiever commented Dec 15, 2015

bartleyn commented Dec 15, 2015

rhiever commented Dec 15, 2015

bartleyn commented Dec 16, 2015

Abstract function for the classifier pipeline operators #57

Abstract function for the classifier pipeline operators #57

Conversation

bartleyn commented Dec 14, 2015

rhiever Dec 14, 2015

Choose a reason for hiding this comment

rhiever commented Dec 15, 2015

rhiever commented Dec 15, 2015

bartleyn commented Dec 15, 2015

rhiever commented Dec 15, 2015

bartleyn commented Dec 15, 2015

rhiever commented Dec 15, 2015

bartleyn commented Dec 16, 2015