-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Abstract function for the classifier pipeline operators #57
Conversation
input_df.loc[:, sf_identifier] = input_df['guess'].values | ||
|
||
return input_df | ||
return TPOT._train_model_and_predict(input_df, DecisionTreeClassifier, max_features=max_features, max_depth=max_depth, random_state=42) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the TPOT.
is extraneous: It should be possible to call _train_model_and_predict()
without it.
Great work as always! Thank you for adding the test. Addressing your questions:
AFAIK, there is no performance difference between the two. The difference between a static and regular class function is whether the function relies on the current state of the object, i.e., it affects the scope. Static functions always have their own scope, whereas class functions share the scope of the object they're called on. I think we have to ask ourselves: Do we want users using these functions outside of TPOT? e.g., from tpot import random_forest
result = random_forest(...)
... I originally envisioned the
Name looks fine to me!
That's a tough one, since we're never really sure where the pipeline operator functions will be getting their input from and where they're sending it to. I think putting in that quick check at the beginning of the function is the best we can do. |
IMO we should keep |
I settled on moving |
I see. We should definitely remove all of the |
…sing to the sklearn model easier, changed static model test name removed static decorators for static models and _train_model_and_predict() Cleaning up.
I'll try to squash some of my commits for brevity's sake. |
This looks ready to merge now. Anything else you planning on adding to this PR? |
I think I'm good for now. If you think it's ready, then great! |
Abstract function for the classifier pipeline operators
Per #43 , this abstracts the shared code between static models into a static method that each model can call. In addition, I built a test for testing all the static models / this new abstract method indirectly, and tweaked the num_trees parameter in random_forest to n_estimators to facilitate testing.
Potential improvements: