-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New classifiers #147
New classifiers #147
Conversation
Also adds the classifiers's predictions as a 'SyntheticFeature' column. | ||
|
||
""" | ||
return self._train_model_and_predict(input_df, AdaBoostClassifier, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
learning_rate
should be capped at > 0.
Looking at the docs further, I also think that for the AdaBoostClassifier, we should allow n_estimators
to be evolved as well, with a max of 500 estimators. The AdaBoostClassifier is one unique case where there is a tradeoff between n_estimators
and learning_rate
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should I just do
max(learning_rate, 0.001)
Not sure what the exact minimum should be here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Check xgradient_boosting
for an example: https://github.com/rhiever/tpot/blob/master/tpot/tpot.py#L582
learning_rate = max(0.0001, learning_rate)
And for the C
param, check _logistic_regression
for an example: https://github.com/rhiever/tpot/blob/master/tpot/tpot.py#L516
C = max(0.0001, C)
0.0001 seems like a fine minimum value for now until we finish the sklearn benchmark and figure out an ideal range.
When writing test cases for classifiers, test with normal parameters as well as extreme parameters: negative values, out-of-bounds values, etc. That will help catch issues where we're allowing invalid parameters to be passed to the various models. |
Now addresses #151 |
What does this PR do?
Adds 7 new classifiers to TPOT
Where should the reviewer start?
At the _ada_boost() method in the TPOT class, going all the way down to the _p_aggr() method.
There is also relevant export code and docs that should be checked for an LGTM.
How should this PR be tested?
Travis should test the classifiers themselves, but a few pipelines could be made and exported to confirm that the code indeed works.
What are the relevant issues?
#128
Questions:
Yes. I have already updated them.
No.