Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New classifiers #147

Merged
merged 9 commits into from
May 25, 2016
Merged

New classifiers #147

merged 9 commits into from
May 25, 2016

Conversation

danthedaniel
Copy link
Contributor

@danthedaniel danthedaniel commented May 16, 2016

What does this PR do?

Adds 7 new classifiers to TPOT

Where should the reviewer start?

At the _ada_boost() method in the TPOT class, going all the way down to the _p_aggr() method.

There is also relevant export code and docs that should be checked for an LGTM.

How should this PR be tested?

Travis should test the classifiers themselves, but a few pipelines could be made and exported to confirm that the code indeed works.

What are the relevant issues?

#128

Questions:

  • Do the docs need to be updated?

Yes. I have already updated them.

  • Does this PR add new (Python) dependencies?

No.

@coveralls
Copy link

Coverage Status

Coverage increased (+0.6%) to 57.965% when pulling 879bd54 on teaearlgraycold:new_classifiers into 5aea926 on rhiever:master.

Also adds the classifiers's predictions as a 'SyntheticFeature' column.

"""
return self._train_model_and_predict(input_df, AdaBoostClassifier,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

learning_rate should be capped at > 0.

Looking at the docs further, I also think that for the AdaBoostClassifier, we should allow n_estimators to be evolved as well, with a max of 500 estimators. The AdaBoostClassifier is one unique case where there is a tradeoff between n_estimators and learning_rate.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should I just do

max(learning_rate, 0.001)

Not sure what the exact minimum should be here.

Copy link
Contributor

@rhiever rhiever May 17, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Check xgradient_boosting for an example: https://github.com/rhiever/tpot/blob/master/tpot/tpot.py#L582

learning_rate = max(0.0001, learning_rate)

And for the C param, check _logistic_regression for an example: https://github.com/rhiever/tpot/blob/master/tpot/tpot.py#L516

C = max(0.0001, C)

0.0001 seems like a fine minimum value for now until we finish the sklearn benchmark and figure out an ideal range.

@rhiever
Copy link
Contributor

rhiever commented May 17, 2016

When writing test cases for classifiers, test with normal parameters as well as extreme parameters: negative values, out-of-bounds values, etc. That will help catch issues where we're allowing invalid parameters to be passed to the various models.

@coveralls
Copy link

Coverage Status

Coverage increased (+1.7%) to 59.055% when pulling c834e7a on teaearlgraycold:new_classifiers into a4e00b2 on rhiever:master.

@coveralls
Copy link

Coverage Status

Coverage increased (+1.7%) to 59.055% when pulling 36e0acd on teaearlgraycold:new_classifiers into a4e00b2 on rhiever:master.

@danthedaniel
Copy link
Contributor Author

Now addresses #151

@coveralls
Copy link

Coverage Status

Coverage increased (+2.2%) to 59.588% when pulling ff3cb08 on teaearlgraycold:new_classifiers into a4e00b2 on rhiever:master.

@coveralls
Copy link

Coverage Status

Coverage increased (+2.2%) to 59.588% when pulling ff3cb08 on teaearlgraycold:new_classifiers into a4e00b2 on rhiever:master.

@rhiever rhiever merged commit 5ebd6d1 into EpistasisLab:master May 25, 2016
@danthedaniel danthedaniel deleted the new_classifiers branch August 19, 2016 19:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants