Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Users can now pass in all valid kwargs to Estimators #1157

Merged
merged 2 commits into from
Sep 15, 2020

Conversation

freddyaboulton
Copy link
Contributor

Pull Request Description

Fixes #1155


After creating the pull request: in order to pass the release_notes_updated check you will need to update the "Future Release" section of docs/source/release_notes.rst to include this pull request by adding :pr:123.

@codecov
Copy link

codecov bot commented Sep 10, 2020

Codecov Report

Merging #1157 into main will increase coverage by 0.00%.
The diff coverage is 100.00%.

Impacted file tree graph

@@           Coverage Diff           @@
##             main    #1157   +/-   ##
=======================================
  Coverage   99.91%   99.91%           
=======================================
  Files         197      197           
  Lines       11663    11705   +42     
=======================================
+ Hits        11653    11695   +42     
  Misses         10       10           
Impacted Files Coverage Δ
...ents/estimators/classifiers/logistic_regression.py 100.00% <ø> (ø)
...ents/estimators/classifiers/catboost_classifier.py 100.00% <100.00%> (ø)
...ts/estimators/classifiers/elasticnet_classifier.py 100.00% <100.00%> (ø)
...onents/estimators/regressors/catboost_regressor.py 100.00% <100.00%> (ø)
.../tests/component_tests/test_catboost_classifier.py 100.00% <100.00%> (ø)
...l/tests/component_tests/test_catboost_regressor.py 100.00% <100.00%> (ø)
evalml/tests/component_tests/test_components.py 100.00% <100.00%> (ø)
evalml/tests/component_tests/test_en_classifier.py 100.00% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update b38b51b...62d3b30. Read the comment docs.

Copy link
Collaborator

@jeremyliweishih jeremyliweishih left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great to me! I really like the coverage but just had one clarifying question.

estimator = estimator_class()
if estimator._component_obj is None:
pytest.skip(f"Skipping {estimator_class} because does not have component object.")
params = estimator._component_obj.get_params()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this test if all estimators accept the default keyword arguments and values of the component_obj?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes!

Copy link
Contributor

@angela97lin angela97lin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looooks good! Thanks for the cleanup 😁

@freddyaboulton freddyaboulton merged commit 54f6767 into main Sep 15, 2020
This was referenced Sep 17, 2020
'silent': silent}
if kwargs.get('allow_writing_files', False):
warnings.warn("Parameter allow_writing_files is being set to False in CatBoostRegressor")
kwargs["allow_writing_files"] = False
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@freddyaboulton it looks like this code updates allow_writing_files so that it will always be False. Why is this necessary? If a user wants to set this parameter, why not allow them? We can still encode a default like so, without adding this parameter to the parameters dict:

cb_parameters['allow_writing_files'] = kwargs.get('allow_writing_files', False)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This question also applies to the changes for catboost classifier and elasticnet classifier as well.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Discussed with @freddyaboulton .

Options:

  • Allow users to set allow_writing_files to whatever they want. Make it a named field.
  • Add the ability to have an allow-list and deny-list for parameters. I.e. if someone passes in allow_writing_files, we raise an error.
  • Keep code as-is

We liked the first option. Let's expose allow_writing_files as named value in init, default False, and delete the warning code here

@@ -26,8 +26,6 @@ def __init__(self, penalty="l2", C=1.0, n_jobs=-1, random_state=0, **kwargs):
parameters.update(kwargs)

lr_classifier = LogisticRegression(random_state=random_state,
multi_class="auto",
solver="lbfgs",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@freddyaboulton is there a reason why we shouldn't add these as named parameters to init, so that the defaults are still clear?

def __init__(self, ..., multi_class='auto', solver='lbfgs', ..., **kwargs):

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Discussed with @freddyaboulton .Plan: let's do this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Can't pass in certain kwargs to LogisticRegression or EN classifiers
4 participants