Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Freeze Hyperparameters for AutoMLSearch #1284

Merged
merged 25 commits into from
Oct 30, 2020
Merged

Conversation

bchen1116
Copy link
Contributor

@bchen1116 bchen1116 commented Oct 9, 2020

fix #767

Right now, the implementation assumes that the user includes the 'default' values of the components in the hyperparameter dict. We catch the error that AutoML throws and warn the users if this default value isn't in the provided hyperparameter ranges. Hyperparameter validation would be tracked through this issue.

Since we have no hyperparameter validation as of yet, if a user creates a new pipeline with EvalML components, redefines a hyperparameter range that doesn't include the default value, and calls AutoMLSearch on this pipeline, AutoMLSearch will raise the error that the default values aren't in the search space. This will ideally be addressed in the issue above. Additionally, since we don't do hyperparam validation, we don't catch/throw an error if a user misspells a component name, which will allow AutoMLSearch to run with no errors

custom_hyperparameters = {
        # misspell `Imputer'
        "Inpute": {
            "numeric_impute_strategy": ["mean"]
        }
 }
# will run with no errors
make_pipelines(X, y, estimator, 'binary', custom_hyperparameters)
# will run and compute with original `Imputer` hyperparameters, and quietly return the results
AutoMLSearch(problem_type='binary', allowed_pipelines=[make_pipelines]).search(X, y) 

This can be fixed once hyperparameter validation is implemented in the before-mentioned issue.

Docs here

@bchen1116 bchen1116 self-assigned this Oct 9, 2020
@codecov
Copy link

codecov bot commented Oct 9, 2020

Codecov Report

Merging #1284 into main will increase coverage by 0.01%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #1284      +/-   ##
==========================================
+ Coverage   99.95%   99.95%   +0.01%     
==========================================
  Files         213      213              
  Lines       13857    13928      +71     
==========================================
+ Hits        13850    13921      +71     
  Misses          7        7              
Impacted Files Coverage Δ
...lml/automl/automl_algorithm/iterative_algorithm.py 100.00% <100.00%> (ø)
evalml/pipelines/utils.py 100.00% <100.00%> (ø)
evalml/tests/automl_tests/test_automl.py 100.00% <100.00%> (ø)
evalml/tests/pipeline_tests/test_pipelines.py 100.00% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 1ec2ee4...bbcf93f. Read the comment docs.

@bchen1116 bchen1116 marked this pull request as ready for review October 9, 2020 18:38
try:
self._tuners[pipeline.name].add(pipeline.parameters, score_to_minimize)
except ValueError as e:
if 'is not within the bounds of the space' in str(e):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only throw the hyperparameter range error if we receive that error, otherwise we can throw the original error

Copy link
Contributor

@dsherry dsherry left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bchen1116 looks great! I left a couple asks, and some suggestions on the docs content, but LGTM :D

evalml/automl/automl_algorithm/automl_algorithm.py Outdated Show resolved Hide resolved
evalml/automl/automl_algorithm/automl_algorithm.py Outdated Show resolved Hide resolved
docs/source/user_guide/automl.ipynb Outdated Show resolved Hide resolved
docs/source/user_guide/automl.ipynb Outdated Show resolved Hide resolved
docs/source/user_guide/automl.ipynb Outdated Show resolved Hide resolved
evalml/pipelines/utils.py Outdated Show resolved Hide resolved
@dsherry
Copy link
Contributor

dsherry commented Oct 29, 2020

@bchen1116 RE the bug you ran into with the docs here: I filed as #1367

We already know one workaround: create a woodwork DataTable before calling make_pipelines, so that the datetime feature has the right type in the pandas DataFrame passed to make_pipelines (datetime64 instead of object). It occurs to me another workaround for you in this PR would be to use a dataset which doesn't have a datetime feature 😆

@bchen1116 bchen1116 merged commit e2e5c5f into main Oct 30, 2020
@dsherry dsherry mentioned this pull request Nov 24, 2020
@freddyaboulton freddyaboulton deleted the bc_767_hyperparameters branch May 13, 2022 15:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Ability to freeze specific hyperparameters
2 participants