Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add add_to_leaderboard() to AutoSearchBase #874

Merged
merged 11 commits into from
Jun 26, 2020
Merged

Conversation

jeremyliweishih
Copy link
Collaborator

Fixes #628.

@codecov
Copy link

codecov bot commented Jun 23, 2020

Codecov Report

Merging #874 into master will increase coverage by 0.00%.
The diff coverage is 100.00%.

Impacted file tree graph

@@           Coverage Diff           @@
##           master     #874   +/-   ##
=======================================
  Coverage   99.75%   99.75%           
=======================================
  Files         195      195           
  Lines        8532     8585   +53     
=======================================
+ Hits         8511     8564   +53     
  Misses         21       21           
Impacted Files Coverage Δ
evalml/automl/auto_search_base.py 98.00% <100.00%> (+0.08%) ⬆️
evalml/tests/automl_tests/test_autobase.py 100.00% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 511f6e9...a29e532. Read the comment docs.

@jeremyliweishih jeremyliweishih requested a review from dsherry June 23, 2020 20:24
@jeremyliweishih jeremyliweishih marked this pull request as ready for review June 23, 2020 20:24
@jeremyliweishih jeremyliweishih requested a review from eccabay June 23, 2020 20:24
@jeremyliweishih
Copy link
Collaborator Author

After speaking to @dsherry - we will wait on #719 and see how that affects the evaluate API before merging this in and updating this API.

@dsherry
Copy link
Contributor

dsherry commented Jun 24, 2020

@jeremyliweishih I think its fine to merge this before #719--but we should plan to get another PR up once that's merged.

Copy link
Contributor

@dsherry dsherry left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jeremyliweishih looks good!

So now users could do:

automl = AutoClassificationSearch(...)
automl.search(X_train, y_train)
custom_pipeline = CustomPipeline({'custom': 'parameters', ...})
autom.add_to_rankings(pipeline, X_train, y_train)

@jeremyliweishih I have a few more thoughts we should address before merging:

  • I left a comment about renaming the main method to add_to_rankings or similar, since we don't use the word "leaderboard" anywhere. That name was my idea originally haha oops
  • We should add a check that the data provided here matches the training data passed to search, right? It feels inappropriate to allow other data to be used. Otherwise the CV scores aren't fully comparable. We could check the input data shape--in particular the number and name of the columns should be identical. We could also compute/store/compare a hash of the X and y data, but that may be too much to add for now.
  • Add a test to ensure passing a trained vs untrained pipeline into this method doesn't produce different results
  • What happens when the user tries to add a pipeline which is identical to one which was already added? Will this create a new entry in the results structure and in full_rankings? Ideally it would not. Suggested change: check if the pipeline class/params have already been evaluated and return early if so.
  • As in search, what happens if X or y are np/list type? Should get converted to pandas df

@jeremyliweishih jeremyliweishih requested a review from dsherry June 25, 2020 19:35
Copy link
Contributor

@dsherry dsherry left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! I left a comment on the impl return value and a proposal on how to simplify it. Let's close that conversation, but otherwise this is ready to go.

@jeremyliweishih jeremyliweishih merged commit 52e0959 into master Jun 26, 2020
for parameter in pipeline_rows['parameters']:
if pipeline.parameters == parameter:
return
self._evaluate(pipeline, X, y, raise_errors=True)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 thanks

@angela97lin angela97lin mentioned this pull request Jun 30, 2020
@dsherry dsherry deleted the js_628_leader branch October 29, 2020 23:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add a specific pipeline to leaderboard
2 participants