Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove MSLE from default additional objectives #203

Merged
merged 8 commits into from
Nov 13, 2019
Merged

Conversation

jeremyliweishih
Copy link
Collaborator

@jeremyliweishih jeremyliweishih commented Nov 11, 2019

In order to remove MSLE from default objectives, I propose to have an internal blacklist of objectives that may cause problems or are not applicable to all use cases. This makes it easy for us to maintain and check for but may cause ambiguity for users that might manually use get_objectives().

Also added a change to how we handle objectives. I don't think its a good idea that we're returning the same instance of an objective when we handle objectives.

@codecov
Copy link

codecov bot commented Nov 11, 2019

Codecov Report

Merging #203 into master will decrease coverage by 0.04%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #203      +/-   ##
==========================================
- Coverage   96.81%   96.77%   -0.05%     
==========================================
  Files          91       91              
  Lines        2353     2354       +1     
==========================================
  Hits         2278     2278              
- Misses         75       76       +1
Impacted Files Coverage Δ
evalml/objectives/utils.py 100% <ø> (ø) ⬆️
evalml/objectives/objective_base.py 97.5% <100%> (ø) ⬆️
evalml/tests/objective_tests/test_objectives.py 100% <100%> (ø) ⬆️
evalml/tests/automl_tests/test_autoclassifier.py 100% <100%> (ø) ⬆️
evalml/objectives/standard_metrics.py 99.53% <0%> (-0.47%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 18ffb63...f84ed9a. Read the comment docs.

Copy link
Contributor

@angela97lin angela97lin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, comments on things you mentioned:

  • I agree that it seems cleaner to return the objective class rather than the instance, but an argument for returning the instance in OPTIONS could be that it allows us to reuse the metrics rather than creating a new one each time?

  • Rather than having a BLACKLIST list which might be confusing (or at least make me wonder why specific objectives are blacklisted), would it be more clear to maintain a list of DEFAULT_OPTIONS separate from OPTIONS?

@jeremyliweishih
Copy link
Collaborator Author

@angela97lin

In regards to returning class vs. object: my main concern is that if we add metrics that require fitting, passing around one object will be problematic. I think the difference in memory usage would be negligible unless someone is calling in multiple AutoClassifiers or AutoRegressors.

I agree that it'll be more clear with a default options list but currently we there would be mostly overlap between options and default options since MSLE is the only one we want to exclude. However, if we change what we want to provide as default options (maybe only F1, precision, and recall for classification, and r2, MSE, MAE for regression?) we should definitely go with default options.

@jeremyliweishih
Copy link
Collaborator Author

@angela97lin or do you think a system like what we wrote for components utils where we fetch all standard metrics and use that for get_objectives() but keep a small list for get_default_objectives()? Maybe something like:

[f1, f1_micro, precision, precision_micro, recall, recall_micro, r2, mae, mse]

evalml/objectives/utils.py Outdated Show resolved Hide resolved
evalml/objectives/utils.py Outdated Show resolved Hide resolved
"r2": standard_metrics.R2,
"mae": standard_metrics.MAE,
"mse": standard_metrics.MSE,
"msle": standard_metrics.MSLE,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what if we just remove MSLE as an option here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Per the discussion above, @kmax12 what do you think if we keep a separate default_objectives object as well as the current objectives object? My concern is if we remove MSLE, how will the user be aware of it as an available metric?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@angela97lin I agree this creates some ambiguity but I think a user could look into the API reference to get the full list of metrics. Think this solution is probably the cleanest.

@kmax12 kmax12 self-requested a review November 12, 2019 23:43
Copy link
Contributor

@kmax12 kmax12 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM once tests pass

@jeremyliweishih jeremyliweishih merged commit 80353d0 into master Nov 13, 2019
@angela97lin angela97lin mentioned this pull request Nov 15, 2019
@dsherry dsherry deleted the msle branch May 26, 2020 21:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants