Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix Lead scoring demos #2315

Merged
merged 45 commits into from Jun 22, 2021
Merged

Fix Lead scoring demos #2315

merged 45 commits into from Jun 22, 2021

Conversation

bchen1116
Copy link
Contributor

@bchen1116 bchen1116 commented May 28, 2021

fixes #2221 and is part 1 of #2145

New doc results:
image
image

Doc for how we chose these weights here
Perf test for binary pipeline is here

@bchen1116 bchen1116 self-assigned this May 28, 2021
@codecov
Copy link

codecov bot commented May 28, 2021

Codecov Report

Merging #2315 (6743075) into main (ce3f046) will increase coverage by 0.1%.
The diff coverage is 100.0%.

Impacted file tree graph

@@           Coverage Diff           @@
##            main   #2315     +/-   ##
=======================================
+ Coverage   99.7%   99.7%   +0.1%     
=======================================
  Files        283     283             
  Lines      25233   25240      +7     
=======================================
+ Hits       25133   25140      +7     
  Misses       100     100             
Impacted Files Coverage Δ
...alml/objectives/binary_classification_objective.py 100.0% <100.0%> (ø)
evalml/objectives/lead_scoring.py 100.0% <100.0%> (ø)
evalml/tests/objective_tests/test_lead_scoring.py 100.0% <100.0%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update ce3f046...6743075. Read the comment docs.

Copy link
Contributor

@angela97lin angela97lin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I now see:

OrderedDict([('AUC', 0.6637428226050166), ('Lead Scoring', 0.034393809114359415)])

when we optimize for Lead Scoring and

OrderedDict([('AUC', 0.8526896343306134), ('Lead Scoring', 0.0)])

when we optimize for AUC which aligns with what we're trying to say. Great work! :)

I wonder if we can catch this again if our AutoMLSearch changes and causes behavior in the docs that we don't expect. Maybe putting an assert in the docs to confirm that our auc-optimized lead scoring score is less than our lead-scoring optimized lead scoring score? (sorry, mouthful) 😁

@bchen1116
Copy link
Contributor Author

@angela97lin would failing assertions in the docs trigger any warnings for us, like in doc builds? We could make this a test, but that would add a fair bit of runtime to our tests. Otherwise, unsure how else we could catch this in the future beyond just revisiting the docs

@angela97lin
Copy link
Contributor

@bchen1116 Yup, or it'd just fail the docs test. Just having an assert between the two things we've already calculated above in a new cell shouldn't take too much runtime!

Copy link
Contributor

@dsherry dsherry left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bchen1116 this looks great. I left a question and a request for the release notes.

Since we'll be changing the optimization method we use for binary classification threshold tuning, please run the perf tests (at least on the binary classification datasets) and share the results before merging.

@@ -43,5 +43,7 @@ def objective_function(self, y_true, y_predicted, X=None):
profit += self.false_positives * false_positives

profit_per_lead = profit / len(y_true)
# penalty if our estimator only predicts 1 output
same_class_penalty = (2 - len(set(y_predicted))) * (abs(profit_per_lead) / 2)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it.

Why the division by 2?

@@ -43,7 +43,7 @@ def cost(threshold):
cost = self.objective_function(y_true, y_predicted, X=X)
return -cost if self.greater_is_better else cost

optimal = minimize_scalar(cost, method="Golden", options={"maxiter": 100})
optimal = minimize_scalar(cost, bounds=(0, 1), method="Bounded")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bchen1116 please add a second entry to the release notes for this, under "Enhancements": "Updated threshold optimization method for binary classification"

docs/source/demos/lead_scoring.ipynb Show resolved Hide resolved
@bchen1116
Copy link
Contributor Author

@dsherry updated with the perf tests!

@dsherry
Copy link
Contributor

dsherry commented Jun 21, 2021

@bchen1116 thank you, looks good to me, let's 🚢 💨 !

@bchen1116
Copy link
Contributor Author

@angela97lin I added the assert to the docs! Hopefully it will allow us to catch this issue in the future

Copy link
Contributor

@ParthivNaresh ParthivNaresh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work with this! The docs are detailed and the perf tests look good, and nice job adding the asserts.

@angela97lin
Copy link
Contributor

@bchen1116 Perfect, looks great 😁

@bchen1116 bchen1116 merged commit 5256584 into main Jun 22, 2021
@chukarsten chukarsten mentioned this pull request Jun 22, 2021
@freddyaboulton freddyaboulton deleted the bc_2145_demos branch May 13, 2022 15:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Fix Lead Scoring Demo
6 participants