-
-
Notifications
You must be signed in to change notification settings - Fork 25.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FEA Implementation of "threshold-dependent metric per threshold value" curve #25639
base: main
Are you sure you want to change the base?
Conversation
…/scikit-learn into metric_threshold_curve
❌ Linting issuesThis PR is introducing linting issues. Here's a summary of the issues. Note that you can avoid having linting issues by enabling You can see the details of the linting issues under the
|
OK now that we merged the I think this is time to review and prioritize this feature. @vitaliset would you have time to dedicate to work on this feature? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some initial thoughts. I did not look at the documentation or test but it will come later.
@@ -674,6 +674,7 @@ Kernels | |||
|
|||
inspection.partial_dependence | |||
inspection.permutation_importance | |||
inspection.metric_threshold_curve |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think that we should include this feature in this module. Actually, as the precision-recall and roc curves, this is a curve metric. Also, by including it in sklearn.metrics
module, we can drop the metric_
naming.
I would therefore call it sklearn.metrics.decision_threshold_curve
"y_true": ["array-like"], | ||
"y_score": ["array-like"], | ||
"score_func": [callable], | ||
"threshold_grid": [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
to be consistent with the TunedThresholdClassifierCV
, we need to call this thresholds
. I would not allow for None
, because with many sample, we are going to reinterpolate anyway. So let's be consistent as well with the classifier.
def metric_threshold_curve( | ||
y_true, | ||
y_score, | ||
score_func, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We might want to call this scoring
as well for consistency. But here, we should only accept a callable.
pos_label=None, | ||
sample_weight=None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Those are actually not necessary per-se: they are parameters of scoring
and we should accept any keyword argument and propogate them. I assume that we should have a scoring_kwargs
that is a dictionary.
# Make y_true a boolean vector. | ||
y_true = y_true == pos_label | ||
|
||
# Sort scores and corresponding truth values. | ||
desc_score_indices = np.argsort(y_score, kind="mergesort")[::-1] | ||
y_score = y_score[desc_score_indices] | ||
y_true = y_true[desc_score_indices] | ||
if sample_weight is not None: | ||
sample_weight = sample_weight[desc_score_indices] | ||
|
||
# Logic to see if we need to use all possible thresholds (distinct values). | ||
all_thresholds = False | ||
if threshold_grid is None: | ||
all_thresholds = True | ||
elif isinstance(threshold_grid, int): | ||
if len(set(y_score)) < threshold_grid: | ||
all_thresholds = True | ||
|
||
if all_thresholds: | ||
# y_score typically has many tied values. Here we extract | ||
# the indices associated with the distinct values. We also | ||
# concatenate a value for the end of the curve. | ||
distinct_value_indices = np.where(np.diff(y_score))[0] | ||
threshold_idxs = np.r_[distinct_value_indices, y_true.size - 1] | ||
thresholds = y_score[threshold_idxs[::-1]] | ||
elif isinstance(threshold_grid, int): | ||
# It takes representative score points to calculate the metric | ||
# with these thresholds. | ||
thresholds = np.percentile( | ||
list(set(y_score)), np.linspace(0, 100, threshold_grid) | ||
) | ||
else: | ||
# If threshold_grid is an array then run some checks and sort | ||
# it for consistency. | ||
threshold_grid = column_or_1d(threshold_grid) | ||
assert_all_finite(threshold_grid) | ||
thresholds = np.sort(threshold_grid) | ||
|
||
# For each threshold calculates the metric. | ||
metric_values = [] | ||
for threshold in thresholds: | ||
preds_threshold = (y_score > threshold).astype(int) | ||
metric_values.append( | ||
score_func(y_true, preds_threshold, sample_weight=sample_weight) | ||
) | ||
# TODO: should we multithread the metric calculations? | ||
|
||
return np.array(metric_values), thresholds |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All this code is already implemented by the _score
function of the sklearn.model_selection._classification_threshold._CurveScorer
class.
I think that we should leverage this code by creating this scorer. We probably need to dissociate getting y_score
from the scoring itself such that here we only call the scoring part.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So now, it makes sense to move the _CurveScorer
in metrics
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So now, it makes sense to move the
_CurveScorer
inmetrics
.
Do you want me to do this in this PR or create a separate one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be better to be in a separate PR. Depending on the schedule, I might start to do the PR.
Awesome news! I might need a couple of weeks, but I would love to make this feature available! Will work on your comments as soon as I can, @glemaitre. |
Towards #21391.
Intending to later build the
MetricThresholdCurveDisplay
following the same structure that other Displays have, this PR implements the associate curve. I decided to break the original issue into two parts (curve and Display) for easier review (but I don't mind adding the Display to this PR as well).A quick example of usage of the implementation here:
Most of the code for
metric_threshold_curve
function is an adaptation of_binary_clf_curve
.Points of doubt: