Join GitHub today
ENH/FIX Replace jaccard_similarity_score by sane jaccard_score #13151
What does this implement/fix? Explain your changes.
The current Jaccard implementation is ridiculous for binary and multiclass problems, returning accuracy. This makes a new Jaccard function with API comparable to Precision, Recall and F-score, which are also fundamentally set-wise metrics.
This also drops the
ogrisel left a comment •
LGTM overall. I like the consistency with f1 / precision / recall.
However what I find a bit suboptimal is that we have the combination of the following:
1- Jaccard index in most useful with multilabel classification problems;
Which means that for 99% of the regular use cases for
One alternative would be to use
That would make
In any case I like this PR overall.
This is actually the advantage of average="binary", right? (i.e., to make users aware of different average options).