-
-
Notifications
You must be signed in to change notification settings - Fork 25.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
pairwise_distances(X) should always have 0 diagonal #12628
Comments
And should be delete the I want to work on this |
I don't think so because you can use Btw, |
So, the change must be added on the |
Yes. However we rely on scipy's one for many of them, which are already consistent, so there shouldn't be too many to update |
@jnothman Actually there are tests in sklearn to test that metrics with I wonder why these kind of "metrics" are allowed. But if we want to keep that, we have to either add a parameter |
I'm proposing that we stop supporting them, but it would be worth tracing
the history of that test to work out if it was well motivated
|
In the case of euclidean distances, we explicitly set the diagonal of unary pairwise distances to 0:
scikit-learn/sklearn/metrics/pairwise.py
Line 257 in e170d47
scikit-learn/sklearn/metrics/pairwise.py
Line 1279 in e170d47
scikit-learn/sklearn/metrics/pairwise.py
Line 550 in e170d47
We should zero the diagonal of the output for all metrics through pairwise_distances and pairwise_distances_chunked (where
Y is None or Y is X
) to reduce the effect of imprecision during distance calculation.That is, for all
metric
we eventually want:and a similar assertion for
pairwise_distances_chunked
I say eventually because I propose that:
np.allclose(pairwise_distances(X, metric=metric)[np.diag_indices(X.shape[0])], 0)
, we just set it to 0pairwise_distances_chunked
The text was updated successfully, but these errors were encountered: