Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

speedup pairwise_distance #256

Merged
merged 11 commits into from Aug 11, 2023

Conversation

gabelstein
Copy link
Contributor

closes #224

@gabelstein
Copy link
Contributor Author

Slight problem as logdet and wasserstein distances don't return 0 for same inputs.

@gabelstein
Copy link
Contributor Author

Probably due to numerical instability of numpy.linalg.slogdet.
Also just caught this because of actually checking correctness of pairwise distance matrix. So not a new issue anyways.
I think there's no good fix for this, slogdet is much faster than anything else.
Maybe just return 0 after a check if the arrays are equal?
Otherwise I just reduced the required precision in the test for now.

@gabelstein gabelstein marked this pull request as ready for review July 26, 2023 14:59
@gabelstein
Copy link
Contributor Author

@agramfort @qbarthelemy

Copy link
Member

@agramfort agramfort left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

basically I would avoid the new public functions (make them private) and I would not do any magic with "np.array_equal(X, Y)" if Y=X then user should just pass Y = None. You should now what you are doing.
Finally do not set the diag to zero. If it's 1e-15 than it's a numerical zero. Basically don't hack the numerical noise.

doc/api.rst Outdated
pairwise_distance_euclid
pairwise_distance_harmonic
pairwise_distance_logeuclid
pairwise_distance_riemann
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would not make these functions public to avoid over crowding the public namespace.

pyriemann/utils/distance.py Outdated Show resolved Hide resolved
pyriemann/utils/distance.py Outdated Show resolved Hide resolved
@agramfort
Copy link
Member

agramfort commented Aug 1, 2023 via email

@gabelstein
Copy link
Contributor Author

it's better to fix LLE then

@agramfort doesn't need fixing, I remembered incorrectly.
Can this be merged then?

@agramfort
Copy link
Member

agramfort commented Aug 3, 2023 via email

@gabelstein
Copy link
Contributor Author

Can you see why CIs fail ?

it says "FileNotFoundError: [Errno 2] No such file or directory: '/tmp/sphinx-log'" which I don't really understand. I thought it was an issue on the server side.

Now from an API perspective I think we should have only one public function called pairwise_distances that takes a metric parameter. It will make the API lighter.

I removed the individual functions from the api page, I thought that was enough.
Is there way to make the functions other functions private?

@agramfort
Copy link
Member

agramfort commented Aug 4, 2023 via email

@gabelstein
Copy link
Contributor Author

prepend some _ the function name for each

Done.

pairwise_distance
distance_harmonic
"""
if isinstance(Y, type(None)):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if isinstance(Y, type(None)):
if Y is None:

between elements of X and Y.
Notes
-----
.. versionadded:: 0.6
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You don't need these now that it's private

doc/whatsnew.rst Outdated
@@ -10,6 +10,7 @@ A catalog of new features, improvements, and bug-fixes in each release.
v0.6.dev
--------

- Speedup pairwise distance function :func:`pyriemann.utils.distance.pairwise_distance` by adding individual functions. :pr:`256` by :user:`gabelstein`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Say for what metric it's faster

Copy link
Member

@qbarthelemy qbarthelemy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thx @gabelstein for this nice contribution!
We can merge after #254

@agramfort
Copy link
Member

agramfort commented Aug 11, 2023 via email

@gabelstein
Copy link
Contributor Author

@qbarthelemy 's changes just gave me an idea. Setting the invY and logY to none when applicable speeds up the internal sklearn pairwise distance.
image

@qbarthelemy qbarthelemy merged commit 0f35b46 into pyRiemann:master Aug 11, 2023
10 checks passed
@qbarthelemy
Copy link
Member

Thx @gabelstein

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Speed up pairwise_distance
3 participants