Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added ability to use custom DistanceMetric #692

Merged
merged 3 commits into from
Jul 19, 2022

Conversation

freemansw1
Copy link
Contributor

Per #691, I added the ability to use a custom scikit-learn DistanceMetric function as a dist_func parameter in trackpy.link. I also added a new test for this method. I think that this would allow users to extend the DistanceMetric class, too, with compiled distance functions (and therefore faster than passing in a python function).

Copy link
Contributor

@nkeim nkeim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hooray! Just a few changes are needed to gracefully handle sklearn versions before late 2021.

dist_func : function, optional
a custom distance function that takes two 1D arrays of coordinates and
returns a float. Must be used with the 'BTree' neighbor_strategy.
dist_func : function or ```sklearn.metrics.DistanceMetric``` class, optional
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is supposed to be an instance of DistanceMetric, not a class.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, good catch.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My new commits should hopefully resolve my poor choice of documentation here.

@@ -10,6 +10,7 @@

try:
from sklearn.neighbors import BallTree
from sklearn.metrics import DistanceMetric
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line appears to raise an ImportError in many of our automated tests, which in turn breaks BallTree even though sklearn is installed and detected. Here is an example: https://github.com/soft-matter/trackpy/runs/5127629137?check_suite_focus=true#step:6:2336 . Is DistanceMetric absent from earlier sklearn versions?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, it appears to be the difference between e.g. scikit-learn 1.0.2 and scikit-learn 0.24.2 . Since DistanceMetric was only released in late 2021, this code should check for its presence separately from BallTree.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for interpreting the CI output, I was seeing a failure but was having a hard time deducing the cause without further investigation. Looking through the API versions, it looks like they recently moved DistanceMetric over to sklearn.metrics. Before, it was in sklearn.neighbors. I'll update the PR to resolve this- first try importing DistanceMetric from sklearn.metrics and if an ImportError is raised, import it from sklearn.neighbors. It looks like sklearn.neighbors.DistanceMetric should work as far back as at least scikit-learn 0.16 (March 2015); should I still have it separately check for presence, or is it safe to assume that users will have a version newer than this? Happy to do either, but of course assuming that it is there is easier.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome!! We don't have a formal policy but 6 years is definitely past our horizon for backwards compatibility. My own personal preference is that it be roughly 1/2 of the time to complete a PhD :). Thanks for clearing that up.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've now resolved this, and it looks to be running on my end. Hopefully the CI likes it too :).

@freemansw1
Copy link
Contributor Author

It seems like I have increased my percentage of CI tests passed, but am still failing a few. It seems like the CI is failing on some code I didn't touch, but perhaps I caused it to fail from my edits. I'm happy to keep revising my PR as needed, but I would appreciate any insights on why the CI is failing. Thanks for your help on this!

@nkeim nkeim added this to the 0.5.1 milestone Feb 21, 2022
@nkeim
Copy link
Contributor

nkeim commented Feb 21, 2022

Those are (almost) certainly errors due to a new Pandas or matplotlib release, not in your code. To be safe, let's plan to merge once that's resolved.

Here's an instructive failed test that leads me to believe that: https://github.com/soft-matter/trackpy/runs/5276655344?check_suite_focus=true#step:5:41

@freemansw1
Copy link
Contributor Author

Thanks much for the insight and for your help with this PR!

@nkeim
Copy link
Contributor

nkeim commented Jun 17, 2022

@freemansw1 the CI and compatibility problems that were holding this PR back have (hopefully) been fixed in the main branch. Please rebase when you get a chance. Thanks!

@freemansw1
Copy link
Contributor Author

I'm happy to rebase, but I don't see any changes to the main branch (actually I don't see main at all, just master).

@freemansw1
Copy link
Contributor Author

Sorry, I just want to check that you aren't waiting for me on anything @nkeim

@nkeim nkeim merged commit d69452d into soft-matter:master Jul 19, 2022
@nkeim
Copy link
Contributor

nkeim commented Jul 19, 2022

Looks like whatever dependency (or GHA) issues were at play in February are now gone; all checks pass. Thanks for your contribution @freemansw1 and for your patience!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants