Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor NN graph building (included in #43) #21

Draft
wants to merge 33 commits into
base: master
Choose a base branch
from

Conversation

naspert
Copy link

@naspert naspert commented Mar 19, 2018

No description provided.

@naspert naspert requested a review from mdeff March 19, 2018 10:15
Copy link
Collaborator

@mdeff mdeff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your contribution @naspert :)

This refactoring was much needed and I like it. Note that we are still in early development and so we don't care too much about breaking the API (e.g. for the distances' names).

Some general concerns:

  • Shouldn't 'scipy-pdist' be the default? For sure the default should be from scipy (so users won't have to install any additional library), but I wonder how good the scipy KDTree is. I tried the LSHForest once and it wasn't great...
  • If you have some intuitions about which backend to use when, it would be nice to share it in the documentation. ;)
  • Travis is failing, please fix it.
  • For now the tests are simply running the code, without any check. (As it was before I know...) It would be great to check the approximation error in the tests, with 'scipy-pdist' as the ground truth.
  • While refactoring, I would use the opportunity to rename NNtype to type, dist_type to metric, symmetrize_type to symmetrization for consistency.
  • The knn and radius branches should certainly share more code. At least the computation of the Gaussian kernel should be factored out.

Thanks again, and feel free to disagree with the above. ;)


self._nn_functions = {
'knn': {
'scipy-kdtree':_knn_sp_kdtree,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you miss a space after the colon

pygsp/graphs/nngraphs/nngraph.py Outdated Show resolved Hide resolved
- do not use underscores in functions args
@coveralls
Copy link

coveralls commented Mar 19, 2018

Coverage Status

Coverage increased (+0.6%) to 82.579% when pulling a562896 on naspert:nn_refactor into e696d9f on epfl-lts2:master.

@mdeff
Copy link
Collaborator

mdeff commented Mar 29, 2018

Thanks for merging the knn and radius matrix build 👍

@bricaud
Copy link
Contributor

bricaud commented Apr 2, 2018

This might be an interesting alternative to FLANN:
https://github.com/spotify/annoy

@mdeff
Copy link
Collaborator

mdeff commented Apr 2, 2018

Thanks @bricaud :) There's actually a whole lot of libraries, see e.g. this benchmark. We currently support the following backends (thanks to @naspert work):

  • scipy pdist (brute force), KDTree and cKDTree, because scipy is a hard requirement and will be installed for all users
  • nmslib, because it's the fastest according to the benchmark
  • flann, because it was there before and people know it

@bricaud: which kNN libraries do you have experience with? Would you recommend any?
@naspert: do you see any reason to use KDTree over cKDTree?

@naspert
Copy link
Author

naspert commented Apr 9, 2018

cKDTree is the same as kdtree, with a C backend. It generates the same results, much faster so no good reason to prefer kdtree over ckdtree. If you are looking into having a dataset with many dimensions, nmslib will be faster.

mdeff added a commit that referenced this pull request Feb 14, 2019
mdeff added a commit that referenced this pull request Feb 14, 2019
mdeff added a commit that referenced this pull request Feb 15, 2019
mdeff added a commit that referenced this pull request Feb 15, 2019
mdeff added a commit that referenced this pull request Feb 15, 2019
mdeff added a commit that referenced this pull request Feb 16, 2019
@mdeff mdeff mentioned this pull request Feb 25, 2019
@mdeff mdeff changed the title refactor NN graph building refactor NN graph building (included in #43) Mar 5, 2019
@mdeff mdeff marked this pull request as draft May 25, 2020 23:28
nperraud pushed a commit that referenced this pull request Dec 17, 2020
nperraud pushed a commit that referenced this pull request Dec 17, 2020
nperraud pushed a commit that referenced this pull request Dec 17, 2020
nperraud pushed a commit that referenced this pull request Dec 17, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants