Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add more index adapters #28

Merged
merged 21 commits into from
Dec 16, 2020
Merged

Conversation

benbovy
Copy link
Member

@benbovy benbovy commented Dec 15, 2020

TODO:

  • rename 'balltree' and 'geo_balltree' to 'sklearn_balltree' and 'sklearn_geo_balltree'
  • add sklearn.neighbors.KDTree adapter
  • add 's2point' index adapter (based on pys2index)
  • add scipy.cKDTree adapter

Maybe in this PR or later (if relevant):

  • add "geo" kd-tree adapters (conversion from lat/lon to x/y/z)
  • add pykdtree adapters
  • add pyinterp adapters
  • add rtree adapters

All dependencies are optional, adapters are not registered if their corresponding package is not installed.

@benbovy benbovy marked this pull request as draft December 15, 2020 16:27
@benbovy
Copy link
Member Author

benbovy commented Dec 16, 2020

Each index adapter is tested against brute-force nearest neighbor lookup for the whole matrix:

  • numpy/dask
  • 1/2/3 dimension dataset coordinates
  • 1/2/3 dimension indexer coordinates

Not sure we need to apply the whole matrix each time, but this can be addressed later.

@willirath
Copy link
Contributor

willirath commented Dec 16, 2020

Looks very good!

Two things I noticed:

  1. We should list scipy as an explicit dependency (in the tests and in the env file) even if it is probably pulled by sklearn anyway.
  2. The RTD build for this PR doesn't seem to contain the S2 adapter: https://xoak--28.org.readthedocs.build/en/28/search.html?q=S2PointIndexAdapter&check_keywords=yes&area=default But maybe this is a caching issue.

@benbovy
Copy link
Member Author

benbovy commented Dec 16, 2020

Good catch, your two points are addressed in the last commits.

I also implemented attribute access for the index registry. It's convenient for easily accessing the index adapter docstrings, e.g.,

In [1]: import xoak

In [2]: ireg = xoak.IndexRegistry()

In [3]: ireg.s2point?
Init signature: ireg.s2point(**kwargs)
Docstring:     
Xoak index adapter for :class:`pys2index.S2PointIndex`.

It can be used for efficient indexing of a set of latitude / longitude points.

When building the index, the coordinates must be given in the latitude,
longitude order.

Latitude and longitude values must be in degrees for both index and query
points.

See https://github.com/benbovy/pys2index.



This index adapter is registered in xoak under the name ``s2point``.
You can use it in :meth:`xarray.Dataset.xoak.set_index` by simply providing
its name for the ``index_type`` argument.
Alternatively, you can access it via the index registry, i.e.,

>>> import xoak
>>> ireg = xoak.IndexRegistry()
>>> ireg.s2point
File:           ~/Git/github/benbovy/xoak/src/xoak/index/s2_adapters.py
Type:           ABCMeta
Subclasses: 

@benbovy
Copy link
Member Author

benbovy commented Dec 16, 2020

I have no idea on what is causing random crashes on ubuntu...

@benbovy
Copy link
Member Author

benbovy commented Dec 16, 2020

Mmm looks like pys2index conda packages might be broken on GH-actions ubuntu-latest (that's surprising since the feedstock built fine on conda-forge CI)...

...until pys2index conda packages are fixed on ubuntu-latest.
@benbovy benbovy marked this pull request as ready for review December 16, 2020 15:41
@benbovy benbovy mentioned this pull request Dec 16, 2020
4 tasks
@benbovy
Copy link
Member Author

benbovy commented Dec 16, 2020

I'm going to merge this and move on with the remaining items in #29. Thanks @willirath for having a look!

@benbovy benbovy merged commit 2fa2c40 into xarray-contrib:master Dec 16, 2020
@benbovy benbovy deleted the more-indexes branch August 4, 2021 10:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants