Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow custom coordinate dimensions #512

Closed
mpu-creare opened this issue Sep 7, 2023 · 17 comments
Closed

Allow custom coordinate dimensions #512

mpu-creare opened this issue Sep 7, 2023 · 17 comments
Assignees

Comments

@mpu-creare
Copy link
Contributor

Description
Allow downstream applications to specify arbitrary coordinate dimensions.

Describe the solution you'd like
I might have an application where I want to pull data from a dataset based on coordinates with dimensions other than ["lat", "lon", "alt", "time"]

Sample Code

import podpac
podpac.coordinates.add_dimension('my_custom_dimension')
coords = podpac.Coordinates([[1, 2, 3]], dims=["my_custom_dimension"])
node = MyCustomNode()  # Defined elsewhere
o = node.eval(coords)
@mpu-creare
Copy link
Contributor Author

@mpu-creare
Copy link
Contributor Author

See https://github.com/creare-com/podpac/tree/feature/arbitrary-coordinate-dims

Looks like this might be pretty straight-forward actually. Needs some careful testing through, and still need that utility function.

With the above changes, this works:

In [1]: import podpac
In [2]: c1 = podpac.Coordinates([[1, 2, 3]], ['mydim'])
---------------------------------------------------------------------------
TraitError                                Traceback (most recent call last)
<snip>
TraitError: The 'name' trait of an ArrayCoordinates1d instance expected any of ['lat', 'lon', 'alt', 'time'] or None, not the str 'mydim'.

In [3]: podpac.core.coordinates.utils.VALID_DIMENSION_NAMES.append('mydim')

In [4]: c1 = podpac.Coordinates([[1, 2, 3]], ['mydim'])

In [5]: n = podpac.data.Array(source=[1, 2, 3], coordinates=c1)

In [6]: n.eval(c1[1:])
Out[6]: 
<xarray.UnitsDataArray (mydim: 2)>
array([2., 3.])
Coordinates:
  * mydim    (mydim) float64 2.0 3.0
Attributes:
    layer_style:  <podpac.core.style.Style object at 0x7fec11c57be0>
    crs:          +proj=longlat +datum=WGS84 +no_defs +vunits=m

@CFoye-Creare
Copy link
Contributor

I think testing will require us to make sure that any hardcoding of the valid dimensions is replaced with looking through VALID_DIMENSION_NAMES

@CFoye-Creare
Copy link
Contributor

I want to create a utility function add_valid_dimension, so we can add a dimension to the list of valid dimensions. I wonder if we want a rm_valid_dimension function with protections on lat, lon, time, alt

@CFoye-Creare
Copy link
Contributor

See https://github.com/creare-com/podpac/blob/develop/podpac/core/common_test_utils.py. This is an odd example of hardcoding. This only seems to be used for testing the coordinate creation? I don't really understand the point of this test or why we would make a hardcoded list of permutations and not use a library (itertools.permutations for example)

@mpu-creare
Copy link
Contributor Author

This is just fine because it lives in the test, and the base podpac library will never have any dimensions other than these. The reason we didn't use itertools is because of combinations of stacked dimensions. Perhaps there was a way to do it, but it was faster to just do a manually.

@CFoye-Creare
Copy link
Contributor

CFoye-Creare commented Sep 14, 2023

Would like to meet about this.

  • It seems the main updating we have to do is include dataset keys for non-default dimension. It seems like Dataset has a "selection" key that supports other dims, but it isn't clear to me how widespread this is. Would like to talk about it.
  • The "polar coordinates" object currently hardcode checks for 'lat/lon' as the dimension. Wondering if we should keep this/what it is intending and a way we can do it without hardcode check.

@CFoye-Creare
Copy link
Contributor

Both seem okay, and a change is not needed right now.

@CFoye-Creare
Copy link
Contributor

CFoye-Creare commented Sep 14, 2023

  • Interpolators hardcode the dimensions supported.
  • Make sure the importing of VALID_DIMENSION_NAMES is not making copies
  • Make sure in unit tests arrays are valid

@CFoye-Creare
Copy link
Contributor

Import of VALID_DIMENSION_NAMES seems to be updating.

In [1]: from podpac.core.coordinates.utils import VALID_DIMENSION_NAMES
INFO:numexpr.utils:NumExpr defaulting to 8 threads.

In [2]: VALID_DIMENSION_NAMES
Out[2]: ['lat', 'lon', 'alt', 'time']

In [3]: from podpac.core.coordinates.utils import add_valid_dimension

In [4]: add_valid_dimension('bug')

In [5]: VALID_DIMENSION_NAMES
Out[5]: ['lat', 'lon', 'alt', 'time', 'bug']

In [6]: from podpac.core.coordinates.utils import VALID_DIMENSION_NAMES

In [7]: VALID_DIMENSION_NAMES
Out[7]: ['lat', 'lon', 'alt', 'time', 'bug']

@CFoye-Creare
Copy link
Contributor

CFoye-Creare commented Sep 20, 2023

See 7416d41

Couple of questions about the valid dims.

  • What do we do about _get_tol and _get_scale? They will raise a NotImplementedError for any custom dim right now. See here
  • When the dims supported are on;y lat and lon for an interpolator, should we replace the array with VALID_DIMENSION_NAMES?
  • Should we allow probing using a custom dim? Currently probe_node in podpac/core.utils:495 doesn't support other dims.
  • Currently the DayOfYearWindow algorithm doens't support custom dims. See here
  • Affine coords seem to hardcode 'lat' and 'lon' as coordinates. Should we leave as-is?
  • Some of the coordinates.transform methods seem to rely on the existence of lat, lon, and alt.

@CFoye-Creare
Copy link
Contributor

We decided that if podpac isn't explicitly doing an operation that involved all of the dimensions of a coordinates object, then we can leave the hardcoded checks in. The philosophy is that custom dimensions should not replace the already existing lat, lon, alt, time dimensions.

@CFoye-Creare
Copy link
Contributor

We decided not to update the probe_node for now.

@CFoye-Creare
Copy link
Contributor

We decided for nearest neighbor interpolators the tolerance and scale for other dimensions are traits. All custom dimensions must share the same tolerance and the same scale. The default value for tolerance is inf and for scale is 1

@CFoye-Creare
Copy link
Contributor

CFoye-Creare commented Sep 20, 2023

  • Rebase onto develop-3.X
  • Test with interpolated coordinates on the array node
  • Make sure to use both nearest neighbor and bilinear interpolation (in 3.X this is an Array/Mixin)

@CFoye-Creare
Copy link
Contributor

Rebase went shaky. Just created a new branch, feature/3.X_custom_dims and cherry-picked commits from original feature/custom_coord_dims branch, as that original branch had actually been created from feature/cache4.0, not from develop.

@CFoye-Creare
Copy link
Contributor

We merged the new branch feature/3.x_custom_dims into develop-3.X. I have also deleted feature/custom_coord_dims.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
Development

No branches or pull requests

2 participants