Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RegularGridInterpolator returning 4D array using xarray object as input #14824

Closed
rebeccaringuette opened this issue Oct 8, 2021 · 5 comments

Comments

@rebeccaringuette
Copy link

I am using a chunked xarray object from a netCDF4 file as the input to scipy's RegularGridInterpolator as below.

cdf_data = xarray.open_dataset(filename, chunks={'time':100, 'x':10, 'y':10, 'z':10})  #import using dask for chunking
test_da = getattr(cdf_data, 'rr')  #sample DataArray for testing containing one variable
test_da = test_da.assign_coords(time=cdf_data._time, x=cdf_data.x, y=cdf_data._y, z=cdf_data._z)  #add arrays for coordinate values
rgi1 = RegularGridInterpolator((test_da.time, test_da.x, test_da.y, test_da.z), \
                              test_da, bounds_error = False, fill_value=np.NaN)   #define interpolator function

#given time, x, y, z are 1D arrays of the same length, the coordinate values are given by:
track = np.array([[t, c1_val, c2_val, c3_val] for t, c1_val, c2_val, c3_val in zip(
            time,x,y,z)])  #which has shape (59,4)
result1 = rgi1(track)  #calling rgi1 with track  gives the error:
#ValueError: cannot reshape array of size 12117361 into shape (59,)
#Note: 59**4 = 12117361   
#It is attempting to return an NxNxNxN array instead of the normal 1D array of length N

#creating the interpolator using numpy arrays does not produce this error
rgi2 = RegularGridInterpolator((test_da.time.values, test_da.x.values, 
                                test_da.y.values, test_da.z.values), 
                              test_da.values, bounds_error = False, fill_value=np.NaN)
result2 = rgi2(track)
#This returns a 1D array of length N (59 in my case) with beautiful speed.

I am dealing with 'medium data', which fits on my disk but not in memory, so I need to avoid converting the object into a numpy array, which takes about 5GB on my computer. I have to do this 15 times, so the second option I described is not practical. My current (irritating) solution is:

result = np.squeeze(np.array([rgi1(track0) for track0 in track]))

which is clumsy and takes several seconds longer than the simplistic result2 call. Is there a better way to do this? Why the differing behavior in the two cases? I get the same behavior using interpn via xarray.

@rebeccaringuette
Copy link
Author

test_da looks like this after the coordinate arrays are assigned:
<xarray.DataArray 'rr' (time: 60, x: 656, y: 190, z: 190)>
dask.array<open_dataset-52c52bdbb592b7919b52dece2a11a4farr, shape=(60, 656, 190, 190), dtype=float32, chunksize=(60, 10, 10, 10), chunktype=numpy.ndarray>
Coordinates:

  • time (time) float32 12.0 12.02 12.03 12.05 ... 12.93 12.95 12.97 12.98
  • x (x) float32 -349.6 -348.1 -346.7 -345.2 ... 32.77 32.77 32.91 33.05
  • y (y) float32 -95.97 -93.03 -90.04 -87.27 ... 87.27 90.04 93.03 95.97
  • z (z) float32 -95.97 -93.03 -90.04 -87.27 ... 87.27 90.04 93.03 95.97
    Attributes:
    units: 1/cm**3

@tupui
Copy link
Member

tupui commented Oct 9, 2021

Hi @rebeccaringuette, thanks for dropping by. This looks to me more like a usage question. If so, please try StackOverflow or our user mailing list instead. Issues are just for bugs or enhancement proposals since we don't have the bandwidth to do more.

If there is a bug, please elaborate on the bug (answering the questions from the bug template at least), otherwize please close the issue.

@rebeccaringuette
Copy link
Author

rebeccaringuette commented Oct 12, 2021 via email

@tupui
Copy link
Member

tupui commented Oct 12, 2021

Well sounds more like an issue on xarray's side than SciPy then, no? We don't support xarray explicitly in the code base (at least now).

@rebeccaringuette
Copy link
Author

rebeccaringuette commented Oct 12, 2021 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants