Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

work for one geometry? #8

Closed
raybellwaves opened this issue Jun 30, 2021 · 5 comments
Closed

work for one geometry? #8

raybellwaves opened this issue Jun 30, 2021 · 5 comments

Comments

@raybellwaves
Copy link
Contributor

I ran into IndexError: single positional indexer is out-of-bounds (Traceback below)

I have a dataset with one variable over CONUS and I'm trying to weight to one geom e.g. a county.

I'll try to give make a reproducible example

---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-83-5cd8fd54cbfc> in <module>
      1 weightmap = xa.pixel_overlaps(ds, gdf, subset_bbox=True)
----> 2 aggregated = xa.aggregate(ds, weightmap)

/opt/userenvs/ray.bell/main/lib/python3.9/site-packages/xagg/core.py in aggregate(ds, wm)
    434                 #   the grid have just nan values for this variable
    435                 # in both cases; the "aggregated variable" is just a vector of nans.
--> 436                 if not np.isnan(wm.agg.iloc[poly_idx,:].pix_idxs).all():
    437                     # Get the dimensions of the variable that aren't "loc" (location)
    438                     other_dims = [k for k in np.atleast_1d(ds[var].dims) if k != 'loc']

/opt/userenvs/ray.bell/main/lib/python3.9/site-packages/pandas/core/indexing.py in __getitem__(self, key)
    887                     # AttributeError for IntervalTree get_value
    888                     return self.obj._get_value(*key, takeable=self._takeable)
--> 889             return self._getitem_tuple(key)
    890         else:
    891             # we by definition only have the 0th axis

/opt/userenvs/ray.bell/main/lib/python3.9/site-packages/pandas/core/indexing.py in _getitem_tuple(self, tup)
   1448     def _getitem_tuple(self, tup: Tuple):
   1449 
-> 1450         self._has_valid_tuple(tup)
   1451         with suppress(IndexingError):
   1452             return self._getitem_lowerdim(tup)

/opt/userenvs/ray.bell/main/lib/python3.9/site-packages/pandas/core/indexing.py in _has_valid_tuple(self, key)
    721         for i, k in enumerate(key):
    722             try:
--> 723                 self._validate_key(k, i)
    724             except ValueError as err:
    725                 raise ValueError(

/opt/userenvs/ray.bell/main/lib/python3.9/site-packages/pandas/core/indexing.py in _validate_key(self, key, axis)
   1356             return
   1357         elif is_integer(key):
-> 1358             self._validate_integer(key, axis)
   1359         elif isinstance(key, tuple):
   1360             # a tuple should already have been caught by this point

/opt/userenvs/ray.bell/main/lib/python3.9/site-packages/pandas/core/indexing.py in _validate_integer(self, key, axis)
   1442         len_axis = len(self.obj._get_axis(axis))
   1443         if key >= len_axis or key < -len_axis:
-> 1444             raise IndexError("single positional indexer is out-of-bounds")
   1445 
   1446     # -------------------------------------------------------------------

IndexError: single positional indexer is out-of-bounds
@raybellwaves
Copy link
Contributor Author

import geopandas as gdf
import xagg as xa
import xarray as xr


ds = xr.tutorial.open_dataset("air_temperature").isel(time=0)
ds = ds.assign_coords(lon=(((ds.lon + 180) % 360) - 180))

url = "https://eric.clst.org/assets/wiki/uploads/Stuff/gz_2010_us_040_00_20m.json"
gdf = gpd.read_file(url)
gdf = gdf[gdf.NAME == 'Florida']

weightmap = xa.pixel_overlaps(ds, gdf)
aggregated = xa.aggregate(ds, weightmap)

Get traceback above

@ks905383
Copy link
Owner

Hm, so it looks like the issue is in xa.get_pixel_overlaps(), which adds the poly_idx column to the data frame (line 287 of xa.core). To do so, it uses the index values of the existing gdf, for which in this case Florida has index 27.

So then, when xa.aggregate() is called, it loops over for poly_idx in wm.agg.poly_idx: in line 427, and indexes the gdf using the poly_idx column, which is 27 instead of 0.

This is obviously not ideal... especially since a geodataframe may have weird indices for any number of reasons. I think there are two ways forward:

  1. (most robust, probably) - instead of setting poly_idx using the provided gdf's index, force it to always just be recounted from 0. I don't quite remember why I had used the built-in index, so I have to check to make sure that doesn't break something.
  2. change every indexing that currently uses poly_idx as a row index to instead be a match (i.e. wm.agg.iloc[wm.agg.poly_idx==poly_idx,:]), but this may affect performance, and there are a lot of places where that index happens.

I'll try both and see what works.

Thanks for catching this!

ks905383 added a commit that referenced this issue Jun 30, 2021
`xa.get_pixel_overlaps()` creates a `poly_idx` column in the `gdf` that takes as its value the index of the input `gdf`. However, if there is a pre-existing index, this can lead to bad behavior, since `poly_idx` is used as an `.iloc` indexer in the `gdf`. This update instead makes `poly_idx` `np.arange(0,len(gdf))`, which will avoid this indexing issue (and hopefully not cause any more? I figured there would've been a reason I used the existing index if not a new one... fingers crossed).
ks905383 added a commit that referenced this issue Jun 30, 2021
fix index error if input gdf has own index [issue #8]
@ks905383
Copy link
Owner

OK should be fixed with #10 ; will incorporate into the next release

@raybellwaves
Copy link
Contributor Author

Thanks a lot. I'll be happy to help if you are aiming to get some other things in before the next release.

@ks905383
Copy link
Owner

Finally in a stable release (0.2.5 should be in pypi etc now), apologies for the delay.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants