Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

groupby trigging StopIteration: error when ran in loop #2240

Closed
lgpreston opened this issue Jun 20, 2018 · 4 comments
Closed

groupby trigging StopIteration: error when ran in loop #2240

lgpreston opened this issue Jun 20, 2018 · 4 comments

Comments

@lgpreston
Copy link

First github issue I've raised so apologies if it doesn't follow protocol.

I'm receiving a StopIteration: error when attempting to use the groupby function in xarray. The error only occurs when attempting to loop through a list of files - if a single file path is input, no error is generated. I've also tried using xr.open_mfdataset to open the full directory of files, but this produced the same error.

for path in in_files:
    ds = xr.open_dataset(path)
    ds['index'] = county_mask
    ds = ds.set_coords('index')
    ds = ds.where(ds['index'].isin(cotton_county_keys))
    ds.groupby('index').mean('stacked_lat_lon').to_dataframe().reset_index()

Produces:

StopIteration                             Traceback (most recent call last)
<ipython-input-91-f26bf31efda5> in <module>()
      6     ds = ds.set_coords('index')
      7     ds = ds.where(ds['index'].isin(cotton_county_keys))
----> 8     ds.groupby('index').mean('stacked_lat_lon').to_dataframe().reset_index()

~\AppData\Local\Continuum\anaconda3\lib\site-packages\xarray\core\common.py in wrapped_func(self, dim, keep_attrs, skipna, **kwargs)
     52                 return self.reduce(func, dim, keep_attrs, skipna=skipna,
     53                                    numeric_only=numeric_only, allow_lazy=True,
---> 54                                    **kwargs)
     55         else:
     56             def wrapped_func(self, dim=None, keep_attrs=False, **kwargs):

~\AppData\Local\Continuum\anaconda3\lib\site-packages\xarray\core\groupby.py in reduce(self, func, dim, keep_attrs, **kwargs)
    652         def reduce_dataset(ds):
    653             return ds.reduce(func, dim, keep_attrs, **kwargs)
--> 654         return self.apply(reduce_dataset)
    655 
    656     def assign(self, **kwargs):

~\AppData\Local\Continuum\anaconda3\lib\site-packages\xarray\core\groupby.py in apply(self, func, **kwargs)
    607         kwargs.pop('shortcut', None)  # ignore shortcut if set (for now)
    608         applied = (func(ds, **kwargs) for ds in self._iter_grouped())
--> 609         return self._combine(applied)
    610 
    611     def _combine(self, applied):

~\AppData\Local\Continuum\anaconda3\lib\site-packages\xarray\core\groupby.py in _combine(self, applied)
    611     def _combine(self, applied):
    612         """Recombine the applied objects like the original."""
--> 613         applied_example, applied = peek_at(applied)
    614         coord, dim, positions = self._infer_concat_args(applied_example)
    615         combined = concat(applied, dim)

~\AppData\Local\Continuum\anaconda3\lib\site-packages\xarray\core\utils.py in peek_at(iterable)
    113     """
    114     gen = iter(iterable)
--> 115     peek = next(gen)
    116     return peek, itertools.chain([peek], gen)
    117 

StopIteration: 

As does:

ds = xr.open_dataset(in_files[0])
ds['index'] = county_mask
ds = ds.set_coords('index')
ds = ds.where(ds['index'].isin(cotton_county_keys))
ds.groupby('index').mean('stacked_lat_lon').to_dataframe().reset_index()

However a file path works perfectly,

path = r'V:\ARL\Weather\Product_Development\US_PRISM_DATA\daily_temp\PRISM_daily_temp_1993-01-08'

ds = xr.open_dataset(path)
ds['index'] = county_mask
ds = ds.set_coords('index')
ds = ds.where(ds['index'].isin(cotton_county_keys))
ds.groupby('index').mean('stacked_lat_lon').to_dataframe().reset_index()
INSTALLED VERSIONS ------------------ commit: None python: 3.6.3.final.0 python-bits: 64 OS: Windows OS-release: 10 machine: AMD64 processor: Intel64 Family 6 Model 158 Stepping 9, GenuineIntel byteorder: little LC_ALL: None LANG: None LOCALE: None.None

xarray: 0.10.3
pandas: 0.22.0
numpy: 1.13.3
scipy: 1.0.1
netCDF4: 1.3.1
h5netcdf: None
h5py: 2.7.0
Nio: None
zarr: None
bottleneck: 1.2.1
cyordereddict: None
dask: 0.15.3
distributed: 1.19.1
matplotlib: 2.1.0
cartopy: 0.15.1
seaborn: 0.8.0
setuptools: 36.5.0.post20170921
pip: 9.0.1
conda: 4.4.6
pytest: 3.2.1
IPython: 6.1.0
sphinx: 1.6.3

@shoyer
Copy link
Member

shoyer commented Jun 20, 2018

Thanks for the report.

I believe this is the same issue as #1764

@shoyer shoyer added the bug label Jun 20, 2018
@lgpreston
Copy link
Author

@shoyer is there any update on this? I don't quite understand the error so have so far been unable to develop a workaround.

@shoyer
Copy link
Member

shoyer commented Sep 25, 2018

Sorry, I haven't had time to look into this yet

@dcherian
Copy link
Contributor

I think this has been fixed since groupby discards nans in the grouped variable.

Please reopen with a reproducible example if it has not been fixed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants