-
-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Description
MCVE Code Sample
import xarray as xr
import pandas as pd
from functools import reduce
# Create time dimension array for all climatological winters
indexes = [pd.DatetimeIndex(f'{year}-12-01', f'{year+1}-03-01', freq='12H')
for year in range(1980, 2020)]
index_union = reduce(pd.Index.union, indexes)
# Create DataArray
ds = xr.Dataset({'var': ('time', np.arange(len(ix_union))), 'time': index_union})Problem Description
From here we can check ds time dimension months:
>>> pd.DatetimeIndex(ds.time.values).month.unique()
Int64Index([12, 1, 2, 3], dtype='int64')Now, if we downsample our Dataset to a one week period:
>>> pd.DatetimeIndex(ds.resample(time='1W').mean().time.values).month.unique()
Int64Index([12, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11], dtype='int64')Instead of getting the weeks within the months in the original Dataset, we obtain additional months with missing values in the var array. One way of solving this issue is to use the dropna() method, but is a slow approach.
Expected Output
We would expect something like this:
>>> pd.DatetimeIndex(ds.resample(time='1W').mean().time.values).month.unique()
Int64Index([12, 1, 2, 3], dtype='int64')Output of xr.show_versions()
Details
INSTALLED VERSIONS ------------------ commit: None python: 3.7.6 (default, Jan 8 2020, 19:59:22) [GCC 7.3.0] python-bits: 64 OS: Linux OS-release: 3.10.0-957.12.2.el7.x86_64 machine: x86_64 processor: x86_64 byteorder: little LC_ALL: en_US.UTF-8 LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 libhdf5: 1.10.5 libnetcdf: 4.7.3xarray: 0.14.1
pandas: 0.25.3
numpy: 1.17.5
scipy: 1.4.1
netCDF4: 1.5.3
pydap: None
h5netcdf: None
h5py: None
Nio: None
zarr: None
cftime: 1.0.4.2
nc_time_axis: None
PseudoNetCDF: None
rasterio: None
cfgrib: 0.9.7.7
iris: 2.3.0
bottleneck: None
dask: 2.9.2
distributed: 2.9.3
matplotlib: 3.1.1
cartopy: 0.17.0
seaborn: 0.9.0
numbagg: None
setuptools: 44.0.0.post20200106
pip: 19.3.1
conda: None
pytest: None
IPython: 7.11.1
sphinx: None