You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
For a DataFrame I want to preserve rows that belong to groups that fulfil specific condition and replace other rows with NaN. I have used a combination of 'groupby' and 'filter' (with dropna=False). In a special case when there are no groups fulfilling the condition an exception occured.
The problem I have identified is in the _apply_filter method of _GroupBy class (core/groupby.py) -- line with "mask[indices.astype(int)] = True" throws because in my case indices is equal to []; shouldn't it be "indices = np.array([])" instead of "indices = []" in the case when len(indices) == 0
def_apply_filter(self, indices, dropna):
iflen(indices) ==0:
indices= []
else:
indices=np.sort(np.concatenate(indices))
ifdropna:
filtered=self._selected_obj.take(indices, axis=self.axis)
else:
mask=np.empty(len(self._selected_obj.index), dtype=bool)
mask.fill(False)
mask[indices.astype(int)] =True# mask fails to broadcast when passed to where; broadcast manually.mask=np.tile(mask, list(self._selected_obj.shape[1:]) + [1]).Tfiltered=self._selected_obj.where(mask) # Fill with NaNs.returnfiltered
jreback
changed the title
filter (with dropna=False) when there are no groups fulfilling the condition
BUG: filter (with dropna=False) when there are no groups fulfilling the condition
Apr 1, 2016
For a DataFrame I want to preserve rows that belong to groups that fulfil specific condition and replace other rows with NaN. I have used a combination of 'groupby' and 'filter' (with dropna=False). In a special case when there are no groups fulfilling the condition an exception occured.
The problem I have identified is in the _apply_filter method of _GroupBy class (core/groupby.py) -- line with "mask[indices.astype(int)] = True" throws because in my case indices is equal to []; shouldn't it be "indices = np.array([])" instead of "indices = []" in the case when len(indices) == 0
Code Sample, a copy-pastable example if possible
Expected Output
output of
pd.show_versions()
The text was updated successfully, but these errors were encountered: