You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
As of pandas 1.1 (still in development), you can pass a dropna argument to groupby. This should fix it going forward, but I'm not sure of the best strategy with previous versions.
I'd lean toward...
updating the behavior to use dropna=False for pandas 1.1 (this will make it consistent with dplyr)
at the very least for < 1.1, raise an explicit error for pandas when NAs in group keys
at best for < 1.1, have it operate while dropping NA rows (pandas' behavior), and raise a warning.
~/.virtualenvs/siublocks-org/lib/python3.6/site-packages/siuba/dply/verbs.py in call(self, x)
92 res = x
93 for f in self.calls:
---> 94 res = f(res)
95 return res
96
~/.virtualenvs/siublocks-org/lib/python3.6/site-packages/siuba/siu.py in call(self, x)
193 return getattr(inst, *rest)
194 elif self.func == "call":
--> 195 return getattr(inst, self.func)(*rest, **kwargs)
196
197 # in normal case, get method to call, and then call it
~/.virtualenvs/siublocks-org/lib/python3.6/site-packages/siuba/dply/verbs.py in _mutate(__data, **kwargs)
283 # will drop all but original index
284 group_by_lvls = list(range(df.index.nlevels - 1))
--> 285 g_df = df.reset_index(group_by_lvls, drop = True).loc[orig_index].groupby(groupings)
286
287 return g_df
~/.virtualenvs/siublocks-org/lib/python3.6/site-packages/pandas/core/indexing.py in _validate_read_indexer(self, key, indexer, axis, raise_missing)
1653 if not (ax.is_categorical() or ax.is_interval()):
1654 raise KeyError(
-> 1655 "Passing list-likes to .loc or [] with any missing labels "
1656 "is no longer supported, see "
1657 "https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#deprecate-loc-reindex-listlike" # noqa:E501
Note that for some reason the above code succeeds on pandas v1.5.1 (I believe there's an issue open, and that this is a regression?), but we should still set dropna=False in all grouped operations.
Running code like below raises an error, because pandas by default drops group keys that are NA.
As of pandas 1.1 (still in development), you can pass a dropna argument to groupby. This should fix it going forward, but I'm not sure of the best strategy with previous versions.
I'd lean toward...
Traceback:
The text was updated successfully, but these errors were encountered: