Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

minimal DataFrame.apply call causes RecursionError #25196

Closed
moonshoes87 opened this issue Feb 6, 2019 · 2 comments · Fixed by #25230
Closed

minimal DataFrame.apply call causes RecursionError #25196

moonshoes87 opened this issue Feb 6, 2019 · 2 comments · Fixed by #25230
Labels
DataFrame DataFrame data structure Regression Functionality that used to work in a prior pandas version
Milestone

Comments

@moonshoes87
Copy link

moonshoes87 commented Feb 6, 2019

Code Sample, a copy-pastable example if possible

import numpy as np
import pandas as pd
df = pd.DataFrame(np.zeros((100, 15)))
df.T.apply(dict)
---------------------------------------------------------------------------
RecursionError                            Traceback (most recent call last)
<ipython-input-153-4987478fba35> in <module>()
      3 #df = pd.DataFrame(np.zeros((2454, 15)))
      4 df = pd.DataFrame(np.zeros((100, 15)))
----> 5 df.T.apply(dict)

~/anaconda3/envs/pmagpy/lib/python3.6/site-packages/pandas/core/frame.py in apply(self, func, axis, broadcast, raw, reduce, result_type, args, **kwds)
   6485                          args=args,
   6486                          kwds=kwds)
-> 6487         return op.get_result()
   6488 
   6489     def applymap(self, func):

~/anaconda3/envs/pmagpy/lib/python3.6/site-packages/pandas/core/apply.py in get_result(self)
    113         if is_list_like(self.f) or is_dict_like(self.f):
    114             return self.obj.aggregate(self.f, axis=self.axis,
--> 115                                       *self.args, **self.kwds)
    116 
    117         # all empty

~/anaconda3/envs/pmagpy/lib/python3.6/site-packages/pandas/core/frame.py in aggregate(self, func, axis, *args, **kwargs)
   6286             pass
   6287         if result is None:
-> 6288             return self.apply(func, axis=axis, args=args, **kwargs)
   6289         return result
   6290 

... last 3 frames repeated, from the frame below ...

~/anaconda3/envs/pmagpy/lib/python3.6/site-packages/pandas/core/frame.py in apply(self, func, axis, broadcast, raw, reduce, result_type, args, **kwds)
   6485                          args=args,
   6486                          kwds=kwds)
-> 6487         return op.get_result()
   6488 
   6489     def applymap(self, func):

RecursionError: maximum recursion depth exceeded

Problem description

This code snippet did not cause a RecursionError in pandas 0.23.4 and earlier. It occurs with pandas 0.24.0 and 0.24.1.

This is possibly similar #23568.

Expected Output

Pandas Series:

0     {0: 0.0, 1: 0.0, 2: 0.0, 3: 0.0, 4: 0.0, 5: 0....
1     {0: 0.0, 1: 0.0, 2: 0.0, 3: 0.0, 4: 0.0, 5: 0....
2     {0: 0.0, 1: 0.0, 2: 0.0, 3: 0.0, 4: 0.0, 5: 0....
...

Output of pd.show_versions()

INSTALLED VERSIONS

commit: None
python: 3.6.6.final.0
python-bits: 64
OS: Darwin
OS-release: 17.7.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: en_US.UTF-8

pandas: 0.24.1
pytest: 3.8.0
pip: 10.0.1
setuptools: 40.2.0
Cython: 0.28.5
numpy: 1.15.2
scipy: 1.1.0
pyarrow: None
xarray: None
IPython: 6.5.0
sphinx: 1.7.9
patsy: None
dateutil: 2.7.3
pytz: 2018.5
blosc: None
bottleneck: 1.2.1
tables: 3.4.4
numexpr: 2.6.8
feather: None
matplotlib: 2.2.3
openpyxl: 2.5.6
xlrd: 1.1.0
xlwt: 1.2.0
xlsxwriter: 1.1.0
lxml.etree: 4.2.5
bs4: 4.6.3
html5lib: 1.0.1
sqlalchemy: 1.2.11
pymysql: None
psycopg2: None
jinja2: 2.10
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None
gcsfs: None

@jschendel
Copy link
Member

Thanks, I can confirm that this is broken on master:

In [1]: import pandas as pd; pd.__version__
Out[1]: '0.25.0.dev0+77.g51fca4cc9'

In [2]: df = pd.DataFrame([[0, 0], [0, 0]])

In [3]: df.apply(dict)
---------------------------------------------------------------------------
RecursionError: maximum recursion depth exceeded while calling a Python object

And it worked on 0.23.4:

In [1]: import pandas as pd; pd.__version__
Out[1]: '0.23.4'

In [2]: df = pd.DataFrame([[0, 0], [0, 0]])

In [3]: df.apply(dict)
Out[3]:
0    {0: 0, 1: 0}
1    {0: 0, 1: 0}
dtype: object

@jschendel jschendel added Regression Functionality that used to work in a prior pandas version DataFrame DataFrame data structure labels Feb 7, 2019
@jschendel jschendel added this to the Contributions Welcome milestone Feb 7, 2019
@jschendel
Copy link
Member

jschendel commented Feb 7, 2019

Issue appears to be caused by this line:

if is_list_like(self.f) or is_dict_like(self.f):

Specifically is_dict_like considers dict to be dict-like, but I think the intention is that it should only look for initialized dict-like structures, and not the constructors themselves. Note that this is inconsistent with is_list_like, which does not consider list to be list-like:

In [1]: from pandas.core.dtypes.common import is_dict_like, is_list_like

In [2]: is_dict_like(dict)
Out[2]: True

In [3]: is_list_like(list)
Out[3]: False

So the fix is to modify is_dict_like to not consider dict (and other constructors) as dict-like. This requires a little more than just excluding dict itself, as other similar things, like the defaultdict constructor, should also not be considered dict-like:

In [4]: from collections import defaultdict

In [5]: is_dict_like(defaultdict)
Out[5]: True

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
DataFrame DataFrame data structure Regression Functionality that used to work in a prior pandas version
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants