Stack overflow on applying numpy functions to DataFrame with duplicated column entries. #11611

Closed
skycaptain opened this Issue Nov 16, 2015 · 2 comments

Comments

Projects
None yet
2 participants
Contributor

skycaptain commented Nov 16, 2015

Applying a numpy function, such as np.round, to a DataFrame with duplicated column indicies can cause an unrecoverable stack overflow error (Fatal Python error: Cannot recover from stack overflow.), which causes e.g. a ipython kernel to crash. E.g. take the following example, where python crashes at line 4:

x = pd.DataFrame(np.random.randn(3,3))
y = pd.DataFrame(np.random.randn(3,3))
z = pd.concat((x, y), axis=1)
print(np.round(z))

However, removing the duplicate column entries, works as expected:

...
z = pd.concat((x, y), axis=1, ignore_index=True)
print(np.round(z))

python 3.5.0, numpy 1.10.1, pandas 0.17.0

Contributor

jreback commented Nov 16, 2015

this is specifically with np.round, which ends up calling DataFrame.round, which does not handle duplicates properly.

iteration needs to use .iteritems() which correctly handles duplicate iteration, rather than column selection

pull-requests to fix are welcome

jreback added this to the Next Major Release milestone Nov 16, 2015

@jreback jreback modified the milestone: 0.17.1, Next Major Release Nov 20, 2015

@jreback jreback added a commit that referenced this issue Nov 20, 2015

@skycaptain @jreback skycaptain + jreback BUG: fix col iteration in DataFrame.round, #11611
BUG: decimals must be unique indexed, #11618

BUG: Added test, added whatsnew entry, #11618

TST: move round testing to test_format.py
80a2d53
Contributor

jreback commented Nov 20, 2015

closed by #11618

jreback closed this Nov 20, 2015

@yarikoptic yarikoptic added a commit to neurodebian/pandas that referenced this issue Dec 3, 2015

@yarikoptic yarikoptic Merge tag 'v0.17.1' into debian
Version 0.17.1

* tag 'v0.17.1': (168 commits)
  add nbviewer link
  Revert "DOC: fix sponsor notice"
  DOC: a few touchups
  DOC: fix sponsor notice
  DOC: warnings and remove HTML
  COMPAT: compat of scalars on all platforms, xref #11638
  DOC: fix build errors/warnings
  DOC: whatsnew edits
  DOC: fix link syntax
  DOC: update release.rst / whatsnew edits
  BUG: fix col iteration in DataFrame.round, #11611
  DOC: Clarify foramtting
  BUG: #11638 return correct dtype for int and float
  BUG: #11637 fix to_csv incorrect output.
  DOC: sponsor notice
  BUG: indexing with a range , #11652
  Fix link to numexpr
  ENH: fixup tilde expansion, xref #11438
  ENH: tilde expansion for write output formatting functions, #11438
  DOC: fix up doc-string creations in generic.py
  ...
9b2e35f
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment