BUG: _tidy_repr should not be called when max_rows is None #6863

Merged
merged 1 commit into from Apr 10, 2014

Projects

None yet

3 participants

Contributor
unutbu commented Apr 10, 2014

This issue was raised in http://stackoverflow.com/q/22824104/190597:

import pandas as pd
pd.options.display.max_rows = None
result = pd.Series(range(1001))
print(result)

raises TypeError: unsupported operand type(s) for -: 'NoneType' and 'int'.

The problem occurs in series.py (line 832) when max_rows is None:

    if len(self.index) > (max_rows or 1000):
        result = self._tidy_repr(min(30, max_rows - 4))

Since the doc string for get_options says

display.max_rows: [default: 60] [currently: 60]
        ...
        'None' value means unlimited.

I think _tidy_repr should not be called when max_rows is None.

This PR seems simple enough but my main concern is that this PR stomps on GH1467 which explicitly changed > max_rows to > (max_rows or 1000) and I don't know what the purpose of this was.

Contributor
jreback commented Apr 10, 2014

can you add a test, something like:

with pd.option_context('max_rows',None):
     str(Series(range(1001)))

which would raise w/o this patch but pass with it

Contributor
unutbu commented Apr 10, 2014

test added.

Contributor
jreback commented Apr 10, 2014

can u post a before / after (maybe with defaults and maybe max_rows a bigger number)

to see what this is actually doing

Contributor
unutbu commented Apr 10, 2014

I'm sorry, I'm not following you. If max_rows is set to any int, e.g. 0, 60, 2000, then there is no Exception. The exception is raised only when max_rows is None.

Contributor
jreback commented Apr 10, 2014

so in reality, max_rows of 0 or None is the same effectively as max_rows of 1000 (when this patch is applied)

Contributor
unutbu commented Apr 10, 2014

Using master, max_rows == 0 would return a summarized repr only if len(series.index) > 1000.

After this patch is applied, max_rows == 0 would result in a summarized repr if len(series.index) > 0.

I'm not sure what max_rows = 0 is supposed to do. I could preserve the old behavior with

    if max_rows is not None and len(self.index) > (max_rows or 1000):

however.

Contributor
jreback commented Apr 10, 2014

i think max_rows=0 and max_rows=None are equivalent, so that is fine

Contributor
jreback commented Apr 10, 2014

can you add release note (ref this PR), in Bug Fixes I think

@jreback jreback added this to the 0.14.0 milestone Apr 10, 2014
Contributor
jreback commented Apr 10, 2014
Contributor
unutbu commented Apr 10, 2014

If max_rows = 0 and max_rows = None should behave the same, then perhaps I should change it to if max_rows instead of if max_rows is not None:

if max_rows and len(self.index) > max_rows:
    result = self._tidy_repr(min(30, max_rows - 4))
Contributor
jreback commented Apr 10, 2014

this options interplay always confuses me. that seems reasonable as what is the point of max_rows=0 anyhow, yes, I think that should be equivalent to None

Contributor
unutbu commented Apr 10, 2014

Added note to Bug Fixes in release.rst, and now using if max_rows to handle max_rows = 0 the same as max_rows = None.

Contributor

+1 to treating max_rows=0 the same as max_rows=None. The alternative of not printing any rows would be more surprising than printing all the rows.

@jreback jreback merged commit ea883e0 into pandas-dev:master Apr 10, 2014

1 check passed

continuous-integration/travis-ci The Travis CI build passed
Details
Contributor
jreback commented Apr 10, 2014

@unutbu thanks as always!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment