Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: Updated _pprint_seq to use enumerate instead of next #57295

Merged
merged 3 commits into from Feb 9, 2024

Conversation

cliffckerr
Copy link
Contributor

@cliffckerr cliffckerr commented Feb 7, 2024

This PR closes an edge-case bug that has been around for almost 7 years.

To print a DataFrame, the function _pprint_seq() constructs a string representation of an iterable object (called seq) by creating an iterator over it, and truncating it if len(seq) > max_seq_items.

However, a pandas DataFrame is an example of an object where len(seq) is not a valid way of checking the length of the iterator. Specifically, len(df) returns the number of rows, while iter(df) iterates over the columns. When trying to print a DataFrame with more rows than columns, this raises a StopIterator exception.

This PR fixes this bug by explicitly iterating over the object, rather than assuming that len(seq) is equal to the number of items in the object. The new test test_nested_dataframe() raises an exception on main, but passes on this branch.

@simonjayhawkins simonjayhawkins added Bug Output-Formatting __repr__ of pandas objects, to_string labels Feb 7, 2024
def test_nested_dataframe(self):
df1 = DataFrame({"level1": [["row1"], ["row2"]]})
df2 = DataFrame({"level3": [{"level2": df1}]})
df2.to_string()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you assert the result of this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

@phofl phofl added this to the 3.0 milestone Feb 9, 2024
@phofl phofl merged commit 767a9a7 into pandas-dev:main Feb 9, 2024
47 checks passed
@phofl
Copy link
Member

phofl commented Feb 9, 2024

thx @cliffckerr

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Output-Formatting __repr__ of pandas objects, to_string
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ERR: DataFrame can get into unprintable state
4 participants