Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: Fix MI repr with long names #21655

Merged

Conversation

TomAugspurger
Copy link
Contributor

Closes #21180

In [4]:         try:
   ...:             from unittest import mock
   ...:         except ImportError:
   ...:             mock = pytest.importorskip("mock")
   ...:
   ...:         terminal_size = os.terminal_size((118, 96))
   ...:         p1 = mock.patch('pandas.io.formats.console.get_terminal_size',
   ...:                         return_value=terminal_size)
   ...:         p2 = mock.patch('pandas.io.formats.format.get_terminal_size',
   ...:                         return_value=terminal_size)
   ...:
   ...:         index = range(5)
   ...:         columns = pd.MultiIndex.from_tuples([
   ...:             ('This is a long title with > 37 chars.', 'cat'),
   ...:             ('This is a loooooonger title with > 43 chars.', 'dog'),
   ...:         ])
   ...:         df = pd.DataFrame(1, index=index, columns=columns)
   ...:
   ...:

In [5]: with p1, p2:
   ...:     print('-' * 80)
   ...:     print(repr(df))
   ...:     print('-' * 80)
   ...:

output:

--------------------------------------------------------------------------------
  ...
  ...
0 ...
1 ...
2 ...
3 ...
4 ...

[5 rows x 2 columns]
--------------------------------------------------------------------------------

This matches the repr for non-hierarchical

In [6]: s = pd.DataFrame({"A" * 41: [1, 2], 'B' * 41: [1, 2]})

In [7]: with p1, p2:
   ...:     print('-' * 80)
   ...:     print(repr(s))
   ...:     print('-' * 80)
   ...:

output:

--------------------------------------------------------------------------------
  ...
0 ...
1 ...

[2 rows x 2 columns]
--------------------------------------------------------------------------------

These can certainly be improved, though I'm not sure we'll (I'll) get to it for 0.23.2.

@TomAugspurger TomAugspurger added this to the 0.23.2 milestone Jun 27, 2018
@TomAugspurger TomAugspurger added the Output-Formatting __repr__ of pandas objects, to_string label Jun 27, 2018
@TomAugspurger
Copy link
Contributor Author

from @jorisvandenbossche

Overflowing the line is what is happening in my console if I make it smaller

Yeah, agreed that would be best, but I can't tell whose doing the wrapping, if that makes sense.

@TomAugspurger
Copy link
Contributor Author

Well, I guess it has to be the terminal... So we should just always print two columns? And let the terminal wrap if needed?

@codecov
Copy link

codecov bot commented Jun 27, 2018

Codecov Report

❗ No coverage uploaded for pull request base (master@1cc5471). Click here to learn what that means.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff            @@
##             master   #21655   +/-   ##
=========================================
  Coverage          ?    91.9%           
=========================================
  Files             ?      154           
  Lines             ?    49657           
  Branches          ?        0           
=========================================
  Hits              ?    45638           
  Misses            ?     4019           
  Partials          ?        0
Flag Coverage Δ
#multiple 90.28% <100%> (?)
#single 42.05% <75%> (?)
Impacted Files Coverage Δ
pandas/io/formats/format.py 98.25% <100%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 1cc5471...4391c24. Read the comment docs.

@TomAugspurger
Copy link
Contributor Author

Running

import pandas as pd
import numpy as np

s = pd.DataFrame({"A" * 41: 1, "B" * 41: 1}, index=[0, 1, 2])
print("Regular Columns:")
print(repr(s))
print('\n\n')

index = range(5)
columns = pd.MultiIndex.from_tuples([
    ('This is a long title with > 37 chars.', 'cat'),
    ('This is a loooooonger title with > 43 chars.', 'dog'),
])
df = pd.DataFrame(1, index=index, columns=columns)

print("MultiIndex")
print(repr(df))

output

Regular Columns:
   AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA  BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB
0                                          1                                          1
1                                          1                                          1
2                                          1                                          1



MultiIndex
  This is a long title with > 37 chars. This is a loooooonger title with > 43 chars.
                                    cat                                          dog
0                                     1                                            1
1                                     1                                            1
2                                     1                                            1
3                                     1                                            1
4                                     1                                            1

@jorisvandenbossche
Copy link
Member

The output in the top-post is not longer correct with the latest changes?

@TomAugspurger
Copy link
Contributor Author

Correct, #21655 (comment) has the current output.

# TODO: use mock fixutre.
# This is being backported, so doing it directly here.
try:
from unittest import mock
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should PR in 0.24.0 (to move this to test_decorators)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Forgot to lay out my plan.

I'd like to merge this with the try / except, backport to 0.23.2, and then make a PR removing this and using the mock fixture from #20729

#20729 isn't being backported, so that seems easiset.

@@ -640,6 +640,8 @@ def to_string(self):
col_lens = col_lens.drop(mid_ix)
n_cols = len(col_lens)
max_cols_adj = n_cols - self.index # subtract index column
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe make make the comments consistent here

# subtract index column
max_cols_adj = ...

# esnure we print at least two
....

@jorisvandenbossche
Copy link
Member

I can confirm that this fixes #21327 (the one that I could reproduce in my local console without any hacks)

@jorisvandenbossche jorisvandenbossche merged commit ad76ffc into pandas-dev:master Jul 2, 2018
jorisvandenbossche pushed a commit to jorisvandenbossche/pandas that referenced this pull request Jul 2, 2018
jorisvandenbossche pushed a commit that referenced this pull request Jul 5, 2018
Sup3rGeo pushed a commit to Sup3rGeo/pandas that referenced this pull request Oct 1, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Output-Formatting __repr__ of pandas objects, to_string
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants