Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

to_html vertically expands multiindex cells if there are empty strings #3547

Closed
al-yisun opened this issue May 8, 2013 · 5 comments
Closed
Labels
Bug Output-Formatting __repr__ of pandas objects, to_string
Milestone

Comments

@al-yisun
Copy link
Contributor

al-yisun commented May 8, 2013

Notice in the html below the second index column's 'a' field is given as <th rowspan="2" valign="top">a</th>, expanding over what should be an empty cell.

I actually found this using a pivot table, margins=True, as this creates a row keyed like ('All', '', '').

print df = pd.DataFrame({'c1': ['a', 'b'], 'c2': ['a', ''], 'data': [1, 2]}).set_index(['c1', 'c2'])
       data
c1 c2      
a  a      1
b         2

print df.to_html()
<table border="1" class="dataframe">
  <thead>
    <tr style="text-align: right;">
      <th></th>
      <th></th>
      <th>data</th>
    </tr>
    <tr>
      <th>c1</th>
      <th>c2</th>
      <th></th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <th>a</th>
      <th rowspan="2" valign="top">a</th>
      <td> 1</td>
    </tr>
    <tr>
      <th>b</th>
      <td> 2</td>
    </tr>
  </tbody>
</table>
@cpcloud
Copy link
Member

cpcloud commented May 9, 2013

@y-p @jreback r u sure this a bug? _get_level_lengths on line 742 of pandas/core/format.py explicitly tests for the empty string. Getting rid of that check causes tests to fail, and looking at the tests it looks like this kind of output is expected unless the tests are copy and pasted output of what is expected in which case it probably is a bug.

@al-yisun
Copy link
Contributor Author

al-yisun commented May 9, 2013

> pd.DataFrame({'c1': ['a', 'b'], 'c2': ['a', ''], 'data': [1, 2]}).set_index(['c1', 'c2'])

screenshot_5_9_13_15_43

This table is definitely incorrect.

@cpcloud
Copy link
Member

cpcloud commented May 9, 2013

@Adeodatus I asked because the code seems to indicate that an empty string in a MultiIndex means "repeat the previous value", so I'm not sure I would say it's definitely incorrect. Seems more like a design choice issue. Of course, I could be completely wrong.

@ghost
Copy link

ghost commented May 10, 2013

Empty string labels are a mild abuse IMO. But this does work:

import pandas as pd
df = pd.DataFrame({'c1': ['', 'b'], 'c2': ['a', 'c'], 'data': [1, 2]}).set_index(['c1', 'c2'])
df.ix['']

so It's a bug. Using '' as a sentinel value was a way to reuse the existing formatting code.

fixed by bb4b640

@point6013
Copy link

> pd.DataFrame({'c1': ['a', 'b'], 'c2': ['a', ''], 'data': [1, 2]}).set_index(['c1', 'c2'])

screenshot_5_9_13_15_43

This table is definitely incorrect.

in this case; how to make data down to aglin with the C1 and C2

This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Output-Formatting __repr__ of pandas objects, to_string
Projects
None yet
Development

No branches or pull requests

3 participants