Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: Better string representation for MultiIndex #4935

Merged
merged 1 commit into from
Sep 27, 2013

Conversation

jtratner
Copy link
Contributor

Fixes #3347.

All interested parties are invited to submit test cases. cc @y-p @hayd


Examples:

In [1]: import pandas as pd

In [2]: pd.MultiIndex.from_arrays([[1, 1, 1, 1], [1, 3, 5, 7], [9, 9, 1, 1]])
Out[2]:
MultiIndex(levels=[[1], [1, 3, 5, 7], [1, 9]]
           labels=[[0, 0, 0, 0], [0, 1, 2, 3], [1, 1, 0, 0]])

In [3]: mi = _

In [4]: mi.names = list('abc')

In [5]: print mi
a  b  c
1  1  9
   3  9
   5  1
   7  1

And with too many rows:

In [10]: pd.set_option('display.max_rows', 15)

In [11]: lst1 = [1] * 3 + [2] * 5 + [3] * 2

In [12]: lst1
Out[12]: [1, 1, 1, 2, 2, 2, 2, 2, 3, 3]

In [13]: lst2 = ['a'] * 6 + ['b'] * 3 + ['c'] * 1

In [14]: lst2
Out[14]: ['a', 'a', 'a', 'a', 'a', 'a', 'b', 'b', 'b', 'c']

In [15]: mi = pd.MultiIndex.from_arrays([lst1 * 10, lst2 * 10, range(100)]); mi
Out[15]:
MultiIndex(levels=[[1, 2, 3], [u'a', u'b', u'c'], [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99]]
           labels=[[0, 0, 0, 1, 1, 1, 1, 1, 2, 2, 0, 0, 0, 1, 1, 1, 1, 1, 2, 2, 0, 0, 0, 1, 1, 1, 1, 1, 2, 2, 0, 0, 0, 1, 1, 1, 1, 1, 2, 2, 0, 0, 0, 1, 1, 1, 1, 1, 2, 2, 0, 0, 0, 1, 1, 1, 1, 1, 2, 2, 0, 0, 0, 1, 1, 1, 1, 1, 2, 2, 0, 0, 0, 1, 1, 1, 1, 1, 2, 2, 0, 0, 0, 1, 1, 1, 1, 1, 2, 2, 0, 0, 0, 1, 1, 1, 1, 1, 2, 2], [0, 0, 0, 0, 0, 0, 1, 1, 1, 2, 0, 0, 0, 0, 0, 0, 1, 1, 1, 2, 0, 0, 0, 0, 0, 0, 1, 1, 1, 2, 0, 0, 0, 0, 0, 0, 1, 1, 1, 2, 0, 0, 0, 0, 0, 0, 1, 1, 1, 2, 0, 0, 0, 0, 0, 0, 1, 1, 1, 2, 0, 0, 0, 0, 0, 0, 1, 1, 1, 2, 0, 0, 0, 0, 0, 0, 1, 1, 1, 2, 0, 0, 0, 0, 0, 0, 1, 1, 1, 2, 0, 0, 0, 0, 0, 0, 1, 1, 1, 2], [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99]])

In [16]: mi = _

In [17]: print mi

1  a  0
      1
      2
2  a  3
      4
      5
  ...
2  a  93
      94
      95
   b  96
      97
3  b  98
   c  99

Not sure how to handle too wide -- wrap? format() doesn't try to sparsify after a column with values everywhere.

In [19]: mi = pd.MultiIndex.from_arrays([lst1 * 10, lst2 * 10, range(100)] * 10)
In [20]: print mi

1  a  0   1  a  0   1  a  0   1  a  0   1  a  0   1  a  0   1  a  0   1  a  0   1  a  0   1  a  0
      1   1  a  1   1  a  1   1  a  1   1  a  1   1  a  1   1  a  1   1  a  1   1  a  1   1  a  1
      2   1  a  2   1  a  2   1  a  2   1  a  2   1  a  2   1  a  2   1  a  2   1  a  2   1  a  2
2  a  3   2  a  3   2  a  3   2  a  3   2  a  3   2  a  3   2  a  3   2  a  3   2  a  3   2  a  3
      4   2  a  4   2  a  4   2  a  4   2  a  4   2  a  4   2  a  4   2  a  4   2  a  4   2  a  4
      5   2  a  5   2  a  5   2  a  5   2  a  5   2  a  5   2  a  5   2  a  5   2  a  5   2  a  5
                                               ...
2  a  93  2  a  93  2  a  93  2  a  93  2  a  93  2  a  93  2  a  93  2  a  93  2  a  93  2  a  93
      94  2  a  94  2  a  94  2  a  94  2  a  94  2  a  94  2  a  94  2  a  94  2  a  94  2  a  94
      95  2  a  95  2  a  95  2  a  95  2  a  95  2  a  95  2  a  95  2  a  95  2  a  95  2  a  95
   b  96  2  b  96  2  b  96  2  b  96  2  b  96  2  b  96  2  b  96  2  b  96  2  b  96  2  b  96
      97  2  b  97  2  b  97  2  b  97  2  b  97  2  b  97  2  b  97  2  b  97  2  b  97  2  b  97
3  b  98  3  b  98  3  b  98  3  b  98  3  b  98  3  b  98  3  b  98  3  b  98  3  b  98  3  b  98
   c  99  3  c  99  3  c  99  3  c  99  3  c  99  3  c  99  3  c  99  3  c  99  3  c  99  3  c  99
In [9]: labels = [('foo', '2012-07-26T00:00:00', 'b5c2700'),
   ...: ('foo', '2012-08-06T00:00:00', '900b2ca'),
   ...: ('foo', '2012-08-15T00:00:00', '07f1ce0'),
   ...: ('foo', '2012-09-25T00:00:00', '5c93e83'),
   ...: ('foo', '2012-09-25T00:00:00', '9345bba')]

In [10]: print pd.MultiIndex.from_tuples(labels)

foo  2012-07-26T00:00:00  b5c2700
     2012-08-06T00:00:00  900b2ca
     2012-08-15T00:00:00  07f1ce0
     2012-09-25T00:00:00  5c93e83
                          9345bba

@cpcloud
Copy link
Member

cpcloud commented Sep 22, 2013

+1 ... the current MultiIndex repr always makes me feel a bit strange

@jreback
Copy link
Contributor

jreback commented Sep 22, 2013

looks ok

jtratner added a commit that referenced this pull request Sep 27, 2013
ENH: Better string representation for MultiIndex
@jtratner jtratner merged commit 6260e29 into pandas-dev:master Sep 27, 2013
@jtratner jtratner deleted the multiindex-smart-repr branch September 27, 2013 00:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

MultiIndex smart repr fmt doesn't work well for complex examples
3 participants