New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: to_latex() output broken when the index has a name #10660

Closed
jakbaum opened this Issue Jul 23, 2015 · 11 comments

Comments

Projects
None yet
3 participants
@jakbaum

jakbaum commented Jul 23, 2015

Hey folks,

I posted this on SO and was asked to file a report here as well.

I'm trying to export pandas.DataFrame.describe() to LaTex using the to_latex()-method. This works all fine as long as I don't apply the groupby()-method beforehand. With a grouped DataFrame, the first row has no values, even though its label is count. Note that the first row of a grouped dataframe is used to mark down the variable used for grouping in iPython notebook.

I'm using pandas 0.16.2, python 3.
Is this a bug or am I doing something wrong?

Cheers,
Jakob

Here some examples:

Without groupby:

\begin{tabular}{lr}
\toprule
{} &    IS\_FEMALE \\
\midrule
count &  2267.000000 \\
mean  &     0.384649 \\
...
...
75\%   &     1.000000 \\
max   &     1.000000 \\
\bottomrule
\end{tabular}

enter image description here

With groupby:

\begin{tabular}{llr}
\toprule
  &       &    IS\_FEMALE \\
\midrule
0 & count &              \\     % <-- note missing value here
  & mean &  1134.000000 \\
  & std &     0.554674 \\
...
...
  & 75\% &     0.000000 \\
  & max &     0.000000 \\
\bottomrule
\end{tabular}

enter image description here

Output in the notebook:

enter image description here

@jorisvandenbossche

This comment has been minimized.

Show comment
Hide comment
@jorisvandenbossche

jorisvandenbossche Jul 23, 2015

Member

Thanks for the report! Can you:

  • try to provide a small reproducible example? (so some code we can run that makes up a dummy dataframe and that reproduces the error)
  • check if it is an issue with groupby, or just with to_latex. For example, if you create a similar dataframe comparable to the output of the groupby by hand, and then export it to latex, do you experience the same error?
Member

jorisvandenbossche commented Jul 23, 2015

Thanks for the report! Can you:

  • try to provide a small reproducible example? (so some code we can run that makes up a dummy dataframe and that reproduces the error)
  • check if it is an issue with groupby, or just with to_latex. For example, if you create a similar dataframe comparable to the output of the groupby by hand, and then export it to latex, do you experience the same error?
@jakbaum

This comment has been minimized.

Show comment
Hide comment
@jakbaum

jakbaum Jul 23, 2015

Sure. This snippet re-creates the issue. Sorry for the messy DataFrame-construction. First time I create one with numpy.

import pandas as pd
import numpy as np

cols = ['Group','Value']
group = np.random.randint(2, size=10)
values = np.random.random_sample(10)
df = pd.DataFrame([group, values]).T
df.columns = cols

print(df.groupby('Group').describe().to_latex())

I don't really know how to test your second point, to be honest. The first 'blank' row of a groupby is just visualization, I reckon?

jakbaum commented Jul 23, 2015

Sure. This snippet re-creates the issue. Sorry for the messy DataFrame-construction. First time I create one with numpy.

import pandas as pd
import numpy as np

cols = ['Group','Value']
group = np.random.randint(2, size=10)
values = np.random.random_sample(10)
df = pd.DataFrame([group, values]).T
df.columns = cols

print(df.groupby('Group').describe().to_latex())

I don't really know how to test your second point, to be honest. The first 'blank' row of a groupby is just visualization, I reckon?

@jorisvandenbossche

This comment has been minimized.

Show comment
Hide comment
@jorisvandenbossche

jorisvandenbossche Jul 23, 2015

Member

Thanks for the reproducible example! That indeeds triggers the error for me as well.

Here is an example of just a small dataframe that also shows the error (as it has as such nothing to do with the groupby, it is just that it creates a multi-index that to_latex handles incorrectly):


In [22]: df = pd.DataFrame({'a':[0,0,1,1], 'b':list('abab'), 'c':[1,2,3,4]})

In [23]: df = df.set_index(['a', 'b'])

In [24]: df
Out[24]:
     c
a b
0 a  1
  b  2
1 a  3
  b  4

In [25]: print(df.to_latex())
\begin{tabular}{llr}
\toprule
  &   &  c \\
\midrule
0 & a &    \\
  & b &  1 \\
1 & a &  2 \\
  & b &  3 \\
\bottomrule
\end{tabular}

It seems that all values are shifted one line below.

Member

jorisvandenbossche commented Jul 23, 2015

Thanks for the reproducible example! That indeeds triggers the error for me as well.

Here is an example of just a small dataframe that also shows the error (as it has as such nothing to do with the groupby, it is just that it creates a multi-index that to_latex handles incorrectly):


In [22]: df = pd.DataFrame({'a':[0,0,1,1], 'b':list('abab'), 'c':[1,2,3,4]})

In [23]: df = df.set_index(['a', 'b'])

In [24]: df
Out[24]:
     c
a b
0 a  1
  b  2
1 a  3
  b  4

In [25]: print(df.to_latex())
\begin{tabular}{llr}
\toprule
  &   &  c \\
\midrule
0 & a &    \\
  & b &  1 \\
1 & a &  2 \\
  & b &  3 \\
\bottomrule
\end{tabular}

It seems that all values are shifted one line below.

@jorisvandenbossche

This comment has been minimized.

Show comment
Hide comment
@jorisvandenbossche

jorisvandenbossche Jul 23, 2015

Member

It seems this has something to do with the index level names:

In [35]: df.index.names = [None, None]

In [36]: df
Out[36]:
     c
0 a  1
  b  2
1 a  3
  b  4

In [37]: print df.to_latex()
\begin{tabular}{llr}
\toprule
  &   &  c \\
\midrule
0 & a &  1 \\
  & b &  2 \\
1 & a &  3 \\
  & b &  4 \\
\bottomrule
\end{tabular}

And possibly related: #9908

Member

jorisvandenbossche commented Jul 23, 2015

It seems this has something to do with the index level names:

In [35]: df.index.names = [None, None]

In [36]: df
Out[36]:
     c
0 a  1
  b  2
1 a  3
  b  4

In [37]: print df.to_latex()
\begin{tabular}{llr}
\toprule
  &   &  c \\
\midrule
0 & a &  1 \\
  & b &  2 \\
1 & a &  3 \\
  & b &  4 \\
\bottomrule
\end{tabular}

And possibly related: #9908

@jreback

This comment has been minimized.

Show comment
Hide comment
@jreback

jreback Jul 23, 2015

Contributor

dupe if #2942 ?

Contributor

jreback commented Jul 23, 2015

dupe if #2942 ?

@jorisvandenbossche jorisvandenbossche changed the title from .groupby().to_latex() output broken to BUG: to_latex() output broken when the index has a name Jul 23, 2015

@jorisvandenbossche

This comment has been minimized.

Show comment
Hide comment
@jorisvandenbossche

jorisvandenbossche Jul 23, 2015

Member

No, I don't think so, as this one not only applies to multi-index:

In [45]: df = pd.DataFrame({'a':list('abc'), 'b':[1,2,3]})

In [46]: df = df.set_index(['a'])

In [47]: df
Out[47]:
   b
a
a  1
b  2
c  3

In [49]: print df.to_latex()
\begin{tabular}{lr}
\toprule
{} &  b \\
\midrule
a &    \\
a &  1 \\
b &  2 \\
c &  3 \\
\bottomrule
\end{tabular}

So it is something with the index name.

Member

jorisvandenbossche commented Jul 23, 2015

No, I don't think so, as this one not only applies to multi-index:

In [45]: df = pd.DataFrame({'a':list('abc'), 'b':[1,2,3]})

In [46]: df = df.set_index(['a'])

In [47]: df
Out[47]:
   b
a
a  1
b  2
c  3

In [49]: print df.to_latex()
\begin{tabular}{lr}
\toprule
{} &  b \\
\midrule
a &    \\
a &  1 \\
b &  2 \\
c &  3 \\
\bottomrule
\end{tabular}

So it is something with the index name.

@jakbaum

This comment has been minimized.

Show comment
Hide comment
@jakbaum

jakbaum Jul 23, 2015

Is the proposed fix of #9908 implemented in 0.16.2?

jakbaum commented Jul 23, 2015

Is the proposed fix of #9908 implemented in 0.16.2?

@jorisvandenbossche

This comment has been minimized.

Show comment
Hide comment
@jorisvandenbossche

jorisvandenbossche Jul 23, 2015

Member

@jakbaum yes, it is already in 0.16.1. But it does not fix this one, it possibly fixed a related issue, but should look into more detail into that.

And very welcome to look into the problem if you want! It shouldn't be that hard I think.

Member

jorisvandenbossche commented Jul 23, 2015

@jakbaum yes, it is already in 0.16.1. But it does not fix this one, it possibly fixed a related issue, but should look into more detail into that.

And very welcome to look into the problem if you want! It shouldn't be that hard I think.

@jreback

This comment has been minimized.

Show comment
Hide comment
@jreback

jreback Jul 23, 2015

Contributor

also #8336

Contributor

jreback commented Jul 23, 2015

also #8336

@jakbaum

This comment has been minimized.

Show comment
Hide comment
@jakbaum

jakbaum Jul 23, 2015

@jorisvandenbossche Your belief in my coding qualities honor me, but quite honestly: I don't think I'm capable of fixing this. I wouldn't even know how to start and I don't want to mess things up. Actually, I'm more of a copy-paste coder than anything else. :)

jakbaum commented Jul 23, 2015

@jorisvandenbossche Your belief in my coding qualities honor me, but quite honestly: I don't think I'm capable of fixing this. I wouldn't even know how to start and I don't want to mess things up. Actually, I'm more of a copy-paste coder than anything else. :)

@jorisvandenbossche

This comment has been minimized.

Show comment
Hide comment
@jorisvandenbossche

jorisvandenbossche Jul 23, 2015

Member

@jakbaum no problem, thanks for reporting it anyway!

Member

jorisvandenbossche commented Jul 23, 2015

@jakbaum no problem, thanks for reporting it anyway!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment