Description
A small, complete example of the issue
I am not sure I understand how label generation work in pandas and I've had several problems in the past. I tried to find some documentation without success so here is my pb. I open a ticket here since I was told on gitter it might be a bug but it might just be me misunderstanding
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
df = pd.DataFrame({ 'A' : 1.,
'B' : pd.Timestamp('20130102'),
'C' : pd.Series(1,index=list(range(4)),dtype='float32'),
'D' : np.array([3] * 4,dtype='int32'),
'E' : pd.Categorical(["test","train","test","train"]),
'F' : 'foo' })
fig = plt.figure()
axes = fig.gca()
g = df.groupby(['B','F'])
print("Nb of %d" % len(g))
g.plot.line(ax=axes,
legend=True,
x="D",
y="C"
)
plt.show()
Expected Output
I would expect to see only one label per group there is one group but instead I see 2 labels (only one line plotted).
The real example (in a more complex program), I have 2 groups with a multiindex of 3 keys and it generates 6 labels while it plots 2 lines.
Hence it seems there are as many labels as (nb of groups)* (nb of keys in multindex)
Output of pd.show_versions()
commit: None
python: 3.5.2.final.0
python-bits: 64
OS: Linux
OS-release: 4.8.0-26-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: fr_FR.UTF-8
LOCALE: fr_FR.UTF-8
pandas: 0.19.0
nose: 1.3.7
pip: 8.1.2
setuptools: 26.1.1
Cython: 0.24.1
numpy: 1.11.1rc1
scipy: None
statsmodels: None
xarray: None
IPython: 5.1.0
sphinx: 1.4.5
patsy: None
dateutil: 2.5.3
pytz: 2016.7
blosc: None
bottleneck: None
tables: None
numexpr: 2.5
matplotlib: 1.5.3
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: 0.7.3
lxml: None
bs4: 4.5.1
html5lib: 0.999
httplib2: 0.9.1
apiclient: None
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: 2.8
boto: None
pandas_datareader: None