Description
Pandas version checks
I have checked that this issue has not already been reported.
I have confirmed this bug exists on the latest version of pandas.
I have confirmed this bug exists on the main branch of pandas.
Reproducible Example
import pandas as pd
import matplotlib.pyplot as plt
# I am using Jupyter. and my backend is notebook, for this, I use %matplotlib notebook
data = {'y': [1,2,3,4,5,6,7,8,9,10],
'x': [1,2,3,4,5,6,7,8,9,10],
'layer' : ['a','a','a','b','b','b','c','c','c','c']}
df = pd.DataFrame(data)
plt.figure()
df.groupby("layer").plot(x='x', y='y', ax= plt.gca(), kind='line') # this changes the color automatically
plt.figure()
df.groupby("layer").plot(x='x', y='y', ax= plt.gca(), kind='scatter') # this does not
Issue Description
I was trying to write an asnwer on stackoverflow and I noticed the following behaviour:
Using df.groupby("layer").plot(x='x', y='y', ax= plt.gca(), kind='line')
changes the color of every new group automatically.
This is however not seen when using kind="scatter"
.
I think my version is a bit older than the latest one, but I could not find any similar issue on github.
Expected Behavior
I expect that the color should also change for the scatter kind. For this, here is the output of the following code:
plt.figure()
for layer in df['layer'].unique(): # group by layer
subDf = df[df['layer'] == layer] # get subset of dataframe
plt.scatter(subDf['x'], subDf['y'], label=layer) # plot
# add labels and such
plt.xlabel('x')
plt.ylabel('y')
Installed Versions
commit : 0f43794
python : 3.11.5.final.0
python-bits : 64
OS : Windows
OS-release : 10
Version : 10.0.22631
machine : AMD64
processor : Intel64 Family 6 Model 186 Stepping 3, GenuineIntel
byteorder : little
LC_ALL : None
LANG : None
LOCALE : English_United States.1252
pandas : 2.0.3
numpy : 1.24.3
pytz : 2023.3.post1
dateutil : 2.8.2
setuptools : 68.0.0
pip : 23.2.1
Cython : 3.0.6
pytest : 7.4.0
hypothesis : None
sphinx : 5.0.2
blosc : None
feather : None
xlsxwriter : None
lxml.etree : 4.9.3
html5lib : 1.1
pymysql : None
psycopg2 : None
jinja2 : 3.1.2
IPython : 8.15.0
pandas_datareader: None
bs4 : 4.12.2
bottleneck : 1.3.5
brotli :
fastparquet : None
fsspec : 2023.4.0
gcsfs : None
matplotlib : 3.7.2
numba : 0.57.1
numexpr : 2.8.4
odfpy : None
openpyxl : 3.0.10
pandas_gbq : None
pyarrow : 11.0.0
pyreadstat : None
pyxlsb : None
s3fs : 2023.4.0
scipy : 1.11.1
snappy :
sqlalchemy : 1.4.39
tables : 3.8.0
tabulate : 0.8.10
xarray : 2023.6.0
xlrd : None
zstandard : 0.19.0
tzdata : 2023.3
qtpy : 2.2.0
pyqt5 : None
Activity
abhisin-07 commentedon Sep 21, 2024
df.groupby("layer").plot(x='x', y='y', ax=plt.gca(), kind='line', color='blue').
DEFINE COLOR EXPICITLY BY PASSING COLOR PARAMETER IN THE FUNCTION PLOT BECAUSE WHEN WE USE groupby(layer)
it divides the plot data in groups and uses default matplotlib color cycle to different groups sets
TinoDerb commentedon Sep 21, 2024
@abhisin-07 Hey, it seems you didn't really read the issue. The desired results is to have the automatic matplotlib color cycle for
kind='scatter'
, just like when plotting withkind='line'
.P.S.: no need for all caps :)
yuanx749 commentedon Sep 23, 2024
Hmm, with the line plot, even the color is correct, the legend is problematic.
RafaelGreg18 commentedon Oct 20, 2024
take
myenugula commentedon Apr 5, 2025
take
BUG: Fix scatter plot colors in groupby context to match line plot be…
BUG: Fix scatter plot colors in groupby context to match line plot be…
BUG: Fix scatter plot colors in groupby context to match line plot be…
2 remaining items