Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

xticks missing for scatter plots with colors #10611

Closed
adamgreenhall opened this issue Jul 17, 2015 · 34 comments

Comments

Projects
None yet
@adamgreenhall
Copy link
Contributor

commented Jul 17, 2015

example: https://www.wakari.io/sharing/bundle/adamgreenhall/test-scatter

I think this happens specifically for pandas scatter plots with colorbars in ipython. The xticks are still working for:

  • non-colorbar pandas scatter plots
  • the same scatter plot using matplotlib
  • standard python scripts using plt.savefig

related problem with %matplotlib inline?: ipython/ipython#1443

@sinhrks

This comment has been minimized.

Copy link
Member

commented Jul 17, 2015

Thanks for the report.

Strange, may be related to the order or method of ax, plot and colorbar creation...? We should specify the occurrence condition.

@rubennj

This comment has been minimized.

Copy link

commented Jul 23, 2015

I was checking previous versions of Pandas and I can confirm that the bug starts in 0.16.1 and it is still in 0.16.2

@denfromufa

This comment has been minimized.

Copy link

commented Aug 11, 2015

I hit this bug as well (0.16.2), and I can confirm that this problem appears only with %matplotlib inline

@denfromufa

This comment has been minimized.

Copy link

commented Aug 11, 2015

as a quick workaround use ax.xaxis.tick_top(), this puts the ticks on the top, but the xlabel is still missing.

@denfromufa

This comment has been minimized.

Copy link

commented Oct 26, 2015

sharex=False is better workaround, see answer on SO: http://stackoverflow.com/a/31633381/2230844

@dr-1

This comment has been minimized.

Copy link

commented Aug 29, 2017

Still an issue in 0.20.3

@jreback

This comment has been minimized.

Copy link
Contributor

commented Aug 29, 2017

@dr-1 PR's to fix are welcome. or event a simple repro on 20.3, as this issue is still open

@labarba

This comment has been minimized.

Copy link

commented Oct 20, 2017

I encountered this bug also. See screenshots of the two versions of the plot, one with colors, one without. The full notebook (without output, as it's still WiP) is at: http://go.gwu.edu/engcomp2lesson4

screen shot 2017-10-20 at 1 07 56 pm

screen shot 2017-10-20 at 1 08 11 pm

@javadnoorb

This comment has been minimized.

Copy link
Contributor

commented Mar 14, 2018

I'm having this issue in pandas 0.22.0 running in Jupyter notebook. Would it solve the issue to hardcode sharex=False as default into arguments of class ScatterPlot.__init__ in /pandas/plotting/_core.py?

@TomAugspurger

This comment has been minimized.

Copy link
Contributor

commented Mar 16, 2018

Hardcoding sharex=False could break subplots.

@tacaswell maybe you have a quick answer. With

import matplotlib.pyplot as plt
import numpy as np

x = np.arange(20)
y = np.sin(x)
c = np.hstack([np.ones_like(x[:10]) * 0.25,
               np.ones_like(x[10:]) * 0.75])

fig, ax = plt.subplots()
sc = ax.scatter(x, y, c=c, cmap='viridis')

plt.colorbar(mappable=sc, ax=ax, ticks=[0, 0.25, 0.5, 0.75, 1.0]);

Is there any way to disable the update done to the x axis ticks (not sure if it's the tick labels or something else) in the plt.colorbar call?

With matplotlib that gives

screen shot 2018-03-16 at 11 54 52 am

with pandas

ax2 = pd.DataFrame({'x': x, 'y': y, 'c': c}).plot.scatter(x='x', y='y', c='c', cmap='viridis');

gives

screen shot 2018-03-16 at 11 54 42 am

@TomAugspurger

This comment has been minimized.

Copy link
Contributor

commented Mar 16, 2018

For reference, here's the MPL code that we're calling:

if cb:
img = ax.collections[0]
kws = dict(ax=ax)
if self.mpl_ge_1_3_1():
kws['label'] = c if c_is_column else ''
self.fig.colorbar(img, **kws)

@javadnoorb

This comment has been minimized.

Copy link
Contributor

commented Mar 19, 2018

Not sure if this is a duplicate, but I dug a bit deeper into the code. It is happening in _handle_shared_axes. If you comment the following line the issue disappears:

self._adorn_subplots()

it seems to stem from this line:

_handle_shared_axes(axarr=all_axes, nplots=len(all_axes),

I tried running the code from within _core.py to explore this more closely. Compare this:

%matplotlib inline
import numpy as np
import pandas as pd
df = pd.DataFrame(np.random.random((100,3)),columns=['A','B','C'])

plot_obj = ScatterPlot(df, x='A', y='B',c='C')
plot_obj._args_adjust()
plot_obj._compute_plot_data()
plot_obj._setup_subplots()
plot_obj._make_plot()
plot_obj._add_table()
plot_obj._make_legend()


all_axes = plot_obj._get_subplots()
nrows, ncols = plot_obj._get_axes_layout()
_handle_shared_axes(axarr=all_axes, nplots=len(all_axes),
                    naxes=nrows * ncols, nrows=nrows,
                    ncols=ncols, sharex=plot_obj.sharex,
                    sharey=plot_obj.sharey)

for ax in plot_obj.axes:
    plot_obj._post_plot_logic_common(ax, plot_obj.data)
    plot_obj._post_plot_logic(ax, plot_obj.data)

image

with this:

%matplotlib inline
import numpy as np
import pandas as pd
df = pd.DataFrame(np.random.random((100,3)),columns=['A','B','C'])

plot_obj = ScatterPlot(df, x='A', y='B',c='C')
plot_obj._args_adjust()
plot_obj._compute_plot_data()
plot_obj._setup_subplots()
plot_obj._make_plot()
plot_obj._add_table()
plot_obj._make_legend()


all_axes = plot_obj._get_subplots()
nrows, ncols = plot_obj._get_axes_layout()
# _handle_shared_axes(axarr=all_axes, nplots=len(all_axes),
#                     naxes=nrows * ncols, nrows=nrows,
#                     ncols=ncols, sharex=plot_obj.sharex,
#                     sharey=plot_obj.sharey)

for ax in plot_obj.axes:
    plot_obj._post_plot_logic_common(ax, plot_obj.data)
    plot_obj._post_plot_logic(ax, plot_obj.data)

image

@TomAugspurger

This comment has been minimized.

Copy link
Contributor

commented Mar 19, 2018

@javadnoorb thanks for looking. So your suspicion is that if we excluded the newly created colorbar axes from _handle_shared_axes, everything would be OK? Do you have time to explore that further?

@TomAugspurger

This comment has been minimized.

Copy link
Contributor

commented Mar 19, 2018

FWIW, extracting the colorbar axes may be a tad difficult, but we should be able to hack something together.

@javadnoorb

This comment has been minimized.

Copy link
Contributor

commented Mar 19, 2018

That's my guess @TomAugspurger. I vaguely remember that I extracted colorbar axes before, and as you said it was not very straightforward. I'll look more into this within the next couple of days and will get back to you.

@TomAugspurger

This comment has been minimized.

Copy link
Contributor

commented Mar 19, 2018

FWIW, we're creating the colorbar axes, and we control every method here, so as a last resort you could add a private attribute like __is_pandas_colorbar to the colorbar axes and skip it during _handle_shared_axes

@javadnoorb

This comment has been minimized.

Copy link
Contributor

commented Mar 19, 2018

That's a good idea. _handle_shared_axes seems to be just a few for loops through the axes, so that would probably resolve this. Could that conflict with other subplottings with colorbars?

@TomAugspurger

This comment has been minimized.

Copy link
Contributor

commented Mar 19, 2018

@javadnoorb

This comment has been minimized.

Copy link
Contributor

commented Mar 20, 2018

@TomAugspurger I'm going through this code. You're right. Extracting colorbars is not very easy. I think what you suggested with the private attribute is the way to go. The issue as far as I can tell is that axes are matploltlib objects, so I don't know of any clean way to define a private attribute for them without inheritance. Any ideas?

We also need to be careful with _handle_shared_axes:

def _handle_shared_axes(axarr, nplots, naxes, nrows, ncols, sharex, sharey):

It loops through all axes and uses _remove_labels_from_axis to remove the axis label unless it is the last row/column or sharex/sharey=False. Simply skipping the colorbar axes wouldn't solve the problem, because by default when sharex=True, _remove_labels_from_axis will be applied to the axis anyway. However it only loops through axes if there are more than one rows/columns. So one solution I can think of is:

if any([ax.__is_pandas_colorbar for ax in axarr]): nrows=nrows-1
or

nrows = nrows-sum([ax.__is_pandas_colorbar for ax in axarr])

This won't mess up with multiplots (although we might want to do the same to ncols as well?). But it's a bit hacky. Any better solution you can think of?

A second way to deal with this might be to alter layout:

layout[ax.rowNum, ax.colNum] = ax.get_visible()

whenever there is only one non-colorbar axis.

Beside scatter, colorbars seem to only be used by hexbin, and they don't seem to share any code except through PlanePlot. Their colorbar implementation is very similar. So I think resolving the issue with scatter should similarly affect hexbin.

@javadnoorb

This comment has been minimized.

Copy link
Contributor

commented Mar 20, 2018

The above hacks I was thinking of assume that we're dealing with a single scatterplot with a single colorbar. A more general solution to deal with multiple subplots will require more work.

@TomAugspurger

This comment has been minimized.

Copy link
Contributor

commented Mar 20, 2018

The hack I had in mind was getting the new cbar ax from

self.fig.colorbar(img, **kws)

and then directly adding a _pandas_colorbar_axes like ax._pandas_colorbar_axes = True

Then in the _handle_shared_axes we'll skip axes where getattr(ax, '_pandas_colorbar_axes', False) is True.

@javadnoorb

This comment has been minimized.

Copy link
Contributor

commented Mar 20, 2018

Interesting. I didn't know it was possible to assign attributes to objects in python. That's a pretty nice feature!

@TomAugspurger

This comment has been minimized.

Copy link
Contributor

commented Mar 20, 2018

@javadnoorb

This comment has been minimized.

Copy link
Contributor

commented Mar 20, 2018

I like it!

I think skipping getattr(ax, '_pandas_colorbar_axes', False) will avoid removing tick labels from all the axes except for colorbars. So it will interfere with subplots if there are multiple plots. Don't you think?

I think we need something like axarr=[ax for ax in axarr if getattr(ax, '_pandas_colorbar_axes', False)] and corresponding changes to nplots, naxes, nrows, ncols in the first line of _handle_shared_axes

@TomAugspurger

This comment has been minimized.

Copy link
Contributor

commented Mar 21, 2018

@javadnoorb

This comment has been minimized.

Copy link
Contributor

commented Mar 22, 2018

@TomAugspurger, I made your suggested changes in my own fork:
https://github.com/javadnoorb/pandas

seems to be working fine:

%matplotlib inline
import numpy as np
from mypandas import pandas as pd
df = pd.DataFrame(np.random.random((1000,3)),columns=['A','B','C'])
df.plot.scatter('A','B',c='C');
df.plot.hexbin('A','B', gridsize=25);

image

image

Will probably require more testing. But I've never done this. What's the proper testing procedure before PR?

@TomAugspurger

This comment has been minimized.

Copy link
Contributor

commented Mar 22, 2018

@TomAugspurger

This comment has been minimized.

Copy link
Contributor

commented Mar 30, 2018

So @javadnoorb has submitted #20446, but I notice that with matplotlib 2.2.2 I'm getting the correct output on pandas master (without @javadnoorb's fix).

Would the people in this thread mind updating to matplotlib 2.2.2 and try out this code:

import matplotlib.pyplot as plt
import numpy as np
import pandas as pd

x = np.arange(20)
y = np.sin(x)
c = np.hstack([np.ones_like(x[:10]) * 0.25,
               np.ones_like(x[10:]) * 0.75])

ax2 = pd.DataFrame({'x': x, 'y': y, 'c': c}).plot.scatter(x='x', y='y', c='c', cmap='viridis');

My figure has the xaxis visible

gh

@javadnoorb

This comment has been minimized.

Copy link
Contributor

commented Mar 30, 2018

I upgraded to matplotlib 2.2.2 and still get the old result (without x-axis label). @TomAugspurger are you running this in Jupyter notebook?

@TomAugspurger

This comment has been minimized.

Copy link
Contributor

commented Mar 30, 2018

@javadnoorb

This comment has been minimized.

Copy link
Contributor

commented Mar 30, 2018

I had forgotten too. But I think this is an issue. I suspect the tests we've written for the PR are going to always pass. We need tests to run with inline backend.

@javadnoorb

This comment has been minimized.

Copy link
Contributor

commented Feb 4, 2019

This was resolved in the master for a while but seems to have returned since 0.24.0.

Example code:

%matplotlib inline
import numpy as np
import pandas as pd
random_array = np.random.random((1000, 3))
df = pd.DataFrame(random_array,columns=['A label','B label','C label'])
df.plot.scatter('A label', 'B label', c='C label')

output:
image

I think the culprit is:

# The workaround below is no longer necessary.

Note that this only happens in the inline backend.

Output of pd.show_versions()

INSTALLED VERSIONS

commit: None
python: 3.6.8.final.0
python-bits: 64
OS: Linux
OS-release: 4.15.0-43-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: en_US.UTF-8

pandas: 0.24.1
pytest: 4.1.1
pip: 18.1
setuptools: 40.6.3
Cython: 0.29.2
numpy: 1.15.4
scipy: 1.2.0
pyarrow: None
xarray: None
IPython: 7.2.0
sphinx: 1.8.2
patsy: None
dateutil: 2.7.5
pytz: 2018.9
blosc: None
bottleneck: None
tables: None
numexpr: None
feather: None
matplotlib: 3.0.2
openpyxl: None
xlrd: 1.2.0
xlwt: 1.3.0
xlsxwriter: None
lxml.etree: 4.3.0
bs4: None
html5lib: 0.9999999
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: 2.10
s3fs: None
fastparquet: None
pandas_gbq: 0.8.0
pandas_datareader: None
gcsfs: None

@Raquelie

This comment has been minimized.

Copy link

commented Mar 25, 2019

I am having the same issue

@giliam

This comment has been minimized.

Copy link

commented May 21, 2019

Same here, executing from a Jupyter Notebook (I haven't fully understood whether it is important or not) with Matplotlib version greater than 3.0 (3.0.3). Pandas version is 0.24.2.

I have tried forcing the workaround highlighted by @javadnoorb but it didn't change anything for me.

with

without

Edit : My bad, just printing the two figures one above the other made me realise that it was maybe just the width missing. I put a larger x figsize and it worked, adding back again the x axis and its labels.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.