Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sharey keyword for boxplot #20968

Merged
merged 18 commits into from Jun 8, 2018
Merged
Show file tree
Hide file tree
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
42 changes: 42 additions & 0 deletions doc/source/whatsnew/v0.23.0.txt
Expand Up @@ -898,6 +898,48 @@ yourself. To revert to the old setting, you can run this line:

pd.options.display.max_columns = 20

.. _whatsnew_0230.boxplot.sharexy:

Optional sharing of x/y-axis by pandas.DataFrame().groupby().boxplot()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't an API change, so we don't need this long of a release note. Probably a single line saying that boxplot now accepts the sharey keyword, and a link to a more detailed example in the docstring or plotting.rst

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I updated the docstring and the the part in v0.23.0.txt.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you move this to 0.23.1.txt now?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And I think the example can be trimmed down. I'd rather update the docstring for DataFrame.plot.box since that's a longer-term thing. The release note are for people checking changes. They'll see that it now supports sharey, and click through if interested.

^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(:issue:`15184`)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this file needs to be reverted


Previous Behavior:

.. code-block:: jupyter-notebook
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See the other examples in /pandas/plotting/_core.py for how these should be formatted.


%pylab inline
import pandas as pd

N = 100
rand = random.random(N)
clas = random.binomial(5,.5, N)
df = pd.DataFrame({'Rand': rand-clas,
'Rand2': rand,
'Class': clas},
index= np.arange(N))

df.groupby('Class').boxplot(sharey=True, sharex=False)
>>> TypeError: boxplot() got an unexpected keyword argument 'sharey'

Using boxplot with keywords sharex or sharey resulted in an error.

New Behavior:

.. ipython:: jpyter-notebook:

...

df.groupby('Class').boxplot(sharey=True, sharex=True)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Won't need all of these. Probably just one will convey the point.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, 0.23.1.txt does not exist yet. When I fetch from upstream I don't receive it. Does that mean I should create that file? I thought I already removed the example. Not sure why it still there.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@soerendip did you merge after fetching? The file is there already on master

df.groupby('Class').boxplot(sharey=True, sharex=False)
df.groupby('Class').boxplot(sharey=False, sharex=True)
df.groupby('Class').boxplot(sharey=False, sharex=False)

All leads to different behaviour. The shareing of axes both x and y
can be turned on and off separately.

To restore previous behavior, use boxplot() without keywords.

.. _whatsnew_0230.api.datetimelike:

Datetimelike API Changes
Expand Down
4 changes: 2 additions & 2 deletions pandas/plotting/_core.py
Expand Up @@ -2548,7 +2548,7 @@ def plot_group(group, ax):

def boxplot_frame_groupby(grouped, subplots=True, column=None, fontsize=None,
rot=0, grid=True, ax=None, figsize=None,
layout=None, **kwds):
layout=None, sharex=False, sharey=True, **kwds):
"""
Make box plots from DataFrameGroupBy data.

Expand Down Expand Up @@ -2598,7 +2598,7 @@ def boxplot_frame_groupby(grouped, subplots=True, column=None, fontsize=None,
if subplots is True:
naxes = len(grouped)
fig, axes = _subplots(naxes=naxes, squeeze=False,
ax=ax, sharex=False, sharey=True,
ax=ax, sharex=sharex, sharey=sharey,
figsize=figsize, layout=layout)
axes = _flatten(axes)

Expand Down
58 changes: 58 additions & 0 deletions pandas/tests/plotting/test_frame.py
Expand Up @@ -367,6 +367,64 @@ def test_subplots(self):
for ax in axes:
assert ax.get_legend() is None

def test_groupby_boxplot_sharey(self):
# https://github.com/pandas-dev/pandas/issues/9737 using gridspec,
# the axis in fig.get_axis() are sorted differently than pandas
# expected them, so make sure that only the right ones are removed

df = DataFrame({'a': [-1.43, -0.15, -3.70, -1.43, -0.14],
'b': [0.56, 0.84, 0.29, 0.56, 0.85],
'c': [0, 1, 2, 3, 1]},
index=[0, 1, 2, 3, 4])

# standart behavior
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

standard

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is 'standard' mean here, can you update the comment

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

axes = df.groupby('c').boxplot()
self._check_visible(axes[0].get_yticklabels(), visible=True)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These tests are a bit wordy. I wonder if the following is clear?

expected = pd.Series([True, False, True, False])
result = axes.apply(lambda ax: ax.yaxis.get_visible())
tm.assert_frame_equal(result, expected)

Is that the same? Do you find it clearer?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see where you want to go. Would this be acceptable? It uses a function that you might want somewhere else though in case you want to use it in other tests? Where would be a good position to put it?

def test_groupby_boxplot_sharex(self):

    def _assert_xtickslabels_visibility(axes, expected):
        for ax, exp in zip(axes, expected):
            self._check_visible(ax.get_xticklabels(), visible=exp)

    df = DataFrame({'a': [-1.43, -0.15, -3.70, -1.43, -0.14],
                    'b': [0.56, 0.84, 0.29, 0.56, 0.85],
                    'c': [0, 1, 2, 3, 1]},
                   index=[0, 1, 2, 3, 4])

    # standart behavior
    axes = df.groupby('c').boxplot()
    expected = [True, True, True, True]
    _assert_xtickslabels_visibility(axes, expected)
    # set sharex=False should be identical
    axes = df.groupby('c').boxplot(sharex=False)
    expected = [True, True, True, True]
    _assert_xtickslabels_visibility(axes, expected)        
    # sharex=True, xticklabels should be visible only for bottom plots
    axes = df.groupby('c').boxplot(sharex=True)
    expected = [False, False, True, True]
    _assert_xtickslabels_visibility(axes, expected)´

self._check_visible(axes[1].get_yticklabels(), visible=False)
self._check_visible(axes[2].get_yticklabels(), visible=True)
self._check_visible(axes[3].get_yticklabels(), visible=False)
# set sharey=True should be identical
axes = df.groupby('c').boxplot(sharey=True)
self._check_visible(axes[0].get_yticklabels(), visible=True)
self._check_visible(axes[1].get_yticklabels(), visible=False)
self._check_visible(axes[2].get_yticklabels(), visible=True)
self._check_visible(axes[3].get_yticklabels(), visible=False)
# sharey=False, all yticklabels should be visible
axes = df.groupby('c').boxplot(sharey=False)
self._check_visible(axes[0].get_yticklabels(), visible=True)
self._check_visible(axes[1].get_yticklabels(), visible=True)
self._check_visible(axes[2].get_yticklabels(), visible=True)
self._check_visible(axes[3].get_yticklabels(), visible=True)

def test_groupby_boxplot_sharex(self):
# https://github.com/pandas-dev/pandas/issues/9737 using gridspec,
# the axis in fig.get_axis() are sorted differently than pandas
# expected them, so make sure that only the right ones are removed

df = DataFrame({'a': [-1.43, -0.15, -3.70, -1.43, -0.14],
'b': [0.56, 0.84, 0.29, 0.56, 0.85],
'c': [0, 1, 2, 3, 1]},
index=[0, 1, 2, 3, 4])

# standart behavior
axes = df.groupby('c').boxplot()
self._check_visible(axes[0].get_xticklabels(), visible=True)
self._check_visible(axes[1].get_xticklabels(), visible=True)
self._check_visible(axes[2].get_xticklabels(), visible=True)
self._check_visible(axes[3].get_xticklabels(), visible=True)
# set sharex=False should be identical
axes = df.groupby('c').boxplot(sharex=False)
self._check_visible(axes[0].get_xticklabels(), visible=True)
self._check_visible(axes[1].get_xticklabels(), visible=True)
self._check_visible(axes[2].get_xticklabels(), visible=True)
self._check_visible(axes[3].get_xticklabels(), visible=True)
# sharex=True, yticklabels should be visible for bottom plots
axes = df.groupby('c').boxplot(sharex=True)
self._check_visible(axes[0].get_xticklabels(), visible=False)
self._check_visible(axes[1].get_xticklabels(), visible=False)
self._check_visible(axes[2].get_xticklabels(), visible=True)
self._check_visible(axes[3].get_xticklabels(), visible=True)

@pytest.mark.slow
def test_subplots_timeseries(self):
idx = date_range(start='2014-07-01', freq='M', periods=10)
Expand Down