New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sharey keyword for boxplot #20968

Merged
merged 18 commits into from Jun 8, 2018

Conversation

Projects
None yet
7 participants
@soerendip
Contributor

soerendip commented May 6, 2018

…to _subplots().

.. code-block:: jupyter-notebook

%pylab inline
import pandas as pd

N = 100
rand = random.random(N)
clas = random.binomial(5,.5, N)
df = pd.DataFrame({'Rand': rand-clas,
                'Rand2': rand,
                'Class': clas},
                index= np.arange(N))

df.groupby('Class').boxplot(sharey=True, sharex=False)
>>> TypeError: boxplot() got an unexpected keyword argument 'sharey'

New Behavior:

.. ipython:: jpyter-notebook:

...

df.groupby('Class').boxplot(sharey=True, sharex=True)
df.groupby('Class').boxplot(sharey=True, sharex=False)
df.groupby('Class').boxplot(sharey=False, sharex=True)
df.groupby('Class').boxplot(sharey=False, sharex=False)

All leads to different behaviour. The shareing of axes both x and y
can be turned on and off separately. 

To restore previous behavior, use boxplot() without keywords.

Default is the previous behavior of sharey

  • [x ] closes #20918
  • [x ] tests added / passed
  • [ x] passes git diff upstream/master -u -- "*.py" | flake8 --diff
  • [x ] whatsnew entry
@codecov

This comment has been minimized.

codecov bot commented May 6, 2018

Codecov Report

Merging #20968 into master will increase coverage by 0.03%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #20968      +/-   ##
==========================================
+ Coverage   91.85%   91.88%   +0.03%     
==========================================
  Files         153      153              
  Lines       49570    49564       -6     
==========================================
+ Hits        45533    45543      +10     
+ Misses       4037     4021      -16
Flag Coverage Δ
#multiple 90.28% <ø> (+0.03%) ⬆️
#single 41.86% <ø> (ø) ⬆️
Impacted Files Coverage Δ
pandas/plotting/_core.py 83.53% <ø> (+1.14%) ⬆️
pandas/core/reshape/pivot.py 96.97% <0%> (-0.06%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update abfac97...fd76955. Read the comment docs.

@TomAugspurger

This comment has been minimized.

Contributor

TomAugspurger commented May 7, 2018

We'll want tests for this. See the e.g. test_sharex_and_ax in pandas/tests/plotting/test_frame.py for an example.

@soerendip

This comment has been minimized.

Contributor

soerendip commented May 9, 2018

Thanks. I will look into that and give it a try.

@pep8speaks

This comment has been minimized.

pep8speaks commented May 9, 2018

Hello @soerendip! Thanks for updating the PR.

Cheers ! There are no PEP8 issues in this Pull Request. 🍻

Comment last updated on June 08, 2018 at 11:26 Hours UTC
@@ -898,6 +898,48 @@ yourself. To revert to the old setting, you can run this line:
pd.options.display.max_columns = 20
.. _whatsnew_0230.boxplot.sharexy:
Optional sharing of x/y-axis by pandas.DataFrame().groupby().boxplot()

This comment has been minimized.

@TomAugspurger

TomAugspurger May 10, 2018

Contributor

This isn't an API change, so we don't need this long of a release note. Probably a single line saying that boxplot now accepts the sharey keyword, and a link to a more detailed example in the docstring or plotting.rst

This comment has been minimized.

@soerendip

soerendip May 14, 2018

Contributor

I updated the docstring and the the part in v0.23.0.txt.

This comment has been minimized.

@TomAugspurger

TomAugspurger May 17, 2018

Contributor

Can you move this to 0.23.1.txt now?

This comment has been minimized.

@TomAugspurger

TomAugspurger May 17, 2018

Contributor

And I think the example can be trimmed down. I'd rather update the docstring for DataFrame.plot.box since that's a longer-term thing. The release note are for people checking changes. They'll see that it now supports sharey, and click through if interested.

@@ -898,6 +898,48 @@ yourself. To revert to the old setting, you can run this line:
pd.options.display.max_columns = 20
.. _whatsnew_0230.boxplot.sharexy:
Optional sharing of x/y-axis by pandas.DataFrame().groupby().boxplot()

This comment has been minimized.

@TomAugspurger

TomAugspurger May 17, 2018

Contributor

Can you move this to 0.23.1.txt now?

@@ -898,6 +898,48 @@ yourself. To revert to the old setting, you can run this line:
pd.options.display.max_columns = 20
.. _whatsnew_0230.boxplot.sharexy:
Optional sharing of x/y-axis by pandas.DataFrame().groupby().boxplot()

This comment has been minimized.

@TomAugspurger

TomAugspurger May 17, 2018

Contributor

And I think the example can be trimmed down. I'd rather update the docstring for DataFrame.plot.box since that's a longer-term thing. The release note are for people checking changes. They'll see that it now supports sharey, and click through if interested.

Previous Behavior:
.. code-block:: jupyter-notebook

This comment has been minimized.

@TomAugspurger

TomAugspurger May 17, 2018

Contributor

See the other examples in /pandas/plotting/_core.py for how these should be formatted.

...
df.groupby('Class').boxplot(sharey=True, sharex=True)

This comment has been minimized.

@TomAugspurger

TomAugspurger May 17, 2018

Contributor

Won't need all of these. Probably just one will convey the point.

This comment has been minimized.

@soerendip

soerendip May 17, 2018

Contributor

Ok, 0.23.1.txt does not exist yet. When I fetch from upstream I don't receive it. Does that mean I should create that file? I thought I already removed the example. Not sure why it still there.

This comment has been minimized.

@WillAyd

WillAyd May 18, 2018

Member

@soerendip did you merge after fetching? The file is there already on master

@soerendip

This comment has been minimized.

Contributor

soerendip commented May 18, 2018

Done. Hope it is correct this time.

@@ -2567,6 +2567,12 @@ def boxplot_frame_groupby(grouped, subplots=True, column=None, fontsize=None,
figsize : A tuple (width, height) in inches
layout : tuple (optional)
(rows, columns) for the layout of the plot
sharex :

This comment has been minimized.

@TomAugspurger

TomAugspurger May 18, 2018

Contributor

Could you take a look at http://pandas.pydata.org/pandas-docs/stable/contributing_docstring.html for how to format these? Something like

sharex : bool, default False
    Whether the horizontal axes is shared among subplots

This comment has been minimized.

@soerendip

soerendip May 19, 2018

Contributor

Ok, did that. Should this part of the doc-string then be corrected as well?

subplots :
    * ``False`` - no subplots will be used
    * ``True`` - create a subplot for each group
# standart behavior
axes = df.groupby('c').boxplot()
self._check_visible(axes[0].get_yticklabels(), visible=True)

This comment has been minimized.

@TomAugspurger

TomAugspurger May 18, 2018

Contributor

These tests are a bit wordy. I wonder if the following is clear?

expected = pd.Series([True, False, True, False])
result = axes.apply(lambda ax: ax.yaxis.get_visible())
tm.assert_frame_equal(result, expected)

Is that the same? Do you find it clearer?

This comment has been minimized.

@soerendip

soerendip May 18, 2018

Contributor

I see where you want to go. Would this be acceptable? It uses a function that you might want somewhere else though in case you want to use it in other tests? Where would be a good position to put it?

def test_groupby_boxplot_sharex(self):

    def _assert_xtickslabels_visibility(axes, expected):
        for ax, exp in zip(axes, expected):
            self._check_visible(ax.get_xticklabels(), visible=exp)

    df = DataFrame({'a': [-1.43, -0.15, -3.70, -1.43, -0.14],
                    'b': [0.56, 0.84, 0.29, 0.56, 0.85],
                    'c': [0, 1, 2, 3, 1]},
                   index=[0, 1, 2, 3, 4])

    # standart behavior
    axes = df.groupby('c').boxplot()
    expected = [True, True, True, True]
    _assert_xtickslabels_visibility(axes, expected)
    # set sharex=False should be identical
    axes = df.groupby('c').boxplot(sharex=False)
    expected = [True, True, True, True]
    _assert_xtickslabels_visibility(axes, expected)        
    # sharex=True, xticklabels should be visible only for bottom plots
    axes = df.groupby('c').boxplot(sharex=True)
    expected = [False, False, True, True]
    _assert_xtickslabels_visibility(axes, expected)´
@soerendip

This comment has been minimized.

Contributor

soerendip commented May 28, 2018

Any updates on this? I shortened the tests in the style of your examples using the assert function that was implemented to check for visibility. However, I had to create two helper functions, and was not sure where the optimal place for those is.

@@ -2567,6 +2567,10 @@ def boxplot_frame_groupby(grouped, subplots=True, column=None, fontsize=None,
figsize : A tuple (width, height) in inches
layout : tuple (optional)
(rows, columns) for the layout of the plot
sharex : bool, default False
Whether x-axes will be shared among subplots

This comment has been minimized.

@jreback

jreback May 29, 2018

Contributor

add a versionadded tag (0.23.1)

This comment has been minimized.

@soerendip

soerendip May 29, 2018

Contributor

Like this? I added this to v.0.23.1.txt.

Plotting
^^^^^^^^

- Optional sharing of x/y-axis by pandas.DataFrame().groupby().boxplot() (:issue: `20968`)

If not, could you be more specific about what I should add where, please?

This comment has been minimized.

@TomAugspurger

TomAugspurger May 29, 2018

Contributor
    sharey : bool, default False
        Whether y-axes will be shared among subplots 

        .. versionadded:: 0.23.1

This comment has been minimized.

@soerendip

soerendip May 29, 2018

Contributor

Thank you.

# sharey can now be switched check whether the right
# pair of axes is turned on or off
def _assert_ytickslabels_visibility(axes, expected):

This comment has been minimized.

@jreback

jreback May 29, 2018

Contributor

do we have similar functions at a top level? maybe we should

This comment has been minimized.

@soerendip

soerendip May 29, 2018

Contributor

Sorry, I am new to pandas. I am not sure whether I understand what you mean. Are you saying, I should move the function somewhere else? If yes, where exactly should the function go?

This comment has been minimized.

@jreback

jreback Jun 4, 2018

Contributor

you can make this a function on the class itself rather than defininig it inside the function

# standart behavior
axes = df.groupby('c').boxplot()
expected = [True, False, True, False]
_assert_ytickslabels_visibility(axes, expected)

This comment has been minimized.

@jreback

jreback May 29, 2018

Contributor

use a blank line between cases

@jreback jreback added this to the 0.23.1 milestone May 29, 2018

soerendip added some commits May 29, 2018

@soerendip

This comment has been minimized.

Contributor

soerendip commented Jun 1, 2018

All changes should be done.

Optional sharing of x/y-axis by pandas.DataFrame().groupby().boxplot()
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(:issue:`15184`)

This comment has been minimized.

@jreback

jreback Jun 4, 2018

Contributor

this file needs to be reverted

# sharey can now be switched check whether the right
# pair of axes is turned on or off
def _assert_ytickslabels_visibility(axes, expected):

This comment has been minimized.

@jreback

jreback Jun 4, 2018

Contributor

you can make this a function on the class itself rather than defininig it inside the function

'c': [0, 1, 2, 3, 1]},
index=[0, 1, 2, 3, 4])
# standart behavior

This comment has been minimized.

@jreback

jreback Jun 4, 2018

Contributor

standard

This comment has been minimized.

@jreback

jreback Jun 4, 2018

Contributor

what is 'standard' mean here, can you update the comment

This comment has been minimized.

@soerendip

soerendip Jun 6, 2018

Contributor

Done.

@TomAugspurger

This comment has been minimized.

Contributor

TomAugspurger commented Jun 6, 2018

@soerendip looks like tests are failing in the latest commit.

@soerendip

This comment has been minimized.

Contributor

soerendip commented Jun 6, 2018

@@ -76,7 +76,7 @@ I/O
Plotting
^^^^^^^^
-
- Optional sharing of x/y-axis by pandas.DataFrame().groupby().boxplot() (:issue: `20968`)

This comment has been minimized.

@jreback

jreback Jun 7, 2018

Contributor

its not clear to me what is being fixed / enhancement here, can you re-word

@@ -2921,6 +2972,14 @@ def test_secondary_axis_font_size(self, method):
self._check_ticks_props(axes=ax.right_ax,
ylabelsize=fontsize)
def _assert_ytickslabels_visibility(self, axes, expected):

This comment has been minimized.

@jreback

jreback Jun 7, 2018

Contributor

you you move these to the very top of the class (I think we hvae other checker functions there)

@soerendip

This comment has been minimized.

Contributor

soerendip commented Jun 7, 2018

Done.

@jreback

jreback approved these changes Jun 8, 2018

@jreback jreback merged commit 4f1704e into pandas-dev:master Jun 8, 2018

0 of 3 checks passed

ci/circleci Your tests are queued behind your running builds
Details
continuous-integration/appveyor/pr Waiting for AppVeyor build to complete
Details
continuous-integration/travis-ci/pr The Travis CI build is in progress
Details
@jreback

This comment has been minimized.

Contributor

jreback commented Jun 8, 2018

thanks @soerendip

daminisatya added a commit to daminisatya/pandas that referenced this pull request Jun 8, 2018

@jorisvandenbossche jorisvandenbossche modified the milestones: 0.23.1, 0.24.0 Jun 8, 2018

david-liu-brattle-1 added a commit to david-liu-brattle-1/pandas that referenced this pull request Jun 18, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment