Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sharey keyword for boxplot #20968

Merged
merged 18 commits into from
Jun 8, 2018
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
13 changes: 13 additions & 0 deletions doc/source/whatsnew/v0.23.1.txt
Original file line number Diff line number Diff line change
Expand Up @@ -48,40 +48,53 @@ Bug Fixes
~~~~~~~~~

Groupby/Resample/Rolling
~~~~~~~~~~~~~~~~~~~~~~~~

- Bug in :func:`DataFrame.agg` where applying multiple aggregation functions to a :class:`DataFrame` with duplicated column names would cause a stack overflow (:issue:`21063`)
- Bug in :func:`pandas.core.groupby.GroupBy.ffill` and :func:`pandas.core.groupby.GroupBy.bfill` where the fill within a grouping would not always be applied as intended due to the implementations' use of a non-stable sort (:issue:`21207`)
- Bug in :func:`pandas.core.groupby.GroupBy.rank` where results did not scale to 100% when specifying ``method='dense'`` and ``pct=True``

Data-type specific
~~~~~~~~~~~~~~~~~~

- Bug in :meth:`Series.str.replace()` where the method throws `TypeError` on Python 3.5.2 (:issue: `21078`)
- Bug in :class:`Timedelta`: where passing a float with a unit would prematurely round the float precision (:issue: `14156`)
- Bug in :func:`pandas.testing.assert_index_equal` which raised ``AssertionError`` incorrectly, when comparing two :class:`CategoricalIndex` objects with param ``check_categorical=False`` (:issue:`19776`)

Sparse
~~~~~~

- Bug in :attr:`SparseArray.shape` which previously only returned the shape :attr:`SparseArray.sp_values` (:issue:`21126`)

Indexing
~~~~~~~~

- Bug in :meth:`Series.reset_index` where appropriate error was not raised with an invalid level name (:issue:`20925`)
- Bug in :func:`interval_range` when ``start``/``periods`` or ``end``/``periods`` are specified with float ``start`` or ``end`` (:issue:`21161`)
- Bug in :meth:`MultiIndex.set_names` where error raised for a ``MultiIndex`` with ``nlevels == 1`` (:issue:`21149`)
- Bug in :class:`IntervalIndex` constructors where creating an ``IntervalIndex`` from categorical data was not fully supported (:issue:`21243`, issue:`21253`)
- Bug in :meth:`MultiIndex.sort_index` which was not guaranteed to sort correctly with ``level=1``; this was also causing data misalignment in particular :meth:`DataFrame.stack` operations (:issue:`20994`, :issue:`20945`, :issue:`21052`)

Plotting
~~~~~~~~

- New keywords (sharex, sharey) to turn on/off sharing of x/y-axis by subplots generated with pandas.DataFrame().groupby().boxplot() (:issue: `20968`)

I/O
~~~

- Bug in IO methods specifying ``compression='zip'`` which produced uncompressed zip archives (:issue:`17778`, :issue:`21144`)
- Bug in :meth:`DataFrame.to_stata` which prevented exporting DataFrames to buffers and most file-like objects (:issue:`21041`)
- Bug in :meth:`read_stata` and :class:`StataReader` which did not correctly decode utf-8 strings on Python 3 from Stata 14 files (dta version 118) (:issue:`21244`)


Reshaping
~~~~~~~~~

- Bug in :func:`concat` where error was raised in concatenating :class:`Series` with numpy scalar and tuple names (:issue:`21015`)
- Bug in :func:`concat` warning message providing the wrong guidance for future behavior (:issue:`21101`)

Other
~~~~~

- Tab completion on :class:`Index` in IPython no longer outputs deprecation warnings (:issue:`21125`)
12 changes: 10 additions & 2 deletions pandas/plotting/_core.py
Original file line number Diff line number Diff line change
Expand Up @@ -2548,7 +2548,7 @@ def plot_group(group, ax):

def boxplot_frame_groupby(grouped, subplots=True, column=None, fontsize=None,
rot=0, grid=True, ax=None, figsize=None,
layout=None, **kwds):
layout=None, sharex=False, sharey=True, **kwds):
"""
Make box plots from DataFrameGroupBy data.

Expand All @@ -2567,6 +2567,14 @@ def boxplot_frame_groupby(grouped, subplots=True, column=None, fontsize=None,
figsize : A tuple (width, height) in inches
layout : tuple (optional)
(rows, columns) for the layout of the plot
sharex : bool, default False
Whether x-axes will be shared among subplots
Copy link
Contributor

@jreback jreback May 29, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add a versionadded tag (0.23.1)

Copy link
Author

@sorenwacker sorenwacker May 29, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Like this? I added this to v.0.23.1.txt.

Plotting
^^^^^^^^

- Optional sharing of x/y-axis by pandas.DataFrame().groupby().boxplot() (:issue: `20968`)

If not, could you be more specific about what I should add where, please?

Copy link
Contributor

@TomAugspurger TomAugspurger May 29, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

    sharey : bool, default False
        Whether y-axes will be shared among subplots 

        .. versionadded:: 0.23.1

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you.


.. versionadded:: 0.23.1
sharey : bool, default True
Whether y-axes will be shared among subplots

.. versionadded:: 0.23.1
`**kwds` : Keyword Arguments
All other plotting keyword arguments to be passed to
matplotlib's boxplot function
Expand Down Expand Up @@ -2598,7 +2606,7 @@ def boxplot_frame_groupby(grouped, subplots=True, column=None, fontsize=None,
if subplots is True:
naxes = len(grouped)
fig, axes = _subplots(naxes=naxes, squeeze=False,
ax=ax, sharex=False, sharey=True,
ax=ax, sharex=sharex, sharey=sharey,
figsize=figsize, layout=layout)
axes = _flatten(axes)

Expand Down
59 changes: 59 additions & 0 deletions pandas/tests/plotting/test_frame.py
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,14 @@ def setup_method(self, method):
"C": np.arange(20) + np.random.uniform(
size=20)})

def _assert_ytickslabels_visibility(self, axes, expected):
for ax, exp in zip(axes, expected):
self._check_visible(ax.get_yticklabels(), visible=exp)

def _assert_xtickslabels_visibility(self, axes, expected):
for ax, exp in zip(axes, expected):
self._check_visible(ax.get_xticklabels(), visible=exp)

@pytest.mark.slow
def test_plot(self):
df = self.tdf
Expand Down Expand Up @@ -367,6 +375,57 @@ def test_subplots(self):
for ax in axes:
assert ax.get_legend() is None

def test_groupby_boxplot_sharey(self):
# https://github.com/pandas-dev/pandas/issues/20968
# sharey can now be switched check whether the right
# pair of axes is turned on or off

df = DataFrame({'a': [-1.43, -0.15, -3.70, -1.43, -0.14],
'b': [0.56, 0.84, 0.29, 0.56, 0.85],
'c': [0, 1, 2, 3, 1]},
index=[0, 1, 2, 3, 4])

# behavior without keyword
axes = df.groupby('c').boxplot()
expected = [True, False, True, False]
self._assert_ytickslabels_visibility(axes, expected)

# set sharey=True should be identical
axes = df.groupby('c').boxplot(sharey=True)
expected = [True, False, True, False]
self._assert_ytickslabels_visibility(axes, expected)

# sharey=False, all yticklabels should be visible
axes = df.groupby('c').boxplot(sharey=False)
expected = [True, True, True, True]
self._assert_ytickslabels_visibility(axes, expected)

def test_groupby_boxplot_sharex(self):
# https://github.com/pandas-dev/pandas/issues/20968
# sharex can now be switched check whether the right
# pair of axes is turned on or off

df = DataFrame({'a': [-1.43, -0.15, -3.70, -1.43, -0.14],
'b': [0.56, 0.84, 0.29, 0.56, 0.85],
'c': [0, 1, 2, 3, 1]},
index=[0, 1, 2, 3, 4])

# behavior without keyword
axes = df.groupby('c').boxplot()
expected = [True, True, True, True]
self._assert_xtickslabels_visibility(axes, expected)

# set sharex=False should be identical
axes = df.groupby('c').boxplot(sharex=False)
expected = [True, True, True, True]
self._assert_xtickslabels_visibility(axes, expected)

# sharex=True, yticklabels should be visible
# only for bottom plots
axes = df.groupby('c').boxplot(sharex=True)
expected = [False, False, True, True]
self._assert_xtickslabels_visibility(axes, expected)

@pytest.mark.slow
def test_subplots_timeseries(self):
idx = date_range(start='2014-07-01', freq='M', periods=10)
Expand Down