Skip to content

Commit

Permalink
Merge pull request #8177 from TomAugspurger/barplot-NaN
Browse files Browse the repository at this point in the history
BUG: barplot with NaNs
  • Loading branch information
Tom Augspurger committed Sep 7, 2014
2 parents 7800290 + 9bf6b59 commit d48bb2c
Show file tree
Hide file tree
Showing 4 changed files with 60 additions and 2 deletions.
1 change: 1 addition & 0 deletions doc/source/v0.15.0.txt
Original file line number Diff line number Diff line change
Expand Up @@ -670,3 +670,4 @@ Bug Fixes

- Bug with kde plot and NaNs (:issue:`8182`)
- Bug in ``GroupBy.count`` with float32 data type were nan values were not excluded (:issue:`8169`).
- Bug with stacked barplots and NaNs (:issue:`8175`).
38 changes: 38 additions & 0 deletions doc/source/visualization.rst
Original file line number Diff line number Diff line change
Expand Up @@ -677,6 +677,44 @@ See the `matplotlib pie documenation <http://matplotlib.org/api/pyplot_api.html#
plt.close('all')
.. _visualization.missing_data
Plotting with Missing Data
--------------------------

Pandas tries to be pragmatic about plotting DataFrames or Series
that contain missing data. Missing values are dropped, left out, or filled
depending on the plot type.

+----------------+--------------------------------------+
| Plot Type | NaN Handling |
+================+======================================+
| Line | Leave gaps at NaNs |
+----------------+--------------------------------------+
| Line (stacked) | Fill 0's |
+----------------+--------------------------------------+
| Bar | Fill 0's |
+----------------+--------------------------------------+
| Scatter | Drop NaNs |
+----------------+--------------------------------------+
| Histogram | Drop NaNs (column-wise) |
+----------------+--------------------------------------+
| Box | Drop NaNs (column-wise) |
+----------------+--------------------------------------+
| Area | Fill 0's |
+----------------+--------------------------------------+
| KDE | Drop NaNs (column-wise) |
+----------------+--------------------------------------+
| Hexbin | Drop NaNs |
+----------------+--------------------------------------+
| Pie | Fill 0's |
+----------------+--------------------------------------+

If any of these defaults are not what you want, or if you want to be
explicit about how missing values are handled, consider using
:meth:`~pandas.DataFrame.fillna` or :meth:`~pandas.DataFrame.dropna`
before plotting.

.. _visualization.tools:

Plotting Tools
Expand Down
17 changes: 17 additions & 0 deletions pandas/tests/test_graphics.py
Original file line number Diff line number Diff line change
Expand Up @@ -1479,6 +1479,23 @@ def test_bar_bottom_left(self):
result = [p.get_x() for p in ax.patches]
self.assertEqual(result, [1] * 5)

@slow
def test_bar_nan(self):
df = DataFrame({'A': [10, np.nan, 20], 'B': [5, 10, 20],
'C': [1, 2, 3]})
ax = df.plot(kind='bar')
expected = [10, 0, 20, 5, 10, 20, 1, 2, 3]
result = [p.get_height() for p in ax.patches]
self.assertEqual(result, expected)

ax = df.plot(kind='bar', stacked=True)
result = [p.get_height() for p in ax.patches]
self.assertEqual(result, expected)

result = [p.get_y() for p in ax.patches]
expected = [0.0, 0.0, 0.0, 10.0, 0.0, 20.0, 15.0, 10.0, 40.0]
self.assertEqual(result, expected)

@slow
def test_plot_scatter(self):
df = DataFrame(randn(6, 4),
Expand Down
6 changes: 4 additions & 2 deletions pandas/tools/plotting.py
Original file line number Diff line number Diff line change
Expand Up @@ -870,9 +870,11 @@ def _validate_color_args(self):
" use one or the other or pass 'style' "
"without a color symbol")

def _iter_data(self, data=None, keep_index=False):
def _iter_data(self, data=None, keep_index=False, fillna=None):
if data is None:
data = self.data
if fillna is not None:
data = data.fillna(fillna)

from pandas.core.frame import DataFrame
if isinstance(data, (Series, np.ndarray, Index)):
Expand Down Expand Up @@ -1780,7 +1782,7 @@ def _make_plot(self):
pos_prior = neg_prior = np.zeros(len(self.data))
K = self.nseries

for i, (label, y) in enumerate(self._iter_data()):
for i, (label, y) in enumerate(self._iter_data(fillna=0)):
ax = self._get_ax(i)
kwds = self.kwds.copy()
kwds['color'] = colors[i % ncolors]
Expand Down

0 comments on commit d48bb2c

Please sign in to comment.