Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

df.plot() does not work for time series after deregistering converters Prophet #27036

Open
sudhNau opened this issue Jun 25, 2019 · 13 comments
Open
Labels

Comments

@sudhNau
Copy link

sudhNau commented Jun 25, 2019

Code Sample, a copy-pastable example if possible

import pandas as pd
import numpy as np

from pandas.plotting import deregister_matplotlib_converters

deregister_matplotlib_converters()


ts = pd.Series(np.random.randn(1000), index=pd.date_range('1/1/2000', periods=1000))
ts.plot()
~/sandbox/pandas/pandas/plotting/_core.py in __call__(self, kind, ax, figsize, use_index, title, grid, legend, style, logx, logy, loglog, xticks, yticks, xlim, ylim, rot, fontsize, colormap, table, yerr, xerr, label, secondary_y, **kwds)
    808                            colormap=colormap, table=table, yerr=yerr,
    809                            xerr=xerr, label=label, secondary_y=secondary_y,
--> 810                            **kwds)
    811     __call__.__doc__ = plot_series.__doc__
    812

~/sandbox/pandas/pandas/plotting/_core.py in plot_series(data, kind, ax, figsize, use_index, title, grid, legend, style, logx, logy, loglog, xticks, yticks, xlim, ylim, rot, fontsize, colormap, table, yerr, xerr, label, secondary_y, **kwds)
    766                  yerr=yerr, xerr=xerr,
    767                  label=label, secondary_y=secondary_y,
--> 768                  **kwds)
    769
    770

~/sandbox/pandas/pandas/plotting/_core.py in _plot(data, x, y, subplots, ax, kind, **kwds)
    714         plot_obj = klass(data, subplots=subplots, ax=ax, kind=kind, **kwds)
    715
--> 716     plot_obj.generate()
    717     plot_obj.draw()
    718     return plot_obj.result

~/sandbox/pandas/pandas/plotting/_matplotlib/core.py in generate(self)
    214         self._compute_plot_data()
    215         self._setup_subplots()
--> 216         self._make_plot()
    217         self._add_table()
    218         self._make_legend()

~/sandbox/pandas/pandas/plotting/_matplotlib/core.py in _make_plot(self)
    987                              stacking_id=stacking_id,
    988                              is_errorbar=is_errorbar,
--> 989                              **kwds)
    990             self._add_legend_handle(newlines[0], label, index=i)
    991

~/sandbox/pandas/pandas/plotting/_matplotlib/core.py in _ts_plot(cls, ax, x, data, style, **kwds)
   1025         ax._plot_data.append((data, cls._kind, kwds))
   1026
-> 1027         lines = cls._plot(ax, data.index, data.values, style=style, **kwds)
   1028         # set date formatter, locators and rescale limits
   1029         format_dateaxis(ax, ax.freq, data.index)

~/sandbox/pandas/pandas/plotting/_matplotlib/core.py in _plot(cls, ax, x, y, style, column_num, stacking_id, **kwds)
   1002             cls._initialize_stacker(ax, stacking_id, len(y))
   1003         y_values = cls._get_stacked_values(ax, stacking_id, y, kwds['label'])
-> 1004         lines = MPLPlot._plot(ax, x, y_values, style=style, **kwds)
   1005         cls._update_stacker(ax, stacking_id, y)
   1006         return lines

~/sandbox/pandas/pandas/plotting/_matplotlib/core.py in _plot(cls, ax, x, y, style, is_errorbar, **kwds)
    581             else:
    582                 args = (x, y)
--> 583             return ax.plot(*args, **kwds)
    584
    585     def _get_index_name(self):

~/Envs/pandas-dev/lib/python3.7/site-packages/matplotlib/axes/_axes.py in plot(self, scalex, scaley, data, *args, **kwargs)
   1666         lines = [*self._get_lines(*args, data=data, **kwargs)]
   1667         for line in lines:
-> 1668             self.add_line(line)
   1669         self.autoscale_view(scalex=scalex, scaley=scaley)
   1670         return lines

~/Envs/pandas-dev/lib/python3.7/site-packages/matplotlib/axes/_base.py in add_line(self, line)
   1898             line.set_clip_path(self.patch)
   1899
-> 1900         self._update_line_limits(line)
   1901         if not line.get_label():
   1902             line.set_label('_line%d' % len(self.lines))

~/Envs/pandas-dev/lib/python3.7/site-packages/matplotlib/axes/_base.py in _update_line_limits(self, line)
   1920         Figures out the data limit of the given line, updating self.dataLim.
   1921         """
-> 1922         path = line.get_path()
   1923         if path.vertices.size == 0:
   1924             return

~/Envs/pandas-dev/lib/python3.7/site-packages/matplotlib/lines.py in get_path(self)
   1025         """
   1026         if self._invalidy or self._invalidx:
-> 1027             self.recache()
   1028         return self._path
   1029

~/Envs/pandas-dev/lib/python3.7/site-packages/matplotlib/lines.py in recache(self, always)
    668         if always or self._invalidx:
    669             xconv = self.convert_xunits(self._xorig)
--> 670             x = _to_unmasked_float_array(xconv).ravel()
    671         else:
    672             x = self._x

~/Envs/pandas-dev/lib/python3.7/site-packages/matplotlib/cbook/__init__.py in _to_unmasked_float_array(x)
   1388         return np.ma.asarray(x, float).filled(np.nan)
   1389     else:
-> 1390         return np.asarray(x, float)
   1391
   1392

~/Envs/pandas-dev/lib/python3.7/site-packages/numpy/core/numeric.py in asarray(a, dtype, order)
    536
    537     """
--> 538     return array(a, dtype, copy=False, order=order)
    539
    540

TypeError: float() argument must be a string or a number, not 'Period'

Problem description

Above code is the first time series plotting example in the docs. It throws a type error as follows

[paste the output of pd.show_versions() here below this line]

INSTALLED VERSIONS

commit: None

pandas: 0.24.2
pytest: None
pip: 19.1.1
setuptools: 41.0.1
Cython: 0.29.10
numpy: 1.16.4
scipy: 1.2.1
pyarrow: None
xarray: None
IPython: 7.5.0
sphinx: None
patsy: 0.5.1
dateutil: 2.8.0
pytz: 2019.1
blosc: None
bottleneck: None
tables: None
numexpr: None
feather: None
matplotlib: 3.1.0
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml.etree: None
bs4: None
html5lib: None
sqlalchemy: None
pymysql: None
psycopg2: 2.7.5 (dt dec pq3 ext lo64)
jinja2: 2.10.1
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None
gcsfs: None

@sudhNau sudhNau changed the title df.plot() does not work for time series df.plot() does not work for time series after importing Prophet Jun 25, 2019
@TomAugspurger
Copy link
Contributor

I can't reproduce locally. Is the prophet import necessary?

@TomAugspurger TomAugspurger added the Needs Info Clarification about behavior needed to assess issue label Jun 25, 2019
@TomAugspurger
Copy link
Contributor

@Dharni0607 pandas will import Matplotlib internally. And plt.plot() may / will be different from Series.plot.

@Dharni0607
Copy link
Contributor

sorry for creating confusion.

@TomAugspurger
Copy link
Contributor

TomAugspurger commented Jun 26, 2019 via email

@Dharni0607
Copy link
Contributor

I have taken the code from this documentation:
https://pandas.pydata.org/pandas-docs/stable/getting_started/10min.html#plotting

I was getting similar error without importing Prophet as well.

ts


@TomAugspurger
Copy link
Contributor

TomAugspurger commented Jun 26, 2019 via email

@Dharni0607
Copy link
Contributor

I think this issue can be related to this:
matplotlib/matplotlib#14359


@TomAugspurger
Copy link
Contributor

Ah, so the reproducer is

import pandas as pd
import numpy as np

from pandas.plotting import deregister_matplotlib_converters

deregister_matplotlib_converters()


ts = pd.Series(np.random.randn(1000), index=pd.date_range('1/1/2000', periods=1000))
ts.plot()
~/sandbox/pandas/pandas/plotting/_core.py in __call__(self, kind, ax, figsize, use_index, title, grid, legend, style, logx, logy, loglog, xticks, yticks, xlim, ylim, rot, fontsize, colormap, table, yerr, xerr, label, secondary_y, **kwds)
    808                            colormap=colormap, table=table, yerr=yerr,
    809                            xerr=xerr, label=label, secondary_y=secondary_y,
--> 810                            **kwds)
    811     __call__.__doc__ = plot_series.__doc__
    812

~/sandbox/pandas/pandas/plotting/_core.py in plot_series(data, kind, ax, figsize, use_index, title, grid, legend, style, logx, logy, loglog, xticks, yticks, xlim, ylim, rot, fontsize, colormap, table, yerr, xerr, label, secondary_y, **kwds)
    766                  yerr=yerr, xerr=xerr,
    767                  label=label, secondary_y=secondary_y,
--> 768                  **kwds)
    769
    770

~/sandbox/pandas/pandas/plotting/_core.py in _plot(data, x, y, subplots, ax, kind, **kwds)
    714         plot_obj = klass(data, subplots=subplots, ax=ax, kind=kind, **kwds)
    715
--> 716     plot_obj.generate()
    717     plot_obj.draw()
    718     return plot_obj.result

~/sandbox/pandas/pandas/plotting/_matplotlib/core.py in generate(self)
    214         self._compute_plot_data()
    215         self._setup_subplots()
--> 216         self._make_plot()
    217         self._add_table()
    218         self._make_legend()

~/sandbox/pandas/pandas/plotting/_matplotlib/core.py in _make_plot(self)
    987                              stacking_id=stacking_id,
    988                              is_errorbar=is_errorbar,
--> 989                              **kwds)
    990             self._add_legend_handle(newlines[0], label, index=i)
    991

~/sandbox/pandas/pandas/plotting/_matplotlib/core.py in _ts_plot(cls, ax, x, data, style, **kwds)
   1025         ax._plot_data.append((data, cls._kind, kwds))
   1026
-> 1027         lines = cls._plot(ax, data.index, data.values, style=style, **kwds)
   1028         # set date formatter, locators and rescale limits
   1029         format_dateaxis(ax, ax.freq, data.index)

~/sandbox/pandas/pandas/plotting/_matplotlib/core.py in _plot(cls, ax, x, y, style, column_num, stacking_id, **kwds)
   1002             cls._initialize_stacker(ax, stacking_id, len(y))
   1003         y_values = cls._get_stacked_values(ax, stacking_id, y, kwds['label'])
-> 1004         lines = MPLPlot._plot(ax, x, y_values, style=style, **kwds)
   1005         cls._update_stacker(ax, stacking_id, y)
   1006         return lines

~/sandbox/pandas/pandas/plotting/_matplotlib/core.py in _plot(cls, ax, x, y, style, is_errorbar, **kwds)
    581             else:
    582                 args = (x, y)
--> 583             return ax.plot(*args, **kwds)
    584
    585     def _get_index_name(self):

~/Envs/pandas-dev/lib/python3.7/site-packages/matplotlib/axes/_axes.py in plot(self, scalex, scaley, data, *args, **kwargs)
   1666         lines = [*self._get_lines(*args, data=data, **kwargs)]
   1667         for line in lines:
-> 1668             self.add_line(line)
   1669         self.autoscale_view(scalex=scalex, scaley=scaley)
   1670         return lines

~/Envs/pandas-dev/lib/python3.7/site-packages/matplotlib/axes/_base.py in add_line(self, line)
   1898             line.set_clip_path(self.patch)
   1899
-> 1900         self._update_line_limits(line)
   1901         if not line.get_label():
   1902             line.set_label('_line%d' % len(self.lines))

~/Envs/pandas-dev/lib/python3.7/site-packages/matplotlib/axes/_base.py in _update_line_limits(self, line)
   1920         Figures out the data limit of the given line, updating self.dataLim.
   1921         """
-> 1922         path = line.get_path()
   1923         if path.vertices.size == 0:
   1924             return

~/Envs/pandas-dev/lib/python3.7/site-packages/matplotlib/lines.py in get_path(self)
   1025         """
   1026         if self._invalidy or self._invalidx:
-> 1027             self.recache()
   1028         return self._path
   1029

~/Envs/pandas-dev/lib/python3.7/site-packages/matplotlib/lines.py in recache(self, always)
    668         if always or self._invalidx:
    669             xconv = self.convert_xunits(self._xorig)
--> 670             x = _to_unmasked_float_array(xconv).ravel()
    671         else:
    672             x = self._x

~/Envs/pandas-dev/lib/python3.7/site-packages/matplotlib/cbook/__init__.py in _to_unmasked_float_array(x)
   1388         return np.ma.asarray(x, float).filled(np.nan)
   1389     else:
-> 1390         return np.asarray(x, float)
   1391
   1392

~/Envs/pandas-dev/lib/python3.7/site-packages/numpy/core/numeric.py in asarray(a, dtype, order)
    536
    537     """
--> 538     return array(a, dtype, copy=False, order=order)
    539
    540

TypeError: float() argument must be a string or a number, not 'Period'

We should figure something out.

@TomAugspurger TomAugspurger changed the title df.plot() does not work for time series after importing Prophet df.plot() does not work for time series after deregistering converters Prophet Jun 27, 2019
@Dharni0607
Copy link
Contributor

Dharni0607 commented Jun 27, 2019

I think, plot() fails to work with formatters and converters after deregistering.

This is the source:
https://github.com/pandas-dev/pandas/blob/v0.24.2/pandas/plotting/_converter.py#L88-L114

def deregister():

    Remove pandas' formatters and converters
    Removes the custom converters added by :func:`register`. This
    attempts to set the state of the registry back to the state before
    pandas registered its own units. Converters for pandas' own types like
    Timestamp and Period are removed completely. Converters for types
    pandas overwrites, like ``datetime.datetime``, are restored to their
    original value. 

Without date_range(), it is giving a decent plot

import pandas as pd
import numpy as np
from pandas.plotting import deregister_matplotlib_converters
deregister_matplotlib_converters()
ts = pd.Series(np.random.randn(1000), index = range(1000))   
ts.plot()

pandasPlot

@nthanhtin
Copy link

nthanhtin commented Jul 1, 2019

A temporary fix if you still want to have date_range():

from pandas.plotting import register_matplotlib_converters       

register_matplotlib_converters()       

ts.plot() 

@amerberg
Copy link

I believe this bug is caused by the following two factors:

  1. register_matplotlib_converters is not idempotent. Calling the function once stores the existing converters in a cache. Calling it a second time overwrites that cache with the newly-registered converters.

  2. Importing anything from pandas.plotting._core forces a call to register_matplotlib_converters.

The first call to register_matplotlib_converters caches the default matplotlib converters. Then the second call overwrites them with the pandas converters. And then when we go to deregister, there's a check that prevents pandas converters from being replaced. So we end up with no converter at all.

I propose to avoid caching the pandas converters in the register function. This will eliminate the need for the check in the deregister function. I'll start working on a pull request.

@gfyoung gfyoung added Visualization plotting Bug and removed Needs Info Clarification about behavior needed to assess issue labels Aug 18, 2019
@amerberg
Copy link

I believe this is a duplicate of #27479.

@TomAugspurger
Copy link
Contributor

I think it's only tangentially related to #27479

I think the best course forward is for prophet disable the converts before plotting, and reenable them (if they were previously enabled) after.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants