Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

don't register pandas mpl unit converters upon import #2579

Closed
changhiskhan opened this issue Dec 21, 2012 · 26 comments
Closed

don't register pandas mpl unit converters upon import #2579

changhiskhan opened this issue Dec 21, 2012 · 26 comments
Labels
Milestone

Comments

@changhiskhan
Copy link
Contributor

register it upon plotting?

http://stackoverflow.com/questions/13988111/importing-pandas-in-python-changes-how-matplotlib-handles-datetime-objects

@wesm
Copy link
Member

wesm commented Jan 20, 2013

Is this critical for 0.10.1?

@changhiskhan
Copy link
Contributor Author

no, it's also a slog. lots of separate plotting functions that need to register the unit converters

@ruidc
Copy link
Contributor

ruidc commented Mar 27, 2013

What about putting them all in a single registration method that's only called when plotting?

@wesm
Copy link
Member

wesm commented Apr 8, 2013

Pushing past 0.11

@jreback
Copy link
Contributor

jreback commented Sep 22, 2013

@cpcloud is still an issue?

@cpcloud
Copy link
Member

cpcloud commented Sep 22, 2013

those are still registered...so, yes

@cpcloud
Copy link
Member

cpcloud commented Sep 22, 2013

i can take a look

@ghost ghost assigned cpcloud Sep 22, 2013
@cpcloud
Copy link
Member

cpcloud commented Sep 27, 2013

this is super low priority ... pushing to 0.14

@jreback jreback modified the milestones: 0.15.0, 0.14.0 Feb 18, 2014
@cpcloud cpcloud removed their assignment Feb 21, 2014
@rhattersley
Copy link

this is super low priority

But also super annoying if you're on the receiving end of it. You can get away with it if you're only using pandas for your plots, but this kind of side-effect is really not pleasant when pandas is used as one component amongst several.

register it upon plotting?

This is only a small improvement - the basic problem of global side-effects still exists.


So I'd much rather make it an explicit user action (with another action available to undo it), and/or a temporary state change that only persists for the lifetime of the pandas plotting routines.

For example, an explicit approach might be:

# <user code>
with pandas.use_nice_date_formats():
    plt.plot(...)

Whereas the automatic temporary state change might look (logically) like:

# <pandas implementation>
class Series(...):
    def plot(...):
        with pandas.use_nice_date_formats():
            plt.plot(...)

# <user code>
ts.plot()

@jorisvandenbossche
Copy link
Member

For me personally the explicit approach (with ...:) is out of question, as the fact that you can just plot a timeseries with ts.plot() and actually see something sensible on the xlabels is really a strength of pandas, certainly for interactive exploring work (matplotlib really does a bad job at this).
Which does not mean that such a context manager could be usefull feature to have in other circumstances (or to implement the automatic state change).

The automatic temporary state change sounds more attractive to me as a user, but I am not familiar enough with the plotting code to know if this would be easily possible to implement.

@rhattersley
Copy link

My preference would be to have both. The pandas date formatters do a better job than the defaults (which is why they exist!) so it would be nice to be able to use them (in a controlled fashion!) in other circumstances.

@ruidc
Copy link
Contributor

ruidc commented Jun 27, 2014

In case it helps some one else: I've been working around it with the following in our startup:


def replace_pandas_mpl_conversions():
    try:
        import matplotlib.dates #to force inbuilt type registration
        d = matplotlib.units.registry.copy() #get state
        import pandas #replaces some matplotlib entries
    except ImportError:
        return
    #reinstate previous registrations
    matplotlib.units.registry.clear()
    matplotlib.units.registry.update(d)

The first problem is not knowing that pandas is taking over - which is a no-no in my book. I agree that it should be explicit and able to be turned off in a setting - we personally do not use matplotlib via pandas but only directly.

@TomAugspurger
Copy link
Contributor

We should be able to wrap something like @ruidc's code up in an option. I'll see if I have time today.

@ocehugo
Copy link

ocehugo commented Jul 21, 2016

@TomAugspurger,

I just want do add more superannoying this is and urge the fix so we can be all happy fellas.

All packages that import pandas cause this.

So this breaks all calls with datetime objects in matplotlib with old dates.

In the case one might think: but why you don't go with the flow and use pandas to deal with dates!? Is much better... OK ... but wait I can't! Since pandas do not support date_ranges before 1978 (out of bounds).

import pandas as pd
xtime = pd.date_range('1/1/1976','11/1/1976',freq='M')
OutOfBoundsDatetime: Out of bounds nanosecond timestamp: 1676-01-01 00:00:00

Using the kludge above works but still a kludge hidden in here.

@jreback
Copy link
Contributor

jreback commented Jul 21, 2016

what version of pandas are you showing above ?

@ocehugo
Copy link

ocehugo commented Jul 21, 2016

@jreback see #13713

@jreback
Copy link
Contributor

jreback commented Jul 21, 2016

your example of using date_range works fine
you have a copy paste error

@jreback
Copy link
Contributor

jreback commented Jul 21, 2016

@TomAugspurger
Copy link
Contributor

@ocehugo I ran out of time today, so feel free to work on it. I think a pd.option to toggle the regiration is the way to handle it.

@ocehugo
Copy link

ocehugo commented Jul 21, 2016

@jreback yeah, pasting error (should be 1676 in xtime). Not a pandas heavy user so didn't remembered the period_range method ( I could do that but this would required me to change all code related so easier to use the kludge). @TomAugspurger, as said not a pandas guy but if you tell me where this should be I could pull something. I believe the kludge above should be default to not cause conflict with other packages but don't know the consequences of that.

just to stress out , I have some source code that use xarray and others with statsmodels. Just the fact of importing xarray makes all my matplotlib plotting functions not to work.

@TomAugspurger
Copy link
Contributor

@ocehugo thanks for taking this.

The goal is to get something like pd.options.plotting.register_converters (that name isn't set in stone). That can be set to either True or False.
The option you'll add will be in https://github.com/pydata/pandas/blob/master/pandas/core/config_init.py

For backwards compatibility the default will have to be True (register the converters).
The documentation for the config options is at the top of this file. You'll want to use a callback that calls converters.register (see below) each time the option is set / reset.

The actual converters are defined in https://github.com/pydata/pandas/blob/fc16f1fd21aee163e93e5713a0676f7a79838897/pandas/tseries/converter.py#L53

and used in https://github.com/pydata/pandas/blob/fc16f1fd21aee163e93e5713a0676f7a79838897/pandas/tools/plotting.py#L35

I would add an argument to converters.register

def register(present=True):
    pairs = [
        (lib.Timestamp, DatetimeConverter()),
        (Period, PeriodConverter()),
        ...  # the rest of the converters
    ]
    for key, value in pairs:
        if present:
            units.registry[key] = value
        else:
            units.registry.pop(key, None)

And then tests + documentation 😄 Hopefully not too much work, but let me know if you have questions.

@tacaswell
Copy link
Contributor

@ocehugo I am impressed by the bread-crumb trail on this issue.

@jorisvandenbossche
Copy link
Member

jorisvandenbossche commented Jul 21, 2016

An option to configure whether pandas converters are registered for matplotlib or not is certainly a good idea. But can't we also just fix our DatetimeConverter to actually work with all datetime.datetime values?

The following one-line change seems to fix the example from @ocehugo : 095a2ef (just converting the values itself when to_datetime failed) (but maybe I am missing the complexity of the issue)

You still get the adapted axis formatting from pandas (which you could then turn off with the option), but at least the plots would work.

@ocehugo
Copy link

ocehugo commented Jul 22, 2016

@tacaswell yeah, messed up with issues in almost all related packages since I was not expecting that importing a package would create such a big issue in almost all my source code ,since some basic packages need to support pandas, they need to import it so the problem was all around and such i went to blame matplotlib first...statsmodels...pandas). Should have investigate further before, but was pretty much present in almost all tests that i did until I had to go to raw jupyter without my default profile).

@TomAugspurger will take a look at that but not until next week. But solution above is tempting

@jorisvandenbossche
Copy link
Member

jorisvandenbossche commented Jul 22, 2016

@ocehugo If pandas would still register its own converter, but if this would not break code that would run without having imported pandas, would that be OK for you?
There will still be differences by importing pandas (eg in the axis formatting), but I would suppose it would solve the biggest problem?

sinhrks pushed a commit that referenced this issue Aug 16, 2016
xref #2579    This at least solves the direct negative consequence
(erroring code by importing pandas) of registering our converters by
default.

Author: Joris Van den Bossche <jorisvandenbossche@gmail.com>

Closes #13801 from jorisvandenbossche/plot-datetime-converter and squashes the following commits:

6b6b08e [Joris Van den Bossche] BUG: handle outofbounds datetimes in DatetimeConverter
@jorisvandenbossche jorisvandenbossche modified the milestones: 0.19.0, Next Major Release Aug 16, 2016
@gfyoung gfyoung modified the milestones: Someday, Next Major Release Jul 20, 2017
@jorisvandenbossche jorisvandenbossche modified the milestones: Someday, 0.21.0 Nov 15, 2017
@jorisvandenbossche
Copy link
Member

This issue has actually been solved in 0.21.0 (not registering the unit converters by default on pandas import, see #17710).
However, giving the many feedback, we will revert this change in 0.21.1, see #18301. Temporarily, to allow a more graceful deprecation period.

But as you currently get a deprecation warning (so are notified of using the converters), and this deprecation will eventually be removed (and we have already another issue to track the removal of deprecations) which will remove the automatic registering, I am already closing this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests