Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Drop use of pytz dependency in next major release #10443

Closed
pganssle opened this issue Feb 13, 2018 · 3 comments
Closed

Drop use of pytz dependency in next major release #10443

pganssle opened this issue Feb 13, 2018 · 3 comments
Milestone

Comments

@pganssle
Copy link
Member

Currently matplotlib defaults to using pytz for time zone support, I think that this should be dropped in favor of using dateutil.tz, which supports everything pytz does and more.

Case for dateutil

For starters, dateutil is already a dependency, so this removes rather than replaces a dependency. Additionally, the time zones dateutil provides properly implement the tzinfo protocol, while pytz uses a non-standard API that is famously confusing.

Additionally, as of dateutil==2.6.0, dateutil has support for the fold attribute introduced in PEP 495. It is currently recommended in the Python 3.6 documentation (scroll up a bit, that's the closest anchor) for this reason. It is unlikely that pytz will implement fold support, as that's not really how pytz works.

I'll also say that most people seem to think that pytz already works like dateutil does, which causes all kinds of problems. An example:

import pytz
from dateutil import tz
from datetime import datetime, timedelta

NYC = tz.gettz('America/New_York')
NYC_pytz = pytz.timezone('America/New_York')

dt_du = datetime(2016, 4, 2, tzinfo=NYC)
dt_pt_bad = datetime(2016, 4,2, tzinfo=NYC_pytz)    # Wrong!
dt_pt = NYC_pytz.localize(datetime(2016, 4, 2))

print(dt_du)
# 2016-04-02 00:00:00-04:00

print(dt_pt_bad)
# 2016-04-02 00:00:00-04:56

print(dt_pt)
# 2016-04-02 00:00:00-04:00

print(dt_du - timedelta(days=60))
# 2016-02-02 00:00:00-05:00

print(dt_pt - timedelta(days=60))
# 2016-02-02 00:00:00-04:00

print(NYC_pytz.normalize(dt_pt - timedelta(days=60)))
# 2016-02-01 23:00:00-05:00

Generally matplotlib handles all this correctly internally, but this only breaks backwards compatibility insofar as users are actually taking the time zone they are getting and using it for something, and those people are getting an object with a famously confusing API (every time I talk about this I get people saying they've been doing it wrong, and even people who know how this works in general don't often understand the details well).

Case against dateutil

In favor of pytz, I'll say that it's a very "light" dependency in the sense that it's extremely widely used and is a hard dependency of other libraries like pandas that matplotlib users are likely to be installing anyway.

Additionally, currently pytz is faster than dateutil, and I think does a (somewhat) better job at memoizing function calls. This is something that I'm actively working on in dateutil, and I've already closed the gap in many areas (particularly dateutil.tz.tzutc(), which is very useful).

Summary:

In favor of dateutil:

  1. Standard tzinfo interface
  2. Support for fold
  3. Already a dependency

In favor of pytz:

  1. Faster
  2. Wouldn't break backwards compatibility to keep it.

I'll also note that I'm only suggesting that matplotlib stop using pytz as its default timezone provider, not that all support for pytz be dropped. Users should still be able to supply datetime objects with any valid tzinfo and have matplotlib work properly.

@tacaswell tacaswell added this to the v3.0 milestone Feb 13, 2018
@tacaswell
Copy link
Member

I am 👍 on this and dropping pytz as a run-time dependency, but we should keep a few tests using it.

@anntzer
Copy link
Contributor

anntzer commented Feb 14, 2018

Do we even need to? Aren't tz implementations simply supposed to inherit https://docs.python.org/3/library/datetime.html#datetime.tzinfo and that should be the ground truth?

@pganssle
Copy link
Member Author

pganssle commented Feb 14, 2018

@anntzer pytz zones are not interchangeable with standard tzinfo implementations, as I demonstrated in my original post, and some infrastructure may be needed to support them properly. I recommend dropping pytz as a dependency and scrub it from the library code, but continuing to test against it.

When I prepare the PR, I'll look into how much special-case code is necessary to support pytz-style zones and if it's a lot it might be worth discussing dropping explicit support for pytz, but there's at least a subset of operations that should probably be supported given how widely pytz is used.

One thing I'll note is that basically anything that operates entirely on the "absolute timeline" should be no problem, because for both pytz and dateutil you can do the exact same operations:

dt_utc = dt.astimezone(tz.tzutc())
dt_utc_txformed = txform(dt_utc)
dt_txformed = dt_utc_txformed.astimezone(dt.tzinfo)

These aware-to-aware transformations are supported perfectly well by both pytz and dateutil. The problem comes with when you have "wall clock" operations, which comes up for example with rrule, where you are generating "wall times" and then localizing them to a given zone, see, for example, this function, which assumes anything implementing a localize function implements a pytz-like interface.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants