Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support stdlib zoneinfo.ZoneInfo (new in Python 3.10) #19450

Open
1 task done
uranusjr opened this issue Nov 7, 2021 · 11 comments
Open
1 task done

Support stdlib zoneinfo.ZoneInfo (new in Python 3.10) #19450

uranusjr opened this issue Nov 7, 2021 · 11 comments
Labels
kind:meta High-level information important to the community

Comments

@uranusjr
Copy link
Member

uranusjr commented Nov 7, 2021

Body

Python 3.10 introduced zoneinfo which covers usages of pendulum.tz (for Airflow uses, from what I can tell). Airflow should support zoneinfo.ZoneInfo for all places where it currently uses pendulum.tz.Timezone. This should be a good first step to phase out Pendulum from Airflow.

The module is also backported as backports.zoneinfo to Python 3.6+, but it's probably not a priority to replace Pendulum usages with it for now.

Committer

  • I acknowledge that I am a maintainer/committer of the Apache Airflow project.
@uranusjr uranusjr added the kind:meta High-level information important to the community label Nov 7, 2021
@ashb
Copy link
Member

ashb commented Mar 20, 2023

👍, but it's arguably a DAG breaking change to replace pendulum, since all the dates exposed via the (template) context are Pendulum instances, and I know I've used pendulum specific methods on those data objects in python operators

@potiuk
Copy link
Member

potiuk commented Mar 20, 2023

Yet another reason to start thinking about Airlfow 3 ? (just saying).

@uranusjr
Copy link
Member Author

Yeah I don’t think replacing Pendulum is viable now (it would be a big undertaking), the most reasonable approach for the moment would be to simp[ly accept ZoneInfo as input and coerce it into a Pendulum Timezone object.

@eshkinkot
Copy link

first step to phase out Pendulum from Airflow

Is there any reason for this?

@notatallshaw-gts
Copy link
Contributor

phase out Pendulum from Airflow

As a user of Airflow I would like to point out two use cases that will make transitioning off Pendulum difficult for users:

  1. Converting the time zone of a datetime macro
  2. Using the nice formatting Pendulum gives

As an example, this is a macro I commonly write (usually as part of a filepath construction or something):

{{ data_interval_end.in_tz(dag.timezone).format('YYYY_MM_DD') }}

The formatting is easy to migrate (although not as nice) but I don't think there's anything in regular Python that makes it easy to switch the time zone of datetime correctly.

@uranusjr
Copy link
Member Author

uranusjr commented May 7, 2023

astimezone and stftime cover those.

@notatallshaw-gts
Copy link
Contributor

notatallshaw-gts commented May 15, 2023

astimezone

I foresee three main hurdles with converting from in_tz to astimezone:

  1. astimezone only takes a tzinfo subclass, whereas in_tz will coerces a string, which leads to my next most common use case:
{{ data_interval_end.in_tz('America/New_York').format('YYYY_MM_DD') }}
  1. Useful timezones like "America/New_York" are not supported in the standard library until Python 3.9 with zoneinfo

  2. I have a feeling that for such a large application and datetime sensitive application like Airflow it's going to discover lots or weird edge cases (e.g. https://news.ycombinator.com/item?id=35916074) in the standard library as it's so relatively recent to the standard library and the larger Python ecosystem has historically relied on third party libraries, but hopefully the test suite is ready for it

stftime

There are definitely fractional second formatting using format that cannot be expressed using purely stftime alone: https://pendulum.eustace.io/docs/#tokens

I'm not saying migrating moving off Pendulum will be impossible, just raising that for at least some users it will not be pain free.

@uranusjr
Copy link
Member Author

uranusjr commented Oct 5, 2023

From the recent Pendulum PRs (#34744 and #34683) I’m starting to really want to remove our reliance on Pendulum-specific timezones in Airflow core. Pendulum will still be used as the implementation, but we switch to only use standard tzinfo interface instead. Want to get a feel if this is viable and/or worthwhile. Any thoughts? @Taragolis @bolkedebruin

@bolkedebruin
Copy link
Contributor

bolkedebruin commented Oct 6, 2023

I think Pendulum 3 is switching to using the default provided timezones, system and then pytz in that order, anyway. Note: there is no such thing as the standard tzinfo interface. Pendulum 2 by default also used pytz timezones through a package , which just wasn't updated frequently and can use the system provided ones.

I'd prefer keep using Pendulum's interface as that at least is consistent within (-: major releases.

@uranusjr
Copy link
Member Author

uranusjr commented Oct 6, 2023

there is no such thing as the standard tzinfo interface

There is a tzinfo (and timezone) interface, and it is in the stdlib. The problem is pendulum and pytz does not follow that.

@bolkedebruin
Copy link
Contributor

yes it is in stdlib, but it does not have a standard interface in the way of 'what you see is what you get'. tzname() is a prime example. The documentation reads that it will return a timezone name, but PST is not a timezone name. So the problem is that the stdlib does not stipulate sufficiently what needs to be done.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind:meta High-level information important to the community
Projects
None yet
Development

No branches or pull requests

6 participants