Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

in_months() returns wrong values in some cases in January to February #790

Open
2 tasks done
baubie opened this issue Jan 4, 2024 · 1 comment
Open
2 tasks done

Comments

@baubie
Copy link

baubie commented Jan 4, 2024

  • I am on the latest Pendulum version.
  • I have searched the issues of this repo and believe that this is not a duplicate.
  • Ubuntu 22.04 and Python 3.11:
  • 3.0.0:

Issue

We encountered a bug in 2.1 that persists in 3.0 when using the in_months() function of an interval where we get weird results for very specific dates. Examples are shown below.

In a leap year (2024). Why would January 3rd return 1 but the 2rd and 4th return 0?

>>> import pendulum

>>> (pendulum.datetime(2024,2,1,0,0,0, tz='America/Toronto') - pendulum.datetime(2024,1,1,0,0,0, tz='America/Toronto')).in_months()
1
>>> (pendulum.datetime(2024,2,1,0,0,0, tz='America/Toronto') - pendulum.datetime(2024,1,2,0,0,0, tz='America/Toronto')).in_months()
0 # Why is this zero but January 1st and 3rd are 1?
>>> (pendulum.datetime(2024,2,1,0,0,0, tz='America/Toronto') - pendulum.datetime(2024,1,3,0,0,0, tz='America/Toronto')).in_months()
1
>>> (pendulum.datetime(2024,2,1,0,0,0, tz='America/Toronto') - pendulum.datetime(2024,1,4,0,0,0, tz='America/Toronto')).in_months()
0
>>> (pendulum.datetime(2024,2,1,0,0,0, tz='America/Toronto') - pendulum.datetime(2024,1,5,0,0,0, tz='America/Toronto')).in_months()
0

Not a leap year. Why would January 4th return 1 but the 3rd and 5th return 0?

>>> (pendulum.datetime(2023,2,1,0,0,0, tz='America/Toronto') - pendulum.datetime(2023,1,1,0,0,0, tz='America/Toronto')).in_months()
1
>>> (pendulum.datetime(2023,2,1,0,0,0, tz='America/Toronto') - pendulum.datetime(2023,1,2,0,0,0, tz='America/Toronto')).in_months()
0
>>> (pendulum.datetime(2023,2,1,0,0,0, tz='America/Toronto') - pendulum.datetime(2023,1,3,0,0,0, tz='America/Toronto')).in_months()
0
>>> (pendulum.datetime(2023,2,1,0,0,0, tz='America/Toronto') - pendulum.datetime(2023,1,4,0,0,0, tz='America/Toronto')).in_months()
1
>>> (pendulum.datetime(2023,2,1,0,0,0, tz='America/Toronto') - pendulum.datetime(2023,1,5,0,0,0, tz='America/Toronto')).in_months()
0
>>> (pendulum.datetime(2023,2,1,0,0,0, tz='America/Toronto') - pendulum.datetime(2023,1,6,0,0,0, tz='America/Toronto')).in_months()
0
@baubie baubie changed the title in_months() returns wrong values in some cases in leap years. in_months() returns wrong values in some cases in January to February Jan 4, 2024
dekean added a commit to dekean/pendulum that referenced this issue Jan 27, 2024
@Benji19967
Copy link
Contributor

Benji19967 commented Mar 7, 2024

@sdispater @Secrus before we look for potential changes in the code, it should be clear what in_months() is supposed to return. If you want to skip straight to what I think is the best solution, you can jump to the TL;DR at the bottom.

The output is what precise_diff in _helpers returns, which is what in_months() uses. Read Interval in the tables below as end - start.

From the issue above:

Interval Current Desired Is Correct
Feb 1 - Jan 1 1 months 0 days 1 months 0 days Yes
Feb 1 - Jan 2 0 months 30 days 0 months 30 days Yes
Feb 1 - Jan 3 0 months 29 days 0 months 29 days Yes
Feb 1 - Jan 4 1 months 0 days 0 months 28 days No
Feb 1 - Jan 5 0 months 27 days 0 months 30 days Yes

This example should clarify the issue (let's assume we're dealing with a non-leap year):

Interval Current Desired
Feb 28 - Jan 28 1 months 0 days 1 months 0 days
Mar 1 - Feb 1 1 months 0 days 1 months 0 days

So far so good. But what pendulum currently returns for the following examples seems inconsistent.

Interval Current Desired
Mar 1 - Jan 28 1 months 1 days 1 months 4 days
Mar 1 - Jan 29 1 months 1 days 1 months 3 days
Mar 1 - Jan 30 1 months 1 days 1 months 2 days
Mar 1 - Jan 31 1 months 1 days 1 months 1 days

The Desired column above represents an alternative. But then we would run into the following:

Interval Desired
Feb 28 - Jan 28 1 months 0 days
Mar 1 - Jan 29 1 months 3 days

By increasing the start and end of the interval by 1 day each, the number of days gets bumped by 3.

Clearly, it's not straightforward to establish what the output should be—all solutions appear to yield non-intuitive results.

TL;DR

As Dijkstra advocated, the best solution appears to be to use a semi-open interval [A, B), where A is inclusive and B exclusive.

Interval Output
Feb 1 - Jan 1 0 months 30 days 23 hours 59 minutes 59 seconds 999999 microseconds
Feb 1 - Jan 2 0 months 29 days 23 hours 59 minutes 59 seconds 999999 microseconds
Feb 1 - Jan 3 0 months 28 days 23 hours 59 minutes 59 seconds 999999 microseconds
Feb 1 - Jan 4 0 months 27 days 23 hours 59 minutes 59 seconds 999999 microseconds
Feb 1 - Jan 5 0 months 26 days 23 hours 59 minutes 59 seconds 999999 microseconds
Feb 28 - Jan 28 0 months 30 days 23 hours 59 minutes 59 seconds 999999 microseconds
Mar 1 - Jan 28 1 months 3 days 23 hours 59 minutes 59 seconds 999999 microseconds
Mar 1 - Jan 29 1 months 2 days 23 hours 59 minutes 59 seconds 999999 microseconds
Mar 1 - Jan 30 1 months 1 days 23 hours 59 minutes 59 seconds 999999 microseconds
Mar 1 - Jan 31 1 months 0 days 23 hours 59 minutes 59 seconds 999999 microseconds
Mar 1 - Feb 1 0 months 27 days 23 hours 59 minutes 59 seconds 999999 microseconds

There is only ever a month if there is a full month between end and start.

The output is intuitive, consistent and hence also easy to document. I think using a semi-open interval should be the de facto choice for diffs/durations/ranges/intervals.

Related issue: #606

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants