Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

humanize overflows before datetime does #245

Closed
carterbox opened this issue Dec 1, 2021 · 11 comments
Closed

humanize overflows before datetime does #245

carterbox opened this issue Dec 1, 2021 · 11 comments
Assignees

Comments

@carterbox
Copy link
Contributor

What did you do?

I tried to humanize a very long time.

What did you expect to happen?

I expect the time to be converted into years/millenia/eon if it is within the bounds of datetime i.e. <= 999 999 999 days, and if the number is higher than that or uncountable with the compute platform (64/32 bits) then something like "longer than countable".

At the very least, this type of error should be handled like others in _date_and_delta: the timedelta is returned unaltered.

What actually happened?

I received an overflow error from an operation within humanize.

What versions are you using?

  • OS: Ubuntu x64
  • Python: 3.9.6
  • Humanize: 3.13.1

Please include code that reproduces the issue.

The best reproductions are self-contained scripts with minimal dependencies.

import datetime as dt
import humanize
d = dt.timedelta(days=999999999)
print(d)
h = humanize.naturaldelta(d)
print(h)
# ... e/time.py", line 131, in _date_and_delta
#    date = now - value
# OverflowError: date value out of range

Once we discuss how to best handle this error, I'd be happy to open a PR.

@hugovk
Copy link
Collaborator

hugovk commented Dec 3, 2021

This is https://stackoverflow.com/a/49267139/724176 and we're going before date.min == date(MINYEAR, 1, 1) == date(1, 1, 1), 1st January 0001.

https://docs.python.org/3/library/datetime.html#datetime.date.min

date = now - value
# date = 2021-12-02 13:13:36.763168 - 999999999 days, 0:00:00

999,999,999 days is ~2,739,726 years, so 2021 - ~2,739,726 is definitely before year 1!

This takes place in _date_and_delta when calculating the date but after we return date and delta, naturaldelta validates date, but other than that it's discarded and only delta is used to figure out the natural delta:

    date, delta = _date_and_delta(value, now=when)
    if date is None:
        return value
    ...

I wonder, is it worth refactoring so we avoid that date = now - value code path here? What are the circumstances where date is None?


At the very least, this type of error should be handled like others in _date_and_delta: the timedelta is returned unaltered.

I originally found it a bit strange to pretend things are fine when there's an error, but it's been done like this for a long time, so this is certainly one option.

On the other hand, I'd guess dealing with 2 million years is pretty rare, and perhaps the user should handle the exception? Is handling an exception preferable to getting back the unaltered delta?

@carterbox
Copy link
Contributor Author

carterbox commented Dec 3, 2021

What are the circumstances where date is None?

When the value passed to naturaldelta() is not datetime or timedelta or cannot be converted to an integer.

# other stuff
else:
    try:
        value = int(value)
        delta = dt.timedelta(seconds=value)
        date = now - delta
    except (ValueError, TypeError):
        return None, value

@carterbox
Copy link
Contributor Author

import datetime as dt
import humanize
d = dt.timedelta(days=999999999)
print(d)
h = humanize.naturaldelta(d, when=dt.date.resolution)
print(h)

I can misuse the when parameter to avoid the overflow for positive deltas, but I'm not sure if this would break in a future release.

@hugovk
Copy link
Collaborator

hugovk commented Dec 12, 2021

I'm thinking we should either let the OverflowError bubble up for the user to handle it (the status quo) or return the timedelta unaltered (which I think makes it harder for the user to handle, but is at least consistent)?

@carterbox
Copy link
Contributor Author

carterbox commented Dec 13, 2021

After more though, I think the better option (of those two you mention) would be to raise OverflowError because the current behavior isn't documented and is (I think) unexpected. It is simpler that naturaldelta() always returns a string and raises an error to explain invalid inputs instead of just silently returning something that is not a string.

@carterbox
Copy link
Contributor Author

Should I also open a PR to deprecate/warn users about this undocumented behavior (the returning of a timedelta instead of a string)?

@hugovk
Copy link
Collaborator

hugovk commented Dec 15, 2021

I'm a bit wary about changing the behaviour as this is an old library (we just missed the 10-year anniversary! 🍰 ) with lots of downloads, and I think the Django one some of these are based on does a similar thing.

Maybe the best thing to do is document the behaviour more clearly?

@carterbox
Copy link
Contributor Author

To summarize: the original issue is that an OverflowError is raised by naturaldelta() at deltas much smaller than what datetime can handle. Other errors in naturaldelta() cause a timedelta to be returned instead of a string, but this is undocumented. We want to retain the current behavior (to raise errors sometimes, return string sometimes, and return timedelta sometimes), but we want to document when each of these outcomes occurs.

Can I still refactor naturaldelta() so that the OverflowError is raised by datetime instead of by humanize? I believe this is compatible with the approach above given the unrelated deprecation of naturaldelta(..., when).

@hugovk
Copy link
Collaborator

hugovk commented Dec 15, 2021

Yes, documentation and refactoring welcome!

@carterbox
Copy link
Contributor Author

Assign me, please! Otherwise, I might loose track of this issue.

@hugovk
Copy link
Collaborator

hugovk commented Dec 21, 2021

@hugovk hugovk closed this as completed in 3b40d5f Feb 12, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants