New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Returned datetime skips a day with time+timezone input and PREFER_DATES_FROM = 'future' #403
Comments
I am experiencing exactly the same issue. It checks whether the given time is in the future for UTC instead of for the given timezone. Versions:
|
I have found that the following MWE also seems to produce the same bug, even if PREFER_DATES_FROM is not set: from datetime import datetime
from dateparser import parse
DATEPARSER_SETTINGS = {
'TIMEZONE': 'UTC',
'TO_TIMEZONE': 'UTC',
'RETURN_AS_TIMEZONE_AWARE': False,
'RELATIVE_BASE': datetime(2019, 8, 18, 3, 55, 1, 0)
}
dt = parse("9:57 PM MDT", settings=DATEPARSER_SETTINGS) # datetime(2019, 8, 19, 3, 57, 0)
expected = datetime(2019, 8, 18, 3, 57, 0)
assert dt == expected # fail: dt is a day later than expected Note that RELATIVE_BASE is set to 9:55:01 PM MDT on 17 August, which is before the specified time of |
It looks like lines 309 to 324 of def _set_relative_base(self):
self.now = self.settings.RELATIVE_BASE
if not self.now:
self.now = datetime.utcnow()
def _get_datetime_obj_params(self):
if not self.now:
self._set_relative_base()
params = {
'day': self.day or self.now.day,
'month': self.month or self.now.month,
'year': self.year or self.now.year,
'hour': 0, 'minute': 0, 'second': 0, 'microsecond': 0,
}
return params Without a RELATIVE_BASE parameter, the day selected for the parsed input time is the current UTC date, before any past or future PREFER_DATES_FROM adjustment is done. This explains why timezones behind UTC will skip a day when UTC time has passed 12AM before the local timezone has. With a RELATIVE_BASE parameter, the parsed time will use the date of the RELATIVE_BASE parameter, ignoring any declared timezone settings. This explains @Laogeodritt 's latest issue, as the 18th from RELATIVE_BASE is joined with 9:57 PM MDT, which converted to UTC will end up on the 19th. The current solution is to specify a RELATIVE_BASE on the same timezone as the input string to parse. When not specifying a timezone in the input string, adding this should solve most issues: 'RELATIVE_BASE': datetime.now() With the pytz library, 'RELATIVE_BASE': datetime.now(pytz.timezone(timezone_of_input_string)).replace(tzinfo=None) The library fix is probably to make |
Versions:
What I was trying to do: Use dateparser to:
I think my settings here should achieve this according to my understanding of the documentation—if not, and this isn't a bug but a config error on my part, please do let me know!
Minimum working example:
With the following settings, behaviour is as expected:
With the following settings, some unexpected behaviour occurs:
Expected behaviour: With an undated time input and PREFER_DATES_FROM=future, dateparse should return the nearest future time that matches, i.e., datetime(2018, 3, 27, 18, 0).
What actually happens: Dateparser skips a day to the second nearest date in the future, datetime(2018, 3, 28, 18, 0). This only happens in the last case above, which to me suggests the following:
The text was updated successfully, but these errors were encountered: