New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bpo-1100942: Add datetime.time.strptime and datetime.date.strptime #5578

Open
wants to merge 9 commits into
base: master
from

Conversation

Projects
None yet
6 participants
@matrixise
Contributor

matrixise commented Feb 7, 2018

Add datetime.date.strptime and datetime.time.strptime.

Fix the documentation of _strptime._strptime, the documentation was
wrong, return a 3-tuple and not a 2-tuple

Co-authored-by: Alexander Belopolsky alexander.belopolsky@gmail.com
Co-authored-by: Amaury Forgeot d'Arc amauryfa@gmail.com
Co-authored-by: Berker Peksag berker.peksag@gmail.com
Co-authored-by: Josh-sf josh-sf@users.sourceforge.net
Co-authored-by: Juarez Bochi jbochi@gmail.com
Co-authored-by: Maciej Szulik soltysh@gmail.com
Co-authored-by: Stéphane Wirtel stephane@wirtel.be
Co-authored-by: Matheus Vieira Portela matheus.v.portela@gmail.com

https://bugs.python.org/issue1100942

@matrixise

This comment has been minimized.

Contributor

matrixise commented Feb 7, 2018

This PR is for a very long issue, since 2005. We have a PR in 2018 👍

@Mariatta

This comment has been minimized.

Member

Mariatta commented Feb 9, 2018

I restarted the travis job. It still did not do the full CPython test suite. So please rebase :)

@matrixise

This comment has been minimized.

Contributor

matrixise commented Feb 15, 2018

Thanks, I didn't see your message, works on this issue today.

@matrixise matrixise force-pushed the matrixise:bpo-1100942 branch from ad32c42 to a7b624a Feb 15, 2018

@matrixise

This comment has been minimized.

Contributor

matrixise commented Feb 15, 2018

@Mariatta rebased and the tests pass on the CIs

the number of microseconds based on the input string and the
format string."""
format string, and the GMT offset."""

This comment has been minimized.

@pganssle

pganssle May 15, 2018

Contributor

We should probably consistently use "UTC offset", though I suppose it doesn't matter much since it's not a public-facing docstring.

This comment has been minimized.

@matrixise

matrixise Oct 25, 2018

Contributor

@pganssle I am not really confident with the offsets and the datetime. Do you think we could keep it like that and propose an other bpo ?

This comment has been minimized.

@pganssle

pganssle Oct 25, 2018

Contributor

I just meant use UTC instead of GMT everywhere.

@@ -565,6 +566,10 @@ def _strptime(data_string, format="%a %b %d %H:%M:%S %Y"):
hour, minute, second,
weekday, julian, tz, tzname, gmtoff), fraction, gmtoff_fraction
date_specs = ('%a', '%A', '%b', '%B', '%c', '%d', '%j', '%m', '%U',

This comment has been minimized.

@pganssle

pganssle May 15, 2018

Contributor

I think this is missing %G, %u and %V, the ISO 8601 week calendar directives.

@@ -565,6 +566,10 @@ def _strptime(data_string, format="%a %b %d %H:%M:%S %Y"):
hour, minute, second,
weekday, julian, tz, tzname, gmtoff), fraction, gmtoff_fraction
date_specs = ('%a', '%A', '%b', '%B', '%c', '%d', '%j', '%m', '%U',
'%w', '%W', '%x', '%y', '%Y',)
time_specs = ('%T', '%R', '%H', '%I', '%M', '%S', '%f', '%i', '%s',)

This comment has been minimized.

@pganssle

pganssle May 15, 2018

Contributor

I don't see any references to %T, %R, %i or %s in the docs or elsewhere in the code. What do these represent?

_time = _strptime_datetime(datetime_datetime, data_string, format)
return _time.time()
def _check_invalid_datetime_specs(fmt, specs, msg):

This comment has been minimized.

@pganssle

pganssle May 15, 2018

Contributor

If I'm understanding this correctly, this function seems to have the inverted sense of what I would expect. It seems that it checks if fmt is a valid spec.

From the names of the variables and functions, I was thinking that this would be a whitelist not a blacklist. Does it make sense to switch to a whitelist approach? If not, can we maybe change specs to be blacklist_specs or something?

@@ -565,6 +566,10 @@ def _strptime(data_string, format="%a %b %d %H:%M:%S %Y"):
hour, minute, second,
weekday, julian, tz, tzname, gmtoff), fraction, gmtoff_fraction
date_specs = ('%a', '%A', '%b', '%B', '%c', '%d', '%j', '%m', '%U',
'%w', '%W', '%x', '%y', '%Y',)
time_specs = ('%T', '%R', '%H', '%I', '%M', '%S', '%f', '%i', '%s',)

This comment has been minimized.

@pganssle

pganssle May 15, 2018

Contributor

I don't have strong opinions about this, but my intuition is that these should be set or frozenset rather than tuple. Are these tuples for performance reasons (I am not sure I know when exactly it's faster to use a set for "lookup membership" rather than a tuple or list).

def _strptime_datetime_date(data_string, format):
"""Return a date based on the input string and the format string."""
if not format:
raise ValueError("Date format is not valid.")

This comment has been minimized.

@pganssle

pganssle May 15, 2018

Contributor

This line and its equivalent in _time are not being hit. If I understand correctly this branch is only hit if format is empty?

This comment has been minimized.

@matrixise

matrixise Oct 25, 2018

Contributor

useless, thanks

tests = [('2004-12-01 13:02:47.197',
'%Y-%m-%d %H:%M:%S.%f'),
('2004-12-01', '%Y-%m-%d'),]
for date_string, date_format in tests:

This comment has been minimized.

@pganssle

pganssle May 15, 2018

Contributor

I think it would be good to use with self.subTest for these parametrized tests.

Also, per the other comment I guess you need to add something like ('12:30:15', '')` to get full coverage.

This comment has been minimized.

@pganssle

pganssle Nov 4, 2018

Contributor

I think you need two more test cases:

    ('1900-01-01 12:30', '%Y-%m-%d %H:%M'),
    ('12:30:15', ''),

date.strptime has similarly missing tests.

This comment has been minimized.

@matrixise

matrixise Nov 11, 2018

Contributor

the test with ('1900-01-01 12:30', '%Y-%m-%d %H:%M') does not raise an exception but returns datetime.time(12, 30)

For the other test, yep, there is an issue.

This comment has been minimized.

@pganssle

pganssle Nov 11, 2018

Contributor

The fact that it doesn't raise an exception is an issue. I'm a bit surprised that it doesn't raise an exception on pure Python, that's a bug, because I'm pretty sure that:

datetime.time.strptime("1901-01-01 12:30", "%Y-%m-%d %H:%M") does raise an exception.

*/
if (emptyDatetime == NULL) {
PyObject *emptyStringPair = Py_BuildValue("ss", "", "");
if (emptyStringPair == NULL)

This comment has been minimized.

@pganssle

pganssle May 15, 2018

Contributor

If I'm interpreting PEP 7 correctly, these if statements need curly braces. The relevant section is Code layout.

@matrixise

This comment has been minimized.

Contributor

matrixise commented May 15, 2018

Hi @pganssle

thank you for your review, I am going to fix it asap but I am not the author of the code, just the author of the PR. so, maybe I would need your help. Thanks

@matrixise matrixise force-pushed the matrixise:bpo-1100942 branch from fb6d098 to 1d7a5b0 Oct 5, 2018

@matrixise

This comment has been minimized.

Contributor

matrixise commented Oct 5, 2018

@pganssle I just rebased my branch with master. I am going to work on this PR. Do you want to help me because you are mister dateutil ;-)

@pganssle

This comment has been minimized.

Contributor

pganssle commented Oct 5, 2018

@matrixise Sorry this is on my list but probably can't get to it until the end of the month. 😟

@matrixise

This comment has been minimized.

Contributor

matrixise commented Oct 5, 2018

@pganssle ok, in this case, I will try to fix all the issues alone ;-) but I am not worried ;-)

@matrixise matrixise force-pushed the matrixise:bpo-1100942 branch from 1d7a5b0 to c286708 Oct 17, 2018

@matrixise

This comment has been minimized.

Contributor

matrixise commented Oct 17, 2018

Hi, I just updated this PR with master.

@matrixise matrixise force-pushed the matrixise:bpo-1100942 branch from a6722e7 to 7526de9 Oct 25, 2018

@matrixise

This comment has been minimized.

Contributor

matrixise commented Oct 25, 2018

@pganssle when you have time, could you review this PR, we started together, just comment when you find a mistake, thanks

@matrixise

This comment has been minimized.

Contributor

matrixise commented Nov 4, 2018

ping @pganssle ;-)

@pganssle

I haven't had a chance to look at the PR thoroughly, but I have a first-pass review of things that should be changed.

I also haven't checked exactly how you are doing it, but I believe we may want to / be able to refactor the tests a bit to take advantage of the existing test suite for datetime.strptime by separating out the date-only and time-only formats and reusing the tests with date, time and datetime.

Additionally, I should note that it's unfortunate that if this is merged, we'll have time.strptime and datetime.time.strptime, the first of which returns a timetuple (which is actually more like a datetime), and the second returning a datetime.time object. I don't see any way around this, but it will add confusion. :(

Show resolved Hide resolved Doc/library/datetime.rst
Return a :class:`date` corresponding to *date_string*, parsed according to
*format*. :exc:`ValueError` is raised if the date string and format can't be
parsed by `time.strptime`, or if it returns a value where the time part is
nonzero.

This comment has been minimized.

@pganssle

pganssle Nov 4, 2018

Contributor

Here and below, you have nonzero instead of non-zero. If we keep this wording, the hyphen needs to be added.

However, this PR does not parse to a timetuple or something and check if certain components were zero, it checks to see if the format string contains time components, and fails in that case (Edit: I was looking at just the pure python implementation - I now realize that this is precisely what the C implementation is doing, but I think it's the wrong thing to do anyway). The way the docs are currently worded, you would expect this to work:

from datetime import date
date.strptime("2018-01-01 00:00", "%Y-%m-%d %H:%M")

But it will fail (rightly so, I think).

I believe you can change the last part of the last sentence as such:

-     parsed by `time.strptime`, or if it returns a value where the time part is
-     nonzero.
+     parsed by `time.strptime`, or if time components are present in the format string.

Also, I think it needs to be

:meth:`time.strptime`

This comment has been minimized.

@pganssle

pganssle Nov 4, 2018

Contributor

One other note on the documentation, the datetime.strptime documentation links to

:ref:`strftime-strptime-behavior`.

I think these should too.

{
return new_date(GET_YEAR(self),
GET_MONTH(self),
GET_DAY(self));
}
static PyObject *
datetime_gettime(PyDateTime_DateTime *self, PyObject *Py_UNUSED(ignored))

This comment has been minimized.

@pganssle

pganssle Nov 4, 2018

Contributor

I don't know why this PyObject *Py_UNUSED(ignored) is here, so per Chesterton's Fence, I'm not comfortable removing it. Any insight as to why it is here and what the consequences will be in removing it?

Possibly it will be a breaking change in the C ABI?

This comment has been minimized.

@pganssle

pganssle Nov 4, 2018

Contributor

According to git blame, this is to silence a warning in gcc8. Is this still relevant @siddhesh @serhiy-storchaka?

Edit: Looking closer, the merged PR is from April, so I'm guessing it is, but now I'm wondering if this will break the C ABI the other way.

This comment has been minimized.

@serhiy-storchaka

serhiy-storchaka Nov 5, 2018

Member

Yes, datetime_gettime should have two arguments, the second is ignored.

return NULL;
}
if (DATE_GET_HOUR(datetime) ||

This comment has been minimized.

@pganssle

pganssle Nov 4, 2018

Contributor

Hm, if I understand correctly, this is inconsistent with the pure Python version. The pure Python version checks if there are any time components in the format string, whereas the C version parses to a datetime and then checks to see if there are any time components.

I think checking the format string is the better way to do this, as I mention in another comment, this approach seems to indicate that date.strptime("2018-01-01 00:00", "%Y-%m-%d %H:%M") would succeed, which is not the right thing to do. I will comment on the test suite to add a test for this.

This comment has been minimized.

@matrixise

matrixise Nov 11, 2018

Contributor

ok, I will check the Python implementation, but in this case, we could migrate the Python implementation to the C layer?

This comment has been minimized.

@pganssle

pganssle Nov 11, 2018

Contributor

Yeah, port the Python implementation to the C layer.

This comment has been minimized.

@matrixise

matrixise Nov 11, 2018

Contributor

lol ;-) I am not an expert with the C-API, but I could try, it's a good exercise for my comprehension of the C-API of Python

tests = [('2004-12-01 13:02:47.197',
'%Y-%m-%d %H:%M:%S.%f'),
('2004-12-01', '%Y-%m-%d'),]
for date_string, date_format in tests:

This comment has been minimized.

@pganssle

pganssle Nov 4, 2018

Contributor

I think you need two more test cases:

    ('1900-01-01 12:30', '%Y-%m-%d %H:%M'),
    ('12:30:15', ''),

date.strptime has similarly missing tests.

tests = [
('2004-12-01 13:02:47.197', '%Y-%m-%d %H:%M:%S.%f'),
('01', '%M'),
('02', '%H'),

This comment has been minimized.

@pganssle

pganssle Nov 4, 2018

Contributor

This needs at least two more test cases:

    ('2018-01-01 00:00', '%Y-%m-%d %H:%M'),
    ('2018-01-01', ''),

Both should fail.

@matrixise

This comment has been minimized.

Contributor

matrixise commented Nov 4, 2018

@pganssle thanks for your review, I am going to update this PR asap.

matrixise and others added some commits Feb 7, 2018

bpo-1100942: Add datetime.time.strptime and datetime.date.strptime
Add datetime.date.strptime and datetime.time.strptime.

Fix the documentation of _strptime._strptime, the documentation was
wrong, return a 3-tuple and not a 2-tuple

Co-authored-by: Alexander Belopolsky <alexander.belopolsky@gmail.com>
Co-authored-by: Amaury Forgeot d'Arc <amauryfa@gmail.com>
Co-authored-by: Berker Peksag <berker.peksag@gmail.com>
Co-authored-by: Josh-sf <josh-sf@users.sourceforge.net>
Co-authored-by: Juarez Bochi <jbochi@gmail.com>
Co-authored-by: Maciej Szulik <soltysh@gmail.com>
Co-authored-by: Stéphane Wirtel <stephane@wirtel.be>
Co-authored-by: Matheus Vieira Portela <matheus.v.portela@gmail.com>

@matrixise matrixise force-pushed the matrixise:bpo-1100942 branch from 1ca284e to 43a77af Nov 11, 2018

@matrixise

This comment has been minimized.

Contributor

matrixise commented Nov 11, 2018

ok, rebased with the last master.

@matrixise

This comment has been minimized.

Contributor

matrixise commented Nov 11, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment