New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
strpdate('20141110', '%Y%m%d%H%S') returns wrong date #67029
Comments
strptime() is returning the wrong date if I try to parse today's date (2014-11-10) as a string with no separators, and if I ask strpdate() to look for nonexistent hour and minute fields. >>> datetime.datetime.strptime('20141110', '%Y%m%d').isoformat()
'2014-11-10T00:00:00'
>>> datetime.datetime.strptime('20141110', '%Y%m%d%H%M').isoformat()
'2014-01-01T01:00:00' |
What result did you expect? |
I expected the second call to strpdate() to throw an exception, because %Y consumed '2014', %m consumed '11', and %d consumed '10', leaving nothing for %H and %M to match. That would be consistent with the first call. |
The documentation certainly appears to say that %m, for example, will consume two digits, but it could just as easily be only for output (i.e. strftime). I suspect this is simply a documentation issue as opposed to a bug, but let's see what the others think. |
I have recently closed a similar issue (bpo-5979) as "won't fix". The winning argument there was that Python behavior was consistent with C. How does C strptime behave in this case? |
With the following C code: #include <time.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main(){
char buf[255];
struct tm tm;
memset(&tm, 0, sizeof(tm));
strptime("20141110", "%Y%m%d%H%M", &tm);
strftime(buf, sizeof(buf), "%Y-%m-%d %H:%M", &tm);
printf("%s\n", buf);
return 0;
} I get $ ./a.out
2014-11-10 00:00 So I think Python behavior is wrong. |
Here is the case that I think illustrates the current logic better: >>> datetime.strptime("20141234", "%Y%m%d%H%M")
datetime.datetime(2014, 1, 2, 3, 4) |
Looking at the POSIX standard http://pubs.opengroup.org/onlinepubs/009695399/functions/strptime.html It appears that Python may be compliant: %H The hour (24-hour clock) [00,23]; leading zeros are permitted but not required. |
Here is another interesting bit from the standard: "The application shall ensure that there is white-space or other non-alphanumeric characters between any two conversion specifications." This is how they get away from not specifying whether parser of variable width fields should be greedy or not. |
strptime very much follows the POSIX standard as I implemented strptime by reading that doc. If you want to see how the behaviour is implemented you can look at https://hg.python.org/cpython/file/ac0334665459/Lib/_strptime.py#l178 . But the key thing here is that the OP has unused formatters. Since strptime uses regexes underneath the hood, the re module does its best to match the entire format. Since POSIX says that e.g. the leading 0 for %m is optional, the regex goes with the single digit version to let the %H format match _something_ (same goes for %d and %M). So without rewriting strptime to not use regexes to support unused formatters and to stop being so POSIX-compliant, I don't see how to change the behaviour. Plus it would be backwards-incompatible as this is how strptime has worked in 2002. It's Alexander's call, but I vote to close this as "not a bug". |
After reading the standard a few more times, I agree with Brett and Ethan that this is at most a call for better documentation. I'll leave this open for a chance that someone will come up with a succinct description of what exactly datetime.strptime does. (Maybe we should just document the format to regexp translation implemented in _strptime.py.) We may also include POSIX's directive "The application shall ensure that there is white-space or other non-alphanumeric characters between any two conversion specifications" as a recommendation. |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: