-
Notifications
You must be signed in to change notification settings - Fork 7.9k
Fix #75035: Fix 5+ digit years in ISO 8601 date strings #2672
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This patch adds support for parsing (unserializing, constructing) ISO 8601 date strings with 5+ digit years, consistent with our already existing support for serializing and formatting these types of dates. The standard allows implementations to use 5+ digit years so long as the sender and receiver agree on the format. The patch is at odds with tests for bugfix #62852, which asserts that a 5 digit year should fail to parse. Reviewing the original bug report, it seems to be more related to invalid timezone data. I didn't dig into it very far. As a bonus, the patch also ensures that ISO 8601 years -1 thru -999 are formatted with 4 digits (instead of 3 digits) prefixed by a minus sign. I believe this is required by the standard. Additionally it fixes an overflow error that causes bad year formatting. I focused only on ISO 8601 for this patch, but the same `long long -> int` overflow exists in other date formats, e.g., RFC 2822 dates. I regenerated `ext/date/lib/parse_date.c` manually with `$ re2c -d -b -o ...`. I'm not sure if that's correct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changes to timelib need to go here: https://github.com/derickr/timelib
break; | ||
case 'c': { | ||
const char *fmt = t->y < 0 && t->y > -1000 | ||
/* Ensure 4 digits for years -1 thru -999 */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a BC break.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a specific rubric for distinguishing between bugfix and bc break? I view this more as a bugfix than a bc break as the docs advertise ISO 8601 support. Ditto for the overflow fix. Thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's also against the ISO standard, which says a minimum of four digits optionally prefixed by a - (or +): https://en.wikipedia.org/wiki/ISO_8601#Years
@derickr should this be closed ? Is it the kind of BC break you will never be happy with, or can we have it in 7.3 ? |
The BC break is in the formatting, and the datex rules need to go to timelib upstream, so yes, this PR should be closed (for now). I'll have a look at whether the datex rules breaks any tests on any other integration that uses timelib. |
date.timezone=GMT | ||
--FILE-- | ||
<?php | ||
$s2 = 'O:3:"Foo":3:{s:4:"date";s:20:"10007-06-07 03:51:49";s:13:"timezone_type";i:3;s:8:"timezone";s:3:"UTC";}'; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test should not be removed, but instead, a different wrong date string should be used that is unparsable.
@derickr Let me know if you'd like me to assist / file a pr on timelib. Thanks. |
@adsr Sure - just make sure to add tests and that all existing tests pass. |
We can fix this bug in 7.3 if derickr/timelib#16 makes it into timelib 2018.01. Can we re-open? @krakjoe @derickr |
This patch adds support for parsing (unserializing, constructing)
ISO 8601 date strings with 5+ digit years, consistent with our
already existing support for serializing and formatting these
types of dates. The standard allows implementations to use 5+ digit
years so long as the sender and receiver agree on the format.
The patch is at odds with tests for bugfix #62852, which asserts that
a 5 digit year should fail to parse. Reviewing the original bug
report, it seems to be more related to invalid timezone data. I
didn't dig into it very far.
As a bonus, the patch also ensures that ISO 8601 years -1 thru -999
are formatted with 4 digits (instead of 3 digits) prefixed by a minus
sign. I believe this is required by the standard.
Additionally it fixes an overflow error that causes bad year
formatting. I focused only on ISO 8601 for this patch, but the same
long long -> int
overflow exists in other date formats, e.g., RFC2822 dates.
I regenerated
ext/date/lib/parse_date.c
manually with$ re2c -d -b -o ...
. I'm not sure if that's correct.