Skip to content

Conversation

adsr
Copy link
Contributor

@adsr adsr commented Aug 5, 2017

This patch adds support for parsing (unserializing, constructing)
ISO 8601 date strings with 5+ digit years, consistent with our
already existing support for serializing and formatting these
types of dates. The standard allows implementations to use 5+ digit
years so long as the sender and receiver agree on the format.

The patch is at odds with tests for bugfix #62852, which asserts that
a 5 digit year should fail to parse. Reviewing the original bug
report, it seems to be more related to invalid timezone data. I
didn't dig into it very far.

As a bonus, the patch also ensures that ISO 8601 years -1 thru -999
are formatted with 4 digits (instead of 3 digits) prefixed by a minus
sign. I believe this is required by the standard.

Additionally it fixes an overflow error that causes bad year
formatting. I focused only on ISO 8601 for this patch, but the same
long long -> int overflow exists in other date formats, e.g., RFC
2822 dates.

I regenerated ext/date/lib/parse_date.c manually with
$ re2c -d -b -o .... I'm not sure if that's correct.

This patch adds support for parsing (unserializing, constructing)
ISO 8601 date strings with 5+ digit years, consistent with our
already existing support for serializing and formatting these
types of dates. The standard allows implementations to use 5+ digit
years so long as the sender and receiver agree on the format.

The patch is at odds with tests for bugfix #62852, which asserts that
a 5 digit year should fail to parse. Reviewing the original bug
report, it seems to be more related to invalid timezone data. I
didn't dig into it very far.

As a bonus, the patch also ensures that ISO 8601 years -1 thru -999
are formatted with 4 digits (instead of 3 digits) prefixed by a minus
sign. I believe this is required by the standard.

Additionally it fixes an overflow error that causes bad year
formatting. I focused only on ISO 8601 for this patch, but the same
`long long -> int` overflow exists in other date formats, e.g., RFC
2822 dates.

I regenerated `ext/date/lib/parse_date.c` manually with
`$ re2c -d -b -o ...`. I'm not sure if that's correct.
Copy link
Member

@derickr derickr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes to timelib need to go here: https://github.com/derickr/timelib

break;
case 'c': {
const char *fmt = t->y < 0 && t->y > -1000
/* Ensure 4 digits for years -1 thru -999 */
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a BC break.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a specific rubric for distinguishing between bugfix and bc break? I view this more as a bugfix than a bc break as the docs advertise ISO 8601 support. Ditto for the overflow fix. Thanks.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's also against the ISO standard, which says a minimum of four digits optionally prefixed by a - (or +): https://en.wikipedia.org/wiki/ISO_8601#Years

@krakjoe
Copy link
Member

krakjoe commented Aug 7, 2017

@derickr should this be closed ? Is it the kind of BC break you will never be happy with, or can we have it in 7.3 ?

@krakjoe krakjoe added the Bug label Aug 7, 2017
@derickr
Copy link
Member

derickr commented Aug 7, 2017

The BC break is in the formatting, and the datex rules need to go to timelib upstream, so yes, this PR should be closed (for now). I'll have a look at whether the datex rules breaks any tests on any other integration that uses timelib.

date.timezone=GMT
--FILE--
<?php
$s2 = 'O:3:"Foo":3:{s:4:"date";s:20:"10007-06-07 03:51:49";s:13:"timezone_type";i:3;s:8:"timezone";s:3:"UTC";}';
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test should not be removed, but instead, a different wrong date string should be used that is unparsable.

@krakjoe krakjoe closed this Aug 7, 2017
@adsr
Copy link
Contributor Author

adsr commented Aug 7, 2017

@derickr Let me know if you'd like me to assist / file a pr on timelib. Thanks.

@derickr
Copy link
Member

derickr commented Aug 7, 2017

@adsr Sure - just make sure to add tests and that all existing tests pass.

@adsr
Copy link
Contributor Author

adsr commented Jul 11, 2018

We can fix this bug in 7.3 if derickr/timelib#16 makes it into timelib 2018.01. Can we re-open? @krakjoe @derickr

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants