Join GitHub today
[bugfix] Fix #3500: ISO 8601 parsing should match the full string, not the beginning of the string. #3502
Before the fix, date strings which started with a valid ISO pattern but ended with non-conforming data would pass as valid ISO dates and not trigger the deprecation warning, causing garbage-in/garbage-out bugs and silent failure.
Example of a garbage date which was considered Valid ISO before:
Example of silent failure cases:
Non of the above cases trigger the "non-ISO deprecation warning", either.
Because of the nature of how the related tests are put together, the newly added test passes even before the fix is applied.
This happens because the "non-ISO" tests are tested against the same ISO regex which is being fixed.
Because the broken regex is also used in the test, it allowed the non ISO date to be validated before the fix.
This fix exposed another problem with non-ISO dates in the test suite, so it causes 3 other test failures, which are now fixed, too. The other test failures are in:
All concern dates which look like this:
From what I can tell, it is not valid IS0 8601 to include a timezone offset without a time. The references I checked included:
These tests were updated to avoid this invalid date format after @icambron agreed the dates were not ISO format but should be.
referenced this pull request
Oct 13, 2016
@icambron: Thanks for the response.
I read through the related thread and saw some other people also ran into problems with junk at the end of ISO-valid strings-- Like my case where December 4th became December 1st, someone else had an unnerving case where 11 PM became 11 AM.
I see no need to add a "strict" flag as part of fixing this. I see this fix as simply making the ISO parser work as advertised-- more correctly! I don't see the current behavior has being "not strict", it's not like "not behaving advertised by silently discarding at the end of valid ISO patterns".
I pushed another commit to fix the bogus date strings in the test suite that the original ISO fix turned up.
This pull request will put a stop the unknown amount of silent failures where invalid date strings are translated into other ISO dates or junk is simply allowed through as a "ISO valid" when it is not.
Non-strict really just means "wrong but in ways that are hopefully helpful". I agree that the current behavior is an especially unhelpful flavor of non-strictness. But AFAIK, it's the only bit of slack in the parser, so fixing has the effect of making it strict. My point in the other thread was that I don't think need a global non-strict parsing flag to make ISO in particular strict. I think you agree.
About this change not being backwards compatible -- we've done those in the past with varying success. So we can try to ship in in 2.16 and if there is massive backlash cut out 2.16.1 ... not really something I can predict now, but in general, given how many mistakes have happened even in our code (tests) its hard to be against this PR.
I'll merge it in next release (2.16)