-
Notifications
You must be signed in to change notification settings - Fork 322
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make accepted UTCTime formats more lenient #197
Comments
It's of course possible to work-around this in client-code, like say with something like this:
|
+1. This would be really convenient! |
It's quite common to encounter timestamp strings of format: 2014-04-28 13:13:13.112233 2014-04-28 13:13:13 (I.e., without 'T' in the middle and without the 'Z' postfix) An example source of these strings is SQLite: sqlite> select datetime('now'); 2014-04-28 12:35:12 Rather than fail to parse these strings, allow them in JSON -> UTCTime conversion. See haskell#197 for more examples & test case that reproduces the failure.
At the very least, all 8601 formats should be accepted, for instance having a time zone indicator without including seconds or fractions of a second. |
Personally, I'm strongly -1 on dropping the |
@hvr has a point. For example, my Python example above uses local time, not UTC. Perhaps it'd be better not support by this by default but add a passage to docs on how to deal with non-standard UTC strings. |
Launchpads API outputs timestamps like so: 2014-04-03T15:58:03.572018+00:00 - so UTC but explicit rather than Z :/. But these are actually UTC times - just an explicit 0 offset. That should be safely handleable. |
FWIW the format I needed is handled by the ZonedTime class (Data.Time.ZonedTime rather than Clock.UTCTime in my data structure). Perhaps this is now just a docs issue? |
@nurpax I'd be quite interested in that pull request, especially if it's faster than the current horrible |
@bos Sure. I put together a UTCTime parser with attoparsec. It beats the performance of the existing UTCTime parser by 6x (30 usec vs 5 usec per a datetime string). The UTCTime parser still accepts many inputs that the earlier parser didn't. I'll clean that up so that UTCTime parser only accepts what it used to ( If the tests for ZonedTime are comprehensive, I guess it's possible to turn that into an attoparsec based parser too. Looks like a lot more work though. I'll put together a PR for UTCTime first and maybe follow up with ZonedTime later. |
@nurpax I think it's ok if it accepts more inputs (which ought to be clearly documented) than the current parser does as long as it doesn't accept inputs lacking any timezone indicator |
Replace the old 'parseTime' based UTCTime parser with an attoparsec based parser. This is roughly 5x faster than the old one. It also allows decoding some timestamp formats that the old version rejected. The new version still requires all timestamps to specify a timezone when decoding to a UTCTime type. Also add a few unit tests for both what should be accepted and also for things that should be rejected. A lot of this code was bummed from the postgresql-simple library - thus credits to @lpsmith too!
@hvr, @bos, I think the referenced PR captures the requirements expressed in this bug thread. I didn't touch ZonedTime because the new UTCTime parser should accepts the commonly occurring ISO-8601 formats, while still always requiring a timezone qualifier to be present. It departs from the original in that earlier only UTC (e.g., Zulu timezone) timestamps were accepted when decoding to UTCTime type. Now timestamps that specify a timezone can be implicitly converted to UTCTime by appropriately adjusting for the timezone offset. |
@nurpax Thanks for putting in the work to improve this. I'll review the pull request as soon as I can. |
Replace the old 'parseTime' based UTCTime parser with an attoparsec based parser. This is roughly 5x faster than the old one. It also allows decoding some timestamp formats that the old version rejected. The new version still requires all timestamps to specify a timezone when decoding to a UTCTime type. Also add a few unit tests for both what should be accepted and also for things that should be rejected. A lot of this code was bummed from the postgresql-simple library - thus credits to @lpsmith too!
I believe this should now be all fixed up. |
I noticed that Aeson only accepts UTCTime's with the following syntax:
%FT%T%QZ
Unfortunately, it's quite common to see some variations of this syntax where the middle 'T' and the trailing 'Z' have been dropped. I think many Aeson users would appreciate if the UTCTime parser would be a bit more lenient on these; these parse failures can be hard to track down in a Haskell program if all you get is a
Nothing
fromdecode
.Here's a couple of examples of where timestamps with the troublesome format could originate from:
(.hs code & cabal that reproduce the parse error here: https://gist.github.com/nurpax/11369860)
Python's default
__str__()
conversion for datetime:Similarly, SQLite yields:
Data.Time's
read
for UTCTime also accepts both of these formats.I have an attoparsec-based parser adapted from postgresql-simple that's pretty well tested. I'd be happy to send a PR for relaxing the UTCTime parser rules if you agree with the general idea of allowing some leeway in timestamp syntax.
I am aware of #109. Also, I do not propose any change to UTCTime encode format.
The text was updated successfully, but these errors were encountered: