Make accepted UTCTime formats more lenient #197

nurpax · 2014-04-28T12:53:04Z

I noticed that Aeson only accepts UTCTime's with the following syntax: %FT%T%QZ

Unfortunately, it's quite common to see some variations of this syntax where the middle 'T' and the trailing 'Z' have been dropped. I think many Aeson users would appreciate if the UTCTime parser would be a bit more lenient on these; these parse failures can be hard to track down in a Haskell program if all you get is a Nothing from decode.

Here's a couple of examples of where timestamps with the troublesome format could originate from:

(.hs code & cabal that reproduce the parse error here: https://gist.github.com/nurpax/11369860)

Python's default __str__() conversion for datetime:

> d = datetime.now()
> print datetime.now()
2014-04-28 13:13:13.112233

Similarly, SQLite yields:

sqlite> select datetime('now');
2014-04-28 12:35:12

Data.Time's read for UTCTime also accepts both of these formats.

I have an attoparsec-based parser adapted from postgresql-simple that's pretty well tested. I'd be happy to send a PR for relaxing the UTCTime parser rules if you agree with the general idea of allowing some leeway in timestamp syntax.

I am aware of #109. Also, I do not propose any change to UTCTime encode format.

The text was updated successfully, but these errors were encountered:

nurpax · 2014-04-28T12:56:10Z

It's of course possible to work-around this in client-code, like say with something like this:

kludgeParseTime :: String -> UTCTime
kludgeParseTime t =
  case parseTime defaultTimeLocale "%F %T" t of
    Just d -> d
    Nothing ->
      case parseTime defaultTimeLocale "%FT%T%QZ" t of
        Just d -> d
        Nothing -> error "could not parse date"


instance FromJSON Foo where
  parseJSON (Object v) = Foo <$> (kludgeParseTime <$> v .: "timestamp")

cmsaperstein · 2014-05-07T16:48:07Z

+1. This would be really convenient!

It's quite common to encounter timestamp strings of format: 2014-04-28 13:13:13.112233 2014-04-28 13:13:13 (I.e., without 'T' in the middle and without the 'Z' postfix) An example source of these strings is SQLite: sqlite> select datetime('now'); 2014-04-28 12:35:12 Rather than fail to parse these strings, allow them in JSON -> UTCTime conversion. See haskell#197 for more examples & test case that reproduces the failure.

ghost · 2014-05-25T02:58:32Z

At the very least, all 8601 formats should be accepted, for instance having a time zone indicator without including seconds or fractions of a second.

hvr · 2014-05-25T10:34:58Z

Unfortunately, it's quite common to see some variations of this syntax where the middle 'T' and the trailing 'Z' have been dropped.

Personally, I'm strongly -1 on dropping the Z part, as that's been a major source of annoying timezone bugs. If the data-source left off the Z part it was very often an indicator that someone didn't pay attention to timezone handling, and passed a local-time (lacking any explicit TZ info) where an utc-time was expected.

nurpax · 2014-05-25T12:48:04Z

@hvr has a point. For example, my Python example above uses local time, not UTC. Perhaps it'd be better not support by this by default but add a passage to docs on how to deal with non-standard UTC strings.

rbtcollins · 2015-02-14T10:11:19Z

Launchpads API outputs timestamps like so: 2014-04-03T15:58:03.572018+00:00 - so UTC but explicit rather than Z :/. But these are actually UTC times - just an explicit 0 offset. That should be safely handleable.

rbtcollins · 2015-02-14T11:42:43Z

FWIW the format I needed is handled by the ZonedTime class (Data.Time.ZonedTime rather than Clock.UTCTime in my data structure). Perhaps this is now just a docs issue?

bos · 2015-07-11T03:42:20Z

@nurpax I'd be quite interested in that pull request, especially if it's faster than the current horrible ZonedTime mess.

nurpax · 2015-07-12T12:51:56Z

@bos Sure. I put together a UTCTime parser with attoparsec. It beats the performance of the existing UTCTime parser by 6x (30 usec vs 5 usec per a datetime string).

The UTCTime parser still accepts many inputs that the earlier parser didn't. I'll clean that up so that UTCTime parser only accepts what it used to (parseTime defaultTimeLocale "%FT%T%QZ").

If the tests for ZonedTime are comprehensive, I guess it's possible to turn that into an attoparsec based parser too. Looks like a lot more work though. I'll put together a PR for UTCTime first and maybe follow up with ZonedTime later.

hvr · 2015-07-12T15:18:47Z

@nurpax I think it's ok if it accepts more inputs (which ought to be clearly documented) than the current parser does as long as it doesn't accept inputs lacking any timezone indicator

@lpsmith

Replace the old 'parseTime' based UTCTime parser with an attoparsec based parser. This is roughly 5x faster than the old one. It also allows decoding some timestamp formats that the old version rejected. The new version still requires all timestamps to specify a timezone when decoding to a UTCTime type. Also add a few unit tests for both what should be accepted and also for things that should be rejected. A lot of this code was bummed from the postgresql-simple library - thus credits to @lpsmith too!

nurpax · 2015-07-12T19:38:56Z

@hvr, @bos, I think the referenced PR captures the requirements expressed in this bug thread. I didn't touch ZonedTime because the new UTCTime parser should accepts the commonly occurring ISO-8601 formats, while still always requiring a timezone qualifier to be present. It departs from the original in that earlier only UTC (e.g., Zulu timezone) timestamps were accepted when decoding to UTCTime type. Now timestamps that specify a timezone can be implicitly converted to UTCTime by appropriately adjusting for the timezone offset.

bos · 2015-07-13T17:22:23Z

@nurpax Thanks for putting in the work to improve this. I'll review the pull request as soon as I can.

@lpsmith

Replace the old 'parseTime' based UTCTime parser with an attoparsec based parser. This is roughly 5x faster than the old one. It also allows decoding some timestamp formats that the old version rejected. The new version still requires all timestamps to specify a timezone when decoding to a UTCTime type. Also add a few unit tests for both what should be accepted and also for things that should be rejected. A lot of this code was bummed from the postgresql-simple library - thus credits to @lpsmith too!

bos · 2015-07-21T16:15:51Z

I believe this should now be all fixed up.

nurpax mentioned this issue May 8, 2014

Relax UTCTime format string #201

Closed

aaronlevin mentioned this issue Jan 5, 2015

Aeson does not handle Shopify's DateTime Format aaronlevin/haskell-shopify#5

Open

rbtcollins mentioned this issue Feb 14, 2015

deriveJSON and custom parsers #235

Closed

bos closed this as completed Jul 21, 2015

hvr mentioned this issue Apr 4, 2016

UTCTime instance and ISO8601 format w/o separators #386

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make accepted UTCTime formats more lenient #197

Make accepted UTCTime formats more lenient #197

nurpax commented Apr 28, 2014

nurpax commented Apr 28, 2014

cmsaperstein commented May 7, 2014

ghost commented May 25, 2014

hvr commented May 25, 2014

nurpax commented May 25, 2014

rbtcollins commented Feb 14, 2015

rbtcollins commented Feb 14, 2015

bos commented Jul 11, 2015

nurpax commented Jul 12, 2015

hvr commented Jul 12, 2015

nurpax commented Jul 12, 2015

bos commented Jul 13, 2015

bos commented Jul 21, 2015

Make accepted UTCTime formats more lenient #197

Make accepted UTCTime formats more lenient #197

Comments

nurpax commented Apr 28, 2014

nurpax commented Apr 28, 2014

cmsaperstein commented May 7, 2014

ghost commented May 25, 2014

hvr commented May 25, 2014

nurpax commented May 25, 2014

rbtcollins commented Feb 14, 2015

rbtcollins commented Feb 14, 2015

bos commented Jul 11, 2015

nurpax commented Jul 12, 2015

hvr commented Jul 12, 2015

nurpax commented Jul 12, 2015

bos commented Jul 13, 2015

bos commented Jul 21, 2015