New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
UTCTime serialisation slow #51
Comments
The actual format is part of the standard: https://tools.ietf.org/html/rfc7049#section-2.4.1 so there's probably no wiggle room in changing that to the nicer |
You could |
It'd be interesting to see where you got your instance Binary UTCTime where
put (UTCTime a b) = put a >> put b
get = UTCTime <$> get <*> get
instance Binary Day where
put (ModifiedJulianDay d) = put d
get = ModifiedJulianDay <$> get
instance Binary DiffTime where
put = put . fromEnum
get = toEnum <$> get
... or whatever? I can definitely believe that the time formatting functions are simply slow no matter what, since they have to roundtrip through Second, if you can file a synthetic benchmark or something with our If this is an insurmountable problem, it would probably be OK to provide another function like That said, providing the canonical instance is, IMO correct (we don't want orphans for this). It sucks that it's so slow, though. |
@thoughtpolice - yea, that's pretty close: https://hackage.haskell.org/package/binary-orphans-0.1.4.0/docs/src/Data.Binary.Orphans.html I'm trying to add a
|
Removed that benchmark for now while testing. See #52. I'm totally unclear on whether that benchmark actually does what I want (with |
Oh yes, and Generic might be especially bad due to some GHC bugs. I'll test for a bit. |
Sigh. It's unfortunate that it's so slow. So we have two choices:
|
I suspect this is a problem for our application as well, which serializes large datasets full of Alternatively, I notice that the CBOR standard permits POSIX-style timestamps as well (tag 1). Though those don't seem to be supported by the |
Looks like |
Right, so Aeson uses ISO-8601 for encoding |
We are going to move to a new, more reasonable encoding for the One possible encoding is two integers: days since the 0 epoch and picoseconds. |
It looks like this will go into 0.2 regardless at this rate, so I'm queuing this for the initial release. |
I have submitted an IANA request for the new tags but it will take some time for the process to take it course. I don't think we should hold the release for this. |
Fixed by d093bb6. |
Errr, actually fixed by cad4c72! (Ben accidentally force pushed over the last commit) |
For testing I replaced my
Binary
instances withSerialise
from this package and serialization performance was 10-20x slower. I narrowed it down to UTCTime. Replacing that with a dummy empty serialiser speeds up my code from 3 seconds to 150 milliseconds.Instance in question:
https://github.com/well-typed/binary-serialise-cbor/blob/9528877a4d85642be787c05efee39d2b3e0e078e/Data/Binary/Serialise/CBOR/Class.hs#L402
The text was updated successfully, but these errors were encountered: