-
Notifications
You must be signed in to change notification settings - Fork 845
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Replace Binary with something more reliable for file storage #889
Comments
https://github.com/well-typed/binary-serialise-cbor https://www.youtube.com/watch?v=60gUaOuZZsE it's not released but might be an option. CBOR is essentially binary json, there are tags indicating what data comes next in the binary stream. |
Awesome, thanks for the link. I remembered hearing @dcoutts discuss this at some point, but didn't know what came of it. Duncan: are there plans to release this in the near term? And there doesn't seem to be any |
@snoyberg I have made a small package https://github.com/phadej/binary-tagged during "the morning hack". It's missing a bit of essential instances; but could it be used as a possible solution to this? It's not yet on Hackage, but it could be as soon as I verify API in our own dependant project. |
It definitely looks like an improvement. My concerns are:
|
First question: it allows explicit versioning, but data is implicitly versioned by its structure: generic-derivable Second question: If only top-level data is tagged, then the performance hit shouldn't be noticeable. I'll write some trivial benchmarks to verify that. |
Sounds good, I'm liking where this is headed :) |
With https://github.com/phadej/binary-tagged/blob/wip/bench/Bench.hs (which is quite nice usage example as well) the result is: I'll try to finalise and realise the package during this week. I'll ping this issue after that. |
Very nice, thanks! On Mon, Sep 7, 2015 at 6:47 PM, Oleg Grenrus notifications@github.com
|
First release http://hackage.haskell.org/package/binary-tagged / https://github.com/phadej/binary-tagged, some (almost) real world usage in https://github.com/futurice/haskell-flowdock-rest/blob/grep/flowdock-grep/src/Main.hs As the latter is under active development, |
Cool. Is this ready to go? And by any chance, would you be interested in making the change to the stack codebase? It should be isolated to just a few specific places (anything calling Binary.encode* or Binary.decode*). |
I can check them out, sure. |
Thanks! |
Binary
instances are quite problematic for file storage, since they store no metadata about the schema itself. This has led multiple times to out-of-memory issues because anInt
was parsed as a list size, for example. We have some hacks in the code base to essentially embed magic numbers to work around this, but there are certainly better ways we can solve this with improved libraries.A non-solution: use JSON. We tried this already, and the performance hit was significant. We need a fast (de)serialization.
The text was updated successfully, but these errors were encountered: