-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unable to derive serializers #6
Comments
Hi! Thanks for your interest. |
On other side, the |
I'd prefer not to use Marshal, as it breaks between OCaml compiler versions which is a real problem for us. A I completely understand not wanting to expose internals (especially as it breaks type encapsulation on the resultant type), but we'd then need to save a representation of the parametrized type so that the serialize would know how to reconstruct that type in deserialization. A to/from bytes or string implementation could do that, but would require significant care to ensure that both sides of the operation worked properly. Saving the implementation type works more reliably (technically), but requires care from the developer to know what the type of the original instance was and only compare with that type. I'm fine with the latter, but I'm only comparing one type. |
Indeed, I actually forgot about #1.
The bloom filters do not contain a representation of the parametrized type. (De)serialization is by essence unsafe, so it's always the responsibility of the user/storage to ensure the data is used correctly when written somewhere else, regardless of the serialization strategy. |
Seeds, unless I misread the PR.
Of course not - that is exactly the problem. Refinements could be made to that question in regards to the parametrized type's definition evolving or something, if you'd like, but this works for me for now. How do we approach that problem? As mentioned, I'm new to the OCaml world, but quite well versed in the OO world. My first thought is reflection or introspection, but that requires some level of runtime type-checking and type preservation (which - to my understanding - OCaml does away with during compilation). How would you approach this problem? I'm happy to explore and attempt an implementation, but don't want to waste anyone's time with efforts that won't be accepted upstream.
Of course. But many of the oft-cited problems with serialization and deserialization fall outside the responsibility (or even the ability) of a library such as this to encapsulate. Dealing with changed type definitions or mutated hash functions? Someone else has access to your storage drives and can muck with your data? There's only so much that a language can do here. |
Indeed: OCaml values are not tagged with their type at runtime; the assumption being that type errors are for the type checker 🙂 As a result, it's not natural in OCaml for serialised values to contain type information either: it's the users responsibility to ensure that type errors do not happen, and neither the OCaml runtime nor the Having said that it's not natural in OCaml, it's still possible. We can imagine an interface where the codecs require the user to pass an explicit "type representation" (like a JVM klass pointer, but well-typed): val to_bytes : 'a type_repr -> 'a bloomf -> bytes
val of_bytes : 'a type_repr -> bytes -> 'a bloomf
(* e.g. *)
let buffer : int list bloomf = Bloomf.to_bytes Type.(list int) filter which could then be included in the
In short: since dynamic typing can be built on top of static typing (but not vice-versa without a performance penalty), we tend to go with the flow of the OCaml language and not have runtime type representations. If this was a language that had already paid the performance penalty of dynamic types, it might be a different situation 🙂 |
So I fail to see how this leaves us in any different a scenario than simply serializing and deserializing |
That would be the way to go, my point was that this should be handled by internal functions (e.g.
No problem :) |
I did not intend to imply that they wouldn't. Far from it, in fact, as I believe this would be a good answer! My initial solution is, admittedly, aimed at only my immediate use case (using I remain unconvinced that I yet understand enough to do this from scratch, but I have attempted to make that clear :P What other state need be serialized? I don't see anything other than |
That should indeed be enough.
An int is probably not enough, but
That sounds ok to me! Maybe check that the Bitv serialisation does not use that specific character though, or put it at the end. Let me know if you have any issues implementing this so I can help, or do it if I get the time to :) |
Seems I misread |
I've got what appears to be a working (though maybe not idiomatic) implementation. I'll open a PR for discussion today. I'll have to familiarize myself with alcotest to get some tests working before I feel comfortable with it getting merged though. |
Added in d581f77, thanks! |
I'm currently working on a project where I would like to serialize and de-serialize instances of
Bloomf.t
to and from disk.Because the main type is abstract over an input type and the actual implementation is not exposed, I don't have access from external code to implement {de,}serializers.
One very simple solution to this would be to simply expose
priv
.Shortly, I'll be opening a PR that does just this (though it also renames
priv
toimpl
since it is no longer private).This is not 100% ideal, as it leaves up to the consumer to reconstruct the abstracted type from context,
but I'm not yet skilled enough to come up with a better solution in a vacuum.
Do you all have thoughts on how one might implement fully type-safe serialization support here?
The text was updated successfully, but these errors were encountered: