Allow overriding the type unpacked from MSGPACK_OBJECT_RAW #6

bpot opened this Issue Sep 10, 2010 · 12 comments


None yet

8 participants


Having String as the unpacking type for MSGPACK_OBJECT_RAW seems to make it impossible to correctly pack and unpack binary data. Probably because of the coercion that v8 does to turn binary string into utf8/ascii. Switching it to return a Buffer object ( fixed the issue.

If you run this script on a binary file you can see how it fails:


Yeah, this behavior kind of sucks. Unfortunately, fixing this is a thornier problem than it might initially seem.

The root of the problem is that the MsgPack serialization format does not differentiate between strings and blobs of raw bytes (all are serialized as MSGPACK_OBJECT_RAW), so we have to pick one format to de-serialize to. Strings are far more common than Buffer objects, IMO. Particularly when inter-operating with other MsgPack-RPC clients.

Anyway, I don't think I'm willing to change the default unpacking type as you suggest, but I can provide a way to override this if needed. Does that work for you?


Yeah, I do see how it's kind of a difficult issue. I think being able to override the default unpacking type makes a lot of sense.


Why not just put the UTF-8 encoded byte order marker when you are passing a string in... then when decoding, you look for that marker and decode those raw entries with the BOM to String, otherwise Buffer.


While Strings may be more common than Buffers, if you unpack a String as a Buffer you can easily make String from it, but you cannot recreate the correct binary data if it's been unpacked to a String.

Therefore, there really really should be at least the option to decode MSGPACK_OBJ_RAW as a Buffer. Otherwise it becomes 100% impossible to use this library to decode raw binary data correctly.


Either like 64c567a or 1c7fe3f.
The latter is my preferred choice.


Personally I think strings should be the default since msgpack is designed as a faster replacement of JSON. However, going with the latter and allowing the developer to choose to work with buffers instead I think is fine.


This has been fixed in the new v5 spec, please see here:


I have also filed a separate issue: #49


I'm working on another project for a few weeks, but might be able to hack in support for this after that. pull requests from the impatient and motivated are welcome.


Sorry if this seems silly, but would it be thinkable to just have a different type for buffers, for instance MSGPACK_OBJECT_BUFFER? This way you could serialize / deserialize it properly...


This way you could serialize / deserialize it properly...

Not if you're trying to communicate with a different implementation.


The msgpack specification now supports the differentiation between strings and buffers. It's been implemented in (probably to avoid breaking libraries using this module).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment