Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP

Loading…

Allow overriding the type unpacked from MSGPACK_OBJECT_RAW #6

Open
bpot opened this Issue · 12 comments

8 participants

Bob Potter Peter Griess Michael J. Ryan James Howe Jacob Wright Chakrit Wichian Christopher Mooney Jean-Michel Hiver
Bob Potter

Having String as the unpacking type for MSGPACK_OBJECT_RAW seems to make it impossible to correctly pack and unpack binary data. Probably because of the coercion that v8 does to turn binary string into utf8/ascii. Switching it to return a Buffer object (http://gist.github.com/573317) fixed the issue.

If you run this script on a binary file you can see how it fails: http://gist.github.com/573312

Peter Griess
Owner

Yeah, this behavior kind of sucks. Unfortunately, fixing this is a thornier problem than it might initially seem.

The root of the problem is that the MsgPack serialization format does not differentiate between strings and blobs of raw bytes (all are serialized as MSGPACK_OBJECT_RAW), so we have to pick one format to de-serialize to. Strings are far more common than Buffer objects, IMO. Particularly when inter-operating with other MsgPack-RPC clients.

Anyway, I don't think I'm willing to change the default unpacking type as you suggest, but I can provide a way to override this if needed. Does that work for you?

Bob Potter

Yeah, I do see how it's kind of a difficult issue. I think being able to override the default unpacking type makes a lot of sense.

Michael J. Ryan

Why not just put the UTF-8 encoded byte order marker when you are passing a string in... then when decoding, you look for that marker and decode those raw entries with the BOM to String, otherwise Buffer.

James Howe

While Strings may be more common than Buffers, if you unpack a String as a Buffer you can easily make String from it, but you cannot recreate the correct binary data if it's been unpacked to a String.

Therefore, there really really should be at least the option to decode MSGPACK_OBJ_RAW as a Buffer. Otherwise it becomes 100% impossible to use this library to decode raw binary data correctly.

James Howe

Either like 64c567a or 1c7fe3f.
The latter is my preferred choice.

Jacob Wright

Personally I think strings should be the default since msgpack is designed as a faster replacement of JSON. However, going with the latter and allowing the developer to choose to work with buffers instead I think is fine.

Chakrit Wichian

This has been fixed in the new v5 spec, please see here: https://gist.github.com/frsyuki/5432559

Chakrit Wichian

I have also filed a separate issue: #49

Christopher Mooney
Collaborator

I'm working on another project for a few weeks, but might be able to hack in support for this after that. pull requests from the impatient and motivated are welcome.

Jean-Michel Hiver

Sorry if this seems silly, but would it be thinkable to just have a different type for buffers, for instance MSGPACK_OBJECT_BUFFER? This way you could serialize / deserialize it properly...

James Howe

This way you could serialize / deserialize it properly...

Not if you're trying to communicate with a different implementation.

Jacob Wright

The msgpack specification now supports the differentiation between strings and buffers. It's been implemented in https://npmjs.org/package/msgpack-js-v5 (probably to avoid breaking libraries using this module).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.