utf8 problems with msgpack? #15
Comments
I have similar problems to_msgpack() -> redis -> MessagePack.unpack(data) leads to UTF8 errors. It seems to happen with hashes and this is what the unpacked data looks like: (ruby 1.9.3 preview 1) |
Did either of you have any luck figuring this out? We're having a similar issue. |
Nope, never got down to the bottom of it.. |
I ran into the same issue with to_msgpack -> redis -> MessagePack.unpack. I tracked it down to a single UTF character Forcing ASCII-8BIT encoding before deserialization seems to fix the problem. |
Experienced the same problem - "force_encoding" solution described by @sgtFloyd fixed it! |
The redis-rb gem forces the Redis response encoding to Encoding::default_external in Redis::Connection::CommandHelper -- this is logical, as the string is coming from an external I/O stream so it uses the default here, which in most setups is UTF-8. The normal case of setting/getting UTF-8 encoded strings in Redis works as expected. But MessagePack is a binary serialization format, and it expects to unpack from a raw binary string, so you need to force the string you get from redis-rb into binary (or ASCII-8BIT as @sgtFloyd suggested above):
I think the MessagePack.unpack method itself should perform this force_encoding in a future version, but for now we have to do it ourselves. |
As each language implementation was separated, please open another issue at each repository if this is still problematic. Thank you. |
A bit of a shot in the dark, but has anyone come across problems with utf8 + msgpack? I'm using the Ruby bindings. Logged ~500 GB of data in zmpac format (stream + zlib), in ~200mb chunks (~1GB uncompressed). Trying to read the data back, and running into parse errors on random files.
Haven't had much luck tracking down the culprit so far, but if I try to sysread chunks of the file 1024 bytes at a time, and parse out the messages.. once the message is thrown, and I dump the buffer, I am seeing chinese characters, etc.
Same behavior under 1.8 and under 1.9. Any suggestions for how to recover this data, and/or any other tips?
The text was updated successfully, but these errors were encountered: