Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[bug] C++ serialized message and Java, JS, Swift serialized messages are different #3570

Closed
Warchant opened this issue Aug 26, 2017 · 3 comments

Comments

@Warchant
Copy link

Warchant commented Aug 26, 2017

Description

I tried to create simple message (Msg below) in different languages: C++, Java, JavaScript, Swift, serialized them and expected to get the same blob.

Actual result is:
C++:
8FFFFFFB9A10FFFFFFBAA1A106C6F6C206B656B20636865627572656B
Java, JavaScript, Swift:
08B90A10BA0A1A106C6F6C206B656B20636865627572656B

Why is it so? Is it a bug?

$ uname -a 
Linux xps 4.10.0-32-generic #36~16.04.1-Ubuntu SMP Wed Aug 9 09:19:02 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux`
$ protoc --version
libprotoc 3.3.0

I posted the same question to stackoverflow:
https://stackoverflow.com/questions/45896187/is-there-any-stable-serialization-method-for-different-languages

test.proto

syntax = "proto3";
package api;

message Msg {
    uint32 a = 1;
    int32  b = 2;
    string c = 3;
    bytes  d = 4;
}

test.cpp

#include <gtest/gtest.h>
#include <proto_deterministic_test.pb.h>

TEST(PROTO, IsProtoDeterministic) {
  api::Msg msg;

  msg.set_a(1337);
  msg.set_b(1338);
  msg.set_c("lol kek cheburek");
  // d is empty intentionally

  std::string str = msg.SerializeAsString();

  std::cout << str.size() << std::endl;
  std::for_each(str.begin(), str.end(),
                [](int i) { std::cout << std::hex << std::uppercase << i; });
  std::cout << std::endl;
  // prints 8FFFFFFB9A10FFFFFFBAA1A106C6F6C206B656B20636865627572656B
  // expected:       08B90A10BA0A1A106C6F6C206B656B20636865627572656B
  /// deserialize
  api::Msg des;

  // C++ version of parser is able to parse data generated by Javascript, Java, Swift
  std::vector<int> v{0x08, 0xb9, 0x0a, 0x10, 0xba, 0x0a, 0x1a, 0x10,
                     0x6c, 0x6f, 0x6c, 0x20, 0x6b, 0x65, 0x6b, 0x20,
                     0x63, 0x68, 0x65, 0x62, 0x75, 0x72, 0x65, 0x6b};
  des.ParseFromString(std::string{v.begin(), v.end()});

  std::cout << std::dec
            << (int)des.a() << std::endl
            << (int)des.b() << std::endl
            <<      des.c() << std::endl;
// prints (as expected) 
// 1337
// 1338
// lol kek cheburek
}

test.js

var api = require('./proto_deterministic_test_pb');

function bytesToHex(bytes) {
  for (var hex = [], i = 0; i < bytes.length; i++) {
    hex.push((bytes[i] >>> 4).toString(16));
    hex.push((bytes[i] & 0xF).toString(16));
  }
  return hex.join("");
}

function hexToBytes(hex) {
  for (var bytes = [], c = 0; c < hex.length; c += 2)
    bytes.push(parseInt(hex.substr(c, 2), 16));
  return bytes;
}

// create the same object with the same fields
var a = new api.Msg();
a.setA(1337);
a.setB(1338);
a.setC("lol kek cheburek");

var bytes = a.serializeBinary();
var hbytes = bytesToHex(bytes);

console.log(hbytes)
// prints 08b90a10ba0a1a106c6f6c206b656b20636865627572656b

// data from C++ can not be parsed in JS, prints AssertionError (below)
var hex = "8FFFFFFB9A10FFFFFFBAA1A106C6F6C206B656B20636865627572656B";
var hex = hexToBytes(hex);
var msg = api.Msg.deserializeBinary(hex);

console.log(msg);
AssertionError in JS
goog.string.splitLimit=function(a,b,c){a=a.split(b);for(var d=[];0<c&&a.length;)d.push(a.shift()),c--;a.length&&d.push(a.join(b));return d};goog.string.editDistance=function(a,b){var c=[],d=[];if(a==b)return 0;if(!a.length||!b.length)return Math.max(a.length,b.length);for(var e=0;e<b.length+1;e++)c[e]=e;for(e=0;e<a.length;e++){d[0]=e+1;for(var f=0;f<b.length;f++)d[f+1]=Math.min(d[f]+1,c[f+1]+1,c[f]+Number(a[e]!=b[f]));for(f=0;f<c.length;f++)c[f]=d[f]}return d[b.length]};goog.asserts={};goog.asserts.ENABLE_ASSERTS=goog.DEBUG;goog.asserts.AssertionError=function(a,b){b.unshift(a);goog.debug.Error.call(this,goog.string.subs.apply(null,b));b.shift();this.messagePattern=a};goog.inherits(goog.asserts.AssertionError,goog.debug.Error);goog.asserts.AssertionError.prototype.name="AssertionError";goog.asserts.DEFAULT_ERROR_HANDLER=function(a){throw a;};goog.asserts.errorHandler_=goog.asserts.DEFAULT_ERROR_HANDLER;
                   
AssertionError
    at new goog.asserts.AssertionError (/home/bogdan/tools/iroha/test/libs/node_modules/google-protobuf/google-protobuf.js:98:603)
    at Object.goog.asserts.doAssertFailure_ (/home/bogdan/tools/iroha/test/libs/node_modules/google-protobuf/google-protobuf.js:99:126)
    at Object.goog.asserts.assert (/home/bogdan/tools/iroha/test/libs/node_modules/google-protobuf/google-protobuf.js:99:385)
    at jspb.BinaryDecoder.readUnsignedVarint32 (/home/bogdan/tools/iroha/test/libs/node_modules/google-protobuf/google-protobuf.js:319:176)
    at jspb.BinaryReader.nextField (/home/bogdan/tools/iroha/test/libs/node_modules/google-protobuf/google-protobuf.js:336:219)
    at Function.proto.api.Msg.deserializeBinaryFromReader (/home/bogdan/tools/iroha/test/libs/proto_deterministic_test_pb.js:93:17)
    at Function.proto.api.Msg.deserializeBinary (/home/bogdan/tools/iroha/test/libs/proto_deterministic_test_pb.js:81:24)
    at Object.<anonymous> (/home/bogdan/tools/iroha/test/libs/test.js:30:19)
    at Module._compile (module.js:569:30)
    at Object.Module._extensions..js (module.js:580:10)

@liujisi
Copy link
Contributor

liujisi commented Aug 26, 2017

The problem is how you print the binary string in hex form. The 2nd char 0xB9 is negative. When casting into an int, it's sign extended to 0xFFFFFFB9. Similar to the 0xFFFFFFBA conversion on the 6th char.

Changing your lamda to:

[](unsigned char i) { std::cout << std::hex << std::uppercase << (int) i; }

should fix the issue

@liujisi liujisi closed this as completed Aug 26, 2017
@liujisi
Copy link
Contributor

liujisi commented Aug 26, 2017

Also, you probably want to preserve leading 0 when hex printing a byte, (e.g. the first byte should be 08 instead of 8) to avoid false positives in equal checks. Otherwise e.g. 101 can be {0x10, 0x1} or {0x1, 0x0, 0x1}

@Warchant
Copy link
Author

Thank you so much.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants