Explain why C++ and Java serializations differ (in test suite) #28

rw · 2014-07-07T17:11:02Z

The C++-generated and Java-generated serializations in the tests/ directory differ. Is this explained anywhere? If not, I don't see how to tell whether it's a bug or expected behavior.

More details:

$ diff tests/monsterdata_test_wire.bin tests/monsterdata_java_wire.bin
Binary files tests/monsterdata_test_wire.bin and tests/monsterdata_java_wire.bin differ

$ xxd tests/monsterdata_test_wire.bin
0000000: 2800 0000 0000 0000 0000 0000 1c00 3c00  (.............<.
0000010: 0800 0000 0600 2c00 0000 3000 0000 0500  ......,...0.....
0000020: 3400 3800 0000 0000 1c00 0000 0001 5000  4.8...........P.
0000030: 0000 803f 0000 0040 0000 4040 0000 0000  ...?...@..@@....
0000040: 0000 0000 0000 0840 0400 0500 0600 0000  .......@........
0000050: 0000 0000 4c00 0000 3c00 0000 3000 0000  ....L...<...0...
0000060: 0400 0000 0200 0000 0a00 1400 1e00 2800  ..............(.
0000070: 1c00 0800 0000 0000 0600 0000 0000 0000  ................
0000080: 0000 0000 0000 0000 0000 0000 1c00 0000  ................
0000090: 0000 1400 0500 0000 0001 0203 0400 0000  ................
00000a0: 0900 0000 4d79 4d6f 6e73 7465 7200 0000  ....MyMonster...

$ xxd tests/monsterdata_java_wire.bin
0000000: 2400 0000 0000 0000 1c00 4400 1c00 0000  $.........D.....
0000010: 1a00 1400 0000 1000 0000 0f00 0800 0400  ................
0000020: 0000 0000 1c00 0000 4000 0000 6400 0000  ........@...d...
0000030: 0000 0001 6300 0000 6800 0000 0000 5000  ....c...h.....P.
0000040: 0000 803f 0000 0040 0000 4040 0000 0000  ...?...@..@@....
0000050: 0000 0000 0000 0840 0400 0500 0600 0000  .......@........
0000060: 0000 0000 0000 0000 0200 0000 1e00 2800  ..............(.
0000070: 0a00 1400 1c00 0700 0000 0000 0400 0000  ................
0000080: 0000 0000 0000 0000 0000 0000 0000 0000  ................
0000090: 1c00 0000 1400 0005 0000 0000 0102 0304  ................
00000a0: 0900 0000 4d79 4d6f 6e73 7465 7200 0000  ....MyMonster...

The text was updated successfully, but these errors were encountered:

ghost · 2014-07-07T17:22:08Z

From http://google.github.io/flatbuffers/md__internals.html:

"On purpose, the format leaves a lot of details about where exactly things live in memory undefined, e.g. fields in a table can have any order, and objects to some extend can be stored in many orders. This is because the format doesn't need this information to be efficient, and it leaves room for optimization and extension (for example, fields can be packed in a way that is most compact). Instead, the format is defined in terms of offsets and adjacency only."

The C++ .bin file is generated by the JSON parser, which packs the table fields in a different order. Different order can also cause different alignment padding which can even make the size differ by a small amount.

I agree this can be a bit of a surprise, though it is important that implementers of readers/writers understand this issue. I should probably explicitly mention below that paragraph that this can cause different binaries that are all compatible with one another. Any other suggestions welcome.

build.py のリファクタもろもろ

ghost closed this as completed Aug 13, 2014

kakikubo pushed a commit to kakikubo/flatbuffers that referenced this issue Apr 19, 2016

Merge pull request google#28 from kiyoto-suzuki/feature/refactor_build

c2e36fe

build.py のリファクタもろもろ

This issue was closed.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Explain why C++ and Java serializations differ (in test suite) #28

Explain why C++ and Java serializations differ (in test suite) #28

rw commented Jul 7, 2014

ghost commented Jul 7, 2014

Explain why C++ and Java serializations differ (in test suite) #28

Explain why C++ and Java serializations differ (in test suite) #28

Comments

rw commented Jul 7, 2014

ghost commented Jul 7, 2014