Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Explain why C++ and Java serializations differ (in test suite) #28

Closed
rw opened this issue Jul 7, 2014 · 1 comment
Closed

Explain why C++ and Java serializations differ (in test suite) #28

rw opened this issue Jul 7, 2014 · 1 comment

Comments

@rw
Copy link
Collaborator

rw commented Jul 7, 2014

The C++-generated and Java-generated serializations in the tests/ directory differ. Is this explained anywhere? If not, I don't see how to tell whether it's a bug or expected behavior.

More details:

$ diff tests/monsterdata_test_wire.bin tests/monsterdata_java_wire.bin
Binary files tests/monsterdata_test_wire.bin and tests/monsterdata_java_wire.bin differ
$ xxd tests/monsterdata_test_wire.bin
0000000: 2800 0000 0000 0000 0000 0000 1c00 3c00  (.............<.
0000010: 0800 0000 0600 2c00 0000 3000 0000 0500  ......,...0.....
0000020: 3400 3800 0000 0000 1c00 0000 0001 5000  4.8...........P.
0000030: 0000 803f 0000 0040 0000 4040 0000 0000  ...?...@..@@....
0000040: 0000 0000 0000 0840 0400 0500 0600 0000  .......@........
0000050: 0000 0000 4c00 0000 3c00 0000 3000 0000  ....L...<...0...
0000060: 0400 0000 0200 0000 0a00 1400 1e00 2800  ..............(.
0000070: 1c00 0800 0000 0000 0600 0000 0000 0000  ................
0000080: 0000 0000 0000 0000 0000 0000 1c00 0000  ................
0000090: 0000 1400 0500 0000 0001 0203 0400 0000  ................
00000a0: 0900 0000 4d79 4d6f 6e73 7465 7200 0000  ....MyMonster...
$ xxd tests/monsterdata_java_wire.bin
0000000: 2400 0000 0000 0000 1c00 4400 1c00 0000  $.........D.....
0000010: 1a00 1400 0000 1000 0000 0f00 0800 0400  ................
0000020: 0000 0000 1c00 0000 4000 0000 6400 0000  ........@...d...
0000030: 0000 0001 6300 0000 6800 0000 0000 5000  ....c...h.....P.
0000040: 0000 803f 0000 0040 0000 4040 0000 0000  ...?...@..@@....
0000050: 0000 0000 0000 0840 0400 0500 0600 0000  .......@........
0000060: 0000 0000 0000 0000 0200 0000 1e00 2800  ..............(.
0000070: 0a00 1400 1c00 0700 0000 0000 0400 0000  ................
0000080: 0000 0000 0000 0000 0000 0000 0000 0000  ................
0000090: 1c00 0000 1400 0005 0000 0000 0102 0304  ................
00000a0: 0900 0000 4d79 4d6f 6e73 7465 7200 0000  ....MyMonster...
@ghost
Copy link

ghost commented Jul 7, 2014

From http://google.github.io/flatbuffers/md__internals.html:

"On purpose, the format leaves a lot of details about where exactly things live in memory undefined, e.g. fields in a table can have any order, and objects to some extend can be stored in many orders. This is because the format doesn't need this information to be efficient, and it leaves room for optimization and extension (for example, fields can be packed in a way that is most compact). Instead, the format is defined in terms of offsets and adjacency only."

The C++ .bin file is generated by the JSON parser, which packs the table fields in a different order. Different order can also cause different alignment padding which can even make the size differ by a small amount.

I agree this can be a bit of a surprise, though it is important that implementers of readers/writers understand this issue. I should probably explicitly mention below that paragraph that this can cause different binaries that are all compatible with one another. Any other suggestions welcome.

@ghost ghost closed this as completed Aug 13, 2014
kakikubo pushed a commit to kakikubo/flatbuffers that referenced this issue Apr 19, 2016
This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant