Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix big endian handling of enums #435

Merged
merged 3 commits into from
Nov 3, 2021

Conversation

stanhu
Copy link
Contributor

@stanhu stanhu commented Nov 2, 2021

Previously using an enum for a field on big-endian process such as the
s390x would fail to decode properly. We need to munge it in the correct
byte order before decoding it.

Previously using an enum for a field on big-endian process such as the
s390x would fail to decode properly. We need to munge it in the correct
byte order before decoding it.
@stanhu
Copy link
Contributor Author

stanhu commented Nov 2, 2021

Hmm, the bazel test is failing:

exec ${PAGER:-/usr/bin/less} "$0" || exit 1
Executing tests from //:msg_test
-----------------------------------------------------------------------------
Running main() from gmock_main.cc
[==========] Running 6 tests from 1 test suite.
[----------] Global test environment set-up.
[----------] 6 tests from MessageTest
[ RUN      ] MessageTest.Extensions
[       OK ] MessageTest.Extensions (1 ms)
[ RUN      ] MessageTest.MessageSet
[       OK ] MessageTest.MessageSet (0 ms)
[ RUN      ] MessageTest.Proto2Enum
upb/msg_test.cc:226: Failure
Value of: std::vector<int32_t>(vals_const, vals_const + size)
Expected: has 6 elements where
element #0 is equal to 0,
element #1 is equal to 15,
element #2 is equal to 12345,
element #3 is equal to -1,
element #4 is equal to 7,
element #5 is equal to 888
  Actual: { 0, 0, 0, 0, 0, -1 }, whose element #1 doesn't match
[  FAILED  ] MessageTest.Proto2Enum (1 ms)
[ RUN      ] MessageTest.TestBadUTF8
[       OK ] MessageTest.TestBadUTF8 (0 ms)
[ RUN      ] MessageTest.RequiredFieldsTopLevelMessage
[       OK ] MessageTest.RequiredFieldsTopLevelMessage (0 ms)
[ RUN      ] MessageTest.RequiredFieldsSubMessage
[       OK ] MessageTest.RequiredFieldsSubMessage (0 ms)
[----------] 6 tests from MessageTest (2 ms total)

[----------] Global test environment tear-down
[==========] 6 tests from 1 test suite ran. (2 ms total)
[  PASSED  ] 5 tests.
[  FAILED  ] 1 test, listed below:
[  FAILED  ] MessageTest.Proto2Enum

 1 FAILED TEST

Still, it previously failed with this?

exec ${PAGER:-/usr/bin/less} "$0" || exit 1
Executing tests from //:msg_test
-----------------------------------------------------------------------------
Running main() from gmock_main.cc
[==========] Running 6 tests from 1 test suite.
[----------] Global test environment set-up.
[----------] 6 tests from MessageTest
[ RUN      ] MessageTest.Extensions
Error loading compiled-in descriptor for file 'google/protobuf/struct.proto' (this should never happen): invalid type for field google.protobuf.Struct.fields (0)
Error loading compiled-in descriptor for file 'src/google/protobuf/test_messages_proto3.proto' (this should never happen):
Error loading compiled-in descriptor for file 'upb/msg_test.proto' (this should never happen):
upb/msg_test.cc:62: Failure
Value of: m.ptr() != nullptr
  Actual: false
Expected: true

@stanhu stanhu changed the title Fix big endian handling of enums Draft: Fix big endian handling of enums Nov 2, 2021
@stanhu stanhu marked this pull request as draft November 2, 2021 08:29
@stanhu stanhu changed the title Draft: Fix big endian handling of enums Fix big endian handling of enums Nov 2, 2021
@stanhu
Copy link
Contributor Author

stanhu commented Nov 2, 2021

I think the decode path is fine, but I think there is an encoding problem with the packed_enum type:

Good

$ ~/github/protobuf/src/protoc -I$(pwd)/upb -I/Users/stanhu/github/protobuf -I/Users/stanhu/github/protobuf/src --decode=upb_test.Proto2FakeEnumMessage msg_test.proto < /tmp/test-proto2-good.bin
optional_enum: 999
repeated_enum: 0
repeated_enum: 15
repeated_enum: 12345
repeated_enum: -1
repeated_enum: 7
repeated_enum: 888
packed_enum: 0
packed_enum: 15
packed_enum: 12345
packed_enum: -1
packed_enum: 7
packed_enum: 888

Bad

$ ~/github/protobuf/src/protoc -I$(pwd)/upb -I/Users/stanhu/github/protobuf -I/Users/stanhu/github/protobuf/src --decode=upb_test.Proto2FakeEnumMessage msg_test.proto < /tmp/test-proto2.bin
optional_enum: 999
repeated_enum: 0
repeated_enum: 15
repeated_enum: 12345
repeated_enum: -1
repeated_enum: 7
repeated_enum: 888
packed_enum: 0
packed_enum: 0
packed_enum: 0
packed_enum: 0
packed_enum: 0
packed_enum: -1

Comparison:

Good

$ hexdump -C test-proto2-good.bin
00000000  10 00 10 0f 10 b9 60 10  ff ff ff ff ff ff ff ff  |......`.........|
00000010  ff 01 1a 0e 00 0f b9 60  ff ff ff ff ff ff ff ff  |.......`........|
00000020  ff 01 08 e7 07 10 07 10  f8 06 18 07 18 f8 06     |...............|
0000002f

Bad

$ hexdump -C test-proto2.bin
00000000  10 00 10 0f 10 b9 60 10  ff ff ff ff ff ff ff ff  |......`.........|
00000010  ff 01 1a 0f 00 00 00 00  00 ff ff ff ff ff ff ff  |................|
00000020  ff ff 01 08 e7 07 10 07  10 f8 06                 |...........|
0000002b

proto.zip

@haberman
Copy link
Member

haberman commented Nov 2, 2021

Good find in the decoder!

@stanhu
Copy link
Contributor Author

stanhu commented Nov 2, 2021

I'll mark this pull request as ready since I've confirmed there is a separate problem with packed enums.

@stanhu stanhu marked this pull request as ready for review November 2, 2021 18:38
@stanhu
Copy link
Contributor Author

stanhu commented Nov 2, 2021

Eh, this actually could still be a decode problem. I'm not seeing the right bits when the encode starts.

@haberman
Copy link
Member

haberman commented Nov 2, 2021

Perhaps decode_enum_packed() needs a call to decode_munge().

@stanhu
Copy link
Contributor Author

stanhu commented Nov 2, 2021

Thanks! That did it indeed.

upb/decode.c Outdated Show resolved Hide resolved
@stanhu stanhu force-pushed the sh-fix-enum-handling-be branch 2 times, most recently from 7cd7dc6 to 8faf897 Compare November 2, 2021 22:17
This avoids an additional type check when decoding packed enums.
@haberman haberman merged commit dee1238 into protocolbuffers:main Nov 3, 2021
@haberman
Copy link
Member

haberman commented Nov 3, 2021

Thanks! btw, how did you get Bazel working on s390x?

@stanhu
Copy link
Contributor Author

stanhu commented Nov 3, 2021

@haberman This workaround helped me sidestep the compiler warnings: bazelbuild/bazel#13597 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants