Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to decode proto3 message from Retrofit's response error body #1056

Closed
pavlospt opened this issue Jul 9, 2019 · 19 comments
Closed

Unable to decode proto3 message from Retrofit's response error body #1056

pavlospt opened this issue Jul 9, 2019 · 19 comments

Comments

@pavlospt
Copy link
Contributor

pavlospt commented Jul 9, 2019

Problem

Hey, I am trying to decode and create a new protobuf model, from the bytes located in Retrofit's Response.errorBody().

Reproducing

The way I am doing this, is get an HttpException with status code 402 from our backend, containing a specific proto message in the error body.
After accessing the errorBody() from the response, I have the following:

val errorBodyBytes = response?.errorBody()?.bytes()

The above in hex has the following value: 2a092801300138e8074801. Specifically I get a failure to decode the value in the bool e = 5; of the following proto message. It resolves the value of 40 when reading the next byte and fails to match that to a boolean in ProtoAdapter.COMMON_BOOL.

The proto3 message

message Subscription {
  bool a = 1;
  bool b = 2;
  bool c = 3;
  bool d = 4;
  bool e = 5;
  bool f = 6;
  int32 g = 7;
  int32 h = 8;
  bool i = 9;
}

Dependencies versions

Wire-runtime: 3.0.0-alpha03
Wire Gradle plugin: 3.0.0-alpha03
Retrofit: 2.6.0
Okio: 2.3.0-SNAPSHOT
OkHttp: 4.0.0

@Egorand
Copy link
Collaborator

Egorand commented Jul 9, 2019

I've looked into it and I can indeed repro the issue, what I'm getting is java.io.IOException: Invalid boolean value 0x28 for tag 5. However, it doesn't seem like a regression from the 2.x version of wire-runtime. Next thing I'll try is to decode data by hand to verify it should indeed decode properly.

Note that I had to add optional to every field, it seems like the message is using proto3 format. It should be the same in binary though, so the only thing to verify here is that the data is not corrupt.

@pavlospt
Copy link
Contributor Author

pavlospt commented Jul 9, 2019

Yeap it is indeed proto3 as I mentioned on the title, and the exception you get is the correct one if you are building from master. I can attest that the data is not corrupt, since I tested it on another branch of our codebase that still uses the Google protobuf Java library and it is correctly parsed as the message I posted above.

Let me know if I can help in any other way!

@Egorand
Copy link
Collaborator

Egorand commented Jul 9, 2019

Okay, so I tried decoding the hex using protoc:

echo 2a092801300138e8074801 | xxd -r -p | protoc --decode_raw

which outputs the following:

5 {
  5: 1
  6: 1
  7: 1000
  9: 1
}

If I'm reading this correctly, tag 5 contains an embedded message that in turn has what seems like boolean values for tags 5, 6, 9 and an int value for tag 7 - this looks consistent with the proto message above. That explains why decoding fails: when we're decoding bytes against the proto above we're expecting tag 5 to resolve to a boolean value, and the data suggests that it's something different.

@pavlospt
Copy link
Contributor Author

pavlospt commented Jul 9, 2019

Hmm this should not be the case! The value in tag 5 is a boolean just like the rest of the bool tags. Our backend only encodes the response to protobuf when requesting with the appropriate header as well, so when checking that in JSON is still like the message and has nothing nested :/ I am not sure why it seems so in the decoding.

@Egorand
Copy link
Collaborator

Egorand commented Jul 9, 2019

Could you please recheck two things:

  • That the bytes have been hexed correctly
  • That reverting to Wire 2 fixes the issue
    Wanna understand if it's a regression from Wire 2, or an old issue we've had before (which seems unlikely).

@pavlospt
Copy link
Contributor Author

pavlospt commented Jul 9, 2019

@Egorand

  • I am positive that the bytes are correct, have tested that multiple times for 3 days in a row (I really want to make the switch to Wire 😂)
  • We did not use Wire 2 previously 😕

@Egorand
Copy link
Collaborator

Egorand commented Jul 9, 2019

Re-checked with this online proto decoder: https://protogen.marcgravell.com/decode
According to its output, field 5 is of type String.

@pavlospt
Copy link
Contributor Author

pavlospt commented Jul 9, 2019

How is it possible that is decoded from the Google protobuf library though? 😕 If this is the case it should fail to decode in both libraries, right? Also this decoder is showing 2 different entries for field 5. Is that the embedded you mentioned before?

@pavlospt
Copy link
Contributor Author

pavlospt commented Jul 9, 2019

@Egorand it seems that the following:

$ echo "2801300138e8074801" | xxd -r -p | protoc --decode_raw
5: 1
6: 1
7: 1000
9: 1

is the proper message payload that I would expect to be correctly decoded. So I am wondering where the following hex data 2a09 are added in the stack.

@JakeWharton
Copy link
Member

JakeWharton commented Jul 9, 2019 via email

@pavlospt
Copy link
Contributor Author

pavlospt commented Jul 9, 2019

@JakeWharton indeed this is the case suggested by the online proto decoder mentioned by @Egorand : https://protogen.marcgravell.com/decode. I am not sure why the message is prefixed though. Is that some kind of protobuf message semantic?

@pavlospt
Copy link
Contributor Author

pavlospt commented Jul 9, 2019

The following code seems to parse the message correctly, when removing the length prefix.

fun String.hexStringToByteArray() = ByteArray(this.length / 2) { this.substring(it * 2, it * 2 + 2).toInt(16).toByte() }
val newBytes = "2801300138e8074801".hexStringToByteArray()
Subscription.ADAPTER.decode(newBytes)

I suspect that the reason it does work with Google's protobuf library, is the usage of: https://developers.google.com/protocol-buffers/docs/reference/java/com/google/protobuf/Parser#parseDelimitedFrom-java.io.InputStream-

Like parseFrom(InputStream), but does not read util EOF. Instead, the size of message (encoded as a varint) is read first, then the message data.

@Egorand
Copy link
Collaborator

Egorand commented Jul 9, 2019

@swankjesse and I looked at it in more detail, and it turns out there's difference in how protoc and Wire parse the example you provided:

  • protoc finds tag 5 with the LENGTH_DELIMITED type, and since it's different from the type of the tag in the schema it'll just put the value into the unknownFields. Decoding succeeds, but you have false values for all your boolean fields and almost the entire message inside unknownFields.
  • Wire finds tag 5 with the LENGTH_DELIMITED type, and since it's different from the type of the tag in the schema, it crashes.

Although protoc does indeed succeed to parse the payload, since your schema is not consistent with the structure of the payload, you end up with an instance where all fields are initialized with false values. This looks like a bug that's hard to identify.

@pavlospt
Copy link
Contributor Author

pavlospt commented Jul 9, 2019

Without trying to sound offensive, I am pretty sure that this is not the case for the Google protobuf. Let me explain why is that.

I am using the values returned in this message in order to take show some different feedback based on some business logic. For the sake of the example let's state the following cases:

  1. Show not available feedback to user
  2. Show not paid feedback to user

If all of the fields had false as values, I would be presented with case 1, whereas the returned payload in the error should have properly lead me to case 2 (like it does with Google protobuf). It has been working like that since the time we added it. I have also compared it between different branches, cause it did not make any sense at the beginning 😛

If I though now drop the first two elements from the incoming ByteArray before actually trying to decode the Subscription message, everything works (as we've discussed above as well), but I am not sure if this is something that I should do.

Totally agree that this is a very hard to identify bug...!

@swankjesse
Copy link
Member

The schema you’re attempting to decode with is this:

message Subscription {
  bool a = 1;
  bool b = 2;
  bool c = 3;
  bool d = 4;
  bool e = 5;
  bool f = 6;
  int32 g = 7;
  int32 h = 8;
  bool i = 9;
}

But the correct schema for this message has an enclosing object:

message Box {
  Subscription subscription = 5;
}

message Subscription {
  bool a = 1;
  bool b = 2;
  bool c = 3;
  bool d = 4;
  bool e = 5;
  bool f = 6;
  int32 g = 7;
  int32 h = 8;
  bool i = 9;
}

You’re attempting to decode as a Subscription, but you should be attempting to decode as Box.

@pavlospt
Copy link
Contributor Author

pavlospt commented Jul 9, 2019

Ok! Thank you all for your time and valuable help! Will take a look at this again with the backend engineers. Probably there is a change that we were not aware of and this value is now boxed like @swankjesse suggests!

@Egorand
Copy link
Collaborator

Egorand commented Jul 9, 2019

Yeah, the fact that protoc is so lenient is probably not a good thing, but in this case the only source of truth is code, and Wire's is different :) Please keep us posted on your findings!

@pavlospt
Copy link
Contributor Author

Update

I did a more detailed investigation today 😄 It seems that you were totally right all along @Egorand & @swankjesse! The proto message is indeed coming in boxed in another proto message, since the backend packs the same response for both success and error cases (miscommunication issue right here).

Let's assume that we have a message like:

message SearchResponse {
    int64 a = 1;
    int64 b =  2;
    int64 c = 3;
    int64 d = 4;
    Subscription e = 5;
}

this message includes the original Subscription proto message. Wire3 correctly failed to parse that when asked to parse it as a Subscription instead of a SearchResponse (which it is correctly parsed), but Google's protobuf did assign default values instead of failing to parse it.

Further down in the codepath, the value of one of Subscriptions properties was not handled as a default/non-parsed value, causing the happy path to be invoked and this is what caused all the confusion for me.

Apparently this seems to be a testing issue after all, since the handling is not correct for the case of bad parsing, but we would have never been able to spot it if we did not move to Wire3 😄

I would like to thank you all once again! Precious investigation and taught me new things for Protobufs!

@Egorand
Copy link
Collaborator

Egorand commented Jul 10, 2019

Sweet! Glad that Wire helped you discover an issue. I'm gonna close this ticket as it seems like there's no action for us to take.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants