New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GRPC Ruby client eventually fails to decode JSON string, v3.15.5 #8420
Comments
If I'm understanding correctly, you are saying that the protobuf library is corrupting the You said that the loop will return the same string every time, but parsing sporadically fails. Can you reproduce this if the loop does not use GRPC at all? payload = "..."
loop do
proto = YourProto.decode(payload)
decoded_metadata = Base64.decode64(proto.json_metadata)
JSON.parse(decoded_metadata) # Does this eventually fail?
end If you can get that to sporadically fail, please send the payload and .proto definition and I can definitely debug that. |
@haberman The GRPC server always sends the same metadata string value to the client. That metadata value is:
On the client side, in the loop, I print out the metadata value and it is the same. Except
It appears the leading characters are being trimmed The client is always requesting the same resource in the loop. If I start the loop on the client for resource 1, it runs maybe 10 times and then throws the error. I run the loop again for resource 1, it runs about 56 times and throws the error. The server is always returning the same data for resource 1. The proto definition is:
Thanks. |
I still wonder if you can reproduce this without gRPC. Can you create a repro that uses the protobuf library only? The info you've given me so far isn't enough to reproduce this problem on my own machine. |
It's unclear how to reproduce this without gRPC, but I'll work on it. |
I would hope that you could grab the payload right before it is passed to protobuf. You might have to edit the gRPC code that calls protobuf, but hopefully that's not too tricky. If protobuf is truly being inconsistent in how it parses the payload, a loop like I proposed in #8420 (comment) should reproduce the error. |
@haberman I created a loop, like you proposed in #8420 and it never produced the error. I'll see about editing the gRPC code that calls protobuf. Thanks. Here is the loop I created:
Where payload is
|
Hi @heyZeus, any luck reproducing this? I do have concern that your bug is fixed by reverting to an older version. That suggests that there is a real bug here of some kind. If we can reproduce it reliably, I'm confident I can find a fix. |
@haberman I've been on vacation and just got back. I hope to try and reproduce tomorrow or Friday. Thanks for reaching out. |
@heyZeus Are you still running into this? |
@haberman Yes, I am still seeing it. I tried upgrading today to v3.17.3. I haven't had time to debug it though. I'm guessing no one else is complaining about this issue? |
Actually, I have a similar issue. I was using v3.14 to decode something. Was working well until I upgraded to v3.15 and then started getting parsing errors |
@Muyinza256 do you have a repro? With a repro or crash report, there is a high chance of being able to fix the bug. Without a repro, it's very difficult to debug. |
@haberman I'm picking this up where @heyZeus left off. Using a whole bunch of logging I've narrowed down where the data is getting corrupted, but I'm still not able to create a repro. I'm hoping with this information it will be enough to get us further. The problem seems to happen when lists of protos are concatenated. Let me explain with a high-level overview of our code. The client code does 3 or 4 lookups to the document server, concatenates all the results, then creates wrapper objects for each document -- which is the point at which the Base64 decode and JSON parse happens. High level it looks kind of like this:
Each of the To lend further credence, if I change the logic slightly to the below, the problem disappears:
It so happens that for the particular repro case I'm working with, the first and third fetches return no documents, and
|
Shortly after writing the above, I remembered that in Ruby repeated fields aren't actually arrays, they're
|
Thanks so much for narrowing this down! I think I have found the bug, and I have a fix. I'm working on getting the fix into the release. For now do you want to see if this patch fixes the bug? diff --git a/ruby/ext/google/protobuf_c/repeated_field.c b/ruby/ext/google/protobuf_c/repeated_field.c
index 88e8434f0..5ff3c769a 100644
--- a/ruby/ext/google/protobuf_c/repeated_field.c
+++ b/ruby/ext/google/protobuf_c/repeated_field.c
@@ -551,6 +551,7 @@ VALUE RepeatedField_plus(VALUE _self, VALUE list) {
RepeatedField* dupped = ruby_to_RepeatedField(dupped_);
upb_array *dupped_array = RepeatedField_GetMutable(dupped_);
upb_arena* arena = Arena_get(dupped->arena);
+ Arena_fuse(list_rptfield->arena, arena);
int size = upb_array_size(list_rptfield->array);
int i; |
I'm trying to verify the patch, and having a devil of a time getting it working. I've tried compiling from source and using a |
Sorry about that. The fix is released in 3.18.0, just released: https://rubygems.org/gems/google-protobuf/versions/3.18.0-x86_64-linux Want to give it a try and see if it fixes your problem? |
I'm a few thousand iterations into a loop of 20k, with no issues so far. It's always failed in the first few hundred before, so I'd say that it's fixed. Thank you! |
Excellent, thanks for your patience and for creating a helpful repro. |
What version of protobuf and what language are you using?
Version: v3.15.6
Language: Ruby
What operating system (Linux, Windows, ...) and version?
MacOS 10.15.7
Ubuntu 16.04
What runtime / compiler are you using (e.g., python version or gcc version)
Ruby 2.6.5
What did you do?
The RPC server will Base64.encode some JSON, like:
the client will decode like:
The proto type is definted as
string json_metadata = 7; // base64 encoded
If I go into a REPL, as a GPRC client, and make a call in a loop to fetch the metadata then decode, eventually parsing the metadata fails.
What did you expect to see
No JSON::ParserError, no matter how many times the code is executed.
What did you see instead?
It will work for awhile in a loop, then eventually fail (at random tries) with an error:
The loop will return the same string every time, but will eventually fail to parse the string and throw the JSON::ParserError.
The GRPC server is using a different of Ruby (v2.5.8) and protobuf (v3.9.0) than the Ruby client.
If I downgrade the Ruby client's protobuf version to
3.14.0
this error goes away. All I need to do to reproduce this error is to upgrade the Ruby client to protobuf version3.15.5
.The text was updated successfully, but these errors were encountered: