Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failed to parse server response when I try to read a stream #1022

Open
paambaati opened this issue Sep 8, 2019 · 13 comments
Open

Failed to parse server response when I try to read a stream #1022

paambaati opened this issue Sep 8, 2019 · 13 comments

Comments

@paambaati
Copy link
Contributor

paambaati commented Sep 8, 2019

Problem description

I'm using generated static bindings from https://github.com/liftbridge-io/liftbridge-grpc/blob/0877a0cacb7a721e2e19280cff126dc2ed51912d/api.proto to create a Liftbridge Node.js client that will create a stream and publish some messages and subscribe to them (Liftbridge is a kind of lightweight Kafka), and I run into an Error: 13 INTERNAL: Failed to parse server response error consistently whenever I try to consume the Subscribe stream.

Reproduction steps

  1. docker run -p 4222:4222 -ti nats:latest --debug --trace in a window.
  2. go get github.com/liftbridge-io/go-liftbridge and then $GOPATH/bin/liftbridge --raft-bootstrap-seed --nats-servers nats://localhost:4222 --level debug in another window.
  3. Clone my repo from https://github.com/paambaati/node-liftbridge.git in yet another window.
    1. yarn install or npm install
    2. yarn run debug or npm run debug to run the debug script that reproduces this issue.

The debug script attempts to create a new stream, then publish a few messages, then subscribes to the same stream (subject) and then publishes a few more messages.

Expected output — Each published message should be printed to console (see relevant lines).

Actual output —

Error: 13 INTERNAL: Failed to parse server response
    at Object.exports.createStatusError (~/node-liftbridge/node_modules/grpc/src/common.js:91:15)
    at ClientReadableStream._emitStatusIfDone (~/node-liftbridge/node_modules/grpc/src/client.js:233:26)
    at ClientReadableStream._receiveStatus (~/node-liftbridge/node_modules/grpc/src/client.js:211:8)
    at Object.onReceiveStatus (~/node-liftbridge/node_modules/grpc/src/client_interceptors.js:1272:15)
    at InterceptingListener._callNext (~/node-liftbridge/node_modules/grpc/src/client_interceptors.js:568:42)
    at InterceptingListener.onReceiveStatus (~/node-liftbridge/node_modules/grpc/src/client_interceptors.js:618:8)
    at ~/node-liftbridge/node_modules/grpc/src/client_interceptors.js:1029:24 {
  code: 13,
  metadata: undefined,
  details: 'Failed to parse server response'
}

Environment

  • macOS 10.14.16
  • Node 12.9.1
  • Node installation method - fnm
  • Clang 10.0.1 (clang-1001.0.46.4)
  • grpc@1.23.3

Additional context

  1. The static bindings were generated by a script - see relevant command here. The bindings are checked-in at grpc/generated.

  2. The Go version of the client I'm developing (see go-liftbridge) can in fact Subscribe to the messages Published by my Node.js debug script and print all the Messages correctly! I used the example subscribe program at https://github.com/liftbridge-io/go-liftbridge/blob/master/example/lift-sub/subscribe.go to verify this 😮

  3. I ran a full verbose trace of my debug script, and here's the output — https://gist.github.com/paambaati/7884b119eee47fafa436f74db8b59edc. I've padded the debug script's stdout with a lot of new lines so it is a little easier to separate them from the GRPC traces.

@paambaati
Copy link
Contributor Author

FYI, adding [jstype=JS_STRING] to the int64 fields has no effect.

@murgatroid99
Copy link
Member

That error message essentially indicates that the incoming binary data could not be parsed as a valid message of the expected response message type. Are you sure that the .proto file you are using is consistent with the one that the service was defined with?

@paambaati
Copy link
Contributor Author

Are you sure that the .proto file you are using is consistent with the one that the service was defined with?

@murgatroid99 I am, yes. And FWIW, the other implementations of the same .proto file (https://github.com/liftbridge-io/go-liftbridge in Go and https://github.com/dgzlopes/python-liftbridge in Python) work fine.

What other angles could we look at this from?

@murgatroid99
Copy link
Member

The next useful step is to add more logging. Those errors are handled here, here, and here, and logging the original error messages will give more information (only one of those will be triggering this specific issue, but I'm not sure which one). I can add some logging in an upcoming release, and in the meantime you could add logging in your local copy of the library to get the information faster.

@paambaati
Copy link
Contributor Author

paambaati commented Sep 10, 2019

@murgatroid99 So the error is from here.

When I stringify response.read, I get —

KEY-ef33408c3ce1ecbfb2e4!VALUE-ok-KEY-ef33408c3ce1ecbfb2e4 �����;�*test9:
subjecttest9:
reply

Dump of the buffer direct to a file — output.txt

Screenshot of the file opened in SublimeText —
Screenshot 2019-09-10 at 1 33 59 PM

KEY-ef33408c3ce1ecbfb2e4 is Message.key, and VALUE-ok-KEY-ef33408c3ce1ecbfb2e4 is Message.value.

The Go client can Subscribe and read the Message without any problems.

@paambaati
Copy link
Contributor Author

paambaati commented Sep 10, 2019

@murgatroid99 There's more weird stuff happening.

When I Publish this Message

{
  offset: 0,
  key: 'S0VZLWFlZmNkYWFhNThhMDlhNmViYWYy', // b64 version of 'KEY-aefcdaaa58a09a6ebaf2'
  value: 'VkFMVUUtb2s=', // b64 version of 'VALUE-ok'
  timestamp: 1568100468657000000,
  subject: 'test11',
  reply: '',
  headersMap: [],
  ackinbox: 'ack.test11',
  correlationid: 'c34c6cfa-a5c6-410c-8fb4-459b6f202970',
  ackpolicy: 0
}

this is the Message I see on the Go Subscribe side —

key:"KEY-aefcdaaa58a09a6ebaf2" value:"VALUE-ok" timestamp:1568100468661514000 subject:"test11" headers:<key:"reply" value:"" > headers:<key:"subject" value:"test11" >

You'll notice that the ackinbox and correlationid properties are also not set on the payload.

@paambaati
Copy link
Contributor Author

paambaati commented Sep 10, 2019

@murgatroid99 By printing the deserialize error trace, I was able to trace the deserialization down to the headers map.

Full stack trace —

Error [AssertionError]: Assertion failed
    at new goog.asserts.AssertionError ~/node-liftbridge/node_modules/google-protobuf/google-protobuf.js:79:876)
    at Object.goog.asserts.doAssertFailure_ ~/node-liftbridge/node_modules/google-protobuf/google-protobuf.js:80:257)
    at Object.goog.asserts.assert [as assert] ~/node-liftbridge/node_modules/google-protobuf/google-protobuf.js:81:83)
    at Function.jspb.Map.deserializeBinary ~/node-liftbridge/node_modules/google-protobuf/google-protobuf.js:275:277)
    at ~/node-liftbridge/grpc/generated/api_pb.js:2259:18
    at jspb.BinaryReader.readMessage ~/node-liftbridge/node_modules/google-protobuf/google-protobuf.js:249:329)
    at Function.proto.proto.Message.deserializeBinaryFromReader ~/node-liftbridge/grpc/generated/api_pb.js:2258:14)
    at Function.proto.proto.Message.deserializeBinary ~/node-liftbridge/grpc/generated/api_pb.js:2214:30)
    at deserialize_proto_Message ~/node-liftbridge/grpc/generated/api_grpc_pb.js:59:25)
    at ~/node-liftbridge/node_modules/grpc/src/common.js:38:12
    at ~/node-liftbridge/node_modules/grpc/src/client_interceptors.js:689:22 {
  message: 'Assertion failed',
  reportErrorToServer: true,
  messagePattern: 'Assertion failed'
}

Relevant line from generated _pb file — https://github.com/paambaati/node-liftbridge/blob/dd64cf136c265bca798c6bcff4121a5398fafc99/grpc/generated/api_pb.js#L2259

Relevant proto definition of headers map — https://github.com/liftbridge-io/liftbridge-grpc/blob/0877a0cacb7a721e2e19280cff126dc2ed51912d/api.proto#L100

I'm not sure where to go from here. Any help would be appreciated.

Some observations

  1. After the headers field, no other field is set.
  2. The generated _pb file only exposes Message.getHeadersMap() and Message.clearHeadersMap(); is there no addHeadersMap() or setHeadersMap()? Is this by design? How does one set headers?

@paambaati
Copy link
Contributor Author

I was finally able to trace it to the root cause! wipes sweat from forehead

The problem is when the headers map has an undefined value. The Go client reads it as an empty string (headers:<key:"reply" value:"" >) while grpc-node (or protobuf?) reads it as undefined, and hence the assertion error.

@murgatroid99 Is this a bug, or is this something that needs to be fixed/reconciled on the server-side? I can gladly provide more information/repros for this.

@murgatroid99
Copy link
Member

OK, it looks like there are multiple issues here.

The fact that you are seeing different messages in your subscribe responses than you are publishing looks like a server bug, or at least unexpected server behavior. Since the Go client is seeing it too, it's very unlikely to be a client bug.

The failure to process a map entry with an absent value looks like a protobuf bug. You should file an issue with the official protobuf repository. We just republish protoc. You should also go there with questions about the generated _pb.js files.

@paambaati
Copy link
Contributor Author

paambaati commented Sep 19, 2019

@murgatroid99 Thanks for the pointers, I've further narrowed down the issue to a missing argument in the generated _pb.js file for deserializing map fields - see protocolbuffers/protobuf-javascript#43.

If you're familiar with the protobuf codebase, would you mind helping me find the codemod that generates the _pb.js file? I'd love to send a PR to fix this.

@murgatroid99
Copy link
Member

Sorry, I'm not that familiar with the protobuf codebase.

@paambaati
Copy link
Contributor Author

protobufjs/protobuf.js#1348 should fix this.

@Josh-Rosen
Copy link

That error message essentially indicates that the incoming binary data could not be parsed as a valid message of the expected response message type. Are you sure that the .proto file you are using is consistent with the one that the service was defined with?

This response helped me.

I was using the BloomRPC tool as a UI to send gRPC requests to my server. I updated my protos but didn't update the client Proto in the BloomRPC app. Updated and my requests are working again.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants