Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cardinality violations should use error code “unimplemented” #11247

Open
jhump opened this issue May 30, 2024 · 5 comments
Open

Cardinality violations should use error code “unimplemented” #11247

jhump opened this issue May 30, 2024 · 5 comments

Comments

@jhump
Copy link
Member

jhump commented May 30, 2024

The gRPC docs for error codes state that both client and server should use the unimplemented code for cardinality violations. See table at the bottom of this doc (you can search for “cardinality violation” in the doc): https://grpc.github.io/grpc/core/md_doc_statuscodes.html.

A cardinality violation is when a stream contains an incorrect number of messages. Specifically, when a response stream for a unary or client-stream RPC contains zero messages with an OK status or more than one message; or when a request stream for a unary or server-stream RPC contains zero or more than one messages.

The server in this repo reports an exception with an internal error in this situation instead of unimplemented.

The client’s behavior is a little more… interesting:

  1. For unary RPCs, if no response messages are present in the response, only an “OK” status, an exception with an internal error is thrown.
  2. For client-stream RPCs, if no response messages are present in the response, only an “OK” status, there is no error at all. The onNext and onError methods of the application’s StreamObserver are never called, but the onCompleted method is.
  3. For both unary and client-stream RPCs, if a second response message is erroneously sent by the server before the trailers/status, an error is reported that has a cancelled code and a message of “Failed to read message.” Presumably the code comes from the fact that the library is likely trying to cancel the operation to prevent receipt of any further unwanted messages in the response stream.
@ejona86
Copy link
Member

ejona86 commented May 30, 2024

This was apparently added without much (any) discussion. UNIMPLEMENTED makes me sad as I am still waiting for someone to make the case that grpc responding 200 all the time is bad. At that point we'd convert UNIMPLEMENTED to 404. (Yes, compression method is UNIMPLEMENTED, but there was a gentleman's agreement that that case wouldn't be used to say 404 is inappropriate.) Although in other ways UNIMPLEMENTED for this case is probably a better fit than for compression.

I'm not checking all the cases, but at least some of these cases on both client and server would be easy to change the status code:

throw Status.INTERNAL
.withDescription("More than one responses received for unary or client-streaming call")
.asRuntimeException();

Status.INTERNAL.withDescription(TOO_MANY_REQUESTS),

@ejona86
Copy link
Member

ejona86 commented May 31, 2024

It seems likely only Python, and maybe some later implementations like .Net, is doing this behavior.

Edit: C++ is also doing UNIMPLEMENTED

@ejona86
Copy link
Member

ejona86 commented Jun 21, 2024

Some TLs discussed this, and we strongly agreed UNIMPLEMENTED is wrong. A particularly bad case is when the client detects the cardinality error, as the server has already processed the request and sent responses and only afterward does the client tell the application the RPC is UNIMPLEMENTED. That will certainly mislead the client.

The only safe choices are INTERNAL and UNKNOWN, and UNKNOWN was agreed to be inappropriate because we know what error happened. At some point I'll work on getting the text updated.

@jhump
Copy link
Member Author

jhump commented Jul 30, 2024

@ejona86, thanks for the transparency about the decision making. IIUC, you're saying that you expect the spec to be amended to instead use INTERNAL for this case instead of UNIMPLEMENTED. Is that right?

@ejona86
Copy link
Member

ejona86 commented Jul 30, 2024

@jhump, yes. I would have updated the spec already with the one-line change, but it seems this deserves to go through a gRFC, and I haven't spent the time to write it up yet. Just this morning I was looking at this issue and thinking I'd have time this week.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants