Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extensibility point is needed to return 401 or 500 based on request execution #159

Closed
ntaranov opened this issue Aug 17, 2018 · 17 comments
Closed

Comments

@ntaranov
Copy link

GraphQLHttpMiddleware is getting better, so I would be happy to make use of it in my projects.

If I understand correctly, current implementation only returns anything other than 200 only based on information available before (IGraphQLExecuter).ExecuteAsync call.

The problem is following: sometimes it is only possible to figure out that middleware needs to return a code different from 200 or 400 only during the request execution. Notable examples are 500 and 401.

Concrete examples to illustrate the concept:
https://github.com/graphql-dotnet/examples/blob/master/src/AspNetCoreCustom/Example/GraphQLMiddleware.cs#L74
graphql-dotnet/authorization#1 (comment)

In my opinion, we need an extensibility point roughly here
https://github.com/graphql-dotnet/server/blob/develop/src/Transports.AspNetCore/GraphQLHttpMiddleware.cs#L88

We can go "MVC way" and introduce something like ActionResult, which would later be evaluate by some "executor" resolved as a service.

Of course there is unlimited number of options of how to do this; I will be happy with any solution be it Chain of Responsibility, some analog of IActionFilter/IResultFilter, a delegate parameter or anything else.

I would go with "ActionResult" solution, can provide a PR.

@johnrutherford
Copy link
Member

With GraphQL, HTTP should be thought of as just the transport. Using the HTTP status code to communicate application status is a REST concept and not really compatible with GraphQL, especially because you can have a result that is mostly successful but with some errors.

So the idea would be that any 4xx or 5xx error is a transport issue. Your application should look for execution errors in the GraphQL response and decide how to handle them.

@ntaranov
Copy link
Author

ntaranov commented Aug 18, 2018

Thanks for the response.

First of all, status code is not a REST concept; it is HTTP concept. Because of this, REST vs GraphQL holywar is irrelevant here. Speaking about the middleware we are discussing transport-specific code, there is no contradiction.

500 Any HTTP service should be able to communicate that it has failed miserably and cannot provide a valid GraphQL error, example being GraphQL layer throwing an exception.

401 When auth is done with expiring tokens, existing clients rely on API returning 401.
Please, see how the most popular OAuth2/OIDC Connect client is tested.
https://github.com/IdentityModel/oidc-client-js/blob/4a23f5e80e54802f7fc1d0ca054f3043f5b39b25/samples/Angular/App/src/app/test-api.service.ts#L40

I currently have my own GraphQL middleware implementation that currently has just the code I referenced above incompatible with current graphql-dotnet/server implementation. My project makes use of graphql-dotnet/authorization, and things work like a charm. This is why I want the ability to return business-relevant HTTP status codes upstream.

UPD: Edited to fix MD numbered list auto numeration error.

@ntaranov
Copy link
Author

Found this.
johnrutherford@3516504#diff-6384828a25b56c885c7984f4c7fe2429R115

I think that here should be an extensibility point because old code basically returned 400 for whatever error and new code returns 200 for whatever error. User should be able to decide instead.

@johnrutherford
Copy link
Member

Here's where that change was discussed: #149.

500 errors are already covered. If there is a server configuration issue or an unhandled exception somewhere in the request pipeline, a 500 error will be returned.

I have a PR (#158) to add support for endpoint authorization which will return a 401 or 403 error as appropriate.

@ntaranov
Copy link
Author

Thanks for the useful resources. I followed the links and had a read. The PR #158 looks fine. What I'm speaking about is a different story.

500 errors are already covered.
"I think as long as there are no issues with the transport protocol, the middleware should return a 200 OK and the client must check for Errors in the GraphQL response."

Could you provide an official source of the idea that you need to unconditionally always return 200 even when you provide errors the JSON?

Here are examples from officially recommended graphql node.js implementations.

express-graphql here return 500 for responses with errors.
https://github.com/graphql/express-graphql/blob/41e26f803f4bf6888a4dedf9af99153892d13eb4/src/index.js#L330

Here apollo-server actually return 400 or 500 through some hardcoded logic relevant for their own code comparable to the piece of code I have given as an example what the requested extensibility point might be used for.
https://github.com/apollographql/apollo-server/blob/c4d3a93e7bc8a9ffb1692c50a4f766422f30e665/packages/apollo-server-core/src/runHttpQuery.ts#L328

GraphQL layer wraps business and data access logic. Choosing result code for HTTP services response is can be a business decision. There is no GraphQL/transport parallel universes. HTTP layer can contain it's own business logic deciding what code to return based on the results of GraphQL layer work and this is OK. Another point is that there are pieces of automation that send HTTP requests and cannot parse JSON responses.

@johnrutherford
Copy link
Member

All I'm saying is that I don't think the middleware should interpret errors returned from a partially successful GraphQL execution and set an HTTP status code based on that.

The example you gave from express-graphql is returning a 500 response if the data from the execution result is null, because it indicates a complete execution failure. I think that is appropriate and should probably be added to this project.

The example in the apollo-server is catching an error thrown when creating the user context. That is before the query execution. In this project, if an exception is thrown when creating the user context, the exception will bubble up and result in a 500 error:
https://github.com/graphql-dotnet/server/blob/develop/src/Transports.AspNetCore/GraphQLHttpMiddleware.cs#L77

Neither of these projects set a status code based on the presence of errors in the GraphQL execution result.

@ntaranov
Copy link
Author

I don't think the middleware should interpret errors returned from a partially successful GraphQL execution and set an HTTP status code based on that.

Well, I guess, I can sum up that you do not know any real official sources prescribing returning 200 on errors inside GraphQL no matter what. I just want to point this circumstance clearly because doing so is super-duper opinionated. There is a reason for this: GraphQL is abstracted form the transport layer which means that it does not prescribe anything about it, including returning 200 on errors. Some protocols have result codes, some do not.

As for now I didn't see any arguments for the opinionated design decision other than "I think so".

My examples (would you provide you own examples?) was about demonstrating that others approach this "always 200" problem differently.

First example: is completely valid, returning 500 on magic rule "null is error" is a way to manifest some business requirement and is in fact based on graphql execution result.
Second example: Setting user context has little to do with GraphQL but it has something to do with the fact that we are speaking about an HTTP service which, as you know, might be expected to return 500 under specific conditions.
If you want more graphql-processing (which is a different thing form graphql, right?) related example, have a look at this code, found immediately with Ctrl+F "server error" in the same file a dozen lines below the previous
https://github.com/apollographql/apollo-server/blob/c4d3a93e7bc8a9ffb1692c50a4f766422f30e665/packages/apollo-server-core/src/runHttpQuery.ts#L433

You arguments so far were:

  1. Using the HTTP status code to communicate application status is a REST concept (it's not necessarily)
  2. So the idea would be that any 4xx or 5xx error is a transport issue (it's not necessarily)
  3. 500 errors are already covered. (they are "covered" with phrase "I think as long as there are no issues with the transport protocol, the middleware should return a 200 OK and the client must check for Errors in the GraphQL response.", being a pure opinion)
  4. I have a PR (Add support for endpoint authorization #158) to add support for endpoint authorization which will return a 401 or 403 error as appropriate. (it's largely about a different thing)
    No other real arguments/proofs.

My arguments thus far are:

  1. Mandatory 200 for graphql-related errors idea needs solid proofs
    • 2 examples returning codes based on execution results
    • GraphQL being decoupled from transport, official sources dictate very little about HTTP
    • one who make a claim is justifying it, of course
  2. We need to return 500 for internal server errors.
    • 2 examples returning 500s for this
  3. We should add support for 401 because it's a common practice.
    • oidc-client tests which test checking for 401 to demonstrate usual relevant code
  4. It is okay to return different codes based on graphql execution result.
    • a real case in a real project
    • people other than me being interested in it
    • 2 examples where 2 different sets of developers approach the problem different both from each other and from the "always 200" approach
    • I'll extend this a bit: GraphQL layer basically wraps the whole application, any business layer decisions and DB events have to surface through it. For me having DB down very well might be 500.

@ntaranov
Copy link
Author

  1. Some pieces of automation do not parse JSON

@ntaranov
Copy link
Author

About apollo-server, I'll provide TL;DR of what they actually do.

  1. They support calling error an HttpQueryError which guarantees that status set for the error will be returned to client.
  2. They make whatever error which pops out of query execution not being HttpQueryError a 500.
  3. They pass errors with promises across layers.
  4. In framework-specific layer they return error with it's HTTP status code.

So, they do support a way to set an arbitrary HTTP status code from inner layers and this is a way of GraphQL execution determining the status code.

@johnrutherford
Copy link
Member

Sorry I don’t have time for a more detailed and thorough response. I never meant to imply that the middleware should always return 200 OK no matter what. Just that the middleware should not be responsible for interpreting errors on the execution result. These other projects are not doing that.

I also didn’t mean to give the impression that I think this project is perfect as-is. I was just trying to work out the best solution for what you are trying to accomplish.

Thanks for pointing out that the apollo-server project catches HttpQueryError on execution and sets the HTTP error code based on that. I hadn’t noticed that when I reviewed the code.

I think a similar approach could be easily implemented in this project with the new ThrowOnUnhandledException property on the ExecutionOptions. Feel free to open a PR.

@ntaranov
Copy link
Author

I personally like the idea of throwing something like HttpQueryError, but I also noticed that graphql-dotnet DocumentExecutor basically swallows all exceptions it can and adds them as GraphQL errors. IMO opinion it should not do catch-all on the first place. This is why other implementations don't always analyze what would be their ExecutionResult analog; they just let exceptions pop.

Another interesting observation, I hoped you could make from my examples that different implementations approach error handling/status codes problem differently from each other and sometimes with rather awkward solutions like checking if result is empty. graphql-dotnet/server used to have the same code changing over time is not a coincidence as well. As I see it, the problem lies in developers not really understanding that they are trying to provide a general solution for something entirely depending on business requirements. For different systems it might be OK respectively:

  • to return 200 if the system does not die in agony
  • to return 400 if errors are not empty
  • to return 500 in specific cases
  • to return 500 in "all the other" errors cases
  • some people actually implement sign-in using graphql, as you can imagine this requires returning 401 on query/mutation basis and not on endpoint basis.
    This is why I'm speaking about being able to provide custom behavior for this.

IMO both somehow extending ExecutionResult and letting some exceptions go form DocumentExecutor. Maybe, the least painful way to achieve what I need is to provide IGraphQLExecutor result-specific counterpart that would be injectable, would have a good local default (current behavior). This would make it possible to avoid making breaking changes to DocumentExecutor.

About doing something that others don't do: I skimmed through 2 major JS and a major Java implementation, truth is that they are not that much more mature, you can quickly understand this from looking on their test suites and issue trackers. So it is OK to do something differently.

I'm personally interested in a way of extending control over result codes. Maybe, others have something to say about all this as well.
@pekkah, @joemcbride

@johnrutherford
Copy link
Member

As I mentioned, there is a new ThrowOnUnhandledException property on the ExecutionOptions. This will make the executor re-throw an exception from query execution. (I haven't looked to see what release it's in.)
https://github.com/graphql-dotnet/graphql-dotnet/blob/master/src/GraphQL/Execution/DocumentExecuter.cs#L237

I like the HttpQueryException approach and it could be very easily be implemented. I think that would at least be a good start.

The apollo-server repository has almost 5k stars and over 500 forks. It definitely seems to be the most widely adopted graphql server, so I don't think taking some inspiration from them is a bad idea. But, I agree, it doesn't mean we can't do something different.

@ntaranov
Copy link
Author

ntaranov commented Aug 20, 2018

I noted that DocumentExecutor ad-hoc code managing threads (btw, I didn't find "UnobservedTaskException" in the graphql-dotnet organization, not sure if it's alright).
As you know, await is actually syntactic sugar around Tasks and Tasks wrap AggregateException around exceptions, the behavior is similar to promises in JS. There are tricky cases with multiple exceptions. You can see that error handling code in apollo-server is not really trivial because of this. Implementing robust "swallow or rethrow" HttpQueryException approach might not be that easy. The approach itself is good.

Anyway, I need both 500 and other status codes, and not only technical, but also business requirements-related. This can be easily implemented now with 10-20 lines of non-test code and zero risks for graphql-dotnet library.

@johnrutherford
Copy link
Member

johnrutherford commented Aug 20, 2018

Not sure what you're referring to about the DocumentExecutor managing threads. Could you link to the code you're referring to?

I don't see why it would be appropriate to look at unobserved task exceptions in one of the GraphQL class libraries. That should be handled by consuming code. There is one open issue related to unobserved tasks (graphql-dotnet/graphql-dotnet#781), but once that is fixed, it shouldn't really be a concern.

Yes, when a Task is awaited, only the first exception in the AggregateException will be thrown. However, I don't think this should be a problem with implementing the HttpQueryException approach. We would need to pick a single status code anyway. Even if multiple nodes threw an HttpQueryException, it would probably be for the same issue and with the same status code. If another type of exception is thrown first in addition to the HttpQueryException, then, yes, we would not see the HttpQueryException and set the specific status code, but that also indicates that there is another, unexpected error and I think the 500 status code would be most appropriate anyway.

So are you going to do a PR for the HttpQueryException approach? Or are you proposing something else?

@ntaranov
Copy link
Author

I have nothing against HttpQueryException, but I'm obviously interested in something else.

@pekkah
Copy link
Collaborator

pekkah commented Oct 31, 2018

It's not that difficult to execute the query by yourself in case you need custom behavior for error handling. Closing this for now.

@pekkah pekkah closed this as completed Oct 31, 2018
@ntaranov
Copy link
Author

I agree, in fact this is what I do. Also nobody else expressed interest in this. @johnrutherford , thanks for the discussion.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants