Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WebFlux controller successfully completes response if Flux emits error [SPR-16051] #20600

Closed
spring-issuemaster opened this issue Oct 6, 2017 · 17 comments

Comments

Projects
None yet
2 participants
@spring-issuemaster
Copy link
Collaborator

commented Oct 6, 2017

Ivan Pavlukhin opened SPR-16051 and commented

I observed such behaviour with reactor-netty HttpServer and ReactorHttpHandlerAdapter.
Actually if error is first emitted event then will controller return HTTP error code. But if some event was emitted before error, then response will complete successfully (ending with zero chunk).
See reproducing gist https://gist.github.com/TanyaGaleyev/83ad550cf7221ef84a3bfe6df26eec3c
I suppose that proper behaviour here is to close connection when error occurs without writing end chunk.


Affects: 5.0.2

@spring-issuemaster

This comment has been minimized.

Copy link
Collaborator Author

commented Nov 30, 2017

Ivan Pavlukhin commented

No comments so far? This problem could lead to unexpected consequences in my opinion.

@spring-issuemaster

This comment has been minimized.

Copy link
Collaborator Author

commented Nov 30, 2017

Rossen Stoyanchev commented

Sorry for the delay, somehow I missed this one.

We intentionally delay committing the response until the first item, because there is still an opportunity to change to an error status. For example an @ExceptionHandler would be invoked if the publisher fails before producing any items. Once writing begins, the response is committed. When the body writing starts, the status/headers cannot be changed. So it's not that the response completes successfully. It's that all we can do is stop.

I am not sure what you mean about the end chunk. What is the actual issue here?

What should happen next is you should see an error message about the error in the server logs. However I did find an issue in Reactor Netty which simply sends an onComplete, and we complete quietly, rather than propagating the error signal which would log the error message. If you switch to another server, Tomcat/Jetty/Undertow you will see the error message.

@spring-issuemaster

This comment has been minimized.

Copy link
Collaborator Author

commented Dec 1, 2017

Ivan Pavlukhin commented

Well. I will describe an actual story. We have a controller serving files from server's file system. If you wonder why we do not use simply FileSystemResource I will say that we started integrating WebFlux at point when controllers returning FileSystemResource had not really worked. But it is only a real story, we can imagine other similar use cases. So, we have a controller serving files from server's file system. Our file reading code is something like:

public static Flux<ByteBuffer> read(ReadableByteChannel channel) {
  return Flux.generate(
      () -> channel,
      (ch, sink) -> {
        try {
          ByteBuffer byteBuffer = ByteBuffer.allocate(1024);
          if (ch.read(byteBuffer) >= 0) {
            byteBuffer.flip();
            sink.next(byteBuffer);
          } else {
            sink.complete();
          }
        } catch (Throwable ex) {
          sink.error(ex);
        }
        return channel;
      }
  );
}

Then after mapping to Flux<DataBuffer> it is used as return value in controller method. During testing case when ReadableByteChannel.read raises an exception it was found that client can receive partial file content gracefully. And client may reasonably assume that it is full content.
Actually, I can imagine similar problems with reading content from DB or even generating in memory (OOM can occur for example). But client will not suspect that the response was not actually full. In my mind proper thing in such case is to stop writing response (currently end chunk is written) and close underlying connection (as far as I remember such behaviour is described in RFC). What do you think?

@spring-issuemaster

This comment has been minimized.

Copy link
Collaborator Author

commented Dec 1, 2017

Rossen Stoyanchev commented

Okay I understand now.

To the best of my knowledge this is not something we can influence, i.e. what the server does with the underlying connection. You can try asking in reactor-netty for more details especially if you find that RFC text. Technically the server is fully aware of the error (well, after #231 is fixed) since it is left to propagate all the way. However I suspect the answer has to do with persistent connections. For what it's worth I verified the same behavior with other servers (Tomcat, Jetty, and Undertow). The only difference is that Tomcat produced a response with a Content-Length but still only partial data.

BTW you should return FileSystemResource because we support a much more efficient, zero-copy file transfer on Netty. This is in ResourceHttpMessageWriter.

@spring-issuemaster

This comment has been minimized.

Copy link
Collaborator Author

commented Dec 1, 2017

Rossen Stoyanchev commented

Resolving for now since I don't think there is anything we can do from our end.

@spring-issuemaster

This comment has been minimized.

Copy link
Collaborator Author

commented Dec 4, 2017

Ivan Pavlukhin commented

Hi Rossen.

Thanks for suggesting zero-copy approach.

I walked around source code (I used version 5.0.2.RELEASE) with Jetty running under the hood this time and found some odd things. In HttpWebHandlerAdapter we have:

@Override
public Mono<Void> handle(ServerHttpRequest request, ServerHttpResponse response) {
  ServerWebExchange exchange = createExchange(request, response);
  return getDelegate().handle(exchange)
      .onErrorResume(ex -> {
        response.setStatusCode(HttpStatus.INTERNAL_SERVER_ERROR);
        logHandleFailure(ex);
        return Mono.empty();
      })
      .then(Mono.defer(response::setComplete));
}

Stange thing here is onErrorResume. Look at function passed into. If response is already committed (which means first body chunk was sent) response.setStatusCode will have no effect. And after that empty Mono is returned. So error is invisible for underlying server.
I tried to fix the situation changing aforementioned onErrorResume callback to return Mono.error. But it did not help. Because we subscribe to returned from mono HttpWebHandlerAdapter.handle in ServletHttpHandlerAdapter.service. And here is what we have in subscriber:

@Override
public void onError(Throwable ex) {
  runIfAsyncNotComplete(this.asyncContext, () -> {
    logger.error("Could not complete request", ex);
    HttpServletResponse response = (HttpServletResponse) this.asyncContext.getResponse();
    response.setStatus(HttpServletResponse.SC_INTERNAL_SERVER_ERROR);
    this.asyncContext.complete();
  });
}

And similar situation again response.setStatus runs silently and then response is successfully completed with asyncContext.complete(). So, server is unaware of underlying error.
I found an approach how to propagate an error with async servlets, but I am not sure that it is good enough. If you need my assistance, please ask. Also, I will try to do similar checks with reactor-netty server.

@spring-issuemaster

This comment has been minimized.

Copy link
Collaborator Author

commented Dec 4, 2017

Rossen Stoyanchev commented

Yes you're right the error signal does stop in HttpWebHandlerAdapter. I misspoke on that.

That said keep in mind the way ServerHttpResponse works. The writeWith method takes a "write" Publisher<DataBuffer> and returns a "result" Publisher<Void>. In the case of Reactor Netty, the write publisher is given straight to Reactor Netty, which receives the error signal and propagates it to the result publisher (once #231 is fixed), which then reaches HttpWebHandlerAdapter. So Reactor Netty at least is definitely aware of the error. What I don't know, and would have to confirm, is whether it treats such a write error differently from a "handler result" error that's allowed to propagate out of HttpHandler.

For the other servers, we use our own Reactive Streams bridge (so onError is processed in AbstractListenerProcessor). If the error is an IOError the server will know about it but if it comes from the write publisher, I'm not sure what we can do, besides call AsyncContext#complete(). Possibly dispatch back into a container thread and then raise an exception, which the container would turn into an AsyncListener.onError notification from which we would have to leave unhandled. We still have to confirm if this is worth the trouble, i.e. will Servlet containers do anything differently and close the connection.

Note also in this particular use case, i.e. sending a file, setting the Content-Length ahead of time is feasible and that should tip off the client that not all content was received. Returning a Resource will ensure that.

In general however, based on your comments, I think we need to double check the current behavior in case of an error when the response is committed and see if there is anything else we can do. I am going to re-open.

@spring-issuemaster

This comment has been minimized.

Copy link
Collaborator Author

commented Dec 6, 2017

Ivan Pavlukhin commented

I experimented a bit with failing async servlet request after it is already committed. And I achieved expected result with calling asyncContext.dispatch() in onError callback and throwing an exception upon receiving redispatched request in service method. Jetty in that case closes the connection. It worth noticing that I changed code only in ServletHttpHandlerAdapter.

Regarding reactor-netty. I spent some time debugging same issue with reactor-netty server. And I have doubts that ReactorNetty.OutboundThen.then() causes a problem. I see that ChannelSendOperator does not propagate errors to it's subscriber. As I understood (it was very damn hard for me to understand anything) in order to propagate an error ChannelSendOperator.WriteBarrier should call WriteCompletionBarrier.onError explicitly. I can see that onError is called for WriteBarrier but not for WriteCompletionBarrier. Sorry If I got everything wrong.

@spring-issuemaster

This comment has been minimized.

Copy link
Collaborator Author

commented Dec 6, 2017

Rossen Stoyanchev commented

Did you see this ticket reactor/reactor-netty#231?

@spring-issuemaster

This comment has been minimized.

Copy link
Collaborator Author

commented Dec 7, 2017

Ivan Pavlukhin commented

Yes I did. I will describe how I understand the story.

Publisher produced by controller method is passed to following method:

public final Mono<Void> writeWith(Publisher<? extends DataBuffer> body) {
  return new ChannelSendOperator<>(body,
      writePublisher -> doCommit(() -> writeWithInternal(writePublisher)));
}

You say that publisher, returned from writeWithInternal method does not propagate errors downstream.

@Override
protected Mono<Void> writeWithInternal(Publisher<? extends DataBuffer> publisher) {
  Publisher<ByteBuf> body = toByteBufs(publisher);
  return this.response.send(body).then();
}

But what happens inside ChannelSendOperator? When someone subscribes on it (in our case it is a downstream that is supposed to handle error) it performs this.source.subscribe(new WriteBarrier(actual)) where source is request body publisher. So, WriteBarrier dispatches original events. WriteBarrier is publisher itself it publishes something after applying writeFunction to it. And it's subscriber is WriteCompletionBarrier. Subscriptions takes place in WriteBarrier.onNext or WriteBarrier.onComplete. As mentioned before WriteBarrier dispatches original events, and when request body publisher produces error (notice that error happened not during write which is responsibility of writeWithInternal) it is passed directly to writeSubscriber which is ChannelOperationsHandler.PublisherSender and has no knowledge about writeCompletionBarrier and completionSubscriber which can propagate an error downstream.

In my mind it seems that calling writeCompletionBarrier.onError in WriteBarrier.onError could do the trick (if it does not break anything else).

@spring-issuemaster

This comment has been minimized.

Copy link
Collaborator Author

commented Dec 7, 2017

Ivan Pavlukhin commented

But perhaps I am not right and ChannelOperationsHandler.PublisherSender.onError (from ractor-netty) is more appropriate place to propagate an error.

@spring-issuemaster

This comment has been minimized.

Copy link
Collaborator Author

commented Dec 7, 2017

Rossen Stoyanchev commented

One way or another we need to pass the error signal to the writeSubscriber (i.e. server). That will propagate the error to the completionSubscriber (effectively the remaining request processing chain). We don't want to shortcircuit this process and also the Reactive Streams spec forbids making any more calls on a Subscriber once it's complete (rule 1.7).

@spring-issuemaster

This comment has been minimized.

Copy link
Collaborator Author

commented Dec 8, 2017

Ivan Pavlukhin commented

Sounds reasonable. During debugging I got an impression that ChannelOperationsHandler.PublisherSender is responsible for network handling but other components could be responsible for controller events handling. But it is just an impression, never mind.

@spring-issuemaster

This comment has been minimized.

Copy link
Collaborator Author

commented Dec 11, 2017

Rossen Stoyanchev commented

Scheduling tentatively for 5.0.3 (as I had intended to) while we investigate the options.

@spring-issuemaster

This comment has been minimized.

Copy link
Collaborator Author

commented Dec 12, 2017

Rossen Stoyanchev commented

I have committed a fix and verified it works as expected for Tomcat, Jetty, and Undertow. For Reactor Netty I need to wait until issue #231 is resolved to confirm the behavior.

I will resolve the ticket for the time being since we now do make sure that any late error (after the response is committed) is propagated to the server. That's as much as we can do from our end. However I will test again when #231 is fixed, which should be soon, and comment here.

Note also that I created an additional issue #237 since the WebClient currently does not realize the connection was closed before the data is fully read.

@spring-issuemaster

This comment has been minimized.

Copy link
Collaborator Author

commented Dec 13, 2017

Ivan Pavlukhin commented

Thanks!

@spring-issuemaster

This comment has been minimized.

Copy link
Collaborator Author

commented Dec 13, 2017

Rossen Stoyanchev commented

Oops forgot this, but once #213 is fixed there is also issue #101 (scheduled for 0.8).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.