Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better resolve race between AsyncListener.onTimeout and AsyncContext.dispatch #6227

Closed
gregw opened this issue Apr 30, 2021 · 7 comments · Fixed by #6228 or #6231
Closed

Better resolve race between AsyncListener.onTimeout and AsyncContext.dispatch #6227

gregw opened this issue Apr 30, 2021 · 7 comments · Fixed by #6228 or #6231
Assignees

Comments

@gregw
Copy link
Contributor

@gregw gregw commented Apr 30, 2021

There is an intrinsic race between an application calling AsyncContext.dispatch and the container calling AsyncListener.onTimeout.
Currently when an async timeout expires, the HttpChannelState is atomically changed from ASYNC to EXPIRE, so that any subsequent call by the application to AsyncContext.dispatch will receive an ISE.

However, once the container starts calling any AsyncListener.onTimeout listeners, it switches the state to EXPIRING as it is allowed for those listeners to call AsyncContext.dispatch as part of their valid handling. However, at this point the current implementation is unable to distinguish between a valid call to dispatch from a listener vs a late call to dispatch from the normal application logic. So the dispatch is allowed, which can result in either the container throwing an ISE when is doesn't see EXPIRING state just before calling onTimeout, or the handling within an onTimeout listener may get an ISE as the request has already been dispatched.

This results at least in a noisy logged exception. At worst, the application dispatch may be used even if onTimeout is called. Better handling would be that once EXPIRING, only the same thread is allowed to call dispatch (or complete for that matter) so that only a listener will be able to call them and any late application calls are rejected with an ISE.

Whilst this race has been seen in the wild, it will be very difficult to reproduce in a unit test as there is no suitable extension point.

@gregw gregw added this to To do in Jetty 9.4.41 via automation Apr 30, 2021
@gregw gregw self-assigned this Apr 30, 2021
gregw added a commit that referenced this issue Apr 30, 2021
Only allow the thread calling onTimeout to call dispatch and complete once timeout has expired.

Signed-off-by: Greg Wilkins <gregw@webtide.com>
@joakime
Copy link
Member

@joakime joakime commented Apr 30, 2021

I was able to reproduce this with a simple server and loadgenerator client setup.
Every run triggers the fault at least 4 times, and as many as 19 times on some runs.

Logs (partial DEBUG, full DEBUG doesn't trigger it for me) against 9.4.40 release -
server-java-11.0.10-20210430T163849.log.gz

The QoSFilter$QoSAsyncListener.onTimeout starts the issue with this stacktrace ...

16:38:59.906 [server-146] WARN  o.e.j.d.load.server.CustomQoSFilter - onTimeout() ERROR
java.lang.IllegalStateException: s=HANDLING rs=DISPATCH os=OPEN is=IDLE awp=false se=true i=false al=1
	at org.eclipse.jetty.server.HttpChannelState.complete(HttpChannelState.java:703)
	at org.eclipse.jetty.server.AsyncContextState.complete(AsyncContextState.java:72)
	at org.eclipse.jetty.demo.load.server.CustomQoSFilter$QoSAsyncListener.onTimeout(CustomQoSFilter.java:343)
	at org.eclipse.jetty.server.HttpChannelState$2.run(HttpChannelState.java:662)
	at org.eclipse.jetty.server.handler.ContextHandler.handle(ContextHandler.java:1520)
	at org.eclipse.jetty.server.HttpChannelState.runInContext(HttpChannelState.java:1206)
	at org.eclipse.jetty.server.HttpChannelState.onTimeout(HttpChannelState.java:679)
	at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:401)
	at org.eclipse.jetty.server.HttpChannel.run(HttpChannel.java:340)
	at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:882)
	at org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:1036)
	at java.base/java.lang.Thread.run(Thread.java:834)

From there, the HttpChannelState.onTimeout creates more WARN messages.

I'll try your branch next.

gregw added a commit that referenced this issue Apr 30, 2021
squelch exception

Signed-off-by: Greg Wilkins <gregw@webtide.com>
@onintwo2
Copy link

@onintwo2 onintwo2 commented May 3, 2021

So is this patch ready?

@joakime
Copy link
Member

@joakime joakime commented May 3, 2021

This has exposed a deficiency in the Servlet API/Spec.

A new thread on the servlet-dev mailing list has been started about it ...
https://www.eclipse.org/lists/servlet-dev/msg00370.html

gregw added a commit that referenced this issue May 4, 2021
Fix #6227 Async timeout dispatch race
Only allow the thread calling onTimeout to call dispatch and complete once timeout has expired.
Signed-off-by: Greg Wilkins <gregw@webtide.com>
gregw added a commit that referenced this issue May 4, 2021
Fix #6227 Async timeout dispatch race
Only allow the thread calling onTimeout to call dispatch and complete once timeout has expired.
Signed-off-by: Greg Wilkins <gregw@webtide.com>
@sbordet sbordet linked a pull request that will close this issue May 5, 2021
@sbordet sbordet linked a pull request that will close this issue May 5, 2021
@sbordet
Copy link
Contributor

@sbordet sbordet commented May 5, 2021

@onintwo2 the patch for 9.4.x has been merged. If you can use 9.4.41-SNAPSHOT or build it yourself, you will be able to test it.

@onintwo2
Copy link

@onintwo2 onintwo2 commented May 5, 2021

@sbordet
Copy link
Contributor

@sbordet sbordet commented May 5, 2021

@gregw gregw closed this in #6231 May 5, 2021
Jetty 9.4.41 automation moved this from To do to Done May 5, 2021
gregw added a commit that referenced this issue May 5, 2021
Fix #6227 Async timeout dispatch race
Only allow the thread calling onTimeout to call dispatch and complete once timeout has expired.
Signed-off-by: Greg Wilkins <gregw@webtide.com>
@salberch
Copy link

@salberch salberch commented May 6, 2021

Hi, We are using maven, can you publish the fix un a mvn repository?
Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
No open projects
Linked pull requests

Successfully merging a pull request may close this issue.

5 participants