Skip to content

Conversation

@eyalkoren
Copy link
Contributor

@eyalkoren eyalkoren commented Dec 16, 2018

General

Investigation started due to #371
A race condition introduced due to implementation failure of some fine-grained definitions of the HttpServletRequest.startAsync API in some application servers requires a shift in how we instrument this. The full explanation is below, but the short version is- AsyncContext.isAsyncStarted is not fully reliable when invoked at the end of service/doFilter in which HttpServletRequest.startAsync was used, due to race condition between both threads handling the request.
Basic rationale behind the suggested solution: if HttpServletRequest.startAsync is used, then the transaction data should not be updated by the service/doFilter instrumentation route, but only through the AsyncListener we register. This should hold for all AsyncContext APIs and regardless of multiple dispatches. Furthermore, we assume that if an AsyncListener was registered- it must be invoked when response gets committed, so we rely on it to clear states.

Detailed explanation

The javadoc for AsyncContext.isAsyncStarted says that it will return false after AsyncContext.dispatch or AsyncContext.complete are invoked. Looking further into the dispatch and complete javadoc shows that this is a bit more complicated, as it says:

If this method is called before the container-initiated dispatch that called startAsync has returned to the container, the dispatch operation will be delayed until after the container-initiated dispatch has returned to the container.

This means that querying AsyncContext.isAsyncStarted should be safe at the end of service/doFilter, as the container should wait for those to return before actually invoking dispatch/complete. When testing multiple WildFly versions, I saw that while this is indeed the case with dispatch, it is not so in the case of complete, at least in WildFly versions
8 through 11. Specifically, the AsyncContext offers a start API for conveniently invoke logic on a separate thread, where complete is likely to be called, making it a common use case.

@eyalkoren eyalkoren requested a review from felixbarny December 16, 2018 15:53
@codecov-io
Copy link

codecov-io commented Dec 16, 2018

Codecov Report

Merging #385 into master will decrease coverage by 1%.
The diff coverage is 11.76%.

Impacted file tree graph

@@             Coverage Diff              @@
##             master     #385      +/-   ##
============================================
- Coverage     72.98%   71.97%   -1.01%     
+ Complexity     1248     1203      -45     
============================================
  Files           132      132              
  Lines          4738     4564     -174     
  Branches        515      468      -47     
============================================
- Hits           3458     3285     -173     
+ Misses         1068     1066       -2     
- Partials        212      213       +1
Impacted Files Coverage Δ Complexity Δ
...lastic/apm/agent/servlet/AsyncInstrumentation.java 75% <0%> (+9.78%) 7 <0> (ø) ⬇️
...tic/apm/agent/servlet/helper/ApmAsyncListener.java 0% <0%> (ø) 0 <0> (ø) ⬇️
...co/elastic/apm/agent/servlet/ServletApiAdvice.java 7.57% <0%> (-1.25%) 2 <0> (ø)
...ent/servlet/helper/StartAsyncAdviceHelperImpl.java 0% <0%> (ø) 0 <0> (ø) ⬇️
...ic/apm/agent/servlet/ServletTransactionHelper.java 71.96% <100%> (+0.43%) 42 <1> (+1) ⬆️
.../co/elastic/apm/agent/impl/payload/SystemInfo.java 55.55% <0%> (-23.02%) 6% <0%> (-18%)
.../apm/agent/report/serialize/DslJsonSerializer.java 81.76% <0%> (-4.84%) 109% <0%> (-28%)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update f2a89b3...576345b. Read the comment docs.

@felixbarny
Copy link
Member

//cc @adriancole

@codefromthecrypt
Copy link
Contributor

codefromthecrypt commented Dec 18, 2018 via email

Copy link
Member

@felixbarny felixbarny left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is still the potential race condition for the completed check in AsyncListener. Other than that LGTM.

// (see class-level Javadoc)
private void endTransaction(AsyncEvent event) {
// To ensure transaction is ended only by a single event
if(EVENT_COUNTER_UPDATER.getAndIncrement(this) > 0) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if(EVENT_COUNTER_UPDATER.getAndIncrement(this) > 0) {
if (EVENT_COUNTER_UPDATER.getAndIncrement(this) > 0) {

@eyalkoren eyalkoren merged commit bbad0d7 into elastic:master Dec 18, 2018
@eyalkoren eyalkoren deleted the async-servlet-instr-change branch December 18, 2018 14:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants