Brad Rust (Migrated from SEC-2005) said:
Spring Security should persist the SecurityContext immediately if any of the following methods are invoked since they commit the response and can hand control back to the client immediately. This is similar to SEC-398.
h1. Original description
During stress/loadtesting of our web application we are experiencing some severe-error issues regarding the freshness of a spring-security security context.
The referenced forum post contains a bit more detail but here is a summary.
A single loadtest client (during high tomcat-7/server load) can execute a "login" establishing a security context and a subsequent API call requiring a security context will fail. The loadtest client (grinder/jython single threaded (per client)) uses a CommonsHttpInvokerRequestExecutor sending and receiving hessian content over a "typical" http request/response.
The login request is handled by our login service bean which establishes a spring security context. That security context is persisted in spring-security scope in a "finally" block of SecurityContextPersistenceFilter.doFilter.
However, the underlying components (take your pick here... tomcat-7 Connector thread, java input/output streams, etc...) have sent the http response content-length number of bytes across the wire and therefore completed the HTTP request and response pair.
The completion of the HTTP request is realized by the loadtest client (and any client I suppose) and moves on to the next HTTP request expecting the proper cookie handling and security context to be applied to the next request.
This next HTTP request is accepted by the appserver and processed by spring and further spring-security before the finally block on the previous request has executed. Therefore the spring security context is not available in this subsequent request.
If I place a client-side delay (100ms or so), that is typically enough time to ensure that the complete spring stack has executed and the spring security context is valid.
We also have this semantically similar issue during "logout". Prior to logout, we make a call to our server that requires a security context (the last API call of our loadtest workflow). This call appears "complete" to our clients because the http reqeust/response is complete. So we continue our loadtest workflow sending a "logout" command. This clears the spring security context before the finally block of the previous call has executed therefore the completion of the previous request throws an exception because the session is invalid.
Once again, if we place a 100ms delay prior to the logout, no race condition occurs and things work as expected.
Rob Winch said:
The bug states this is within Spring Security 3.1.1 (that is the version selected), but the forums state it is Spring Security 3.0.6. Can you clarify which it is? If you have not tried, can you please try to reproduce with Spring Security 3.1.1?
I have some ideas on what might be causing this and will take a look in the up coming days to asses if this will be in our 3.1.2 release (depends on how much luck I have figuring out what is happening). Any more details (i.e. a sample project) would be greatly appreciated and increase the chances of this getting into 3.1.2.
My suspicion was correct that there may be a similar issue to SEC-398. The problem is that closing or flushing the response commits the response which means control can be immediately returned to the client. Specifically I have identified that if any of the following is invoked then there is chance the response is returned to the client prior to the SecurityContextRepository being updated. This is because all of the following methods commit the response.
I have been able to confirm the issue exists when using Tomcat 7 when either of the following are invoked:
Do you think this might be your issue? Are you invoking flush or close? Either way we will need a fix for the issue I discovered, but I would like to ensure this resolves your issue.
FYI I have temporarily pushed a solution out to my clone. If you would like to try it to see if it fixes things for you here are the links:
Brad Rust said:
I was seeing this race condition with 3.0.x and 3.1.x code.
Thank you for providing a method of testing this code. I downloaded the 3.1.x branch and compiled it (although I had to add commons-codec:commons-codec:1.5 to buildSrc/build.gradle to compile without a dependency issue not resolving commons-codec:1.3).
I know it's not deterministic but I consistently receive failures with the old jars (spring-security-3.1.1 and I have yet to see a failure with new 3.1.2-CI jars that I have compiled.
I am not invoking a flush or close; in my code at least. The http request/response cycle is being handled by the hessian servlet.
I added a comment to Jira. Didn't realize that the more optimal workflow would be to select the "provide feedback for developer".
Anyway, your fixes appear to work for me. I really appreciate your help and engineering.
Thanks for providing feedback and not a problem on the workflow (it is more for me to manage rather than making you worry about it).
h2. Verify .close() is invoked?
I was able to confirm that the HessianServlet will invoke getOutputStream().close() under some circumstances. You will notice that it is wrapped by other implementations of OutputStream depending on various factors the original OutputStream may or may not get closed.
I pushed out an update that contains a printStackTrace in the close methods. Do you mind updating the code using git pull to see if this is the case? You will also need to rebuild it using ./gradlew build (you should not need to clean it). I'd like to make sure this fix makes sense for your issue. If you see something similar to either of the following lines you will know the close method is invoked (note you will also see a stacktrace).
h2. Build problem
I was able to build with no problems even after clearing out my Gradle cache. Are you certain you are using the gradlew?
A note to others and myself. I am able to consistently reproduce this issue if the getOutputStream().close() is invoked by using debug points to control the order of execution. Adding a debug point just before the close() I see that the response is not sent back to the client. I step over the close() and the response is sent back to the client. At that point I make another request and see that the SecurityContext has not been set (since the SecurityContextPersistenceFilter has not been invoked). I then continue the original thread and see that the SecurityContext is then saved.
I'm going to go ahead and mark this resolved since you are fairly convinced it is fixed. If you get a chance it would be great if you could try to verify that .close() is being invoked using the instructions in my previous comment.