Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve WebFlux performance for header management [SPR-17250] #21783

Closed
spring-issuemaster opened this issue Sep 6, 2018 · 1 comment
Closed

Improve WebFlux performance for header management [SPR-17250] #21783

spring-issuemaster opened this issue Sep 6, 2018 · 1 comment
Assignees
Milestone

Comments

@spring-issuemaster
Copy link
Collaborator

@spring-issuemaster spring-issuemaster commented Sep 6, 2018

Brian Clozel opened SPR-17250 and commented

In a recent reactor netty issue, small apps are being used to benchmark different frameworks. While working on that, Violeta Georgieva found a few interesting points we could take a look at, as opportunities for performance improvements and micro-benchmark cases.

Hotspots

The benchmark has underlined a few hostpots in the Spring Framework codebase:

  • org.springframework.util.LinkedCaseInsensitiveMap.convertKey(String) (own time, 5th overall)
  • org.springframework.util.MimeType.<init>(String, String, Map) (own time, 10th overall)

LinkedCaseInsensitiveMap

The convertKey method is used evenly by get and put operations on the map itself. This is calling String.toLowerCase many times, which involves a complex logic (it depends on the Locale) and creates new String instances. In general, copying headers from the native request and looking them up in our data structure is not efficient.

Request Content-Type parsing

HttpHeaders.getContentType is the main offender for get operations, called in several places:

  • org.springframework.web.server.adapter.DefaultServerWebExchange.initFormData(ServerHttpRequest, ServerCodecConfigurer)
  • org.springframework.web.reactive.function.BodyExtractors.contentType(HttpMessage)
  • org.springframework.web.server.adapter.DefaultServerWebExchange.initMultipartData(ServerHttpRequest, ServerCodecConfigurer)
  • org.springframework.http.codec.DecoderHttpMessageReader.getContentType(HttpMessage)
  • and more

In general, we're spending a lot of time on org.springframework.util.MimeType.<init>(String, String, Map) and org.springframework.http.MediaType.parseMediaType(String) when parsing the content type of the request.

Possible solutions

We could use a more optimized data structure behind HttpHeaders. A data structure that uses equalsIgnoreCase or even case insensitive hashcode should be better. Many server implementations don't even use Map implementations, given that requests often contain just a few headers and the cost of setting up a Map outweighs the lookup performance gains.

Benchmarks show that even a new TreeMap<String, String>(String.CASE_INSENSITIVE_ORDER) shows better performance.

Other implementations tend to wrap the String instances to store their case insensitive hash internally and use that for comparisons.

For Spring WebFlux, we could even wrap the native HTTP request headers and never copy those values in other data structures. This requires dedicated implementations for each server (Tomcat, Jetty, Reactor Netty and Undertow), but since most of those are using pooled resources for headers, we might improve a lot the GC pressure applied for each HTTP exchange handling.

Microbenchmark results

Here are microbenchmark results on HttpHeaders variants, no other Spring infrastructure piece involved.
The "GET requests" case is reading a few headers from the request; the "POST requests" case is reading multiple times the "Content-Type" and "Content-Length" headers, as Spring does.

The baseline is just performing those read operations from an existing map. the other variants are doing what Spring is supposed to do, treating that map as the native request headers. In the regular HttpHeaders case, we're copying headers into our instance and performing the lookup operations. Other variants are trying the same with other map implementations.

This shows that the copying/allocations are using resources and changing the underlying implementation only slightly improves performance. This means we should instead look into leveraging the native headers directly and avoid copying in the first place.

Benchmark                                                Mode  Cnt         Score        Error  Units

1. GET requests
MyBenchmark.baselineGetRequest                          thrpt    5   890408.644 ±  25882.489  ops/s
1. current HttpHeaders implementation
MyBenchmark.httpHeadersGetRequest                       thrpt    5   632354.108 ±   9054.069  ops/s
1. current implementation, but with a shortcut for linked map lookups
MyBenchmark.fixedCaseInsensitiveHeadersGetRequest       thrpt    5   671559.477 ±  22871.242  ops/s
1. same thing, but backed by a case insensitive TreeMap
MyBenchmark.treeMapHttpHeadersGetRequest                thrpt    5   734720.620 ±  17370.468  ops/s
1. same thing, but backed by a dedicated map implementation
MyBenchmark.headersMapHttpHeadersGetRequest             thrpt    5   695399.245 ±  24594.272  ops/s

1. POST requests
1. baseline benchmark
MyBenchmark.baselinePostRequest                         thrpt    5  8672650.691 ± 135878.532  ops/s
1. current HttpHeaders implementation
MyBenchmark.httpHeadersPostRequest                      thrpt    5   994670.287 ±  19372.729  ops/s
1. current implementation, but with a shortcut for linked map lookups
MyBenchmark.fixedCaseInsensitiveHttpHeadersPostRequest  thrpt    5  1058808.241 ± 131617.683  ops/s
1. same thing, but backed by a case insensitive TreeMap
MyBenchmark.treeMapHttpHeadersPostRequest               thrpt    5  1525237.824 ±  27530.672  ops/s
1. current implementation, but caching the content type once resolved
MyBenchmark.httpHeadersContentTypeCachePostRequest      thrpt    5  1584504.796 ±  65863.946  ops/s
1. caching the content type + backed by a dedicated map implementation
MyBenchmark.optimizedHttpHeadersPostRequest             thrpt    5  2310686.814 ±  93787.128  ops/s

Affects: 5.1 RC2

Reference URL: reactor/reactor-netty#392

Issue Links:

  • #21929 CORS detection for reactive stack on tomcat/netty is broken
  • #22164 HttpHeaders.EMPTY is not immutable
  • #21926 Remove transfer-encoding check in EncoderHttpMessageWriter and related workaround in ReactorServerHttpResponse
  • #22120 Consistent handling of null header values in HttpHeaders

Referenced from: commits 10d5de7, 58b3af9, f12c28e, ce7278a

0 votes, 7 watchers

@spring-issuemaster
Copy link
Collaborator Author

@spring-issuemaster spring-issuemaster commented Oct 11, 2018

Brian Clozel commented

After the analysis done in the issue description, we've decided to optimize on both fronts:

  • cache the "Content-Type" resolution on requests, since that information is asked for several times - this is for both Spring MVC and Spring WebFlux
  • avoid copying headers from/to requests and responses and leverage directly native HTTP headers  data structures - this is Spring WebFlux only

The first optimization directly shows improvements for POST/PUT requests where the server reads the request "Content-Type" in several places: request mapping, decoding, etc. This puts less pressure on MimeType parsing, which is still a hot spot after this change.

The second optimization brings the most benefit for Spring WebFlux applications. All LinkedCaseInsensitiveMap methods (and underlying String.toLowerCase calls) are typically fast (<0.1ms on average), the huge number of calls and memory allocations represent a non-negligible performance impact.

When looking at hot methods with a high overhead profiler (sorted by their "own time", adding the total amount of time spent in them overall), we get the following result: amongst the top 3 hottest methods (all String.toLowerCase related), just one remain and is now at rank 5 (MimeType has still a part in that).

We'll work on further optimizations in separate issues.

@spring-issuemaster spring-issuemaster added this to the 5.1.1 milestone Jan 11, 2019
bclozel added a commit that referenced this issue Jan 17, 2020
As of gh-21783, Spring WebFlux uses a `TomcatHeadersAdapter`
implementation to directly address the native headers used by the
server.

In the case of Tomcat, "Content-Length" and "Content-Type" headers are
processed separately and should not be added to the native headers map.

This commit improves the `HandlerAdapter` implementation for Tomcat and
removes those headers, if previously set in the map. The adapter
already has a section that handles the Tomcat-specific calls for such
headers.

Fixes gh-24361
bclozel added a commit that referenced this issue Jan 17, 2020
As of gh-21783, Spring WebFlux uses a `TomcatHeadersAdapter`
implementation to directly address the native headers used by the
server.

In the case of Tomcat, "Content-Length" and "Content-Type" headers are
processed separately and should not be added to the native headers map.

This commit improves the `HandlerAdapter` implementation for Tomcat and
removes those headers, if previously set in the map. The adapter
already has a section that handles the Tomcat-specific calls for such
headers.

Fixes gh-24387
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
2 participants
You can’t perform that action at this time.