Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consistent comma splitting without regex overhead (e.g. in MediaType/MimeType) [SPR-14635] #19202

Closed
spring-projects-issues opened this issue Aug 28, 2016 · 5 comments
Assignees
Milestone

Comments

@spring-projects-issues
Copy link
Collaborator

@spring-projects-issues spring-projects-issues commented Aug 28, 2016

Christoph Dreis opened SPR-14635 and commented

Hey,

I was doing some benchmarks of MediaType.parseMediaTypes() in order to improve it a bit. While doing so, I noticed that the splitting of media types recompiles the regex pattern for each call. I therefore extracted the regex to a static field to reduce the compilation overhead on each call.

Benchmark Mode Cnt Score Error Units
MyBenchmark.parseMediaTypesBefore thrpt 30 945618,154 ± 29701,378 ops/s
MyBenchmark.parseMediaTypesNew thrpt 30 1530048,625 ± 21685,694 ops/s

As you can see there is ~60% performance increase.

Cheers,
Christoph


Affects: 4.3.2

Reference URL: #1147

Issue Links:

  • #18350 Regression: HttpEntityMethodProcessor does not allow other Http methods than defined in the HttpMethod Enum
  • #19075 HeaderContentNegotiationStrategy does not support multiple Accept headers
  • #19326 Reduce String allocations in TransactionAspectSupport.methodIdentification()

Referenced from: commits 03609c1, d8f7347

@spring-projects-issues
Copy link
Collaborator Author

@spring-projects-issues spring-projects-issues commented Aug 29, 2016

Juergen Hoeller commented

I'll repurpose this issue for a broader context: We have several places in the codebase where we're rather naively using String.split just for separating tokens by comma. Since we have optimized methods for this in our own StringUtils, let's rather use those and avoid java.util.regex handling completely.

@spring-projects-issues
Copy link
Collaborator Author

@spring-projects-issues spring-projects-issues commented Aug 29, 2016

Christoph Dreis commented

I don't mind improving more places. ;-)

@spring-projects-issues
Copy link
Collaborator Author

@spring-projects-issues spring-projects-issues commented Aug 29, 2016

Juergen Hoeller commented

I'm currently reviewing all affected places. We actually use three different ways of comma splitting. You gave me a welcome opportunity to revisit our historically grown mix there :-) The hardest part is to identify whether there were special requirements back then or just arbitrary choices.

String.split has an optimization for simply splitting but only for single-char arguments. Our HTTP header splitting does not fall into that category, unfortunately, so will definitely be better off with StringUtils.tokenizeToStringArray or StringUtils.commaDelimitedListToStringArray.

@spring-projects-issues
Copy link
Collaborator Author

@spring-projects-issues spring-projects-issues commented Aug 29, 2016

Christoph Dreis commented

Let me know if I can support you with some further benchmarks. Unfortunately, I could only provide them this evening at the earliest though.

@spring-projects-issues
Copy link
Collaborator Author

@spring-projects-issues spring-projects-issues commented Aug 31, 2016

Christoph Dreis commented

Just for completeness reasons:

List size: 1

Benchmark Mode Cnt Score Error Units
MyBenchmark.parseMediaTypesOld thrpt 30 2106726,652 ± 47965,856 ops/s
MyBenchmark.parseMediaTypesPull thrpt 30 3615490,821 ± 118357,846 ops/s
MyBenchmark.parseMediaTypesFinal thrpt 30 3970264,331 ± 283558,658 ops/s

List size: 2

Benchmark Mode Cnt Score Error Units
MyBenchmark.parseMediaTypesOld thrpt 30 959425,797 ± 6371,881 ops/s
MyBenchmark.parseMediaTypesPull thrpt 30 1681431,764 ± 52599,552 ops/s
MyBenchmark.parseMediaTypesFinal thrpt 30 1986794,253 ± 12728,893 ops/s

List size: 3

Benchmark Mode Cnt Score Error Units
MyBenchmark.parseMediaTypesOld thrpt 30 639583,402 ± 3243,210 ops/s
MyBenchmark.parseMediaTypesPull thrpt 30 1110211,528 ± 46608,522 ops/s
MyBenchmark.parseMediaTypesFinal thrpt 30 1228908,545 ± 92851,346 ops/s
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
2 participants