Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize RoutingContext by removing summary generation and othe… #2295

Merged
merged 2 commits into from Dec 4, 2019

Conversation

anuraaga
Copy link
Collaborator

@anuraaga anuraaga commented Dec 4, 2019

…r small tweaks.

  • Don't generate a summary array for use in hashing and do the computation directly to reduce allocations
  • Make accept headers extraction non-lazy, it was being eagerly evaluated during summary extraction in the old code anyways. It isn't really possible to use routing cache with lazy attributes anyways
  • Don't use forEach due to extra Stream allocation and lambda allocation during extraction (current forEach holds a reference to the builder)
  • Don't allocate list builder when getAll headers and the header isn't present

After

Benchmark                             Mode  Cnt         Score        Error  Units
RoutersBenchmark.exactMatch          thrpt    5  12921841.803 ▒ 173790.211  ops/s
RoutersBenchmark.exactMatch_wrapped  thrpt    5  12047216.990 ▒ 351620.117  ops/s

Before (wrapped is slower due to double allocation of summary)

Benchmark                             Mode  Cnt        Score         Error  Units
RoutersBenchmark.exactMatch          thrpt    5  7817340.954 ▒  926608.528  ops/s
RoutersBenchmark.exactMatch_wrapped  thrpt    5  6175815.558 ▒ 276154.231  ops/s

Copy link
Collaborator

@trustin trustin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Niiiice 👍

@trustin trustin added this to the 0.97.0 milestone Dec 4, 2019
@anuraaga
Copy link
Collaborator Author

anuraaga commented Dec 4, 2019

For reference, I tend to look hard at routing code when trying to squeeze out performance since it's a lot of useful functionality, but not needed in RPC-only frameworks like gRPC upstream. So to be able to compete in performance with RPC frameworks, we need to make sure it's as fast as possible - this means having the router cache and making sure construction of the routing context is as lightweight as possible. Hopefully we find more ways to reduce this extra processing even further :)

@codecov
Copy link

codecov bot commented Dec 4, 2019

Codecov Report

Merging #2295 into master will decrease coverage by 0.1%.
The diff coverage is 63.46%.

Impacted file tree graph

@@             Coverage Diff              @@
##             master    #2295      +/-   ##
============================================
- Coverage     73.82%   73.71%   -0.11%     
+ Complexity     9934     9919      -15     
============================================
  Files           871      871              
  Lines         38024    38041      +17     
  Branches       4663     4671       +8     
============================================
- Hits          28070    28041      -29     
- Misses         7572     7621      +49     
+ Partials       2382     2379       -3
Impacted Files Coverage Δ Complexity Δ
...va/com/linecorp/armeria/server/RoutingContext.java 100% <ø> (ø) 1 <0> (ø) ⬇️
...a/com/linecorp/armeria/common/HttpHeadersBase.java 72.16% <100%> (+0.1%) 153 <0> (+1) ⬆️
...linecorp/armeria/server/DefaultRoutingContext.java 77.1% <60%> (-18.28%) 28 <7> (ø)
...linecorp/armeria/server/RoutingContextWrapper.java 58.82% <66.66%> (+3.82%) 8 <2> (-1) ⬇️
...n/java/com/linecorp/armeria/server/HttpServer.java 40% <0%> (-20%) 2% <0%> (-1%)
...om/linecorp/armeria/client/HttpSessionHandler.java 61.01% <0%> (-11.02%) 29% <0%> (-4%)
...inecorp/armeria/client/AbstractEventLoopState.java 92.3% <0%> (-7.7%) 6% <0%> (-1%)
...corp/armeria/client/DefaultEventLoopScheduler.java 79.27% <0%> (-7.21%) 30% <0%> (-3%)
...com/linecorp/armeria/client/OneEventLoopState.java 66.66% <0%> (-4.77%) 6% <0%> (-1%)
...com/linecorp/armeria/server/saml/SamlEndpoint.java 62.5% <0%> (-2.5%) 10% <0%> (-1%)
... and 6 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 9e80207...41d6b55. Read the comment docs.

Copy link
Member

@minwoox minwoox left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@anuraaga
Copy link
Collaborator Author

anuraaga commented Dec 4, 2019

@trustin your other PR inspired me to use do/while here :D

Copy link
Contributor

@ikhoon ikhoon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! 👍

@trustin trustin changed the title Optimize RoutingContext by removing summary generation and other smal… Optimize RoutingContext by removing summary generation and othe… Dec 4, 2019
@trustin trustin merged commit 53ab31d into line:master Dec 4, 2019
fmguerreiro pushed a commit to fmguerreiro/armeria that referenced this pull request Sep 19, 2020
…e#2295)

…r small tweaks.

- Don't generate a summary array for use in hashing and do the computation directly to reduce allocations
- Make accept headers extraction non-lazy, it was being eagerly evaluated during summary extraction in the old code anyways. It isn't really possible to use routing cache with lazy attributes anyways
- Don't use `forEach` due to extra `Stream` allocation and lambda allocation during extraction (current `forEach` holds a reference to the builder)
- Don't allocate list builder when `getAll` headers and the header isn't present

After
```
Benchmark                             Mode  Cnt         Score        Error  Units
RoutersBenchmark.exactMatch          thrpt    5  12921841.803 ▒ 173790.211  ops/s
RoutersBenchmark.exactMatch_wrapped  thrpt    5  12047216.990 ▒ 351620.117  ops/s
```

Before (wrapped is slower due to double allocation of summary)
```
Benchmark                             Mode  Cnt        Score         Error  Units
RoutersBenchmark.exactMatch          thrpt    5  7817340.954 ▒  926608.528  ops/s
RoutersBenchmark.exactMatch_wrapped  thrpt    5  6175815.558 ▒ 276154.231  ops/s
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants