Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AppendableCharSequence#append(char) uses IOOBE for expansion #4807

Closed
rkapsi opened this issue Feb 1, 2016 · 8 comments
Closed

AppendableCharSequence#append(char) uses IOOBE for expansion #4807

rkapsi opened this issue Feb 1, 2016 · 8 comments
Assignees
Labels
Milestone

Comments

@rkapsi
Copy link
Member

rkapsi commented Feb 1, 2016

Hello,

I did some profiling in our production environment and noticed that AppendableCharSequence#append(char) catches IndexOutOfBoundsException to expand its internal char[] (in our case called via HTTP header parser, see linked screenshot).

I don't know if it's on purpose (there used to be a pattern around that) but
throwing+catching exceptions is in my experience more expensive than something as simple as:

if (pos >= chars.length) {
  expand();
}
chars[pos++] = c;

Additionally it's screwing and possibly distracting the profiler/developer from real exceptions.

Netty 4.1.0-CR2-SNAPSHOT

http://pasteboard.co/1eVXi7nC.png

@Scottmitch
Copy link
Member

This change was introduced by #4039. Please see the profiling results included in this PR, and also let me know if you are not able to reproduce these results, or if you think the benchmark itself is flawed.

AppendableCharSequence is a hot spot during HTTP profiling, and a few different things were tried (including rewriting portions of HttpObjectDecoder to avoid copying, and also using a AppendableByteSequence) but at the time non provided any substantial gains.

@Scottmitch Scottmitch self-assigned this Feb 2, 2016
@Scottmitch
Copy link
Member

IIRC I did run this change in isolation to see if it had any impact, but looking back at this PR there were a few other changes introduced too. Let me re-run profiles with this change isolated and see what the impacts are.

@Scottmitch
Copy link
Member

Now I remember ... AppendableCharSequenceBenchmark does isolate the two different approaches. As requested in #4807 (comment) let me know if you think there is something wrong with this benchmark.

@rkapsi
Copy link
Member Author

rkapsi commented Feb 3, 2016

Hey Scott,

I'm having trouble getting that Benchmark running (I seem to have a newer version of JMH and it doesn't like the absence of a @State annotation and if I do add it then it's running in an infinite loop).

But the first thing that comes to my mind is the potential impact of the stack depth. JMH's call stack is probably shallow. Something along the lines of this (below) may change things:

@Param({ "0", "1", "2", "4", "8", "16" })
private int stackDepth;

@Benchmark
public void appendCatchExceptionAfter() {
    if (stackDepth > 0) {
        --stackDepth;
        appendCatchExceptionAfter();
        return;
    }

    // ...
}

The IOOBEs are in our particular case most likely caused by Cookie headers. HttpObjectDecoder's initial size for the AppendableCharSequence is 128 and our Cookies are in the realm of 400-500 chars. That yields to almost guaranteed 2x Exceptions per (first) request (per connection). That combined with a rush in traffic and all the objects that need to be allocated in Throwable#fillStackTrace() can't be good.

@Scottmitch
Copy link
Member

@rkapsi - You can run the benchmark with maven:
mvn clean test -DskipTests=false -Dtest=AppendableCharSequenceBenchmark. Let me do some analysis on stack depth. It is also possible if you only use the connection for a single request that the exception being thrown will be more costly than the duplicate index checking...

@rkapsi
Copy link
Member Author

rkapsi commented Feb 5, 2016

@Scottmitch - I've run a few scenarios with that benchmark. The occasional exception being thrown vs. constant double checking is very difficult for the latter to make up. I can fabricate scenarios where the two start showing similar Op/s but it does feel very fabricated. It's hard to quantify the memory cost from these StackTraceElements that need to be created and thrown away.

I think the key takeaway for me is that it'd be great if AppendableCharSequence's initial size was configurable. It's currently hard coded to 128 (HttpObjectDecoder, line 184). From our use-case I can already tell upfront that setting it to 256 would cut the number of exceptions in half and with basically no additional memory cost as most requests will have on header that is in the 400-500 character range.

@Scottmitch
Copy link
Member

it'd be great if AppendableCharSequence's initial size was configurable

@rkapsi +1. Is it fair to treat this issue as a feature request to enhance HttpObjectDecoder? I'll put together a PR.

@rkapsi
Copy link
Member Author

rkapsi commented Feb 5, 2016

@Scottmitch sounds good.

@Scottmitch Scottmitch modified the milestones: 4.1.0.CR3, 4.0.35.Final Feb 5, 2016
Scottmitch added a commit to Scottmitch/netty that referenced this issue Feb 6, 2016
Motivation:
The initial buffer size used to decode HTTP objects is currently fixed at 128. This may be too small for some use cases and create a high amount of overhead associated with resizing/copying. The user should be able to configure the initial size as they please.

Modifications:
- Make HttpObjectDecoder's AppendableCharSequence initial size configurable

Result:
Users can more finely tune initial buffer size for increased performance or to save memory.
Fixes netty#4807
Scottmitch added a commit that referenced this issue Feb 8, 2016
Motivation:
The initial buffer size used to decode HTTP objects is currently fixed at 128. This may be too small for some use cases and create a high amount of overhead associated with resizing/copying. The user should be able to configure the initial size as they please.

Modifications:
- Make HttpObjectDecoder's AppendableCharSequence initial size configurable

Result:
Users can more finely tune initial buffer size for increased performance or to save memory.
Fixes #4807
pulllock pushed a commit to pulllock/netty that referenced this issue Oct 19, 2023
Motivation:
The initial buffer size used to decode HTTP objects is currently fixed at 128. This may be too small for some use cases and create a high amount of overhead associated with resizing/copying. The user should be able to configure the initial size as they please.

Modifications:
- Make HttpObjectDecoder's AppendableCharSequence initial size configurable

Result:
Users can more finely tune initial buffer size for increased performance or to save memory.
Fixes netty#4807
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants