Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HTTP2: Always eagerly fire frames to the child channels but manage ba… #9390

Closed
wants to merge 153 commits into from

Conversation

normanmaurer
Copy link
Member

…ckpressure via window update frames by read()

Motivation:

At the moment each Http2StreamChannel instance has a complicated state machine to handle auto read configurations. We can simplify this by just always fire frames to the child channel pipelines as soon as we receive them but only give back bytes to the flowcontroller when read() is called. This will ensure that backpressure still is correctly managed via flow control (by window update frames) but simplify things a lot and may release memory more quickly.

Fore more details see #9389 (comment).

Special thanks to @bryce-anderson for the idea.

Modifications:

  • Remove all the complex buffering logic from AbstractHttp2StreamChannel
  • Move logic to give back bytes to the flow-controller (and so produce window update frames) to beginRead(...)
  • Update tests

Result:

Less complex implementation while still maintain ability to flow control channels.

@normanmaurer
Copy link
Member Author

This is based on the idea that was brought up by @bryce-anderson and I mainly opened it to start discussions around the idea. I actually like it a lot, but will do some more testing / benchmarking tomorrow to verify it in more depth.

@bryce-anderson
Copy link
Contributor

I'll pull the patch and take look later this afternoon, but this looks even better than I'd imagined.

@ejona86
Copy link
Member

ejona86 commented Jul 18, 2019

If AUTO_READ is off, I'm trying to see how this plays out. Normally code would wait until onReadComplete to decide if it wanted to read more, and if not would call read() when it wanted more. Now onReadComplete will be called multiple times, even in the absence of read()s. That seems like asking for trouble. (And avoiding onReadComplete but still having channelRead also seems to be trouble.)

The previous code appears to be been broken in the same way, as it would deliver these extra onReadComplete()s, even if it didn't deliver any channelReads. But that could be fixed.

Even without the larger changes, it looks like we could still move updateLocalWindowIfNeeded() to doBeginRead to reduce the memory used.

I do agree this code is simpler, but it's not clear to me we can get away with it.

Copy link
Contributor

@bryce-anderson bryce-anderson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we're acking the bytes in the case of auto-read = true.

@@ -272,9 +240,10 @@ void closeOutbound() {

void streamClosed() {
unsafe.readEOS();

// Attempt to drain any queued data from the queue and deliver it to the application before closing this
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment is stale.

@@ -543,17 +509,11 @@ void fireChildRead(Http2Frame frame) {
} else {
unsafe.notifyReadComplete(allocHandle, true);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that we're essentially eager-reading, why don't we always trigger read complete via addChannelToReadCompletePendingQueue();? This should avoid about as many readComplete calls as we can.
As an aside, It seems to me that we shouldn't care much what allocHandle.continueReading() says at this point, or honestly at all with this strategy.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bryce-anderson good point... let me do this.

@normanmaurer
Copy link
Member Author

@bryce-anderson auto read just calls read() automatically so beginRead() is called as well. Which means it does not matter if its auto read or not it should work the same way. I am missing anything ?

@normanmaurer
Copy link
Member Author

@ejona86 I don't share your concerns honestly. read() is just a hint that we want to read more data, how much is dispatched is a different matter. There is no 1:1 mapping between read() and channelRead(...), as even one read() can produce multiple channelRead(...) as of today (like when the ByteToMesssageDecoder is used. What I am missing ?

@normanmaurer
Copy link
Member Author

FYI I did some benchmarks after this PR was applied and the speed is the same.

@ejona86
Copy link
Member

ejona86 commented Jul 19, 2019

There is no 1:1 mapping between read() and channelRead(...), as even one read() can produce multiple channelRead(...) as of today

I'm not bothered by that. I'm bothered that the channel will see channelReadComplete multiple times per read().

@normanmaurer
Copy link
Member Author

@ejona86 with the latest adjustments it will only see it once... Can you check again ?

@normanmaurer
Copy link
Member Author

@netty-bot test this please

@ejona86
Copy link
Member

ejona86 commented Jul 19, 2019

@ejona86 with the latest adjustments it will only see it once... Can you check again ?

Looks like it still happens multiple times. Yes, there will be channelReads between each channelReadComplete, but channelReadComplete will not be 1:1 with read().

Look at it this way: channelReadComplete() can be called even if the read() was never called.

normanmaurer and others added 4 commits July 21, 2019 20:34
Motivation:

There were new openjdk releases

Modifications:

Update releases to latest

Result:

Use latest openjdk versions on CI
Motivation:

We can easily reuse the Http2FrameStreamEvent instances and so reduce GC pressure as there may be multiple events per streams over the life-time.

Modifications:

Reuse instances

Result:

Less allocations
…oRead is false (#9389)

Motivation:

When using the HTTP/2 multiplex implementation we need to ensure we correctly drain the buffered inbound data even if the RecvByteBufallocator.Handle tells us to stop reading in between.

Modifications:

Correctly loop through the buffered inbound data until the user does stop to request from it.

Result:

Fixes #9387.

Co-authored-by: Bryce Anderson <banderson@twitter.com>
… reclaimSpace(...) call (#9394)

Motivation:

We did miss to call reclaimSpace(...) in one case which can lead to the situation of having the Recycler to not correctly reclaim space and so just create new objects when not needed.

Modifications:

Correctly call reclaimSpace(...)

Result:

Recycler correctly reclaims space in all situations.
@normanmaurer
Copy link
Member Author

normanmaurer commented Jul 22, 2019

@ejona86 that is also true for many other handlers... There is no mapping between read() and channelReadComplete imho.... the important thing is that if there is a channelRead there also must be a channelReadComplete.

normanmaurer and others added 4 commits July 23, 2019 21:05
… and the correct handler is used (#9396)

Motivation:

3062993 introduced some code change to move the responsibility of creating the stream for the upgrade to Http2FrameCodec. Unfortunaly this lead to the situation of having newStream().setStreamAndProperty(...) be called twice. Because of this we only ever saw the channelActive(...) on Http2StreamChannel but no other events as the mapping was replaced on the second newStream().setStreamAndProperty(...) call.

Beside this we also did not use the correct handler for the upgrade stream in some cases

Modifications:

- Just remove the Http2FrameCodec.onHttpClientUpgrade() method and so let the base class handle all of it. The stream is created correctly as part of the ConnectionListener implementation of Http2FrameCodec already.
- Consolidate logic of creating stream channels
- Adjust unit test to capture the bug

Result:

Fixes #9395
Motivation:

Docs should have no typos

Modifications:

Fix a few typos

Result:

More correct docs.
normanmaurer added a commit that referenced this pull request Jul 26, 2019
Motivation:

We should better update the flow-controller on Channel.read() to reduce overhead and memory overhead.

See #9390 (comment)

Modifications:

Move updateLocalWindowIfNeeded() to doBeginRead()

Result:

Reduce memory overhead
normanmaurer and others added 26 commits October 9, 2019 17:10
Motivation:

We should use adaptjdk 13 and not oracle openjdk 13 when building with Java 13

Modifications:

Use adopt@1.13.0-0

Result:

More consistent java vendor usage
Motivation:

We should have correct docs without typos

Modification:

Fix typos and spelling

Result:

More correct docs
Motivation:

We should not commit vscode specific files, so at it to gitignore

Modifications:

Add files to .gitignore

Result:

Correctly ignore ide related files
Motivation:

031c2e2 introduced some change to reduce the risk of have the `ReferenceCountedOpenSslContext` be destroyed while the `ReferenceCountedSslEngine` is still in us. Unfortunaly it missed to adjust a few tests which make assumptions about the refCnt of the context.

Modifications:

Adjust tests to take new semenatics into acount.

Result:

No more tests failures
Motivation:
In the current implementation of Base64 decoder an invalid
character `\u00BD` treated as `=`.
Also character `\u007F` leads to ArrayIndexOutOfBoundsException.

Modification:
Explicitly checks that all input bytes are ASCII characters
(greater than zero). Fix `decodabet` tables.

Result:
Correctly validation input bytes in Base64 decoder.
### Motivation:

I've now found two libraries that use Netty to be vulnerable to [CWE-113: Improper Neutralization of CRLF Sequences in HTTP Headers ('HTTP Response Splitting')](https://cwe.mitre.org/data/definitions/113.html) due to using `new DefaultHttpHeaders(false)`.

Some part of me hopes that this warning will help dissuade library authors from disabling this important security check.

### Modification:

Add documentation to `DefaultHttpHeaders(boolean)` to warn about the implications of `false`.

### Result:

This improves the documentation on `DefaultHttpHeaders`.
Motivation:

We should use the latest commons-compress release to fix CVE-2019-12402 (even it is only a test dependency)

Modifications:

Update commons-compress to 1.19

Result:

Fix security alert
Motivation:

Currently when the SslHandler coalesces outbound bytes it always
allocates a direct byte buffer. This does not make sense if the JDK
engine is being used as the bytes will have to be copied back to heap
bytes for the engine to operate on them.

Modifications:

Inspect engine type when coalescing outbound bytes and allocate heap
buffer if heap bytes are preferred by the engine.

Result:

Improved performance for JDK engine. Better performance in environments
without direct buffer pooling.
Motivation

The recently-introduced event loop scheduling hooks can be exploited by
the epoll transport to avoid waking the event loop when scheduling
future tasks if there is a timer already set to wake up sooner.

There is also a "default" timeout which will wake the event
loop after 1 second if there are no pending future tasks. The
performance impact of these wakeups themselves is likely negligible but
there's significant overhead in having to re-arm the timer every time
the event loop goes to sleep (see #7816). It's not 100% clear why this
timeout was there originally but we're sure it's no longer needed.

Modification

Combine the existing volatile wakenUp and non-volatile prevDeadlineNanos
fields into a single AtomicLong that stores the next scheduled wakeup
time while the event loop is in epoll_wait, and is -1 while it is awake.

Use this as a guard to debounce wakeups from both immediate scheduled
tasks and future scheduled tasks, the latter using the new
before/afterScheduledTaskSubmitted overrides and based on whether the
new deadline occurs prior to an already-scheduled timer.

A similar optimization was already added to NioEventLoop, but it still
uses two separate volatiles. We should consider similar streamlining of
that in a future update.

Result

Fewer event loop wakeups when scheduling future tasks, greatly reduced
overhead when no future tasks are scheduled.
Fix javadoc issues

Motivation:

Remove useless doc for 5.0

Modifications:

Remove description for 5.0

Result:

Useless doc will be removed.
#9653)

Motivation:

There is not need to use a CAS as everything is synchronized anyway. We can simplify the code a bit by not using it.

Modifications:

- Just remove the CAS operation
- Change from int to boolean

Result:

Code cleanup
…always guard correctly (#9655)

Motivation:

We can use the `@SuppressJava6Requirement` annotation to be more precise about when we use Java6+ APIs. This helps us to ensure we always protect these places.

Modifications:

Make use of `@SuppressJava6Requirement` explicit

Result:

Fixes #2509.
Motivation

A memory leak related to DNS resolution was reported in #9634,
specifically linked to the TCP retry fallback functionality that was
introduced relatively recently. Upon inspection it's apparent that there
are some error paths where the original UDP response might not be fully
released, and more significantly the TCP response actually leaks every
time on the fallback success path.

It turns out that a bug in the unit test meant that the intended TCP
fallback path was not actually exercised, so it did not expose the main
leak in question.

Modifications

- Fix DnsNameResolverTest#testTruncated0 dummy server fallback logic to
first read transaction id of retried query and use it in replayed
response
- Adjust semantic of internal DnsQueryContext#finish method to always
take refcount ownership of passed in envelope
- Reorder some logic in DnsResponseHandler fallback handling to verify
the context of the response is expected, and ensure that the query
response are either released or propagated in all cases. This also
reduces a number of redundant retain/release pairings

Result

Fixes #9634
…9662)

Motivation:

At the moment we do a ByteBuf.readBytes(...) on removal of the ByteToMessageDecoder if there are any bytes left and forward the returned ByteBuf to the next handler in the pipeline. This is not really needed as we can just forward the cumulation buffer directly and so eliminate the extra memory copy

Modifications:

Just forward the cumulation buffer directly on removal of the ByteToMessageDecoder

Result:

Less memory copies
…nnection that received a GOAWAY. (#9674)


Motivation:

In #8692, `Http2FrameCodec` was
updated to keep track of all "being initialized" streams, allocating
memory before initialization begins, and releasing memory after
initialization completes successfully.

In some instances where stream initialization fails (e.g. because this
connection has received a GOAWAY frame), this memory is never released.

Modifications:

This change updates the `Http2FrameCodec` to use a separate promise
for monitoring the success of sending HTTP2 headers. When sending of
headers fails, we now make sure to release memory allocated for stream
initialization.

Result:

After this change, failures in writing HTTP2 Headers (e.g. because this
connection has received a GOAWAY frame) will no longer leak memory.
Motivation

Currently when future tasks are scheduled via schedule(Runnable, ...)
methods, the supplied Runnable is wrapped in a newly allocated Callable
adapter prior to being wrapped in a ScheduledFutureTask.

This can be avoided which saves an object allocation per scheduled task.

Modifications

Change the Callable task field of ScheduledFutureTask to be of type
Object so that it can hold/run Runnables directly in addition to
Callables.

An "adapter" is still used in the case a Runnable is scheduled with an
explicit constant non-null completion value, assumed to be rare.

Result

Less garbage
Motivation:

bbc34d0 introduced correct handling of "in process" setup of streams but there is some room for improvements. Often the writeHeaders(...) is completed directly which means there is not need to create the extra listener object.

Modifications:

- Only create the listener if we really need too.

Result:

Less GC
Motivation:

Data flowing in from the decoder flows out in sequence,Whether decoder removed or not.

Modification:

fire data in out and clear out when hander removed
before call method handlerRemoved(ctx)

Result:

Fixes #9668 .
Motivation:

We should just ignore (and so skip) invalid entries in /etc/resolver.conf.

Modifications:

- Skip invalid entries
- Add unit test

Result:

Fix #9684
#9664)


Motivation:

We may fail to update the flow-controller and in this case need to notify the stream channel and close it.

Modifications:

Attach a future to the write of the update frame and in case of a failure propagate it to the channel and close it

Result:

Fixes #9663
Motivation:

Sometimes it is useful to be able to set attributes on a SslContext.

Modifications:

Add new method that will return a AttributeMap that is tied to a SslContext instance

Result:

Fixes #6542.
Motivation:

We have a public utility `OpenSsl.isAlpnSupported()` that helps users to
check if ALPN is available for `SslProvider.OPENSSL`. However, we do not
provide a similar utility for `SslProvider.JDK`. Therefore, users who
configured ALPN with `SslProvider.JDK` will get a runtime exception at
the time when a new connection will be created.

Modifications:

- Add public `SslProvider.isAlpnSupported(SslProvider)` utility method
that returns `true` if the `SslProvider` supports ALPN;
- Deprecate `OpenSsl.isAlpnSupported()`;

Result:

Users can verify if their environment supports ALPN with
`SslProvider` upfront (at bootstrap), instead of failing with
runtime exception when a new connection will be created.
…alue (#9688)

Motivation:

HttpPostRequestDecoder.splitHeaderContentType() throws a StringIndexOutOfBoundsException when it parses a Content-Type header that starts with a semicolon ;. We should skip the execution for incorrect multipart form data.


Modification:

Avoid invocation of HttpPostRequestDecoder#splitHeaderContentType(...) for incorrect multipart form data content-type.

Result:

Fixes #8554
### Motivation:

IdleStateEvent is very convenient and frequently used type of events. However both in runtime (logs) and in debug you need some manual steps to see their actual content. Default implementation generates worthless trash like this:

    io.netty.handler.timeout.IdleStateEvent@27f674d

There are examples already, where event has convenient and useful toString implementation:

* io.netty.handler.proxy.ProxyConnectionEvent
* io.netty.handler.ssl.SslCompletionEvent

### Modification:

* Implement 'IdleStateEvent.toString' method.
* Unit test.

### Result:

More useful String representation of IdleStateEvent
…ckpressure via window update frames by read()

Motivation:

At the moment each Http2StreamChannel instance has a complicated state machine to handle auto read configurations. We can simplify this by just always fire frames to the child channel pipelines as soon as we receive them but only give back bytes to the flowcontroller when read() is called. This will ensure that backpressure still is correctly managed via flow control (by window update frames) but simplify things a lot and may release memory more quickly.

Fore more details see #9389 (comment).

Special thanks to @bryce-anderson for the idea.

Modifications:

- Remove all the complex buffering logic from AbstractHttp2StreamChannel
- Move logic to give back bytes to the flow-controller (and so produce window update frames) to beginRead(...)
- Update tests

Result:

Less complex implementation while still maintain ability to flow control channels.
@normanmaurer
Copy link
Member Author

let us close this one for now.

@normanmaurer normanmaurer deleted the read_window_update branch October 25, 2019 18:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet