Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HTTP/2 FlowController allocated write and flush #4242

Closed
blucas opened this issue Sep 20, 2015 · 11 comments
Closed

HTTP/2 FlowController allocated write and flush #4242

blucas opened this issue Sep 20, 2015 · 11 comments
Assignees
Labels
Milestone

Comments

@blucas
Copy link
Contributor

blucas commented Sep 20, 2015

Netty Version: lastest master (94bf412)

/cc @Scottmitch @nmittler

I have a serious issue (it is actually a blocker) that is preventing me from releasing the next version of our product (which will support HTTP/2 traffic). I would really appreciate it if this could be resolved quickly. I am more than willing to help out with this (submit a PR) as long as someone can help point me in the right direction to fixing this issue.

Issue Details
When sending a large response to a client, the flow controller fails to send the entire response. This is very similar to #4052, but is not solved by the fix for that issue.

The problem stems from the fact that the flow controller and state only write up to N allocated bytes here. If there are still pending bytes to write, the flow controller should trigger another write, after flushing. Unfortunately, at this time, all it will do is flush the N allocated bytes, something must be done so that it will trigger another write (and subsequent flush). I don't know how it should accomplish that. I am aware there is a pending pull request for 'stream writability' (#4227), but I don't believe it goes far enough to fix this issue on its own.

Example
Given:
HttpServer which receives a request, generates a FullHttpResponse, and calls writeAndFlush. The pipeline is configured to convert the FullHttpResponse to HTTP/2 Frames.

Response / Channel / Stream Info:
Response size: 100,000 bytes
Channel WriteBuffer High Watermark: 65,535 bytes
Stream allocated bytes: 43,000 bytes

Given the scenario above:

  1. Http2ConnectionHandler receives a call to flush the converted FullHttpResponse and tells the flow controller to write the pending bytes.
  2. the flow controller calculates the number of bytes to write (allocate) for the stream.
  3. the flow controller then triggers a write N allocated bytes for each stream.
  4. DefaultState will write 43,000 bytes down the pipeline.
  5. Http2ConnectionHandler will then trigger a flush to flush the 43,000 bytes to the socket.

So the problem is that only the first 43,000 bytes were sent to the socket. The other 57,000 bytes still need to be sent.

The reason we did not get this problem with #4052 was because the Stream allocate bytes was set to a value much closer to the Channel WriteBuffer High Watermark. That triggered a channelWritabilityChanged event, which ended up writing/flushing more and more of the response to the socket. Below is an example of what I mean:

Response / Channel / Stream Info:
Response size: 100,000 bytes
Channel WriteBuffer High Watermark: 65,535 bytes
Stream allocated bytes: 65,500 bytes

Given the scenario above:

  1. Http2ConnectionHandler receives a call to flush the converted FullHttpResponse and tells the flow controller to write the pending bytes.
  2. the flow controller calculates the number of bytes to write (allocate) for the stream.
  3. the flow controller then triggers a write N allocated bytes for each stream.
  4. DefaultState will write 65,500 bytes down the pipeline.
  5. Http2ConnectionHandler will then trigger a flush to flush the 65,500 bytes to the socket.
  6. SslHandler encrypts the 65,500 bytes, adding a further N number of bytes thus triggering a channelWritabilityChanged event.
  7. ChannelWritabilityChanged event triggers further writes/flushes to the socket.

A side note to all of this... I spotted something that might be a bug, or maybe it is just incorrect documentation. The code here states that all the frame data was written, but the if is checking frame.size() != 0. Should it actually be frame.size() == 0 ??? Or maybe the documentation needs clearing up?

Reproducer
@Scottmitch - The reproducer I gave you for #4052 can be used to produce the scenario I've stated above. In order to reproduce the issue, just request the file using Firefox (version I used was v40.0.3) instead of Chrome. The HTTP/2 codec sets the stream's allocated bytes much lower for Firefox than it does from Chrome.

@nmittler
Copy link
Member

@blucas Thanks for the report!

Regarding the writeBytes method, I believe the comment is wrong. I suspect it should be re-worded to something like "The frame has data but none of it can be written". @Scottmitch, does that sound right to you?

@nmittler
Copy link
Member

@Scottmitch So it sounds like the problem may be that we're using an estimate of the channel writable bytes, and not actually writing until the channel is not writable. With the current code, if our write doesn't actually cause an unwritable event, then data will get stuck since we'll never get the writable event to cause us to flush. Does that sound right to you?

@Scottmitch
Copy link
Member

Regarding the writeBytes method, I believe the comment is wrong. I suspect it should be re-worded to something like "The frame has data but none of it can be written".

+1. The same comment appears 2 times in that method, and is wrong both times.

@nmittler
Copy link
Member

@Scottmitch I think we just need to loop while the channel is writable. WDYT?

@Scottmitch
Copy link
Member

@nmittler - Looking now. The intention is to have the allocation algorithm be allowed to use the fewest amount of bytes as necessary to transition isWritable() to false...then Http2ConnectionHandler should be notified when writibiliity changes back to true and we should finish writing (or continue in this cycle until nothing left to write)... So the "looping" should be done by getting channel writability change notifications.

@Scottmitch
Copy link
Member

@nmittler - I agree with looping until the channel is no longer writable. The issue with my above assumption was that there may be ChannelHandlers that manipulate the size of the data (SslHandler, compression, etc...) so that the number of bytes written from the flow controller is not necessarily the number of bytes written to the channel, and thus the writabilty change is not guaranteed to come unless we loop.

@Scottmitch Scottmitch self-assigned this Sep 21, 2015
@Scottmitch Scottmitch added this to the 4.1.0.Beta7 milestone Sep 21, 2015
@Scottmitch
Copy link
Member

@blucas - I have verified with #4227 that FireFox is able to download a 2Mb file. Before this PR I was also observing the issue you reported where the entire file would not be downloaded. Can you verify with this PR?

@nmittler
Copy link
Member

@Scottmitch would you mind separating this change out from #4227? I think this is a separate issue and would make reviewing a bit easier. If it's too much of a pain then don't worry about it.

@blucas
Copy link
Contributor Author

blucas commented Sep 22, 2015

@Scottmitch - I checked out your branch for #4227 and can confirm that it does not suffer from this issue. I assume it is because of the do...while loop you have added to writePendingBytes.

I agree with @nmittler, if you could separate that change from the bulk of the work done in #4227 that would be great, as I assume it will take some time before #4227 is merged.

@Scottmitch
Copy link
Member

@blucas @nmittler - Yes I will pull out the relevant pieces from PR #4227 to fix this issue.

@blucas - Thanks for verifying...I'll ping you when the new PR is ready.

@nmittler
Copy link
Member

@Scottmitch SGTM ... thanks!

Scottmitch added a commit to Scottmitch/netty that referenced this issue Sep 23, 2015
Motivation:
DefaultHttp2RemoteFlowController attempts to write as many bytes as possible to transition the channel to not writable, and then relies on notification of channelWritabilityChange to continue writing. However the amount of bytes written by DefaultHttp2RemoteFlowController may not be the same number of bytes that is actually written to the channel due to other ChannelHandlers (SslHandler, compression, etc...) in the pipeline. This means there is a potential for the DefaultHttp2RemoteFlowController to be waiting for a channel writaiblity change event that will never come, and thus not write all queued data.

Modifications:
- DefaultHttp2RemoteFlowController should write pending bytes until there are no more, or until the channel is not writable.

Result:
DefaultHttp2RemoteFlowController will write all pending data.
Fixes netty#4242
Scottmitch added a commit that referenced this issue Sep 23, 2015
Motivation:
DefaultHttp2RemoteFlowController attempts to write as many bytes as possible to transition the channel to not writable, and then relies on notification of channelWritabilityChange to continue writing. However the amount of bytes written by DefaultHttp2RemoteFlowController may not be the same number of bytes that is actually written to the channel due to other ChannelHandlers (SslHandler, compression, etc...) in the pipeline. This means there is a potential for the DefaultHttp2RemoteFlowController to be waiting for a channel writaiblity change event that will never come, and thus not write all queued data.

Modifications:
- DefaultHttp2RemoteFlowController should write pending bytes until there are no more, or until the channel is not writable.

Result:
DefaultHttp2RemoteFlowController will write all pending data.
Fixes #4242
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants