Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OPTIM/MINOR: h2_settings_initial_window_size default 64k #1732

Closed

Conversation

gstrauss
Copy link
Contributor

@gstrauss gstrauss commented Jun 5, 2022

OPTIM/MINOR: h2_settings_initial_window_size default 64k

change from RFC 7540 default 65535 (64k-1)

avoid some degenerative WINDOW_UPDATE behaviors in the wild
nghttp2/nghttp2#1722

change from RFC 7540 default 65535 (64k-1)

avoid some degenerative WINDOW_UPDATE behaviors in the wild
nghttp2/nghttp2#1722
@lukastribus
Copy link
Member

This is an automated message.

The haproxy project does not review or merge pull requests on GitHub (please see CONTRIBUTING).

Your patch or patchseries will be forwarded to the haproxy mailing list where it can be reviewed (you will be CC'ed).

This PR will be closed automatically.

@lukastribus lukastribus closed this Jun 5, 2022
@wtarreau
Copy link
Member

wtarreau commented Jun 7, 2022

Hello Glenn,

Thanks for your report, I understand the problem, that's very interesting. I would say it's not even an optim, rather an approach trying to help the whole ecosystem at a very low cost by preventing known degenerative cases from happening (or helping them recover).

However I don't fully agree with the solution, it might just push the issue a bit sidewards. For example a client might already have worked around this issue by using a 65537-bytes buffer and will now face issues again.
Thus I'd first prefer to solve it at the core, then possibly merge this patch later as an optimization but nor for the problem, rather to make sure clients can make 4 full 16kB frames, and improve bandwidth and CPU efficiency.

I'm thinking how we could improve the situation on our side by sending less common window updates. Right now we send the stream WU after reading a frame because of the likelihood that the next frame is not for the same stream and that we lose track of which stream needs to be updated. We can change this to only send a single frame for a number of subsequent frames from the same stream, it will solve the problem for the case you're facing and will be more friendly to the client. Given the nature of the problem, I don't imagine we'd meet this issue with interleaved 1-byte frames from multiple streams over the same connection. If that were the case we'd then need to send less updates, waiting for a sub-divider to be crossed (e.g. check when 1/16th of the max frame size is crossed). But here I really don't think we'd need to go that far and I think that sending grouped updates will be sufficient.

I'm reopening just to keep the conversation running, but I'll study how to do this. I remember that there's a subtlety somewhere with the stream change, I think it's related to the stream switch where the current to-be-acked data is reset, I'll check :-)

@wtarreau wtarreau reopened this Jun 7, 2022
@TimWolla TimWolla added the type: feature This issue describes a feature request / wishlist. label Jun 7, 2022
@lukastribus
Copy link
Member

This is an automated message.

The haproxy project does not review or merge pull requests on GitHub (please see CONTRIBUTING).

Your patch or patchseries will be forwarded to the haproxy mailing list where it can be reviewed (you will be CC'ed).

This PR will be closed automatically.

@lukastribus lukastribus closed this Jun 7, 2022
@lukastribus
Copy link
Member

I'm afraid the bot will keep closing all PR's that are in state "open". An issue needs to be filed for this.

@gstrauss
Copy link
Contributor Author

gstrauss commented Jun 7, 2022

FYI: I sent a message to the mailing list, as recommended in CONTRIBUTING. Hopefully I'll be cc'd in responses.

@TimWolla
Copy link
Member

TimWolla commented Jun 7, 2022

Mailing list thread for reference: https://www.mail-archive.com/haproxy@formilux.org/msg42418.html

@wtarreau
Copy link
Member

wtarreau commented Jun 8, 2022

Thanks guys. I thought it was because I replied but it's because I reopened. I didn't find a trivial way to turn a PR to an issue.

@wtarreau
Copy link
Member

wtarreau commented Jun 9, 2022

Thanks Glenn, I've now merged your patch (I amended the commit message and also updated the doc), as well as the one to coalesce WU frames on output. We're dividing their count by two on average, that's nice. It was totally impossible for me to reproduce the issue with haproxy, regardless of frame sizes, buffer size, window sizes. I looks like the fact that we're using similar-sized buffers for connection and streams prevents the degenerative behavior by marking different boundaries in haproxy and curl. But sometimes I could still see up to two consecutive one-byte frames so definitely this had to be addressed.

@gstrauss
Copy link
Contributor Author

gstrauss commented Jun 9, 2022

But sometimes I could still see up to two consecutive one-byte frames so definitely this had to be addressed.

Thank you for picking this up.

FYI: given sufficient latency between client and server, you should be able to reproduce the recursive degenerative behavior (before recent commits to change the initial window size and to coalese WINDOW_UPDATE frames).

@wtarreau
Copy link
Member

wtarreau commented Jun 9, 2022

I really tried, but as indicated above I suspect that the fact that we end up reading partial frames and sending partial WU possibly avoids this behavior in this particular case. For example for a 16384 bytes frame, we'll read something like 16336 first, then 48 followed by 16327 and 57 etc. Thus WU are sent not for the exact same frame sizes but for what is processed at once, and this varies depending on frame boundaries, which make the issue way less probable. I increased the tune.bufsize to avoid this splitting, and frame sizes were as your predicted them, but no degenerative behavior could be produced either. Anyway rest assured that I understand the problem and I consider that we're lucky not to see it right now because theoretically everything is there for it to happen, provided adequate latency, framing etc. So now that's plugged and we should be safe against it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type: feature This issue describes a feature request / wishlist.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants