Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

http2: avoid degenerative upload behavior #8965

Closed
wants to merge 1 commit into from

Conversation

gstrauss
Copy link
Contributor

@gstrauss gstrauss commented Jun 6, 2022

http2: avoid degenerative upload behavior

avoid degenerative upload behavior which might cause curl to send
mostly 1-byte DATA frames after exhausing the h2 send window size

related discussion: nghttp2/nghttp2#1722

Signed-off-by: Glenn Strauss gstrauss@gluelogic.com

Description

curl degenerative behavior possible when HTTP/2 window size exhausted

curl using libnghttp2 to upload a file over HTTP/2 may in some scenarios end up sending h2 DATA frames with 1-byte each for most of the DATA frames, leading to very high CPU utilization and extremely slow uploads:

A fast client machine will quickly exhaust the h2 send window (default SETTINGS_INITIAL_WINDOW_SIZE 65535 (64k-1)) by sending large DATA frames (default SETTINGS_MAX_FRAME_SIZE 16384 (16k)) in a sequence of size: (16384, 16384, 16384, 16383).

As a server receives h2 DATA frames, a server will typically send h2 WINDOW_UPDATE with exactly the number of bytes received in h2 DATA frame in order to replenish the client send window size.

A fast client machine will quickly send the next block of data immediately upon receiving h2 WINDOW_UPDATE frame from server, once again exhausing the h2 send window.

Now, if the client uploading a large file always fills the window size, then h2 WINDOW_UPDATE frames from the server will also typically be large.

Unfortunately, if the fast client chooses not to fill the window size, then a smaller DATA frame is injected into the sequence, e.g. if the client uses a 64k buffer, the fast client may send DATA frames in a sequence of sizes: (16384, 16384, 16384, 16383, 1) before reusing the 64k buffer to read the next 64k block to be sent.

This can -- and, in practice, does (!) -- lead to severe degenerative behavior, e.g. one person reported a 200 MB upload which took 1-2 mins with HTTP/1.1 now took about 30 mins with HTTP/2. https://redmine.lighttpd.net/issues/3089

The fast client continues injecting an additional 1-byte DATA frame every 64k eventually leading to almost every frame being a 1-byte DATA frame!

  --> (16384, 16384, 16384, 16383)
  --> (1, 16383, 16384, 16384, 16383, 1)
  --> (1, 1, 16382, 16384, 16384, 16383, 1)
  --> (1, 1, 1, 16381, 16384, 16384, 16383, 1)
  --> (1, 1, 1, 1, 16380, 16384, 16384, 16383, 1)
  --> (1, 1, 1, 1, 1, 16379, 16384, 16384, 16383, 1)
  ...
Consider the initial send from the fast client to fill the 65535 window size:
  --> (16384, 16384, 16384, 16383)
Now, when the client receives the first WINDOW_UPDATE from the server
              <-- (16384)
the fast client sends 1 byte to finish the client 64k block, and then
continues with the next 64k block.  Since the window size was 16384:
  --> (1, 16383)
The server continues sending WINDOW_UPDATE for previously sent frames
              <-- (16384)
and the client immediately sends more data
  --> (16384)
continuing
              <-- (16384)
  --> (16384)
              <-- (16383)
  --> (16383)
              <-- (1)
  --> (1)
The client still has 1 byte left to send in its 64k block and so injects
another 1-byte DATA frame upon receiving the next WINDOW_UPDATE from server
before the client starts its next 64k block, this time sending only 16382.
              <-- (16383)
  --> (1, 16382)

I believe there is a design flaw in the defaults chosen in RFC 7540 that SETTINGS_INITIAL_WINDOW_SIZE 65535 (64k-1) is not a multiple of SETTINGS_MAX_FRAME_SIZE 16384 (16k). It should have been a clean multiple.

I think that client and servers would do well to unconditionally send SETTINGS_INITIAL_WINDOW_SIZE 65536 in the HTTP/2 connection preface, and to pre-emptively increase session window size by sending WINDOW_UPDATE with +1 for id 0.

Design choices aside, I have traced this issue reported here to curl's behavior in lib/transfer.c:readwrite_upload(). curl completely empties the buffer before refilling it (if(0 == k->upload_present)) rather than having a low watermark after which the bytes at the end of the buffer are shifted to the beginning, and more data is appended to the buffer. If the low watermark were, say, 64 bytes -- or even in this case 1-byte -- then curl on a fast client machine would be able to continually fill the available h2 send window instead of the current behavior in which curl injects a 1-byte DATA frame at the end of sending each 64k buffer.

Modifying lib/transfer.c:readwrite_upload() to support appending is a larger bit of work, but I might work on that if you might accept such a patch.

In the meantime, this PR implements a simple mitigation to drastically reduce the impact of the degenerative scenario described above.

gstrauss added a commit to gstrauss/curl that referenced this pull request Jun 6, 2022
avoid degenerative upload behavior which might cause curl to send
mostly 1-byte DATA frames after exhausing the h2 send window size

related discussion: nghttp2/nghttp2#1722

Signed-off-by: Glenn Strauss <gstrauss@gluelogic.com>
Closes curl#8965
@bagder bagder added the HTTP/2 label Jun 6, 2022
@bagder
Copy link
Member

bagder commented Jun 6, 2022

Modifying lib/transfer.c:readwrite_upload() to support appending is a larger bit of work

This certainly sounds like something that is worth exploring further, so yes this could be a valuable fix. I think it could also benefit other parts of curl. HTTP/3 and SSH based transfers come to mind.

@bagder bagder requested a review from tatsuhiro-t June 6, 2022 10:06
lib/http2.c Outdated
@@ -1166,6 +1166,9 @@ static ssize_t data_source_read_callback(nghttp2_session *session,

nread = CURLMIN(stream->upload_len, length);
if(nread > 0) {
if(length < 2048 &&
(stream->upload_len >= 2048 || stream->upload_left >= 2048))
return NGHTTP2_ERR_DEFERRED;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This obviously does not work if server allows less than 2048 of window. But in practice, it is quite rare.

In some particular scenario, curl goes into busy loop. For example, initial window size is 65535 + 2048, and if server never send window update, curl goes into busy loop. Server that never send window update is kinda broken though.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've removed the commit from this PR.

Is there a way to return NGHTTP2_ERR_DEFERRED or something else (PAUSE?) to avoid spinning and instead wait for an external event to re-wake/retry the stream, i.e. a WINDOW_UPDATE?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NGHTTP2_ERR_DEFERRED actually removes the stream from the tx queue. But curl calls nghttp2_session_resume_stream when data is available, which brings back the stream into the queue.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand you correctly, you are suggesting that the places in curl lib/http2.c which call nghttp2_session_resume_data() could, if desired, check for window sizes above a certain threshold (low watermark) instead of the current behavior which checks for window sizes > 0.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I am not suggesting that. I just answered you question about the interaction between NGHTTP2_ERR_DEFERRED and nghttp2_session_resume_stream.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In nghttp2/nghttp2#1722, you said this was an application's problem, not the nghttp2 library. Is there any doc I could read about the recommended way to use libnghttp2 and have the application manage the HTTP/2 window sizes? lighttpd uses an HTTP/2 implementation that I wrote; lighttpd does not use libnghttp2. However, curl does use libnghttp2 and as I am not an expert on libnghttp2, and since libnghttp2 is not going to help applications avoid degenerative behaviors like the issue reported here, I am asking for guidance/examples on how an application such as curl should interact with libnghttp2 to manage window sizes.

gstrauss added a commit to gstrauss/curl that referenced this pull request Jun 6, 2022
append to upload buffer when only small amount remains in buffer
rather than performing a separate tiny send to empty buffer

avoid degenerative upload behavior which might cause curl to send
mostly 1-byte DATA frames after exhausing the h2 send window size

related discussion: nghttp2/nghttp2#1722

Signed-off-by: Glenn Strauss <gstrauss@gluelogic.com>
Closes curl#8965
@gstrauss
Copy link
Contributor Author

gstrauss commented Jun 6, 2022

Modifying lib/transfer.c:readwrite_upload() to support appending is a larger bit of work

This certainly sounds like something that is worth exploring further, so yes this could be a valuable fix. I think it could also benefit other parts of curl. HTTP/3 and SSH based transfers come to mind.

It turned out to be less invasive than I initially expected. I have appended a commit to this PR to refill small buffers, eliding small sends to first empty the buffer before refilling.

notes:

  • What should be the low watermark? The commit uses < 1024.
  • If Curl_fillreadbuffer() fails, the error is returned.
    If we wanted to first send previously read bytes, then we would
    need to store the bad result in a new member in data, and
    would have to later check that for error when 0 == k->upload_present.
    Is sending all previously read bytes a requirement? (I do not think so.)
  • This patch could be modified to allow appending to upload buffer
    when data->req.upload_chunky if Curl_fillreadbuffer() were
    modified to insert a fixed-sized chunked header (space padded as needed)
  • I did not carefully check whether or not I needed to exclude buffer appending
    when data->state.trailers_state == TRAILERS_INITIALIZED
    (see Curl_fillreadbuffer())

gstrauss added a commit to gstrauss/curl that referenced this pull request Jun 6, 2022
append to upload buffer when only small amount remains in buffer
rather than performing a separate tiny send to empty buffer

avoid degenerative upload behavior which might cause curl to send
mostly 1-byte DATA frames after exhausing the h2 send window size

related discussion: nghttp2/nghttp2#1722

Signed-off-by: Glenn Strauss <gstrauss@gluelogic.com>
Closes curl#8965
gstrauss added a commit to gstrauss/curl that referenced this pull request Jun 6, 2022
append to upload buffer when only small amount remains in buffer
rather than performing a separate tiny send to empty buffer

avoid degenerative upload behavior which might cause curl to send
mostly 1-byte DATA frames after exhausing the h2 send window size

related discussion: nghttp2/nghttp2#1722

Signed-off-by: Glenn Strauss <gstrauss@gluelogic.com>
Closes curl#8965
gstrauss added a commit to gstrauss/curl that referenced this pull request Jun 6, 2022
append to upload buffer when only small amount remains in buffer
rather than performing a separate tiny send to empty buffer

avoid degenerative upload behavior which might cause curl to send
mostly 1-byte DATA frames after exhausing the h2 send window size

related discussion: nghttp2/nghttp2#1722

Signed-off-by: Glenn Strauss <gstrauss@gluelogic.com>
Closes curl#8965
gstrauss added a commit to gstrauss/curl that referenced this pull request Jun 6, 2022
append to upload buffer when only small amount remains in buffer
rather than performing a separate tiny send to empty buffer

avoid degenerative upload behavior which might cause curl to send
mostly 1-byte DATA frames after exhausing the h2 send window size

related discussion: nghttp2/nghttp2#1722

Signed-off-by: Glenn Strauss <gstrauss@gluelogic.com>
Closes curl#8965
gstrauss added a commit to gstrauss/curl that referenced this pull request Jun 6, 2022
append to upload buffer when only small amount remains in buffer
rather than performing a separate tiny send to empty buffer

avoid degenerative upload behavior which might cause curl to send
mostly 1-byte DATA frames after exhausing the h2 send window size

related discussion: nghttp2/nghttp2#1722

Signed-off-by: Glenn Strauss <gstrauss@gluelogic.com>
Closes curl#8965
@gstrauss
Copy link
Contributor Author

gstrauss commented Jun 6, 2022

Looks like test363 through proxy tunnel is failing with a segfault. I'll try to repro later.

@gstrauss
Copy link
Contributor Author

gstrauss commented Jun 6, 2022

Looks like test363 through proxy tunnel is failing with a segfault. I'll try to repro later.

Separate from the new code which appends to the existing upload buffer, the segfault is a 0x0 NULL pointer dereference where http->postsize is set but http->postdata is 0x0 NULL. Are there some other conditions that must be checked to ensure we are ready to send the request body before attempting to append to the upload buffer?

Some guidance, please.

gstrauss added a commit to gstrauss/curl that referenced this pull request Jun 7, 2022
append to upload buffer when only small amount remains in buffer
rather than performing a separate tiny send to empty buffer

avoid degenerative upload behavior which might cause curl to send
mostly 1-byte DATA frames after exhausing the h2 send window size

related discussion: nghttp2/nghttp2#1722

Signed-off-by: Glenn Strauss <gstrauss@gluelogic.com>
Closes curl#8965
@gstrauss
Copy link
Contributor Author

gstrauss commented Jun 7, 2022

@bagder the code to append to upload buffer appears stable and performant, avoiding the degenerative upload behavior reported at top.

I am less confident in the HTTP logic I have added which chooses when not to append to the upload buffer. I would prefer if you could instead suggest logic that would indicate when it is safe to append to the upload buffer -- when request body is available and when we are sending body content -- and we err on the side of not appending for all other cases. Thank you.

Copy link
Member

@bagder bagder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks promising!

lib/transfer.c Outdated
ssize_t offset = 0;

if(0 != k->upload_present && k->upload_present < 1024 && !k->upload_done) {
DEBUGASSERT(data->set.upload_buffer_size >= 1024);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer a define instead of the hard-coded number 1024 here. Then you can add a comment where the threshold is defined explaining the purpose of this value.

The buffer size is changeable in the API. Should the threshold perhaps rather be a share of that buffer size?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. Where would you like the #define located? In the func? Above the func? In a header?

The purpose is to avoid less efficient tiny writes. The value is set to limit the size and overhead of shifting data at the end of the buffer to the beginning of the buffer. (The overhead of copying is low compared to inefficient writes to the network, but gets slightly more expensive as the buffer size increases and more data is copied.)

If you prefer a share of the buffer, then I propose 1/32 of the buffer size (upload_buffer_size >> 5). The reason I had chosen 1024 is that the HTTP/2 default max frame size is 16k. However, you rightly point out that this code is used by many different protocols, not just HTTP/2.

Just about any reasonable value should be better than the current behavior, so please choose something and let me know your preference.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

modified to (data->set.upload_buffer_size >> 5) in the updated patch. Will change to whatever your preference.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I honestly think, that a plain and hard-coded 1024 does the job better. I only figured it's purpose out, after seeing this conversation and good code should explain itself.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@emilengler The block immediately following and a few lines below where offset is assigned states: /* only read more data if there's no upload data already present in the upload buffer, or if appending to upload buffer */. I agree that good code should explain itself, often without comments. At the same time, hard-coded magic numbers such as 1024 often do not add anything to clarity, which is why @bagder requested a better-named #define. I therefore disagree with your comment "1024 does the job better.". It does not do the job better. I am awaiting feedback from @bagder if he prefers #define foo 1024 or #define foo(data) ((data)->set.upload_buffer_size >> 5), and what 'foo' should be named and where the #define should be located in a .c or .h.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I created a #define curl_upload_refill_watermark(data)

lib/transfer.c Outdated
offset = 0;
#ifdef USE_HYPER
offset = 0;
#else
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function is actually never called for HTTP when Hyper is used. The HTTP route is different then.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok. will remove that along with next update with review feedback.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

lib/transfer.c Outdated
data->state.httpreq == HTTPREQ_POST_MIME) {
if(!k->p.http->postdata)
offset = 0;
}
Copy link
Member

@bagder bagder Jun 7, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you elaborate on why these conditions should use a zero offset?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

offset = 0 indicates "do not append": preserve existing behavior and falling through to send whatever remains in the buffer until the buffer is empty, and then read more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

k->p.http->postdata is NULL in the crash in test363. I do not understand why that is the case when k->p.http->sending == HTTPSEND_BODY

The other conditions are to avoid situations where it might not be safe to append. For example, the current chunk code writes a variable amount for the chunked header, and it adjusts k->upload_fromhere. Were we to modify the chunked code to write a fixed length header, then appending might be possible. I was not sure about the safety of appending with the trailer state, and it seemed prudent to me to restrict appending for HTTP to when sending k->p.http->sending == HTTPSEND_BODY. I am happy to be corrected on any of this.

Copy link
Contributor Author

@gstrauss gstrauss Jun 8, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, I figured out that the reason test363 was crashing was that the POST body had already been queued, and k->upload_done would be set after the existing k->upload_present was sent. I have adjusted the conditions and forced-pushed an update to this PR.

If I adjust the code writing chunked header to use a fixed size, I could make this patch work when k->upload_chunky is set. Would you like me to add a patch to do that? Using a fixed size might send up to 8 excess bytes in the chunked header for each chunk sent.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For your review, I pushed an additional commit which uses a fixed-size chunked header, so that appending to the buffer is safe for HTTP/1.1 chunked, too. I am not sure whether or not the additional bytes in the chunked header are a worthwhile tradeoff, and would lean toward not including this additional commit, but I'll leave that up to you.

gstrauss added a commit to gstrauss/curl that referenced this pull request Jun 8, 2022
append to upload buffer when only small amount remains in buffer
rather than performing a separate tiny send to empty buffer

avoid degenerative upload behavior which might cause curl to send
mostly 1-byte DATA frames after exhausing the h2 send window size

related discussion: nghttp2/nghttp2#1722

Signed-off-by: Glenn Strauss <gstrauss@gluelogic.com>
Closes curl#8965
gstrauss added a commit to gstrauss/curl that referenced this pull request Jun 9, 2022
append to upload buffer when only small amount remains in buffer
rather than performing a separate tiny send to empty buffer

avoid degenerative upload behavior which might cause curl to send
mostly 1-byte DATA frames after exhausing the h2 send window size

related discussion: nghttp2/nghttp2#1722

Signed-off-by: Glenn Strauss <gstrauss@gluelogic.com>
Closes curl#8965
@gstrauss
Copy link
Contributor Author

gstrauss commented Jun 9, 2022

Now that I tracked down why test363 had been failing, I believe I have modified the condition checks at the beginning which set offset to match the conditions toward the end of the routine which might set k->upload_done, I have removed the other HTTP-specific conditions which might set offset = 0.

Also, while I added a commit to enable safe appending to the buffer with HTTP/1.1 chunked, I am leaning against including it due to the additional cost to size of every single chunked header. I added the commit only for your evaluation and review without further back-and-forth.

@gstrauss
Copy link
Contributor Author

gstrauss commented Jun 9, 2022

I had to remove the additional commit which tried to handle k->upload_chunky since it was hanging for me at test60.

@bagder
Copy link
Member

bagder commented Jun 9, 2022

build error

transfer.c: In function 'readwrite_upload':
transfer.c:924:26: error: comparison of integer expressions of different signedness: 'ssize_t' {aka 'int'} and 'unsigned int' [-Werror=sign-compare]
  924 |        k->upload_present < (data->set.upload_buffer_size >> 5) &&
      |                          ^
cc1.exe: all warnings being treated as errors

gstrauss added a commit to gstrauss/curl that referenced this pull request Jun 9, 2022
append to upload buffer when only small amount remains in buffer
rather than performing a separate tiny send to empty buffer

avoid degenerative upload behavior which might cause curl to send
mostly 1-byte DATA frames after exhausing the h2 send window size

related discussion: nghttp2/nghttp2#1722

Signed-off-by: Glenn Strauss <gstrauss@gluelogic.com>
Closes curl#8965
gstrauss added a commit to gstrauss/curl that referenced this pull request Jun 9, 2022
append to upload buffer when only small amount remains in buffer
rather than performing a separate tiny send to empty buffer

avoid degenerative upload behavior which might cause curl to send
mostly 1-byte DATA frames after exhausing the h2 send window size

related discussion: nghttp2/nghttp2#1722

Signed-off-by: Glenn Strauss <gstrauss@gluelogic.com>
Closes curl#8965
gstrauss added a commit to gstrauss/curl that referenced this pull request Jun 9, 2022
append to upload buffer when only small amount remains in buffer
rather than performing a separate tiny send to empty buffer

avoid degenerative upload behavior which might cause curl to send
mostly 1-byte DATA frames after exhausing the h2 send window size

related discussion: nghttp2/nghttp2#1722

Signed-off-by: Glenn Strauss <gstrauss@gluelogic.com>
Closes curl#8965
append to upload buffer when only small amount remains in buffer
rather than performing a separate tiny send to empty buffer

avoid degenerative upload behavior which might cause curl to send
mostly 1-byte DATA frames after exhausing the h2 send window size

related discussion: nghttp2/nghttp2#1722

Signed-off-by: Glenn Strauss <gstrauss@gluelogic.com>
Closes curl#8965
@gstrauss
Copy link
Contributor Author

Note: this PR enhances upload performance for callers of readwrite_upload() (which calls Curl_fillreadbuffer()), but does not affect code which directly calls Curl_fillreadbuffer().

There may be some benefit to modifying smb.c:smb_send_and_recv() and http_proxy.c:CONNECT() with similar changes as is done in readwrite_upload(). (It does not look like there is much benefit to modifying file.c:file_upload(), and it does not look like this applies to c-hyper.c:uploadstreamed().)

@bagder
Copy link
Member

bagder commented Jun 20, 2022

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Development

Successfully merging this pull request may close these issues.

4 participants