Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

h2: Integrate h2_rxbuf_storage (6.0) #4004

Merged
merged 33 commits into from
Oct 18, 2023
Merged

Conversation

dridi
Copy link
Member

@dridi dridi commented Oct 18, 2023

This is a port of #3661 preceded by cherry-picks of varnishtest changes it relies on (plus a cherry pick of a VTC stabilization made afterwards).

This is in preparation for a port of #3998 to the 6.0 branch.

bsdphk and others added 30 commits October 18, 2023 07:46
When running really massive runs, "-j180 -n10000" kind of things,
the "rm -rf" of the tmpdir becomes the limiting factor.

The new -C option sends that int nice(1)'ed child process.
The name implies that this is not for production usage.
    Avoid VSB_printf for static strings

    Done with the following semantic patch for Coccinelle:

        @@
        expression vsb, fmt;
        @@

        - VSB_printf(vsb, fmt);
        + VSB_cat(vsb, fmt);

    This patch is available in the Varnish source tree.
This change increases the initial size and reduces the low watermark.

RFC7540 says this:

> Flow-controlled frames from the sender and WINDOW_UPDATE frames from
> the receiver are completely asynchronous with respect to each other.
> This property allows a receiver to aggressively update the window
> size kept by the sender to prevent streams from stalling.

The default parameters are very much on the low-latency aggressive
updates end of the spectrum, which increases asynchronicity at the
expense of determinism in test cases.

The tweaks made by varnishtest allows basic tests to send a few request
bodies before being bothered by window update race conditions. Test
cases that cover h2 flow control or anything else related to window
updates may reset parameters or pick other specific values. This frees
us from a bunch of barriers where the purpose of mitigating this race
was rarely even documented.

This successfully passed the following test locally:

    git grep -Fl +http2 -- '*.vtc' |
    xargs bin/varnishtest/varnishtest -i -n100 -j32

We can hope that h2 test cases will be overall more stable from now on.

Refs varnishcache#3442
This is an API for getting an arbitrary buffer through the
stevedores. The stevedore in question may then deploy LRU nuking or other
measures to control resource usage.
Conflicts:
	bin/varnishd/storage/storage_simple.c
In the same fashion as include/tbl/params.h for legibility.
And use that for logging purposes when a successfully opened h2 session
ends. RX_JUNK is still the default session close reason when existing
reasons aren't accurate enough.

Fixes varnishcache#3393
The H/2 session thread does have a VSL buffer already set up, but the
'wrk->vsl' pointer was not set. This caused issue for e.g. LRU_NukeOne()
as it wants to log. Set the buffer for the duration that the worker is
dedicated as an H/2 session thread.
This implements stream data handling using a buffer between the H/2
session thread and each stream thread. This is needed to avoid head of
line blocking on the session socket when a data frame is received for a
stream thread that is not yet ready to receive it.

The buffer used will have to be as large as the send window the peer
expects at the time the stream is opened. This will typically be 65535
unless the h2_initial_window_size parameter has been changed.

Stream window updates will then be issued only once data is removed from
the buffer by the request body being consumed from the request handling
thread, limited in size to what space is then available in the buffer.

Conflicts:
	bin/varnishd/http2/cache_http2_proto.c
H2 streams waiting for request body data will timeout after timeout_idle
seconds if no new data on the stream is being received. This will ensure
that individual H2 streams can be reaped if there is no data received from
the peer.
We have a strict min at the protocol default here. This is because we
don't have the 'use settings only after peer ack' in place yet. If the
value is lower than the protocol default, the very first stream could get
a flow control error.
This makes it easier to not have to know exactly when and how many window
updates to expect in a test case.
With the new request body data handling, Varnish changes behaviour
significantly wrt to stream window updates sent to the client. Window
updates will only be sent once the data is consumed by the client through
the request body VFP handling. Test cases that rely on receiving a window
update to sync the H/2 stream needs to be adopted.
This is the test case that fails if these changes aren't in tree. Note the
commented out rxwinup commands that are necessary for the proper fail mode
when run without the varnishtest window update changes.
According to the spec the padding is an 8-bit field, and fields should be
treated as unsigned unless otherwise specified, which it is not for any of
the padding related places. Allow varnishtest to generate padding up to
255 bytes long.
This was found lacking in our H2 implementation. Previously we would have
included any padding bytes in the request body. Possibly it would have
caused errors if there also was a C-L present, or more likely just corrupt
request bodies.

If the client sends nothing but padding bytes and ends up consuming the
entire stream window with no actual request bytes buffered, the request
thread side of things would not send any stream window updates. Handle
this corner case by sending a window update from the session thread.
mbgrydeland and others added 3 commits October 18, 2023 09:48
This parameter allows the user to choose which storage backend / stevedore
that the H/2 receive buffers are allocated from. By default it uses
Transient.

Conflicts:
	bin/varnishd/mgt/mgt.h
	bin/varnishd/mgt/mgt_param.h
	bin/varnishd/mgt/mgt_param_tbl.c
	include/tbl/params.h
We don't need a delay, we need to sync operations.
Copy link
Contributor

@mbgrydeland mbgrydeland left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM on a cursory look

@dridi dridi merged commit 58e7f3e into varnishcache:6.0 Oct 18, 2023
9 checks passed
@dridi dridi deleted the h2_rxbuf_6.0 branch October 18, 2023 09:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants