Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WS_Reset() and overflown workspaces #3194

Closed
nigoroll opened this issue Jan 20, 2020 · 2 comments · Fixed by #3202
Closed

WS_Reset() and overflown workspaces #3194

nigoroll opened this issue Jan 20, 2020 · 2 comments · Fixed by #3202

Comments

@nigoroll
Copy link
Member

noticed here: b12af68

We use workspace overflows to signal to bail out for example after a failing VRT_SetHdr(). This is a guarantee that if some serious issue occurred during processing, we rather send an error downstream than an incomplete response or the result of incomplete processing.
We use the WS_Snapshot() ... WS_Reset() pattern as some kind of second order workspace allocation where the called code itself uses WS_Reserve().
This combination is problematic because WS_Reset() calls ws_ClearOverflow(ws), potentially clearing the overflow bit from a previous relevant failure.

I think we should simply forbid WS_Snapshot() on overflown workspaces, mandating prior calls of WS_Overflowed() for the above mentioned pattern.

Feedback welcome

@nigoroll
Copy link
Member Author

this patch to c00071.vtc demos the issue. Notice that the vtc already failed in the wrong place (while pushing processors rather than in the regsub()), which remained unnoticed.

diff --git a/bin/varnishtest/tests/c00071.vtc b/bin/varnishtest/tests/c00071.vtc
index 7e3c156ee..d4c17de1b 100644
--- a/bin/varnishtest/tests/c00071.vtc
+++ b/bin/varnishtest/tests/c00071.vtc
@@ -13,6 +13,7 @@ server s1 {
 
 varnish v1 -vcl+backend {
        import vtc;
+       import std;
        sub vcl_deliver {
                vtc.workspace_alloc(client, -192);
 
@@ -21,7 +22,8 @@ varnish v1 -vcl+backend {
                        vtc.workspace_alloc(client, -10);
                }
                else if (req.url ~ "/baz") {
-                       set resp.http.x-foo = regsub(req.url, "baz", "baaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaz");
+                       set resp.http.x-foo = regsub(req.url, "baz", "baaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaz");
+                       std.log("dummy");
                }
                set resp.http.x-of = vtc.workspace_overflowed(client);
        }

result:

**** c2   http[ 0] |HTTP/1.1
**** c2   http[ 1] |200
**** c2   http[ 2] |OK
**** c2   http[ 3] |Content-Length: 0
**** c2   http[ 4] |Date: Mon, 20 Jan 2020 13:41:33 GMT
**** c2   http[ 5] |X-Varnish: 1006
**** c2   http[ 6] |Age: 0
**** c2   http[ 7] |Via: 1.1 varnish (Varnish/6.3)
**** c2   http[ 8] |x-of: false
**** c2   http[ 9] |Accept-Ranges: bytes
**** c2   http[10] |Connection: keep-alive
**** c2   bodylen = 0
**   c2   === expect resp.status == 500
---- c2   EXPECT resp.status (200) == "500" failed

@nigoroll
Copy link
Member Author

FTR, I just noticed that we already had a similar issue in #2645

nigoroll added a commit to nigoroll/varnish-cache that referenced this issue Jan 20, 2020
Since 8baf4a6 we lost information about
an overflowed workspace by calling std.log(), which resulted from use of
the workspace for construction of a contiguous string from the argument
constituents.

Since then, we have changed the interface to STRANDS, but this issue
remained.

We now solve the case for real by pushing the string concatenation down
to VSL: New versions of VSL and VSLb (coded by example of VSLv() and
VSLbt()) take a strands argument and create a log record without
additional copy overhead.

These solve varnishcache#3194 for std.log(), make logging more efficient and, in
particular, allow use of std.log() in low workspace conditions (because
they do not require any).

Also improve test coverage for std.log()

Ref varnishcache#3194
nigoroll added a commit to nigoroll/varnish-cache that referenced this issue Jan 20, 2020
We add checks for cases where a WS_Reset() could reset the overflowed
workspace marker to ensure that the workspace is not already overflowed
before using it.

See also varnishcache#3196 for std.log()

This concludes the fix for varnishcache#3194 together with the above (except for one
h2 case, maybe)
nigoroll added a commit to nigoroll/varnish-cache that referenced this issue Jan 20, 2020
Since 8baf4a6 we lost information about
an overflowed workspace by calling std.log(), which resulted from use of
the workspace for construction of a contiguous string from the argument
constituents.

Since then, we have changed the interface to STRANDS, but this issue
remained.

We now solve the case for real by pushing the string concatenation down
to VSL: New versions of VSL and VSLb (coded by example of VSLv() and
VSLbt()) take a strands argument and create a log record without
additional copy overhead.

These solve varnishcache#3194 for std.log(), make logging more efficient and, in
particular, allow use of std.log() in low workspace conditions (because
they do not require any).

Also improve test coverage for std.log()

Ref varnishcache#3194
nigoroll added a commit to nigoroll/varnish-cache that referenced this issue Jan 20, 2020
It now bails out due to a workspace overflow which it acually did
trigger all the time, but that remained unnoticed due to varnishcache#3194

The test to multiply the cookie header 128 times (8x in the first
regex, 16x in the second) was not realistic even for "modern times",
so I do not think we should have workspaces sized by such an example.
nigoroll added a commit to nigoroll/varnish-cache that referenced this issue Jan 27, 2020
Since 8baf4a6 we lost information about
an overflowed workspace by calling std.log(), which resulted from use of
the workspace for construction of a contiguous string from the argument
constituents.

Since then, we have changed the interface to STRANDS, but this issue
remained.

We now solve the case for real by pushing the string concatenation down
to VSL: New versions of VSL and VSLb (coded by example of VSLv() and
VSLbt()) take a strands argument and create a log record without
additional copy overhead.

These solve varnishcache#3194 for std.log(), make logging more efficient and, in
particular, allow use of std.log() in low workspace conditions (because
they do not require any).

Also improve test coverage for std.log()

Ref varnishcache#3194
nigoroll added a commit to nigoroll/varnish-cache that referenced this issue Jan 27, 2020
It now bails out due to a workspace overflow which it acually did
trigger all the time, but that remained unnoticed due to varnishcache#3194

The test to multiply the cookie header 128 times (8x in the first
regex, 16x in the second) was not realistic even for "modern times",
so I do not think we should have workspaces sized by such an example.
nigoroll added a commit to nigoroll/varnish-cache that referenced this issue Jan 27, 2020
We add checks for cases where a WS_Reset() could reset the overflowed
workspace marker to ensure that the workspace is not already overflowed
before using it.

See also varnishcache#3196 for std.log()

This concludes the fix for varnishcache#3194 together with the above (except for one
h2 case, maybe)
nigoroll added a commit that referenced this issue Jan 27, 2020
nigoroll added a commit to nigoroll/varnish-cache that referenced this issue Jan 27, 2020
We use workspace overflows to signal to bail out for example after a
failing `VRT_SetHdr()`. This is a guarantee that if some serious issue
occurred during processing, we rather send an error downstream than an
incomplete response or the result of incomplete processing.

We use the `WS_Snapshot() ...  WS_Reset()` pattern as some kind of
second order workspace allocation where the called code itself uses
`WS_Reserve()`.

With this usage patter, `WS_Reset()` called `ws_ClearOverflow(ws)`,
potentially clearing the overflow bit from a previous relevant
failure.

`WS_Snapshot()` now returns a magic value which gets recognized by
`WS_Reset()` to ensure that the overflowed marker gets preserved or
even restored. In this case, the workspace does not actually get reset.

Fixes varnishcache#3194
nigoroll added a commit to nigoroll/varnish-cache that referenced this issue Jan 27, 2020
We use workspace overflows to signal to bail out for example after a
failing `VRT_SetHdr()`. This is a guarantee that if some serious issue
occurred during processing, we rather send an error downstream than an
incomplete response or the result of incomplete processing.

We use the `WS_Snapshot() ...  WS_Reset()` pattern as some kind of
second order workspace allocation where the called code itself uses
`WS_Reserve()`.

With this usage pattern, `WS_Reset()` called `ws_ClearOverflow(ws)`,
potentially clearing the overflow bit from a previous relevant
failure.

`WS_Snapshot()` now returns a magic value which gets recognized by
`WS_Reset()` to ensure that the overflowed marker gets preserved or
even restored. In this case, the workspace does not actually get reset.

Fixes varnishcache#3194
nigoroll added a commit to nigoroll/varnish-cache that referenced this issue Jan 27, 2020
We use workspace overflows to signal to bail out for example after a
failing `VRT_SetHdr()`. This is a guarantee that if some serious issue
occurred during processing, we rather send an error downstream than an
incomplete response or the result of incomplete processing.

We use the `WS_Snapshot() ...  WS_Reset()` pattern as some kind of
second order workspace allocation where the called code itself uses
`WS_Reserve()`.

With this usage pattern, `WS_Reset()` called `ws_ClearOverflow(ws)`,
potentially clearing the overflow bit from a previous relevant
failure.

`WS_Snapshot()` now returns a magic value which gets recognized by
`WS_Reset()` to ensure that the overflowed marker is still present.
In this case, the workspace does not actually get reset.

Fixes varnishcache#3194
nigoroll added a commit to nigoroll/varnish-cache that referenced this issue Jan 27, 2020
We add checks for cases where a WS_Reset() could reset the overflowed
workspace marker to ensure that the workspace is not already overflowed
before using it.

See also varnishcache#3196 for std.log()

This concludes the fix for varnishcache#3194 together with the above (except for one
h2 case, maybe)
nigoroll added a commit to nigoroll/varnish-cache that referenced this issue Jan 30, 2020
We use workspace overflows to signal to bail out for example after a
failing `VRT_SetHdr()`. This is a guarantee that if some serious issue
occurred during processing, we rather send an error downstream than an
incomplete response or the result of incomplete processing.

We use the `WS_Snapshot() ...  WS_Reset()` pattern as some kind of
second order workspace allocation where the called code itself uses
`WS_Reserve()`.

With this usage pattern, `WS_Reset()` called `ws_ClearOverflow(ws)`,
potentially clearing the overflow bit from a previous relevant
failure.

`WS_Snapshot()` now returns a magic value which gets recognized by
`WS_Reset()` to ensure that the overflowed marker is still present.
In this case, the workspace does not actually get reset.

Fixes varnishcache#3194
nigoroll added a commit to nigoroll/varnish-cache that referenced this issue Feb 3, 2020
We add checks for cases where a WS_Reset() could reset the overflowed
workspace marker to ensure that the workspace is not already overflowed
before using it.

See also varnishcache#3196 for std.log()

This concludes the fix for varnishcache#3194 together with the above (except for one
h2 case, maybe)
@nigoroll nigoroll removed the a=has_PR label Feb 6, 2020
@nigoroll nigoroll self-assigned this Feb 13, 2020
nigoroll added a commit to nigoroll/varnish-cache that referenced this issue Feb 27, 2020
We use workspace overflows to signal to bail out for example after a
failing `VRT_SetHdr()`. This is a guarantee that if some serious issue
occurred during processing, we rather send an error downstream than an
incomplete response or the result of incomplete processing.

We use the `WS_Snapshot() ...  WS_Reset()` pattern as some kind of
second order workspace allocation where the called code itself uses
`WS_Reserve()`.

With this usage pattern, `WS_Reset()` called `ws_ClearOverflow(ws)`,
potentially clearing the overflow bit from a previous relevant
failure.

`WS_Snapshot()` now returns a magic value which gets recognized by
`WS_Reset()` to ensure that the overflowed marker is still present.
In this case, the workspace does not actually get reset.

Fixes varnishcache#3194
nigoroll added a commit to nigoroll/varnish-cache that referenced this issue Feb 27, 2020
This is an improvement over the second last commit (e.g.
ed970de) based on feedback to varnishcache#3202:

We now avoid any other unintended clears of the overflow bit by
splitting two functions:

* WS_Rollback() is now what WS_Reset() used to be: It clears overflows
  and accepts the zero cookie for a reset-to-start

  It is only intended for use within varnishd and is thus declared
  in cache_varnishd.h

* WS_Reset() does not touch the overflow bit any longer, ensuring that
  a once-overflowed workspace stays overflowed

With this in place, the magic snap_overflowed cookie would not be needed
any more, but it still serves two good purposes:

- better debugging and

- a safety measure against passing a cookie from an already overflowed
  workspace to WS_Rollback()

Fixes varnishcache#3194
nigoroll added a commit to nigoroll/varnish-cache that referenced this issue Mar 2, 2020
We use workspace overflows to signal to bail out for example after a
failing `VRT_SetHdr()`. This is a guarantee that if some serious issue
occurred during processing, we rather send an error downstream than an
incomplete response or the result of incomplete processing.

We use the `WS_Snapshot() ...  WS_Reset()` pattern as some kind of
second order workspace allocation where the called code itself uses
`WS_Reserve()`.

With this usage pattern, `WS_Reset()` called `ws_ClearOverflow(ws)`,
potentially clearing the overflow bit from a previous relevant
failure.

We now avoid any other unintended clears of the overflow bit by
splitting two functions:

* WS_Rollback() is now what WS_Reset() used to be: It clears overflows
  and accepts the zero cookie for a reset-to-start

  It is only intended for use within varnishd and is thus declared
  in cache_varnishd.h

* WS_Reset() does not touch the overflow bit any longer, ensuring that
  a once-overflowed workspace stays overflowed

`WS_Snapshot()` now returns a magic value which gets recognized by
`WS_Reset()` to ensure that the overflowed marker is still present.
This serves two purposes:

- better debugging and

- a safety measure against passing a cookie from an already overflowed
  workspace to WS_Rollback()

Fixes varnishcache#3194
nigoroll added a commit to nigoroll/varnish-cache that referenced this issue Mar 2, 2020
We use workspace overflows to signal to bail out for example after a
failing `VRT_SetHdr()`. This is a guarantee that if some serious issue
occurred during processing, we rather send an error downstream than an
incomplete response or the result of incomplete processing.

We use the `WS_Snapshot() ...  WS_Reset()` pattern as some kind of
second order workspace allocation where the called code itself uses
`WS_Reserve()`.

With this usage pattern, `WS_Reset()` called `ws_ClearOverflow(ws)`,
potentially clearing the overflow bit from a previous relevant
failure.

We now avoid any other unintended clears of the overflow bit by
splitting two functions:

* WS_Rollback() is now what WS_Reset() used to be: It clears overflows
  and accepts the zero cookie for a reset-to-start

  It is only intended for use within varnishd and is thus declared
  in cache_varnishd.h

* WS_Reset() does not touch the overflow bit any longer, ensuring that
  a once-overflowed workspace stays overflowed

`WS_Snapshot()` now returns a magic value which gets recognized by
`WS_Reset()` to ensure that the overflowed marker is still present.
This serves two purposes:

- better debugging and

- a safety measure against passing a cookie from an already overflowed
  workspace to WS_Rollback()

Fixes varnishcache#3194
dridi pushed a commit that referenced this issue Mar 3, 2020
We use workspace overflows to signal to bail out for example after a
failing `VRT_SetHdr()`. This is a guarantee that if some serious issue
occurred during processing, we rather send an error downstream than an
incomplete response or the result of incomplete processing.

We use the `WS_Snapshot() ...  WS_Reset()` pattern as some kind of
second order workspace allocation where the called code itself uses
`WS_Reserve()`.

With this usage pattern, `WS_Reset()` called `ws_ClearOverflow(ws)`,
potentially clearing the overflow bit from a previous relevant
failure.

We now avoid any other unintended clears of the overflow bit by
splitting two functions:

* WS_Rollback() is now what WS_Reset() used to be: It clears overflows
  and accepts the zero cookie for a reset-to-start

  It is only intended for use within varnishd and is thus declared
  in cache_varnishd.h

* WS_Reset() does not touch the overflow bit any longer, ensuring that
  a once-overflowed workspace stays overflowed

`WS_Snapshot()` now returns a magic value which gets recognized by
`WS_Reset()` to ensure that the overflowed marker is still present.
This serves two purposes:

- better debugging and

- a safety measure against passing a cookie from an already overflowed
  workspace to WS_Rollback()

Fixes #3194
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants