varnishd stalls with two consecutive Range requests using HTTP persistent connections #2035

Closed
ema opened this Issue Aug 8, 2016 · 1 comment

Projects

None yet

5 participants

@ema
ema commented Aug 8, 2016 edited

While testing Varnish handling of Range requests we've discovered that varnishd seems to block when serving two Range requests over a HTTP persistent connection and saving a large object to file storage at the same time.

Expected Behavior

Varnish should serve Range requests for a small number of bytes fairly quickly, even if at the same time the object received from the backend is being written to disk.

Current Behavior

Instead of quickly serving two successive Range requests for a fairly small number of bytes, varnishd seems to block and serves the second request only after a long time.

Avoiding HTTP persistent connections makes the problem unreproducible.

Steps to Reproduce (for bugs)

Start Varnish with stock VCL and a 10G file storage backend. Also make sure a web server is running on port 80 and serving a large file under /test.img. We've used a 5G file in our test environment.

sudo varnishd -a :3128 -b localhost:80 -F -n backend -s main1=file,/var/tmp/varnish.main1,10G

Once varnishd is ready to serve requests, run the following script:

#!/usr/bin/env python

import socket

REPRODUCE = True

dest_name = "127.0.0.1"
port = 3128

s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.connect((dest_name, port))

s.send("""GET /test.img HTTP/1.1
Range: bytes=18-22

""")
data = (s.recv(1024))
print data

if not REPRODUCE:
    s.shutdown(1)
    s.close()
    s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
    s.connect((dest_name, port))

s.send("""GET /test.img HTTP/1.1
Range: bytes=50-60

""")

data = (s.recv(1024))
print data
s.shutdown(1)
s.close()

The first response will come back quickly and then varnish will stall:

HTTP/1.1 206 Partial Content
Date: Fri, 05 Aug 2016 17:26:11 GMT
Server: Apache/2.4.23 (Debian)
Last-Modified: Thu, 04 Aug 2016 18:38:44 GMT
ETag: "138800000-5394342740f66"
X-Varnish: 2
Age: 0
Via: 1.1 varnish-v4
Accept-Ranges: bytes
Content-Range: bytes 18-22/5242880000
Content-Length: 5
Connection: keep-alive

{`̇

After a while, the second response will arrive. Note the value of Age.

HTTP/1.1 206 Partial Content
Date: Fri, 05 Aug 2016 17:26:11 GMT
Server: Apache/2.4.23 (Debian)
Last-Modified: Thu, 04 Aug 2016 18:38:44 GMT
ETag: "138800000-5394342740f66"
X-Varnish: 4 3
Age: 18
Via: 1.1 varnish-v4
Accept-Ranges: bytes
Content-Range: bytes 50-60/5242880000
Content-Length: 11
Connection: keep-alive

"JÚQ b�

Now stop varnishd, start it again, set REPRODUCE = False in the python script and run it again. Varnish should not stall, and the two responses should come back immediately. Again, notice the value of Age.

HTTP/1.1 206 Partial Content
Date: Fri, 05 Aug 2016 17:28:44 GMT
Server: Apache/2.4.23 (Debian)
Last-Modified: Thu, 04 Aug 2016 18:38:44 GMT
ETag: "138800000-5394342740f66"
X-Varnish: 2
Age: 0
Via: 1.1 varnish-v4
Accept-Ranges: bytes
Content-Range: bytes 18-22/5242880000
Content-Length: 5
Connection: keep-alive

{`̇
HTTP/1.1 206 Partial Content
Date: Fri, 05 Aug 2016 17:28:44 GMT
Server: Apache/2.4.23 (Debian)
Last-Modified: Thu, 04 Aug 2016 18:38:44 GMT
ETag: "138800000-5394342740f66"
X-Varnish: 32770 3
Age: 0
Via: 1.1 varnish-v4
Accept-Ranges: bytes
Content-Range: bytes 50-60/5242880000
Content-Length: 11
Connection: keep-alive

"JÚQ b�

Context

The issue was discovered while testing Range requests handling with two varnishes, the first one with -s malloc and the second one using file as the storage backend. We observed that modifying the VCL of the first varnish instance being hit to avoid using HTTP persistent connections makes the problem unreproducbile:

sub vcl_backend_fetch {
    set bereq.http.Connection = "close";
}

Please note however that the issue can be reproduced with one single varnishd as described above.

Your Environment

  • Version used: 4.1.3
  • Operating System: Debian
@mbgrydeland mbgrydeland self-assigned this Nov 21, 2016
@mbgrydeland mbgrydeland added a commit to mbgrydeland/varnish-cache that referenced this issue Nov 24, 2016
@mbgrydeland mbgrydeland Finish range delivery processing when the range is delivered.
When doing range delivery, make the delivery processing finish early
when all of the bytes of the requested range has been delivered. This
to avoid waiting around for a slow fetch to finish before handling the
next request on the connection.

Fixes: #2035
fb6c54f
@mbgrydeland mbgrydeland added a commit that referenced this issue Nov 28, 2016
@mbgrydeland mbgrydeland Finish range delivery processing when the range is delivered.
When doing range delivery, make the delivery processing finish early
when all of the bytes of the requested range has been delivered. This
to avoid waiting around for a slow fetch to finish before handling the
next request on the connection.

Fixes: #2035
ce079ac
@mbgrydeland mbgrydeland added a commit to mbgrydeland/varnish-cache that referenced this issue Nov 29, 2016
@mbgrydeland mbgrydeland Finish range delivery processing when the range is delivered.
When doing range delivery, make the delivery processing finish early
when all of the bytes of the requested range has been delivered. This
to avoid waiting around for a slow fetch to finish before handling the
next request on the connection.

Fixes: #2035
12b7182
@mbgrydeland mbgrydeland added a commit that closed this issue Dec 5, 2016
@mbgrydeland mbgrydeland Finish range delivery processing when the range is delivered.
When doing range delivery, make the delivery processing finish early
when all of the bytes of the requested range has been delivered. This
to avoid waiting around for a slow fetch to finish before handling the
next request on the connection.

Fixes: #2035
333e6da
@hermunn
Contributor
hermunn commented Dec 8, 2016

Backport review: Already in 4.1.4 (ce079ac)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment