New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

varnishd stalls with two consecutive Range requests using HTTP persistent connections #2035

Closed
ema opened this Issue Aug 8, 2016 · 1 comment

Comments

Projects
None yet
5 participants
@ema

ema commented Aug 8, 2016

While testing Varnish handling of Range requests we've discovered that varnishd seems to block when serving two Range requests over a HTTP persistent connection and saving a large object to file storage at the same time.

Expected Behavior

Varnish should serve Range requests for a small number of bytes fairly quickly, even if at the same time the object received from the backend is being written to disk.

Current Behavior

Instead of quickly serving two successive Range requests for a fairly small number of bytes, varnishd seems to block and serves the second request only after a long time.

Avoiding HTTP persistent connections makes the problem unreproducible.

Steps to Reproduce (for bugs)

Start Varnish with stock VCL and a 10G file storage backend. Also make sure a web server is running on port 80 and serving a large file under /test.img. We've used a 5G file in our test environment.

sudo varnishd -a :3128 -b localhost:80 -F -n backend -s main1=file,/var/tmp/varnish.main1,10G

Once varnishd is ready to serve requests, run the following script:

#!/usr/bin/env python

import socket

REPRODUCE = True

dest_name = "127.0.0.1"
port = 3128

s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.connect((dest_name, port))

s.send("""GET /test.img HTTP/1.1
Range: bytes=18-22

""")
data = (s.recv(1024))
print data

if not REPRODUCE:
    s.shutdown(1)
    s.close()
    s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
    s.connect((dest_name, port))

s.send("""GET /test.img HTTP/1.1
Range: bytes=50-60

""")

data = (s.recv(1024))
print data
s.shutdown(1)
s.close()

The first response will come back quickly and then varnish will stall:

HTTP/1.1 206 Partial Content
Date: Fri, 05 Aug 2016 17:26:11 GMT
Server: Apache/2.4.23 (Debian)
Last-Modified: Thu, 04 Aug 2016 18:38:44 GMT
ETag: "138800000-5394342740f66"
X-Varnish: 2
Age: 0
Via: 1.1 varnish-v4
Accept-Ranges: bytes
Content-Range: bytes 18-22/5242880000
Content-Length: 5
Connection: keep-alive

{`̇

After a while, the second response will arrive. Note the value of Age.

HTTP/1.1 206 Partial Content
Date: Fri, 05 Aug 2016 17:26:11 GMT
Server: Apache/2.4.23 (Debian)
Last-Modified: Thu, 04 Aug 2016 18:38:44 GMT
ETag: "138800000-5394342740f66"
X-Varnish: 4 3
Age: 18
Via: 1.1 varnish-v4
Accept-Ranges: bytes
Content-Range: bytes 50-60/5242880000
Content-Length: 11
Connection: keep-alive

"JÚQ b�

Now stop varnishd, start it again, set REPRODUCE = False in the python script and run it again. Varnish should not stall, and the two responses should come back immediately. Again, notice the value of Age.

HTTP/1.1 206 Partial Content
Date: Fri, 05 Aug 2016 17:28:44 GMT
Server: Apache/2.4.23 (Debian)
Last-Modified: Thu, 04 Aug 2016 18:38:44 GMT
ETag: "138800000-5394342740f66"
X-Varnish: 2
Age: 0
Via: 1.1 varnish-v4
Accept-Ranges: bytes
Content-Range: bytes 18-22/5242880000
Content-Length: 5
Connection: keep-alive

{`̇
HTTP/1.1 206 Partial Content
Date: Fri, 05 Aug 2016 17:28:44 GMT
Server: Apache/2.4.23 (Debian)
Last-Modified: Thu, 04 Aug 2016 18:38:44 GMT
ETag: "138800000-5394342740f66"
X-Varnish: 32770 3
Age: 0
Via: 1.1 varnish-v4
Accept-Ranges: bytes
Content-Range: bytes 50-60/5242880000
Content-Length: 11
Connection: keep-alive

"JÚQ b�

Context

The issue was discovered while testing Range requests handling with two varnishes, the first one with -s malloc and the second one using file as the storage backend. We observed that modifying the VCL of the first varnish instance being hit to avoid using HTTP persistent connections makes the problem unreproducbile:

sub vcl_backend_fetch {
    set bereq.http.Connection = "close";
}

Please note however that the issue can be reproduced with one single varnishd as described above.

Your Environment

  • Version used: 4.1.3
  • Operating System: Debian

@mbgrydeland mbgrydeland self-assigned this Nov 21, 2016

mbgrydeland added a commit to mbgrydeland/varnish-cache that referenced this issue Nov 24, 2016

Finish range delivery processing when the range is delivered.
When doing range delivery, make the delivery processing finish early
when all of the bytes of the requested range has been delivered. This
to avoid waiting around for a slow fetch to finish before handling the
next request on the connection.

Fixes: varnishcache#2035

mbgrydeland added a commit that referenced this issue Nov 28, 2016

Finish range delivery processing when the range is delivered.
When doing range delivery, make the delivery processing finish early
when all of the bytes of the requested range has been delivered. This
to avoid waiting around for a slow fetch to finish before handling the
next request on the connection.

Fixes: #2035

mbgrydeland added a commit to mbgrydeland/varnish-cache that referenced this issue Nov 29, 2016

Finish range delivery processing when the range is delivered.
When doing range delivery, make the delivery processing finish early
when all of the bytes of the requested range has been delivered. This
to avoid waiting around for a slow fetch to finish before handling the
next request on the connection.

Fixes: varnishcache#2035
@hermunn

This comment has been minimized.

Show comment
Hide comment
@hermunn

hermunn Dec 8, 2016

Contributor

Backport review: Already in 4.1.4 (ce079ac)

Contributor

hermunn commented Dec 8, 2016

Backport review: Already in 4.1.4 (ce079ac)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment