Skip to content

Super slow 206 range requests #359

@piskvorky

Description

@piskvorky

After some debugging of slow 206 responses from Flask's send_file, I narrowed the issue down to an interface mismatch between Waitress and werkzeug.

Deep down, werkzeug expects wsgi.file_wrapper to implement seekable() (and seek() and tell()) for efficient 206:

https://github.com/pallets/werkzeug/blob/347291802fcf89bc89660cc9dc62eb1303337bc2/src/werkzeug/wsgi.py#L576-L577

But Waitress provides no such methods in its ReadOnlyFileBasedBuffer, even when the underlying file-like (a regular open file on disk) has them:

class ReadOnlyFileBasedBuffer(FileBasedBuffer):
# used as wsgi.file_wrapper
def __init__(self, file, block_size=32768):
self.file = file
self.block_size = block_size # for __iter__
def prepare(self, size=None):
if _is_seekable(self.file):
start_pos = self.file.tell()
self.file.seek(0, 2)
end_pos = self.file.tell()
self.file.seek(start_pos)
fsize = end_pos - start_pos
if size is None:
self.remain = fsize
else:
self.remain = min(fsize, size)
return self.remain
def get(self, numbytes=-1, skip=False):
# never read more than self.remain (it can be user-specified)
if numbytes == -1 or numbytes > self.remain:
numbytes = self.remain
file = self.file
if not skip:
read_pos = file.tell()
res = file.read(numbytes)
if skip:
self.remain -= len(res)
else:
file.seek(read_pos)
return res
def __iter__(self): # called by task if self.filelike has no seek/tell
return self
def next(self):
val = self.file.read(self.block_size)
if not val:
raise StopIteration
return val
__next__ = next # py3
def append(self, s):
raise NotImplementedError

As a result, werkzeug uses a code path without seek(), where it keeps reading blocks from the beginning of the file until it reaches the range start offset… yuck:

https://github.com/pallets/werkzeug/blob/347291802fcf89bc89660cc9dc62eb1303337bc2/src/werkzeug/wsgi.py#L595-L604

Motivation: We often request 206 ranges from the end of huge files, such as ZIP archives where the "master ZIP record" lives at the end of the file. But because of the problem above, the app reads the entire multi-gigabyte ZIP file just to return the last 2 KB.

I'm not sure if this is an issue with werkzeug or waitress, but the performance is so bad that this is a show-stopper for us. Thanks.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions