-
Notifications
You must be signed in to change notification settings - Fork 180
Description
After some debugging of slow 206 responses from Flask's send_file, I narrowed the issue down to an interface mismatch between Waitress and werkzeug.
Deep down, werkzeug expects wsgi.file_wrapper to implement seekable() (and seek() and tell()) for efficient 206:
But Waitress provides no such methods in its ReadOnlyFileBasedBuffer, even when the underlying file-like (a regular open file on disk) has them:
waitress/src/waitress/buffers.py
Lines 141 to 187 in f41e598
| class ReadOnlyFileBasedBuffer(FileBasedBuffer): | |
| # used as wsgi.file_wrapper | |
| def __init__(self, file, block_size=32768): | |
| self.file = file | |
| self.block_size = block_size # for __iter__ | |
| def prepare(self, size=None): | |
| if _is_seekable(self.file): | |
| start_pos = self.file.tell() | |
| self.file.seek(0, 2) | |
| end_pos = self.file.tell() | |
| self.file.seek(start_pos) | |
| fsize = end_pos - start_pos | |
| if size is None: | |
| self.remain = fsize | |
| else: | |
| self.remain = min(fsize, size) | |
| return self.remain | |
| def get(self, numbytes=-1, skip=False): | |
| # never read more than self.remain (it can be user-specified) | |
| if numbytes == -1 or numbytes > self.remain: | |
| numbytes = self.remain | |
| file = self.file | |
| if not skip: | |
| read_pos = file.tell() | |
| res = file.read(numbytes) | |
| if skip: | |
| self.remain -= len(res) | |
| else: | |
| file.seek(read_pos) | |
| return res | |
| def __iter__(self): # called by task if self.filelike has no seek/tell | |
| return self | |
| def next(self): | |
| val = self.file.read(self.block_size) | |
| if not val: | |
| raise StopIteration | |
| return val | |
| __next__ = next # py3 | |
| def append(self, s): | |
| raise NotImplementedError |
As a result, werkzeug uses a code path without seek(), where it keeps reading blocks from the beginning of the file until it reaches the range start offset… yuck:
Motivation: We often request 206 ranges from the end of huge files, such as ZIP archives where the "master ZIP record" lives at the end of the file. But because of the problem above, the app reads the entire multi-gigabyte ZIP file just to return the last 2 KB.
I'm not sure if this is an issue with werkzeug or waitress, but the performance is so bad that this is a show-stopper for us. Thanks.