Memory/Resource leak #357

fschulze · 2021-12-09T06:54:47Z

With devpi-server I can now confirm a memory leak somehow related to waitress. I observed it on Debian and FreeBSD, but not on macOS. The last test I did was replacing waitress with cheroot and the leak is completely gone with that. I'm currently working on creating a minimal example for reproduction, but wanted to give a heads up in case someone else noticed something and started investigating. It is certainly possible it is related in how we use waitress in devpi-server, but I don't think we do anything special. I don't want to switch to cheroot, because it lacks some features and seems to be less performant.

fschulze · 2021-12-09T09:25:42Z

While trying to reproduce this, I realized it is not a leak, but it seems like waitress isn't cleaning up right after serving a request. If I reduce the number of threads, I can clearly see a upper limit of memory usage. If I follow up requests with large response bodies with requests with small bodies, the memory usage goes down. So depending on the usage pattern the memory usage can get very high if there were a bunch of large responses. Is there a way to fix this from my side, or does this need to be fixed in waitress?

slav0nic · 2021-12-17T23:07:35Z

pay attention to devpi_server.readonly.DictViewReadonly as i see, it leaked on every new request like /root/pypi/+simple/<pkgname>/

(on both waitress / chroot servers)
looks like in this case cheroot.wsgi is more memory effective (and it seems to work faster, but I'm don't dive deeper), but really does not get rid of the problem on your app level.

fschulze · 2021-12-18T06:55:26Z

@slav0nic using devpi-server directly for this leads down many dead ends of which DictViewReadonly is unfortunately one. There is an internal cache, but it has a size limit and eventually flattens out. I'm very sorry to have wasted your time there. They key insight is, that using sys._debugmallocstats() all memory used by Python is accounted for, including the cache above etc., but the virtual and resident memory of the process is much higher than what sys._debugmallocstats() reports.

I tried to create a minimal example, but so far I can only partially replicate the issue. I'm still missing some key interaction between what devpi-server does and what waitress does.

digitalresistor · 2021-12-27T08:37:47Z

Waitress does buffering of the output, those are stored in overflow able buffers, that may write data to disk once it reaches a certain high water mark. Those are then cycled out as they are drained.

The OverflowableBuffer is here: https://github.com/Pylons/waitress/blob/master/src/waitress/buffers.py#L190

A task (WSGI thread) will call write_soon on a channel:

waitress/src/waitress/channel.py

Line 332 in 84cfc2b

def write_soon(self, data):

Which will check to make sure that we are not above outbuf_high_watermark, if we are, we block the WSGI task from continuing to add more data to the output buffers.

To avoid growing the output buffer unbounded, this was an issue because multiple tasks may re-use the same output buffer if sending data back to the same connection, we check to see if the output buffer has reached a high water mark itself, and if so we create a new one:

waitress/src/waitress/channel.py

Line 355 in 84cfc2b

if self.current_outbuf_count >= self.adj.outbuf_high_watermark:

Then we just append the data to the existing buffer:

waitress/src/waitress/channel.py

Line 361 in 84cfc2b

self.outbufs[-1].append(data)

Once the buffer is added to the back of the list, it goes through _flush_some which does the actual sending into the tcp/ip socket once that socket is marked writeable.

waitress/src/waitress/channel.py

Line 245 in 84cfc2b

def _flush_some(self):

That happens here.

We will send data in chunks:

waitress/src/waitress/channel.py

Line 257 in 84cfc2b

while outbuflen > 0:

, in a tight loop until send fails because the socket would block (and thus it returns 0 for the number of bytes sent). When the buffer is empty, and it is not the last one, it is popped from the list:

waitress/src/waitress/channel.py

Line 274 in 84cfc2b

if len(self.outbufs) > 1:

if it is the last one, it sticks around, but the task thread will recycle it eventually if the amount of data written to it is too much. This avoids the issue we had whereby buffers were growing unbounded.

We call .close() on the buffer when we drop it.

Append is fairly simple: https://github.com/Pylons/waitress/blob/master/src/waitress/buffers.py#L244

There's one issue I do see, I am not sure how easily I can test to make sure it fixes this issue, so lemme open a PR real quick and if you could give that a whirl I would appreciate it.

digitalresistor · 2021-12-27T08:53:23Z

@fschulze would you mind giving this change a test run: #358

It's the only issue I can see whereby when we overflow from BytesIO to file, we may end up keeping the BytesIO around without calling .close() on it. Eventually I would expect Python to clean that memory up because the object goes out of scope...

digitalresistor · 2021-12-27T08:59:12Z

pay attention to devpi_server.readonly.DictViewReadonly as i see, it leaked on every new request like /root/pypi/+simple/<pkgname>/

(on both waitress / chroot servers)
looks like in this case cheroot.wsgi is more memory effective (and it seems to work faster, but I'm don't dive deeper), but really does not get rid of the problem on your app level.

cheroot does not buffer outgoing responses, and each connection is directly tied to a WSGI thread. This means that if a client is slow to read, you end up holding up that thread for as long as it takes them to read the response. (my cheroot experience is a couple of years out of date now, so it may not be accurate anymore)

Waitress attempts to avoid that by buffering up to a certain amount of the response, so that the WSGI thread can start working on another request sooner.

fschulze · 2021-12-27T10:04:59Z

@bertjwregeer that seems to be it. Even with the default 50 threads, the memory doesn't grow above 7GB VIRT and 5GB RES in my initial testing. I'll let it run for a few days and report back.

fschulze · 2022-01-16T19:35:01Z

@bertjwregeer memory usage has been stable for the past 20 days

digitalresistor · 2022-02-07T04:53:15Z

@fschulze I have released a new beta version: https://pypi.org/project/waitress/2.1.0b0/

I would absolutely love your help in testing/validating that the various changes before I ship them!

fschulze · 2022-02-07T08:32:45Z

I have updated m.devpi.net and another instance and will let you know if anything comes up or is stable after a few days.

fschulze · 2022-02-11T15:03:16Z

@bertjwregeer so far nothing seems broken for me and memory usage is stable (though slightly higher, but that is likely unrelated).

digitalresistor · 2022-02-11T21:06:59Z

@fschulze thank you so much for testing! I'll likely release it sometime this weekend!

fschulze · 2022-02-24T12:13:58Z

It seems like the leak isn't gone completely. It is much much better than before, but very slowly it still leaks resident memory that is not accounted for by arena allocations of Python. I will investigate further.

This shouldn't hold up a release though.

digitalresistor · 2022-03-05T19:33:22Z

@fschulze I look forward to your investigation, in the mean time I have released 2.1.0: https://pypi.org/project/waitress/2.1.0/

Changelog: ========= Python Version Support ---------------------- - Python 3.6 is no longer supported by Waitress - Python 3.10 is fully supported by Waitress Bugfix ------- - "wsgi.file_wrapper" now sets the "seekable", "seek", and "tell" attributes from the underlying file if the underlying file is seekable. This allows WSGI middleware to implement things like range requests for example See Pylons/waitress#359 and Pylons/waitress#363 - In Python 3 "OSError" is no longer subscriptable, this caused failures on Windows attempting to loop to find an socket that would work for use in the trigger. See Pylons/waitress#361 - Fixed an issue whereby "BytesIO" objects were not properly closed, and thereby would not get cleaned up until garbage collection would get around to it. This led to potential for random memory spikes/memory issues, see Pylons/waitress#358 and Pylons/waitress#357 . Features -------- - When the WSGI app starts sending data to the output buffer, we now attempt to send data directly to the socket. This avoids needing to wake up the main thread to start sending data. Allowing faster transmission of the first byte. See Pylons/waitress#364 - Add REQUEST_URI to the WSGI environment. REQUEST_URI is similar to "request_uri" in nginx. It is a string that contains the request path before separating the query string and decoding "%"-escaped characters. Signed-off-by: Wang Mingyu <wangmy@fujitsu.com> Signed-off-by: Khem Raj <raj.khem@gmail.com>

Changelog: ========= Python Version Support ---------------------- - Python 3.6 is no longer supported by Waitress - Python 3.10 is fully supported by Waitress Bugfix ------- - "wsgi.file_wrapper" now sets the "seekable", "seek", and "tell" attributes from the underlying file if the underlying file is seekable. This allows WSGI middleware to implement things like range requests for example See Pylons/waitress#359 and Pylons/waitress#363 - In Python 3 "OSError" is no longer subscriptable, this caused failures on Windows attempting to loop to find an socket that would work for use in the trigger. See Pylons/waitress#361 - Fixed an issue whereby "BytesIO" objects were not properly closed, and thereby would not get cleaned up until garbage collection would get around to it. This led to potential for random memory spikes/memory issues, see Pylons/waitress#358 and Pylons/waitress#357 . Features -------- - When the WSGI app starts sending data to the output buffer, we now attempt to send data directly to the socket. This avoids needing to wake up the main thread to start sending data. Allowing faster transmission of the first byte. See Pylons/waitress#364 - Add REQUEST_URI to the WSGI environment. REQUEST_URI is similar to "request_uri" in nginx. It is a string that contains the request path before separating the query string and decoding "%"-escaped characters. Signed-off-by: Wang Mingyu <wangmy@fujitsu.com> Signed-off-by: Khem Raj <raj.khem@gmail.com> Signed-off-by: Trevor Gamblin <trevor.gamblin@windriver.com>

2.1.2 ----- Bugfix ~~~~~~ - When expose_tracebacks is enabled waitress would fail to properly encode unicode thereby causing another error during error handling. See Pylons/waitress#378 - Header length checking had a calculation that was done incorrectly when the data was received across multple socket reads. This calculation has been corrected, and no longer will Waitress send back a 413 Request Entity Too Large. See Pylons/waitress#376 Security Bugfix ~~~~~~~~~~~~~~~ - in 2.1.0 a new feature was introduced that allowed the WSGI thread to start sending data to the socket. However this introduced a race condition whereby a socket may be closed in the sending thread while the main thread is about to call select() therey causing the entire application to be taken down. Waitress will no longer close the socket in the WSGI thread, instead waking up the main thread to cleanup. See Pylons/waitress#377 2.1.1 ----- Security Bugfix ~~~~~~~~~~~~~~~ - Waitress now validates that chunked encoding extensions are valid, and don't contain invalid characters that are not allowed. They are still skipped/not processed, but if they contain invalid data we no longer continue in and return a 400 Bad Request. This stops potential HTTP desync/HTTP request smuggling. Thanks to Zhang Zeyu for reporting this issue. See GHSA-4f7p-27jc-3c36 - Waitress now validates that the chunk length is only valid hex digits when parsing chunked encoding, and values such as ``0x01`` and ``+01`` are no longer supported. This stops potential HTTP desync/HTTP request smuggling. Thanks to Zhang Zeyu for reporting this issue. See GHSA-4f7p-27jc-3c36 - Waitress now validates that the Content-Length sent by a remote contains only digits in accordance with RFC7230 and will return a 400 Bad Request when the Content-Length header contains invalid data, such as ``+10`` which would previously get parsed as ``10`` and accepted. This stops potential HTTP desync/HTTP request smuggling Thanks to Zhang Zeyu for reporting this issue. See GHSA-4f7p-27jc-3c36 2.1.0 ----- Python Version Support ~~~~~~~~~~~~~~~~~~~~~~ - Python 3.6 is no longer supported by Waitress - Python 3.10 is fully supported by Waitress Bugfix ~~~~~~ - ``wsgi.file_wrapper`` now sets the ``seekable``, ``seek``, and ``tell`` attributes from the underlying file if the underlying file is seekable. This allows WSGI middleware to implement things like range requests for example See Pylons/waitress#359 and Pylons/waitress#363 - In Python 3 ``OSError`` is no longer subscriptable, this caused failures on Windows attempting to loop to find an socket that would work for use in the trigger. See Pylons/waitress#361 - Fixed an issue whereby ``BytesIO`` objects were not properly closed, and thereby would not get cleaned up until garbage collection would get around to it. This led to potential for random memory spikes/memory issues, see Pylons/waitress#358 and Pylons/waitress#357 . With thanks to Florian Schulze for testing/vaidating this fix! Features ~~~~~~~~ - When the WSGI app starts sending data to the output buffer, we now attempt to send data directly to the socket. This avoids needing to wake up the main thread to start sending data. Allowing faster transmission of the first byte. See Pylons/waitress#364 With thanks to Michael Merickel for being a great rubber ducky! - Add REQUEST_URI to the WSGI environment. REQUEST_URI is similar to ``request_uri`` in nginx. It is a string that contains the request path before separating the query string and decoding ``%``-escaped characters.

Changelog: ========= Python Version Support ---------------------- - Python 3.6 is no longer supported by Waitress - Python 3.10 is fully supported by Waitress Bugfix ------- - "wsgi.file_wrapper" now sets the "seekable", "seek", and "tell" attributes from the underlying file if the underlying file is seekable. This allows WSGI middleware to implement things like range requests for example See Pylons/waitress#359 and Pylons/waitress#363 - In Python 3 "OSError" is no longer subscriptable, this caused failures on Windows attempting to loop to find an socket that would work for use in the trigger. See Pylons/waitress#361 - Fixed an issue whereby "BytesIO" objects were not properly closed, and thereby would not get cleaned up until garbage collection would get around to it. This led to potential for random memory spikes/memory issues, see Pylons/waitress#358 and Pylons/waitress#357 . Features -------- - When the WSGI app starts sending data to the output buffer, we now attempt to send data directly to the socket. This avoids needing to wake up the main thread to start sending data. Allowing faster transmission of the first byte. See Pylons/waitress#364 - Add REQUEST_URI to the WSGI environment. REQUEST_URI is similar to "request_uri" in nginx. It is a string that contains the request path before separating the query string and decoding "%"-escaped characters. Signed-off-by: Wang Mingyu <wangmy@fujitsu.com> Signed-off-by: Khem Raj <raj.khem@gmail.com> Signed-off-by: Trevor Gamblin <trevor.gamblin@windriver.com>

digitalresistor mentioned this issue Dec 27, 2021

Close old buffer when overflowing in OverflowableBuffer #358

Merged

mmerickel closed this as completed in #358 Jan 17, 2022

dependabot bot mentioned this issue Mar 8, 2022

Bump waitress from 1.4.4 to 2.1.0 sgmagar/dsajasjs-dj-app-84#79

Closed

dependabot bot mentioned this issue Mar 9, 2022

Dependency(deps): Bump waitress from 2.0.0 to 2.1.0 dfuchss/DeltaBot-NLU#14

Merged

dependabot bot mentioned this issue Mar 13, 2022

Bump waitress from 1.4.4 to 2.1.0 in /backend crowdbotics-apps/square-sunset-29598#60

Closed

jwhitlock mentioned this issue Mar 23, 2022

Fix ResourceWarning from unclosed files in ichnaea tests mozilla/ichnaea#1393

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Memory/Resource leak #357

Memory/Resource leak #357

fschulze commented Dec 9, 2021

fschulze commented Dec 9, 2021

slav0nic commented Dec 17, 2021

fschulze commented Dec 18, 2021

digitalresistor commented Dec 27, 2021

digitalresistor commented Dec 27, 2021

digitalresistor commented Dec 27, 2021

fschulze commented Dec 27, 2021

fschulze commented Jan 16, 2022

digitalresistor commented Feb 7, 2022

fschulze commented Feb 7, 2022

fschulze commented Feb 11, 2022

digitalresistor commented Feb 11, 2022

fschulze commented Feb 24, 2022

digitalresistor commented Mar 5, 2022

Memory/Resource leak #357

Memory/Resource leak #357

Comments

fschulze commented Dec 9, 2021

fschulze commented Dec 9, 2021

slav0nic commented Dec 17, 2021

fschulze commented Dec 18, 2021

digitalresistor commented Dec 27, 2021

digitalresistor commented Dec 27, 2021

digitalresistor commented Dec 27, 2021

fschulze commented Dec 27, 2021

fschulze commented Jan 16, 2022

digitalresistor commented Feb 7, 2022

fschulze commented Feb 7, 2022

fschulze commented Feb 11, 2022

digitalresistor commented Feb 11, 2022

fschulze commented Feb 24, 2022

digitalresistor commented Mar 5, 2022