-
-
Notifications
You must be signed in to change notification settings - Fork 30.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Memory leak while running TCP/UDPServer with socketserver.ThreadingMixIn #81374
Comments
UDP/TCPServer with socketserver.ThreadingMixin class (also ThreadingTCPServer and ThreadingUDPServer class) seems to be memory leak while running the server. https://docs.python.org/3/library/socketserver.html#socketserver.ThreadingMixIn My code which wrote to check this is the following.
( I wrote this based on https://docs.python.org/3/library/socketserver.html#asynchronous-mixins) Then I checked memory usage with profiling tool. $ mprof run python mycode.py
$ mprof plot I attached result plot. And also I checked this also more long time and I found memory usage was increased endlessly. My environment is Hardware: MacBook Pro (15-inch, 2018) I guess it caused by a thread object is not released in spite of the thread finished to process request and thread object will be made infinitely until server_close() is called. |
I got the same problem when uing the ThreadingTCPServer. I think adding |
Looking at the code, this would be caused by bpo-31233. I expect 3.7+ is affected. 3.6 has similar code, but the leaking looks to be disabled by default. 2.7 doesn't collect a "_threads" list at all. Looks like Victor was aware of the leak when he changed the code: <https://bugs.python.org/issue31233#msg304619\>, but maybe he pushed the code and then forgot about the problem. A possible problem with Norihiro's solution is modifying the "_threads" list from multiple threads without any synchronization. (Not sure if that is a problem, or is it guaranteed to be okay due to GIL etc?) Also, since the thread is removing itself from the list, it will still run a short while after the removal, so there is a window when the "server_close" method will not wait for that thread. Might also defeat the "dangling thread" accounting that I believe was Victor's motivation for his change. Wei's proposal is to check for cleaning up when a new request is handled. That relies on a new request coming in to free up memory. Perhaps we could use similar strategy to the Forking mixin, which I believe cleans up expired children periodically, without relying on a new request. |
PR 13893 with an additional lock sounds like a reasonable solution. The code should be skipped if the thread is a daemon thread. |
Martin Panter: In addition to PR 13893 change, what do you think of also using a weakref? It might make the code even more reliable if something goes wrong. |
I marked bpo-37389 as a duplicate of this issue: """ After putting a basic ThreadingUDPServer under load (500 messages per/second) I noticed that after a night it was consuming a lot of RAM given it does nothing with the data. On inception, I noticed the _thread count inside the server was growing forever even though the sub-threads are done. Setup a basic ThreadingUDPSever with handler that does nothing and check the request_queue_size, it seems to grow without limit. msg346411 - (view) Author: Daniel W Forsyth (danf@dataforge.on.ca) Date: 2019-06-24 14:59 The only way I could figure out to control it was to do this in a thread; for thread in server._threads: # type: Thread
if not thread.is_alive():
server._threads.remove(thread) Shouldn't the server process do this when the thread is done? |
This issue was also reported in prometheus client where the workaround was to use daemon threads. |
Another workaround might be to set the new "block_on_close" flag (bpo-33540) to False on the server subclass or instance. Victor: Replying to <https://bugs.python.org/issue37193#msg345817\> "What do I think of also using a weakref?", I assume you mean maintaining "_threads" as a WeakSet rather than a list object. That seems a nice way to solve the problem, but it seems redundant to me if other code such as Maru's proposal was also added to clean up the list. |
FTR I have been trialling a patched Python 3.7 based on Maru's changes (revision 6ac217c) + review suggestions, and it has reduced the size of the leak (hit 1 GB over a couple days, vs only 60 MB increase over three days). The remaining leak could be explained by bpo-37788. |
I note this is marked as a 3.7regression and still open. Since the cutoff for the final 3.7 bugfix mode release is in a few days, I'm assuming this means that 3.7 users will have to live with this regression. If you feel that is a problem, speak up now. |
Thanks for the notice Ned. I've revived the PR and addressed all the comments from Victor. Any chance this can get into Python 3.7? |
Perhaps but there's a lot that needs to be done yet. Like any bugfix, it needs to be reviewed, merged to master, and get some buildbot exposure first before it is backported anywhere. |
Commit c415590 has introduced reference leaks: ---------------------------------------------------------------------- Example buildbot failure: https://buildbot.python.org/all/#/builders/562/builds/79/steps/5/logs/stdio As there is a release of 3.10 alpha 2 tomorrow I would be great if this could be fixed by tomorrow. |
The change fixing a leak in socketserver introduces a leak in socketserver :-) $ ./python -m test test_socketserver -u all -m test.test_socketserver.SocketServerTest.test_ThreadingTCPServer -R 3:3
0:00:00 load avg: 0.95 Run tests sequentially
0:00:00 load avg: 0.95 [1/1] test_socketserver
beginning 6 repetitions
123456
......
test_socketserver leaked [3, 3, 3] references, sum=9
test_socketserver leaked [3, 3, 3] memory blocks, sum=9
test_socketserver failed == Tests result: FAILURE == 1 test failed: Total duration: 497 ms |
I rejected the backport to 3.8 and 3.9 since the change causes a regression on master. |
I recommend a rollback. I’ll try to get to it later today. |
I filed bpo-42263 to capture the underlying cause of the memory leak that led to the buildbot failures and the rollback. |
Thank you for fixing the regression Jason R. Coombs ;-) |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: