Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bpo-37193: remove thread objects which finished process its request #13893

Merged
merged 18 commits into from
Nov 1, 2020

Conversation

maru-n
Copy link
Contributor

@maru-n maru-n commented Jun 7, 2019

In ThreadingMixIn class, it hold child processes on list (self._threads variable) to join it when server is closed.
This list also contain thread objects which finished its process.
And it will be extended eternally.
So Memory usage keep increasing while running server until server_close() called in spite of all threads finished to process request in real time and computer resource is enough to process all request.
This PR remove thread objects after finished to avoid it.

https://bugs.python.org/issue37193

@mangrisano
Copy link
Contributor

Hi and thank you for the pull request. Just one question about the change: did you test it?
I'm not a core-dev so feel free to don't follow my hint :)

@maru-n
Copy link
Contributor Author

maru-n commented Jun 8, 2019

Hello, Thank you for your kindly comment.

Just one question about the change: did you test it?

I only checked that this PR fixed the problem by memory monitoring tool and sample code written on the issue (https://bugs.python.org/issue37193) and didn't write test code because I couldn't decide whether it is necessary. (of course, it passed original test, test_socketserver)
If test code is required, I'm going to add it.

@mangrisano
Copy link
Contributor

If test code is required, I'm going to add it.

Perhaps it's better to wait for a core-dev for reviewing. :)

Lib/socketserver.py Outdated Show resolved Hide resolved
@maru-n maru-n force-pushed the fix-issue-37193 branch 2 times, most recently from b5e8368 to 41c2122 Compare June 21, 2019 16:10
Lib/socketserver.py Outdated Show resolved Hide resolved
Lib/socketserver.py Outdated Show resolved Hide resolved
Lib/socketserver.py Outdated Show resolved Hide resolved
Lib/socketserver.py Outdated Show resolved Hide resolved
Lib/socketserver.py Outdated Show resolved Hide resolved
@bedevere-bot
Copy link

A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated.

Once you have made the requested changes, please leave a comment on this pull request containing the phrase I have made the requested changes; please review again. I will then notify any core developers who have left a review that you're ready for them to take another look at this pull request.

@maru-n
Copy link
Contributor Author

maru-n commented Jun 25, 2019

Hello @vstinner
Thank you for your good review again.
I have made the requested changes; please review again.

By the way, now it seems that source code around _threads_lock (and _threads list) is redundant and not clear for me. (especially testing if it is none or not and initialize it)
I would prefer to put it together, for example in initialization,

def __init__(self, *args, **kwargs):
    super().__init__(*args, **kwargs)
    self._threads_lock = threading.Lock()
    self._threads = []

How you do you feel about this?

@bedevere-bot
Copy link

Thanks for making the requested changes!

@vstinner: please review the changes made to this pull request.

@jaraco jaraco added the 🔨 test-with-buildbots Test PR w/ buildbots; report in status section label Jun 13, 2020
@bedevere-bot
Copy link

🤖 New build scheduled with the buildbot fleet by @jaraco for commit d99817e 🤖

If you want to schedule another build, you need to add the ":hammer: test-with-buildbots" label again.

@bedevere-bot bedevere-bot removed the 🔨 test-with-buildbots Test PR w/ buildbots; report in status section label Jun 13, 2020
@jaraco jaraco requested a review from vstinner June 13, 2020 14:01
@jaraco
Copy link
Member

jaraco commented Jun 13, 2020

In 7d1f367, I've added a test against the ancestor commit to this PR, capturing the reported failure and in dadeb0c merged that commit into the fix, demonstrating its effectiveness.

jplitza added a commit to PLUTEX/systemd-journal-remote-gelf that referenced this pull request Jun 17, 2020
This has a memory leak in the current Python version
(python/cpython#13893)
@dmitry-prokopchenkov
Copy link

@maru-n , @jaraco Thanks for working on this! I guess we've faced with the issue which could be fixed by this PR. So I'm curious what is the status of this PR? Why is it not merged yet?

@jaraco
Copy link
Member

jaraco commented Nov 1, 2020

I've reviewed it and believe I've addressed the concerns by Victor. Given that there's been no comment after several pings, I recommend to proceed.

@jaraco jaraco merged commit c415590 into python:master Nov 1, 2020
@miss-islington
Copy link
Contributor

Thanks @maru-n for the PR, and @jaraco for merging it 🌮🎉.. I'm working now to backport this PR to: 3.7, 3.8, 3.9.
🐍🍒⛏🤖

miss-islington pushed a commit to miss-islington/cpython that referenced this pull request Nov 1, 2020
…ythonGH-13893)

* bpo-37193: remove the thread which finished process request from threads list

* rename variable t to thread.

* don't remove thread from list if it is daemon.

* use lock to protect self._threads.

* use finally block in case of exception from shutdown_request().

* check "not thread.daemon" before lock to avoid holding the lock if it's unnecessary.

* fix the place of _threads_lock.

* separate code to remove a current thread into a function.

* check ValueError when removing thread.

* fix wrong code which all instance shared same lock.

* Extract thread management into a _Threads class to encapsulate atomic operations and separate concerns.

* Replace multiple references of 'block_on_close' with one, avoiding the possibility that 'block_on_close' could change during the course of processing requests. Now, there's exactly one _threads object with behavior fixed for the duration.

* Add docstrings to private classes.

* Add test to ensure that a ThreadingTCPServer can be closed without serving any requests.

* Use _NoThreads as the default value. Fixes AttributeError when server is closed without serving any requests.

* Add blurb

* Add test capturing failure.

Co-authored-by: Jason R. Coombs <jaraco@jaraco.com>
(cherry picked from commit c415590)

Co-authored-by: MARUYAMA Norihiro <norihiro.maruyama@gmail.com>
@bedevere-bot
Copy link

GH-23087 is a backport of this pull request to the 3.9 branch.

@bedevere-bot bedevere-bot removed the needs backport to 3.9 only security fixes label Nov 1, 2020
miss-islington pushed a commit to miss-islington/cpython that referenced this pull request Nov 1, 2020
…ythonGH-13893)

* bpo-37193: remove the thread which finished process request from threads list

* rename variable t to thread.

* don't remove thread from list if it is daemon.

* use lock to protect self._threads.

* use finally block in case of exception from shutdown_request().

* check "not thread.daemon" before lock to avoid holding the lock if it's unnecessary.

* fix the place of _threads_lock.

* separate code to remove a current thread into a function.

* check ValueError when removing thread.

* fix wrong code which all instance shared same lock.

* Extract thread management into a _Threads class to encapsulate atomic operations and separate concerns.

* Replace multiple references of 'block_on_close' with one, avoiding the possibility that 'block_on_close' could change during the course of processing requests. Now, there's exactly one _threads object with behavior fixed for the duration.

* Add docstrings to private classes.

* Add test to ensure that a ThreadingTCPServer can be closed without serving any requests.

* Use _NoThreads as the default value. Fixes AttributeError when server is closed without serving any requests.

* Add blurb

* Add test capturing failure.

Co-authored-by: Jason R. Coombs <jaraco@jaraco.com>
(cherry picked from commit c415590)

Co-authored-by: MARUYAMA Norihiro <norihiro.maruyama@gmail.com>
@bedevere-bot
Copy link

GH-23088 is a backport of this pull request to the 3.8 branch.

@bedevere-bot
Copy link

GH-23089 is a backport of this pull request to the 3.7 branch.

miss-islington pushed a commit to miss-islington/cpython that referenced this pull request Nov 1, 2020
…ythonGH-13893)

* bpo-37193: remove the thread which finished process request from threads list

* rename variable t to thread.

* don't remove thread from list if it is daemon.

* use lock to protect self._threads.

* use finally block in case of exception from shutdown_request().

* check "not thread.daemon" before lock to avoid holding the lock if it's unnecessary.

* fix the place of _threads_lock.

* separate code to remove a current thread into a function.

* check ValueError when removing thread.

* fix wrong code which all instance shared same lock.

* Extract thread management into a _Threads class to encapsulate atomic operations and separate concerns.

* Replace multiple references of 'block_on_close' with one, avoiding the possibility that 'block_on_close' could change during the course of processing requests. Now, there's exactly one _threads object with behavior fixed for the duration.

* Add docstrings to private classes.

* Add test to ensure that a ThreadingTCPServer can be closed without serving any requests.

* Use _NoThreads as the default value. Fixes AttributeError when server is closed without serving any requests.

* Add blurb

* Add test capturing failure.

Co-authored-by: Jason R. Coombs <jaraco@jaraco.com>
(cherry picked from commit c415590)

Co-authored-by: MARUYAMA Norihiro <norihiro.maruyama@gmail.com>
@bedevere-bot
Copy link

⚠️⚠️⚠️ Buildbot failure ⚠️⚠️⚠️

Hi! The buildbot aarch64 RHEL7 3.x has failed when building commit c415590.

What do you need to do:

  1. Don't panic.
  2. Check the buildbot page in the devguide if you don't know what the buildbots are or how they work.
  3. Go to the page of the buildbot that failed (https://buildbot.python.org/all/#builders/539/builds/270) and take a look at the build logs.
  4. Check if the failure is related to this commit (c415590) or if it is a false positive.
  5. If the failure is related to this commit, please, reflect that on the issue and make a new Pull Request with a fix.

You can take a look at the buildbot page here:

https://buildbot.python.org/all/#builders/539/builds/270

Summary of the results of the build (if available):

== Tests result: ENV CHANGED ==

409 tests OK.

10 slowest tests:

  • test_concurrent_futures: 4 min 17 sec
  • test_unparse: 4 min 13 sec
  • test_capi: 3 min 23 sec
  • test_tokenize: 3 min 6 sec
  • test_peg_generator: 2 min 58 sec
  • test_lib2to3: 2 min 35 sec
  • test_asyncio: 2 min 28 sec
  • test_multiprocessing_spawn: 2 min 17 sec
  • test_unicodedata: 1 min 30 sec
  • test_multiprocessing_forkserver: 1 min 29 sec

1 test altered the execution environment:
test_asyncio

14 tests skipped:
test_devpoll test_gdb test_ioctl test_kqueue test_msilib
test_ossaudiodev test_startfile test_tix test_tk test_ttk_guionly
test_winconsoleio test_winreg test_winsound test_zipfile64

Total duration: 7 min 51 sec

Click to see traceback logs
Traceback (most recent call last):
  File "/home/buildbot/buildarea/3.x.cstratak-RHEL7-aarch64/build/Lib/asyncio/sslproto.py", line 321, in __del__
    self.close()
  File "/home/buildbot/buildarea/3.x.cstratak-RHEL7-aarch64/build/Lib/asyncio/sslproto.py", line 316, in close
    self._ssl_protocol._start_shutdown()
  File "/home/buildbot/buildarea/3.x.cstratak-RHEL7-aarch64/build/Lib/asyncio/sslproto.py", line 590, in _start_shutdown
    self._abort()
  File "/home/buildbot/buildarea/3.x.cstratak-RHEL7-aarch64/build/Lib/asyncio/sslproto.py", line 731, in _abort
    self._transport.abort()
  File "/home/buildbot/buildarea/3.x.cstratak-RHEL7-aarch64/build/Lib/asyncio/selector_events.py", line 680, in abort
    self._force_close(None)
  File "/home/buildbot/buildarea/3.x.cstratak-RHEL7-aarch64/build/Lib/asyncio/selector_events.py", line 731, in _force_close
    self._loop.call_soon(self._call_connection_lost, exc)
  File "/home/buildbot/buildarea/3.x.cstratak-RHEL7-aarch64/build/Lib/asyncio/base_events.py", line 746, in call_soon
    self._check_closed()
  File "/home/buildbot/buildarea/3.x.cstratak-RHEL7-aarch64/build/Lib/asyncio/base_events.py", line 510, in _check_closed
    raise RuntimeError('Event loop is closed')
RuntimeError: Event loop is closed

@pablogsal
Copy link
Member

This commit has introduced reference leaks:


Ran 202 tests in 21.654s
OK (skipped=1)
......
test_logging leaked [20, 20, 20] references, sum=60
test_logging leaked [20, 20, 20] memory blocks, sum=60
2 tests failed again:
test_logging test_socketserver
== Tests result: FAILURE then FAILURE ==

Example buildbot failure:

https://buildbot.python.org/all/#/builders/562/builds/79/steps/5/logs/stdio

jaraco added a commit that referenced this pull request Nov 2, 2020
jaraco added a commit that referenced this pull request Nov 3, 2020
adorilson pushed a commit to adorilson/cpython that referenced this pull request Mar 13, 2021
…ythonGH-13893)

* bpo-37193: remove the thread which finished process request from threads list

* rename variable t to thread.

* don't remove thread from list if it is daemon.

* use lock to protect self._threads.

* use finally block in case of exception from shutdown_request().

* check "not thread.daemon" before lock to avoid holding the lock if it's unnecessary.

* fix the place of _threads_lock.

* separate code to remove a current thread into a function.

* check ValueError when removing thread.

* fix wrong code which all instance shared same lock.

* Extract thread management into a _Threads class to encapsulate atomic operations and separate concerns.

* Replace multiple references of 'block_on_close' with one, avoiding the possibility that 'block_on_close' could change during the course of processing requests. Now, there's exactly one _threads object with behavior fixed for the duration.

* Add docstrings to private classes.

* Add test to ensure that a ThreadingTCPServer can be closed without serving any requests.

* Use _NoThreads as the default value. Fixes AttributeError when server is closed without serving any requests.

* Add blurb

* Add test capturing failure.

Co-authored-by: Jason R. Coombs <jaraco@jaraco.com>
adorilson pushed a commit to adorilson/cpython that referenced this pull request Mar 13, 2021
RubenKelevra pushed a commit to RubenKelevra/VfN-NRW-mesh-announce that referenced this pull request Jan 10, 2023
The ThreadingMixIn class (used by
ThreadingUDPServer/ThreadingTCPServer/...) server stopped using
daemon_threads by default with Python 3.7. But the releasing of the threads
is only done by Threading*Server when the server closes. So in the
meantime, all threads are gathered in the server object and thus we "leak"
memory over the lifetime of the server.

An attacker can therefore cause an OOM based DOS by just requesting some
resources again and again:

  import socket

  while True:
      sock = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
      sock.sendto(bytes("GET statistic", "utf-8"), ("127.0.0.1", 1001))
      sock.close()

To work around this, the server can be forced back to used daemon_threads.

An actual fix for this problem has to be integrated by upstream. Like the
patches already started in PR python/cpython#13893
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.