Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deadlock caused by local import in different thread. #2925

Closed
guojh opened this issue Dec 11, 2015 · 16 comments
Closed

Deadlock caused by local import in different thread. #2925

guojh opened this issue Dec 11, 2015 · 16 comments

Comments

@guojh
Copy link

guojh commented Dec 11, 2015

requests version 2.8.1, installed from pip.

Code to reproduce:

#!/bin/bash
cat > test_module.py <<EOF
import threading
import requests
def make_request():
    print('before requests.get')
    requests.get('https://github.com/kennethreitz/requests')    # <-- program will get stuck here
    print('after requests.get')
thread = threading.Thread(target=make_request)
thread.start()
thread.join()
EOF
python -c 'import test_module'

traceback from stuck point (I modified requests code to print stack):

  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/threading.py", line 783, in __bootstrap
    self.__bootstrap_inner()
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/threading.py", line 810, in __bootstrap_inner
    self.run()
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/threading.py", line 763, in run
    self.__target(*self.__args, **self.__kwargs)
  File "test_module.py", line 7, in make_request
    requests.get('https://github.com/kennethreitz/requests')
  File "/private/tmp/requests/v/lib/python2.7/site-packages/requests/api.py", line 69, in get
    return request('get', url, params=params, **kwargs)
  File "/private/tmp/requests/v/lib/python2.7/site-packages/requests/api.py", line 50, in request
    response = session.request(method=method, url=url, **kwargs)
  File "/private/tmp/requests/v/lib/python2.7/site-packages/requests/sessions.py", line 454, in request
    prep = self.prepare_request(req)
  File "/private/tmp/requests/v/lib/python2.7/site-packages/requests/sessions.py", line 375, in prepare_request
    auth = get_netrc_auth(request.url)
  File "/private/tmp/requests/v/lib/python2.7/site-packages/requests/utils.py", line 74, in get_netrc_auth
    traceback.print_stack(); from netrc import netrc, NetrcParseError

This should be caused by deadlocking on import lock (imp.acquire_lock, imp.release_lock, imp.lock_held).

Possible duplication of:
https://github.com/kennethreitz/requests/issues/2649

@Lukasa
Copy link
Member

Lukasa commented Dec 11, 2015

Why are you acquiring the import lock?

@sigmavirus24
Copy link
Contributor

You've provided none of the information requested in our contributing documentation.

Sent from my Android device with K-9 Mail. Please excuse my brevity.

@guojh
Copy link
Author

guojh commented Dec 11, 2015

I edited original post, and added some information.

@sigmavirus24
Copy link
Contributor

Why are you acquiring the import lock?

@guojh are you acquiring the lock manually in other code or are you positing that the behaviour you're seeing is caused by the import locks?

@guojh
Copy link
Author

guojh commented Dec 11, 2015

@sigmavirus24 No, no manually acquired lock. The bash script I pasted is all.
It easy to position the stuck by adding print statements around that line:
https://github.com/kennethreitz/requests/blob/2d91365cba9ddca25ddd28f5c0ecd9497b8a2e24/requests/utils.py#L95

It seems that import lock is automatically acquired by CPython interpreter during import.

Call stack of CPython 2.7 import should look like this (not verified):

builtin___import__
https://github.com/python/cpython/blob/61ee2ece38e609a444575dda0302907fa7a2752f/Python/bltinmodule.c#L36

PyImport_ImportModuleLevel
https://github.com/python/cpython/blob/61ee2ece38e609a444575dda0302907fa7a2752f/Python/import.c#L2287

_PyImport_AcquireLock
https://github.com/python/cpython/blob/61ee2ece38e609a444575dda0302907fa7a2752f/Python/import.c#L292
(reenterable lock)

@guojh
Copy link
Author

guojh commented Dec 11, 2015

Proof of CPython (2.7) import lock:

#!/bin/bash
cat > test_module_2.py <<EOF
import threading
def do_import():
    print('before import')
    import json                # <-- program will get stuck here
    print('after import')
thread = threading.Thread(target=do_import)
thread.start()
thread.join()
EOF
python -c 'import test_module_2'

@Lukasa
Copy link
Member

Lukasa commented Dec 11, 2015

Oh, I understand. The problem is that we fire an import not at import time, which can apparently cause a deadlock. It seems to me that this would be a bug in CPython, wouldn't it? If you can deadlock the interpreter, that sounds like a bug in the CPython import logic.

@guojh
Copy link
Author

guojh commented Dec 11, 2015

@Lukasa It seems to have been resolved in CPython 3.3. (per-module import lock instead of global import lock)
https://github.com/python/cpython/blob/d5adb7f65d30afd00921e6c22e9e2b8c323c058d/Doc/whatsnew/3.3.rst#a-finer-grained-import-lock

@guojh
Copy link
Author

guojh commented Dec 11, 2015

Proof of CPython (2.7) implicit import:

#!/bin/bash
cat > test_module_3.py <<EOF
import threading
def do_encode():
    print('before import')
    u''.encode('does-not-exist') # <-- program will get stuck here due to implicit import
    print('after import')
thread = threading.Thread(target=do_encode)
thread.start()
thread.join()
EOF
python2.7 -c 'import test_module_3'

I think it's not easy to solve this on CPython 2.7.
Requests needs to prevent all explicit and implicit local import.

@Lukasa
Copy link
Member

Lukasa commented Dec 11, 2015

Well, I mean, no: we just need to not do any late imports ourselves. In this case, it just means restructuring the get_netrc_auth function so that the import is done at global scope.

@sigmavirus24
Copy link
Contributor

get_netrc_auth probably isn't the only such function though and there may be other places where a delayed import is necessary.

That said, @Lukasa, @guojh's point is that using str.encode and str.decode will cause this too because they also use delayed imports apparently (the last test module actually also works with a valid encoding that does exist too).

That said, the other work around is to defer the work in the module such that it doesn't happen at import time.

@julianfnunez
Copy link

Is this bug still around? I think I just ran into this with Python 2.7.12 and requests 2.9.1

@sigmavirus24
Copy link
Contributor

@julianfnunez I don't believe anyone has written something to fix delayed imports in Requests (which I'm not certain need sweeping changes) but the other problem is that there are other ways to trigger this outside of requests as well. Requests uses stdlib functionality that uses delayed imports and such as well. What makes you think this is affecting you?

@julianfnunez
Copy link

julianfnunez commented Apr 5, 2017

I'm trying to do something very similar to the code snippet posted in the first comment. requests.get is locking the thread.

PS: I realize this is not necessarily a problem with requests. I was just wondering if you guys did a workaround since this bug was in "Planned".

@rtobar
Copy link

rtobar commented Jan 9, 2018

I just found this issue by chance on internet, and thought that it would be good to clarify what the problem really is.

In python 2.7 "an import should not have the side effect of spawning a new thread and then waiting for that thread in any way" (source). That is why the examples in the previous comments fail, they execute import blah, which spawns and waits on a thread, which in turn imports a module -- hence leading to a deadlock. In other words, CPython is not to blame, but the code itself. The solution in all cases is either to run the code using python blah.py or removing the call to thread.join.

@sethmlarson
Copy link
Member

In an effort to clean up the issue tracker to only have issues that are still relevant to the project we've done a quick pass and decided this issue may no longer be relevant for a variety of potential reasons, including:

  • Applies to a much older version, unclear whether the issue still applies.
  • Change requires a backwards incompatible release and it's unclear if the benefits are worth the migration effort from the community.
  • There isn't a clear demand from the community on the change landing in Requests.

If you think the issue should remain open, please comment so below or open a new issue and link back to the original issue. Again, thank you for opening the issue and for the discussion, it's much appreciated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants