Skip to content

BUG: Calling import numpy at the same time in two different threads can lead to a race-condition #21223

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
jhgoebbert opened this issue Mar 20, 2022 · 8 comments
Labels

Comments

@jhgoebbert
Copy link

Describe the issue:

Calling import numpy at the same time in two different threads can lead to a race-condition
This happens for example with Xpra when loading the encoder nvjpeg:

2022-03-20 12:54:59,298  cannot load enc_nvjpeg (nvjpeg encoder)
Traceback (most recent call last):
  File "<pythondir>/lib/python3.9/site-packages/xpra/codecs/loader.py", line 52, in codec_import_check
    ic =  __import__(class_module, {}, {}, classnames)
  File "xpra/codecs/nvjpeg/encoder.pyx", line 8, in init xpra.codecs.nvjpeg.encoder
  File "<pythondir>/lib/python3.9/site-packages/numpy/__init__.py", line 150, in <module>
    from . import core
  File "<pythondir>/lib/python3.9/site-packages/numpy/core/__init__.py", line 51, in <module>
    del os.environ[envkey]
  File "<pythondir>/lib/python3.9/os.py", line 695, in __delitem__
    raise KeyError(key) from None
KeyError: 'OPENBLAS_MAIN_FREE'

The problem seems to come from numpy directly.

Here the environment variable OPENBLAS_MAIN_FREE is set:
https://github.com/numpy/numpy/blob/maintenance/1.21.x/numpy/core/__init__.py#L18
and short after that it is deleted
https://github.com/numpy/numpy/blob/maintenance/1.21.x/numpy/core/__init__.py#L51
But this deletion fails ...

To me this looks like a threading issue in numpy. A lock would need to be set here.

Reproduce the code example:

Here the environment variable OPENBLAS_MAIN_FREE is set:
https://github.com/numpy/numpy/blob/maintenance/1.21.x/numpy/core/__init__.py#L18
and short after that it is deleted
https://github.com/numpy/numpy/blob/maintenance/1.21.x/numpy/core/__init__.py#L51

If two threads call this funtion at the same time we get a race-condition.
The deletion fails ...

Error message:

File "<pythondir>/lib/python3.9/site-packages/numpy/core/__init__.py", line 51, in <module>
    del os.environ[envkey]
  File "<pythondir>/lib/python3.9/os.py", line 695, in __delitem__
    raise KeyError(key) from None
KeyError: 'OPENBLAS_MAIN_FREE'


### NumPy/Python version information:

numpy 1.21.3
Python 3.9.6
@seberg
Copy link
Member

seberg commented Mar 20, 2022

@jhgoebbert I would expect Python ensures that importing is thread-safe? It would seem error prone to allow importing the same module twice: Even for simple Python modules running the side-effects of importing more than once seems potentially wrong.

I am really curious how the multiple threads are created in this example? Are there sub-interpreters or so involved?

@jhgoebbert
Copy link
Author

@totaam
Copy link

totaam commented Mar 21, 2022

I am really curious how the multiple threads are created in this example? Are there sub-interpreters or so involved?

Simply using threading.Thread

If __import__ is what is causing this problem, would switching to importlib.import_module make any difference?

@seberg
Copy link
Member

seberg commented Mar 21, 2022

@totaam I don't think __import__ is special, I think __import__ is used also by import internally always. I just tried this:

imported.py:

print("running import of 'imported'!")

import time
time.sleep(1)
print("finished import of 'imported'!")

and test.py (starting second thread later in this version, but it doesn't matter):

import threading
import time

def func():
    print("running thread!")
    __import__("imported")

t1 = threading.Thread(target=func)
t1.start()
time.sleep(0.2)

t2 = threading.Thread(target=func)
t2.start()

t1.join()
t2.join()

And as expected running python3 test.py prints out:

running thread!
running import of 'imported'!
running thread!
finished import of 'imported'

I assume there isn't any other complexities, i.e. PyPy involved? And I don't really understand how anything could be messing with environment variables.

So, my current hypothesis (and I have briefly checked the Python code) is that Python does not do manual locking. But it effectively locks due to this going into C and thus holding the GIL. But somewhere during the import of NumPy, NumPy probably releases the GIL briefly and that could allow the next thread to go into the import machinery.

Would you be up to opening a bug with Python? NumPy may be doing some worse than typical stuff here, but right now it seems to me that Python should be protecting us.
I.e. even a pure Python module that does some array calculations or IO during import, may get its GIL released and then gets run multiple times – which could be wrong.

If Python says that this is expected, I am wondering if you would have to do the locking somehow. Since you are managing the threads.

All of that doesn't mean we can't consider adding a band-aid fix to help in NumPy.

@jhgoebbert
Copy link
Author

cpython does not allow to open issues on GitHub. So I have started a new topic on
https://discuss.python.org/t/no-protection-import-numpy-in-two-different-threads-can-lead-to-race-condition/14504

@seberg
Copy link
Member

seberg commented Mar 21, 2022

Oh, they are finalizing the process of moving from https://bugs.python.org to github, I guess it might be a bit confusing right now :).

@totaam
Copy link

totaam commented Mar 21, 2022

I assume there isn't any other complexities, i.e. PyPy involved?

Not that I can think of, no. Definitely no PyPy involved.
We do sanitize the environment, but this happens before loading the codecs and does not touch OPENBLAS_MAIN_FREE.

@jhgoebbert
Copy link
Author

Oh, they are finalizing the process of moving from https://bugs.python.org to github, I guess it might be a bit confusing right now :).

Oh, I moved it to the issues https://bugs.python.org/issue47082

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants