Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bpo-30891: Fix importlib _find_and_load() race condition #2646

Merged
merged 1 commit into from
Jul 10, 2017
Merged

bpo-30891: Fix importlib _find_and_load() race condition #2646

merged 1 commit into from
Jul 10, 2017

Conversation

vstinner
Copy link
Member

  • Rewrite importlib _get_module_lock(): it is now responsible to hold
    the imp lock directly.
  • _find_and_load() now holds the module lock to check if name is in
    sys.modules to prevent a race condition

message = ('import of {} halted; '
'None in sys.modules'.format(name))
raise ModuleNotFoundError(message, name=name)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My patch changes how the global import lock is handled in _find_and_load(). Before my change, it was held for the whole function (to simplify). With my change, it is now acquired/released twice when we take the _lock_unlock_module() path. IMHO it isn't an issue, I prefer finer grain lock.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even with your improvements to the lock handling, this still looks a bit race-prone to me, since we have the classic "query before use" pattern of:

if name not in sys.modules:
    ...
module = sys.modules[name]

That is, just because the module was there when we checked if name not in sys.modules doesn't mean it's still going to be there when we run module = sys.modules[name].

Previously, holding _imp.acquire_lock() for the whole function would at least protect this from other _find_and_load() calls, but as far as I can see it's never been protected from other threads doing del sys.modules[name] without holding the relevant module lock.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, you are right: I proposed PR #2665.

@vstinner
Copy link
Member Author

I confirm with my Windows VM that "./python -m test -R 3:100 -m test_concurrency test_import" doesn't fail anymore with this change.

Copy link
Member

@serhiy-storchaka serhiy-storchaka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM except one outdated sentence.

def _lock_unlock_module(name):
"""Release the global import lock, and acquires then release the
module lock for a given module name.
"""Acquires then release the module lock for a given module name.
This is used to ensure a module is completely initialized, in the
event it is being imported by another thread.

Should only be called with the import lock taken."""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No longer true.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh sorry, I misunderstood your comment. It should now be fixed.

with _ModuleLockManager(name):
try:
module = sys.modules[name]
except KeyError:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

KeyError is raised in common case (_find_and_load is called from accelerated C code when name not in sys.modules). Catching exceptions in Python code is slow. If you want to avoid KeyError in the following code, it would be better to use sys.modules.get(name, sentinel).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok ok, I reverted this unrelated change.

* Rewrite importlib _get_module_lock(): it is now responsible to hold
  the imp lock directly.
* _find_and_load() now holds the module lock to check if name is in
  sys.modules to prevent a race condition
@vstinner vstinner merged commit 4f9a446 into python:master Jul 10, 2017
@vstinner vstinner deleted the importlib_module_lock branch July 10, 2017 20:52
vstinner added a commit that referenced this pull request Jul 10, 2017
* Rewrite importlib _get_module_lock(): it is now responsible to hold
  the imp lock directly.
* _find_and_load() now holds the module lock to check if name is in
  sys.modules to prevent a race condition
(cherry picked from commit 4f9a446)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants