-
-
Notifications
You must be signed in to change notification settings - Fork 30.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
A finer grained import lock #53506
Comments
This is an implementation of the idea suggested in: The patch creates a dictionary of reentrant locks keyed by module full name. Trying to import a module or package will first get the lock for that module (or, if necessary, create it) and then acquire it. This is done for any module type. The global import lock is still there, but only used for two things:
Semantics of the public C API are unchanged, because it is not clear whether they should be or not (concerns of usefulness vs. compatibility). For example, PyImport_ImportModuleNoBlock() still uses the global import lock but this could be relaxed in a later patch. |
So I say we don't worry about loaders being thread-safe. If __import__ handles the locking for a specific module then it will hold the lock on behalf of the loader. Now if someone decides to call load_module on their own, that's there business, but they should be aware of what could happen if they do that without acquiring the lock themselves. Otherwise we just make sure to provide a context manager that takes the name of the module and people can use that when they make their call to loader.load_module. |
Yes but what happens if two different modules are imported from two |
On Wed, Jul 14, 2010 at 12:34, Antoine Pitrou <report@bugs.python.org>wrote:
That's why I said we should supply a context decorator (or function) which |
Ok, so what are you saying is that we can break compatibility for non |
What I'm saying is that loaders are quite possibly not thread-safe already, so we don't need to do any special for them. If you look at PEP-302 you will notice not a single mention of loaders needing to care about the import lock because there is no mention of the import lock! So changing the locking mechanism most likely won't break loaders because they are not using the current import lock anyway and so already have their own issues. As long as __import__ does the proper locking on behalf of loaders and we provide a way for people to use the lock if they want to then I am not worried about the impact on loaders. For example, this will change the logic in importlib where the current import lock is grabbed, but otherwise won't change a thing in terms of the code for the various loaders it implements. |
Are you sure they aren't using it implicitly? |
How is this going to deal with cyclical imports where different threads could import at the same time different modules within that cycle? I need to look through the proposed patch and work out exactly what it does, but am concerned about whether this approach would cause the classic deadlock problem if not done right? FWIW, this concept of a lock per module is what I used in the mod_python module importer when it was rewritten. I would have to go look back over that code and see how the way the concept is being implemented differs, but there was one remaining potential race condition in the mod_python code which could in rare instances cause a problem. I never did get around to fixing it. Anyway, what I did learn was that this approach isn't necessarily as simple as it may seem so it will need some really good analysis on whatever solution is developed to ensure subtle problems don't come up. |
That's my point; loaders are using the lock implicitly so that's why we don't need to worry about the global import lock just for path hooks. It seems like you are suggesting using the global import lock purely for compatibility, and what I am saying is that loaders don't care so there is no compatibility to care about. I say only use the global import lock for serializing creation. |
You're right, I hadn't thought about that. Additional machinery will be |
What is your take on the threadimp2.patch in bpo-9251? |
I'll have a look when I can (hopefully EuroPython). |
New prototype with per-module import locks and deadlock avoidance. Probably lacks a test and some cleanup. |
IIUC, the deadlock avoidance code just checks that acquiring a For example, let's say we have a thread exactly here in in
acquire_import_lock():
PyThread_acquire_lock(lock->lock, 1);
/* thread inside PyEval_RestoreThread(), waiting for the GIL */
PyEval_RestoreThread(saved);
lock->waiters--;
}
assert(lock->level == 0);
lock->tstate = tstate; It owns the lock, but hasn't yet updated the lock's owner Also, I think that locks will use POSIX semaphores on systems that |
That's true. Do you think temptatively acquiring the lock (without
Isn't this limit only about named semaphores? Or does it apply to |
I think it should work. Something along those lines: while True:
if lock.acquire(0):
lock.tstate = tstate
return True
else:
if detect_circularity():
return False
global_lock.release()
saved = save_tstate()
yield()
restore_tstate(saved)
global_lock.acquire() However, I find this whole mechanism somewhat complicated, so the
I'm no FreeBSD expert, but AFAICT, POSIX SEM_NSEMS_MAX limit doesn't Also, a quick search returned those links: |
Yes, that's the point. Today you have to be careful when mixing imports
As they say, "Kritirii choice of the number 30 in the XXI century is File objects also have a per-object lock, so I guess opening 30 files (our FreeBSD 8.2 buildbot looks pretty much stable, was the number of |
I believe this new patch should be much more robust than the previous one. |
Ok, here is a draft patch for the new importlib.
|
New patch gets rid of the reliance on _thread.RLock (uses non-recursive locks instead), and should solve the synchronization issue. Other issues remain. |
Updated patch fixes the performance issue and disposes of module locks when they aren't used anymore. |
Updated patch also makes PyImport_ImportModuleNoBlock a simple alias of PyImport_ImportModule. |
Updated patch also adds unit tests for the module locks and the deadlock avoidance algorithm. |
Updated patch with a couple new tests. |
I still wonder whether Graham Dumpleton's observation has merits. Suppose we have these modules # a.py
time.sleep(10)
import b
# b.py
time.sleep(10)
import a
# main.py
def x():
import a
def y():
import b Now, if x and y are executed in separate threads - won't it deadlock? |
Well, the patch has a deadlock avoidance mechanism, and it includes unit |
Can you please elaborate in the patch what the deadlock avoidance does? AFAICT, the comment explains that it is able to detect deadlocks, but nowhere says what it does when it has detected a deadlock. Also, please submit patches against default's head, or stop using git-style diffs, to enable Rietveld review. |
Updated patch against tip, and with a comment of what deadlock avoidance does (in _ModuleLock.acquire's docstring). |
The patch parser of Rietveld actually choked on the git binary diff. It now skips over these chunks. |
Updated patch against tip. I also changed the internal API of module locks a bit (acquire() raises _DeadlockError instead of returning False, and deadlock detection is not optional anymore). |
I had forgotten to tackle threadless builds, this patch fixes it. |
Does anyone else want to review this patch? |
I don't feel the need to, but I can in a few days if you want me to (just let me know if you do). |
New changeset edb9ce3a6c2e by Antoine Pitrou in branch 'default': |
I have now pushed the patch. |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: