Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tkinter single-threaded deadlock #58598

Open
jcbollinger mannequin opened this issue Mar 22, 2012 · 11 comments
Open

Tkinter single-threaded deadlock #58598

jcbollinger mannequin opened this issue Mar 22, 2012 · 11 comments
Labels
3.8 only security fixes 3.9 only security fixes 3.10 only security fixes stdlib Python modules in the Lib dir topic-tkinter type-bug An unexpected behavior, bug, or error

Comments

@jcbollinger
Copy link
Mannequin

jcbollinger mannequin commented Mar 22, 2012

BPO 14390
Nosy @ned-deily, @bitdancer, @asvetlov
Files
  • deadlocktest-0.2.tar.gz: Tarball of a standalone test exhibiting the faulty behavior
  • reentrant-tkinter.patch: Patch for source, tests, and docs
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = None
    created_at = <Date 2012-03-22.20:48:30.561>
    labels = ['3.8', 'type-bug', 'expert-tkinter', '3.9', '3.10']
    title = 'Tkinter single-threaded deadlock'
    updated_at = <Date 2020-11-16.00:04:57.129>
    user = 'https://bugs.python.org/jcbollinger'

    bugs.python.org fields:

    activity = <Date 2020-11-16.00:04:57.129>
    actor = 'iritkatriel'
    assignee = 'none'
    closed = False
    closed_date = None
    closer = None
    components = ['Tkinter']
    creation = <Date 2012-03-22.20:48:30.561>
    creator = 'jcbollinger'
    dependencies = []
    files = ['25032', '25866']
    hgrepos = []
    issue_num = 14390
    keywords = ['patch']
    message_count = 11.0
    messages = ['156619', '156768', '156861', '156924', '156927', '156942', '156943', '162536', '162539', '162542', '162547']
    nosy_count = 5.0
    nosy_names = ['ned.deily', 'jnoller', 'r.david.murray', 'asvetlov', 'jcbollinger']
    pr_nums = []
    priority = 'normal'
    resolution = None
    stage = 'patch review'
    status = 'open'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue14390'
    versions = ['Python 3.8', 'Python 3.9', 'Python 3.10']

    @jcbollinger
    Copy link
    Mannequin Author

    jcbollinger mannequin commented Mar 22, 2012

    This is the same as bpo-452973, created as a new issue pursuant to the instruction given when 452973 was closed as "out of date".

    In a nutshell, in a program using combining Tkinter with Tcl callbacks written in C, it is possible for even a single-threaded program to deadlock.

    The case I ran into had these particulars:

    The main program is in Python, but it relies on a custom extension written in C. Through that extension, C callbacks are registered for various Tcl GUI events, and most of these invoke Python functions via Python's C API. Many of those Python functions invoke Tkinter methods. For example, many of the callbacks are bound to menu item activations, and these typically [try to] contruct a Tkinter dialog the first time they are called. What happens in practice is that the program starts fine, but the GUI freezes as soon as any menu item is activated that has one of the affected callbacks bound to it.

    Gdb and I are confident that the problem is as described in bpo-452973: the program's single thread acquires TKinter's internal Tcl lock when the mouse event processing begins, and does not release it before control re-enters Python (there is no public API by which it can be made to do so). When the Python function invokes Tkinter methods, tkinter attempts to acquire the lock again, at which point it deadlocks because it holds the lock already.

    I encountered this issue on CentOS 6 (thus Python 2.6.6), but it appears that the problem is still present in the Python 3 trunk. I have flagged this issue only for version 2.6, however, because I cannot currently confirm that it affects later versions (see below regarding testing).

    I developed a patch against 2.6.6. It fixes the problem by allowing the Tcl lock to be acquired multiple times by any one thread (and requiring it to be released the same number of times before another thread can acquire it). That is perhaps technically inferior to creating public functions around _tkinter.c's ENTER_PYTHON and LEAVE_PYTHON macros, but it doesn't touch the public API. Even if new public functions were provided, the reentrant locking might still be a good fallback. The patch applies cleanly to the trunk, so probably also to every version between that and 2.6.6.

    I would be happy to contribute the patch, but I am a bit at a loss as to how to write an automated test for it because (1) such a test must depend on an extension module, and (2) test failure means causing a deadlock.

    Any advice as to whether such a patch would be considered, or as to how best to test it would be welcome.

    @jcbollinger jcbollinger mannequin added topic-tkinter type-bug An unexpected behavior, bug, or error labels Mar 22, 2012
    @asvetlov
    Copy link
    Contributor

    Can you make a test code to introduce you issue?

    I understand — it's not easy to extract failing code from your big project but please make simple example with python code and trivial C Extension for presentation of your problem.

    Let's start from manual test can be reproduced by everyone. Also, please, make it for 'default' branch of python repo. It's possible to include bugfix for 2.7 but upcoming release the most important.

    Thank you.

    @jcbollinger
    Copy link
    Mannequin Author

    jcbollinger mannequin commented Mar 26, 2012

    I was already working on a standalone test, and now I have it ready. Using it I can demonstrate the issue against both the cpython trunk and against my local v2.6.6 binary distribution, therefore I have added v3.3 as an affected version. It is reasonable to suppose that all versions in between are affected as well, but I have not tested versions 2.7, 3.1, or 3.2.

    I attach a complete package with source and Autotools build scripts. A bit of overkill, I guess, but pretty easy to use. As is typical with the Autotools, the build system is far larger than the actual project sources (those are only 162 lines of C and 57 lines of Python, both reasonably well commented).

    The test should be run against a Python configured with --enable-shared --with-threads (I also used --with-pydebug), and that can be an uninstalled working copy. To build and perform the test:

    1. Unpack the tarball
      tar xzf deadlocktest-0.2.tar.gz
    2. Change to the test source directory
      cd deadlocktest-0.2
    3. Configure the test for building
      ./configure [--with-python-build=/path/to/working/copy]
    4. Build the test
      make
    5. Run the test
      make check

    The test builds and runs (and fails) against both Python 2.6 and the current trunk (3.3). It passes when run against my patched versions of 2.6 and 3.3.

    @jcbollinger
    Copy link
    Mannequin Author

    jcbollinger mannequin commented Mar 27, 2012

    For what it's worth, I can convert my standalone test into a PyUnit testcase easily enough (or so it appears). I'm having trouble, however, figuring out how to get the extension it depends on built and accessible to the test, yet not installed with the normal modules.

    @bitdancer
    Copy link
    Member

    I believe there is an example in the packaging unit tests of building an extension module in a test (but I'm not 100% sure).

    Also, the fact that the test will deadlock if it fails is not a deal breaker: the test should pass normally, and if it fails the buildbots will eventually time out and we'll see the error.

    @jcbollinger
    Copy link
    Mannequin Author

    jcbollinger mannequin commented Mar 27, 2012

    I looked at the packaging tests (thanks), but I didn't find anything useful to me. There were a couple whose names looked promising, but they turned out to be stubs. As far as I can tell, none of those tests actually invoke the system's C compiler, even indirectly. They are numerous, however, so I could have overlooked something.

    It occurs to me that because the extension only needs to provide one function, I could just add that to _tkinter. That would ease testing without adding anything to the *public* API, but it seems a bit smelly to me because the point is that a user extension can trigger the bug. Also, the added function would be accessible to programs that choose to ignore privacy convention.

    Also, I am assuming that tests only need to be runnable by developers and build automatons -- i.e. someone who can and did build Python from source. If they need also to be runnable by end users then a compiled version of any extension the tests depend upon needs to be included in binary distributions.

    @ned-deily
    Copy link
    Member

    For examples of tests that build extension modules, see Lib/packaging/tests/test_command_build_ext.py or the Distutils equivalent, Lib/distutils/tests/test_build_ext.py. These tests are also runnable from installed versions of Python, assuming the user has the necessary build tools (compiler, etc) installed.

    @jcbollinger
    Copy link
    Mannequin Author

    jcbollinger mannequin commented Jun 8, 2012

    I attach a patch fixing the issue and providing a test and docs.

    The fix is substantially as I described earlier: a thread that holds the Tcl lock is permitted to acquire it logically any number of times, but physically attempts to acquire it only if it doesn't already hold it. A thread-local counter ensures that the lock is logically released the same number of times it has been acquired before it is physically released.

    The external API is unchanged, and even source changes are minimized to the greatest extent possible.

    If this fix ultimately is accepted then I hope it can also be back-ported to 2.7.

    @bitdancer
    Copy link
    Member

    Thanks for working on this.

    This is not my area of expertise, but what you describe sounds like an RLock, and there is a C implementation of RLock in Python3. Could you just use that for Python3?

    Also, very minor comments on the patch format (I'm not in a position to review the patch itself): we prefer not to add additional copyright notices (some files have older ones). My understanding is you have the copyright by virtue of having published the patch here, and your contributor agreement on file allows us to incorporate it into the codebase, and nothing more is needed.

    I don't believe we generally include bug fixes in What's New, unless they are significant enough behavior changes that they don't get put into the older versions. It's Raymond's call, though.

    @jcbollinger
    Copy link
    Mannequin Author

    jcbollinger mannequin commented Jun 8, 2012

    Yes, I have basically made tkinter's Tcl lock into an Rlock. With respect to Python3's Rlock implementation, though, are you talking about what I see in Modules/_threadmodule.c? Even if it would be acceptable to make the tkinter module depend on the thread module (not clear), I don't think I can easily use that because it looks like all the relevant functions are static, in typical extension module fashion. In other words, it provides only a Python API, not a C API. Moreover, the current implementation can easily be backported to Python 2, but that would not be true of an implementation based on the thread module's Rlock. If you would nevertheless prefer that the thread module's Rlock be used then I would appreciate technical suggestions for how to overcome the lack of a C API.

    I am content to comply with the PSF copyright marking policy. Is it documented somewhere? My understanding is that my copyright does not depend in any way on marking the work -- at least in the US -- but there are other reasons to prefer to mark. Anyway, show me the policy or else just confirm that it is to not mark in cases such as this, and I will remove it.

    Tkinter threading and re-entrancy issues have been somewhat of a sore spot for a very long time, so I think this change is worth calling out. Nevertheless, if Raymund disagrees then so be it.

    Thanks

    @bitdancer
    Copy link
    Member

    That's why I phrased it as a question, I don't know enough about the C stuff. Someone else nosy on this bug will probably have a more informed opinion.

    I don't think the copyright marking policy is currently written down. It ought to be, but I have a sinking feeling making that happen isn't going to be easy, because it involves lawyerly stuff.

    @iritkatriel iritkatriel added 3.8 only security fixes 3.9 only security fixes 3.10 only security fixes labels Nov 16, 2020
    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    @iritkatriel iritkatriel added the stdlib Python modules in the Lib dir label Nov 23, 2023
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.8 only security fixes 3.9 only security fixes 3.10 only security fixes stdlib Python modules in the Lib dir topic-tkinter type-bug An unexpected behavior, bug, or error
    Projects
    Status: No status
    Development

    No branches or pull requests

    4 participants