gh-145568: Fix thread safety in Modules/_elementtree.c for free-threaded build#145569
Closed
Sebcio03 wants to merge 2 commits intopython:mainfrom
Closed
gh-145568: Fix thread safety in Modules/_elementtree.c for free-threaded build#145569Sebcio03 wants to merge 2 commits intopython:mainfrom
Sebcio03 wants to merge 2 commits intopython:mainfrom
Conversation
…ed build Add Py_BEGIN_CRITICAL_SECTION guards and _lock_held split-function patterns throughout _elementtree.c to make the module's Py_MOD_GIL_NOT_USED declaration honest: * Include pycore_critical_section.h (REQ-1) * Split clear_extra → clear_extra_lock_held + locking wrapper. Split element_resize → element_resize_lock_held + locking wrapper. element_gc_clear / element_dealloc call the lock-free variants directly (GC runs on already-unreachable objects). element_add_subelement and element_insert_impl wrap their full body in a critical section so the resize + slot-write is atomic. Element.clear() holds the lock across clear_extra + text/tail reset. (REQ-2) * Split element_get_attrib, element_get_text, element_get_tail into _lock_held variants (borrowed ref, caller holds lock) plus locking wrappers (strong ref, for callers without a lock). Fix the three call-sites that received a newly strong reference (findtext, itertext). (REQ-3) * Wrap all four Element property getters and setters in per-object critical sections. Getters call the _lock_held helpers directly inside the section to avoid double-locking. (REQ-4) * Wrap __copy__ body in Py_BEGIN_CRITICAL_SECTION(self) so all reads from self are atomic. element_resize_lock_held is called on the freshly-created (unshared) destination element. (REQ-5) * In __deepcopy__, snapshot all fields of self under lock, then release the lock before calling deepcopy() recursively to avoid holding the lock during arbitrary Python calls. Wrap the PyDict_Next fast-path in deepcopy() with a critical section on the dict object. (REQ-6) * Split treebuilder_handle_end → treebuilder_handle_end_lock_held + locking wrapper so the PyList_GET_ITEM(self->stack, self->index) access and the self->index decrement are protected. (REQ-7) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Replace placeholder gh-issue-NNNNN with the real issue number 145568. Expand the NEWS text to cover all 7 areas fixed by the parent commit (struct-field races, child-list mutations, property getters/setters, __copy__, __deepcopy__, TreeBuilder end-tag handling, and the borrowed-reference fix in the expat entity handler). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
The following commit authors need to sign the Contributor License Agreement: |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes #145568 (and contributes to #116738).
Modules/_elementtree.cdeclaresPy_MOD_GIL_NOT_USEDbut had multiple data races in the free-threaded (no-GIL) build. This PR applies the_lock_heldsplit-function pattern andPy_BEGIN_CRITICAL_SECTIONguards throughout the module.Changes
REQ-1 — Add
#include "pycore_critical_section.h".REQ-2 —
clear_extra/element_resizeracesSplit each into a
_lock_heldvariant (for callers already inside a critical section) plus a locking wrapper.element_add_subelement,element_insert_impl, andElement.clear()now hold the per-object lock across the resize-and-write sequence.REQ-3 — Borrowed references from
element_get_text/element_get_tail/element_get_attribSplit into
_lock_held(borrowed ref, caller holds lock) + locking wrapper (strong ref viaPy_XINCREFinside the section). Three callers infindtextanditertextupdated to consume the strong reference correctly.The original
PyDict_GetItemWithError→PyDict_GetItemReffix inexpat_default_handleris also part of this PR.REQ-4 — Property getters/setters
All four
getsetdescriptors (tag,text,tail,attrib) wrapped inPy_BEGIN_CRITICAL_SECTION(op). Getters call_lock_heldhelpers to avoid re-acquiring the lock.REQ-5 —
__copy__Entire body wrapped in
Py_BEGIN_CRITICAL_SECTION(self). Useselement_resize_lock_heldon the freshly-created (unshared) destination element.REQ-6 —
__deepcopy__All fields of
selfsnapshotted under lock; lock released before callingdeepcopy()recursively (to avoid holding a per-object lock during arbitrary Python calls). ThePyDict_Nextfast-path for theattribdict is wrapped inPy_BEGIN_CRITICAL_SECTION(object).REQ-7 —
treebuilder_handle_endSplit into
treebuilder_handle_end_lock_held+ locking wrapper so thePyList_GET_ITEM(self->stack, self->index)access andself->index--decrement are atomic.Tests
Lib/test/test_free_threading/test_xml_etree.py— 10 concurrent stress tests, one per requirement plus regression tests for the original borrowed-ref race. All pass on a--disable-gil --with-pydebugbuild and underPYTHONMALLOC=debug.cc @swtaarrs (original #116738 author)