Skip to content

gh-145568: Fix thread safety in Modules/_elementtree.c for free-threaded build#145569

Closed
Sebcio03 wants to merge 2 commits intopython:mainfrom
Sebcio03:fix-elementtree-borrowed-ref-race
Closed

gh-145568: Fix thread safety in Modules/_elementtree.c for free-threaded build#145569
Sebcio03 wants to merge 2 commits intopython:mainfrom
Sebcio03:fix-elementtree-borrowed-ref-race

Conversation

@Sebcio03
Copy link

@Sebcio03 Sebcio03 commented Mar 5, 2026

Summary

Fixes #145568 (and contributes to #116738).

Modules/_elementtree.c declares Py_MOD_GIL_NOT_USED but had multiple data races in the free-threaded (no-GIL) build. This PR applies the _lock_held split-function pattern and Py_BEGIN_CRITICAL_SECTION guards throughout the module.

Changes

REQ-1 — Add #include "pycore_critical_section.h".

REQ-2clear_extra / element_resize races
Split each into a _lock_held variant (for callers already inside a critical section) plus a locking wrapper. element_add_subelement, element_insert_impl, and Element.clear() now hold the per-object lock across the resize-and-write sequence.

REQ-3 — Borrowed references from element_get_text / element_get_tail / element_get_attrib
Split into _lock_held (borrowed ref, caller holds lock) + locking wrapper (strong ref via Py_XINCREF inside the section). Three callers in findtext and itertext updated to consume the strong reference correctly.
The original PyDict_GetItemWithErrorPyDict_GetItemRef fix in expat_default_handler is also part of this PR.

REQ-4 — Property getters/setters
All four getset descriptors (tag, text, tail, attrib) wrapped in Py_BEGIN_CRITICAL_SECTION(op). Getters call _lock_held helpers to avoid re-acquiring the lock.

REQ-5__copy__
Entire body wrapped in Py_BEGIN_CRITICAL_SECTION(self). Uses element_resize_lock_held on the freshly-created (unshared) destination element.

REQ-6__deepcopy__
All fields of self snapshotted under lock; lock released before calling deepcopy() recursively (to avoid holding a per-object lock during arbitrary Python calls). The PyDict_Next fast-path for the attrib dict is wrapped in Py_BEGIN_CRITICAL_SECTION(object).

REQ-7treebuilder_handle_end
Split into treebuilder_handle_end_lock_held + locking wrapper so the PyList_GET_ITEM(self->stack, self->index) access and self->index-- decrement are atomic.

Tests

Lib/test/test_free_threading/test_xml_etree.py — 10 concurrent stress tests, one per requirement plus regression tests for the original borrowed-ref race. All pass on a --disable-gil --with-pydebug build and under PYTHONMALLOC=debug.

./python -m test test_free_threading.test_xml_etree -v
# 10/10 tests passed

cc @swtaarrs (original #116738 author)

Sebcio03 and others added 2 commits March 5, 2026 20:51
…ed build

Add Py_BEGIN_CRITICAL_SECTION guards and _lock_held split-function
patterns throughout _elementtree.c to make the module's
Py_MOD_GIL_NOT_USED declaration honest:

* Include pycore_critical_section.h (REQ-1)

* Split clear_extra → clear_extra_lock_held + locking wrapper.
  Split element_resize → element_resize_lock_held + locking wrapper.
  element_gc_clear / element_dealloc call the lock-free variants
  directly (GC runs on already-unreachable objects).
  element_add_subelement and element_insert_impl wrap their full
  body in a critical section so the resize + slot-write is atomic.
  Element.clear() holds the lock across clear_extra + text/tail reset.
  (REQ-2)

* Split element_get_attrib, element_get_text, element_get_tail into
  _lock_held variants (borrowed ref, caller holds lock) plus locking
  wrappers (strong ref, for callers without a lock).  Fix the three
  call-sites that received a newly strong reference (findtext,
  itertext).  (REQ-3)

* Wrap all four Element property getters and setters in per-object
  critical sections.  Getters call the _lock_held helpers directly
  inside the section to avoid double-locking.  (REQ-4)

* Wrap __copy__ body in Py_BEGIN_CRITICAL_SECTION(self) so all reads
  from self are atomic.  element_resize_lock_held is called on the
  freshly-created (unshared) destination element.  (REQ-5)

* In __deepcopy__, snapshot all fields of self under lock, then
  release the lock before calling deepcopy() recursively to avoid
  holding the lock during arbitrary Python calls.  Wrap the
  PyDict_Next fast-path in deepcopy() with a critical section on the
  dict object.  (REQ-6)

* Split treebuilder_handle_end → treebuilder_handle_end_lock_held +
  locking wrapper so the PyList_GET_ITEM(self->stack, self->index)
  access and the self->index decrement are protected.  (REQ-7)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Replace placeholder gh-issue-NNNNN with the real issue number 145568.
Expand the NEWS text to cover all 7 areas fixed by the parent commit
(struct-field races, child-list mutations, property getters/setters,
__copy__, __deepcopy__, TreeBuilder end-tag handling, and the
borrowed-reference fix in the expat entity handler).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@python-cla-bot
Copy link

python-cla-bot bot commented Mar 5, 2026

The following commit authors need to sign the Contributor License Agreement:

CLA not signed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Fix thread safety in _elementtree.c

1 participant