fix: snapshot data in Collector.flush_data to avoid threading race#2165
Merged
Conversation
The non-packed-arcs branch passes self.data's sets straight into add_arcs, but the old `cast(dict[str, list[TArc]], self.data)` lied about the runtime type to fit a too-narrow annotation. add_arcs accepts `Mapping[str, Collection[TArc]]`, which covers both the packed-arcs lists and the unpacked sets. Widen arc_data to `dict[str, Collection[TArc]]` and correct the cast to `set[TArc]`.
Collector.data is a single dict shared across all per-thread tracers,
and tracers update its inner sets without taking a lock (one acquire
per traced line would be far too expensive). When a switch_context
boundary fires on one thread -- which is routine with
`dynamic_context = test_function` -- flush_data runs on that
thread while tracers in other threads keep adding lines to the same
sets. mapped_file_dict only guarded the outer dict iteration, so
add_arcs / add_lines later iterated the live sets and could fail with:
RuntimeError: Set changed size during iteration
...raised from nums_to_numbits in the lines path, or from the arc tuple
comprehension in add_arcs. `list(packeds)` and
`list(packed_data.items())` in the packed-arcs branch were equally
exposed: both go through the Python iterator protocol, which detects
concurrent mutation and raises.
Snapshot both the outer dict and each per-file set with `.copy()`,
which is atomic in CPython -- the GIL is held for the duration of the
C-level copy, so no other thread can mutate the source mid-copy.
Free-threaded builds do them in per-container critical sections, so
the same guarantee applies. add_arcs / add_lines now iterate stable
copies even while other threads continue recording trace data.
a093ba4 to
4c51be1
Compare
Member
|
I can see that a lock for each line would be a lot, though I'm not sure it would be too expensive. Copying the data is also expensive, though it happens infrequently. I'm more concerned about the GIL reasoning, and will it hold up in free-threaded builds? |
Contributor
Author
|
In free-threaded builds, each object has its own mutex, which is held while modifying the object. These are suspended if it pop back to Python (e.g. using |
nedbat
added a commit
that referenced
this pull request
May 10, 2026
Member
|
This is now released as part of coverage 7.14.0. |
Contributor
Author
|
Thanks for the release! |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Collector.data is a single dict shared across all per-thread tracers, and tracers update its inner sets without taking a lock (one acquire per traced line would be far too expensive). When a switch_context boundary fires on one thread -- which is routine with
dynamic_context = test_function-- flush_data runs on that thread while tracers in other threads keep adding lines to the same sets. mapped_file_dict only guarded the outer dict iteration, so add_arcs / add_lines later iterated the live sets and could fail with:...raised from nums_to_numbits in the lines path, or from the arc tuple comprehension in add_arcs.
list(packeds)andlist(packed_data.items())in the packed-arcs branch were equally exposed: both go through the Python iterator protocol, which detects concurrent mutation and raises.Snapshot both the outer dict and each per-file set with
.copy(), which is atomic in CPython -- the GIL is held for the duration of the C-level copy, so no other thread can mutate the source mid-copy. Free-threaded builds do them in per-container critical sections, so the same guarantee applies. add_arcs / add_lines now iterate stable copies even while other threads continue recording trace data.