Fix race condition inside recursive strategies #2783
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Fixes #2717.
In the end, I chose thread-local storage for solving the issue as proposed in the linked issue. Here are my thoughts on why I implemented it this way & comparing how it changes the status quo.
Single thread
At the moment, a recursive strategy may fail on this assertion if it calls itself in
do_draw
here:The strategy may look like this:
I am not sure how meaningful such a strategy is, but it is not explicitly prohibited, as far as I can tell. From the one above, I'd expect lists like this -
[True, [False, True, [True]], False]
, or just a boolean. However, a simpler one seems to be producing the same data:I am not sure if there are cases that can only be expressed by using
deferred
insiderecursive
. If there are legitimate use cases, then we might want to rethink capping. Let me know what you think - it might be worth a separate issue, but I'll share my current thoughts on this here.(Please, correct me if I miss something in my reasoning)
As far as I see, the cap is needed to prevent the drawing from this strategy & generating a certain maximum amount of leaves. However, assuming a single thread (more on the multi-threaded behavior in the next section) and such a self-referential strategy, I am not sure if capping is needed as it is - we can just apply it once on the first
capped
usage and make all subsequent calls no-op (e.g., justyield
without modifyingmarked
). Then we still have themarker
set only once on the very firstRecursiveStrategy.do_draw
call, and it will be monotonically decreasing. Therefore, we'll have the max size properly maintained, and there will be no oversized subtrees because at some point,LimitReached
will occur.Anyway, the current behavior for such a case leads to
AssertionError
. With a not-reentrant lock, as I initially mentioned in the linked issue, such a strategy will cause a deadlock. A reentrant one will keep the current behavior for a single-threaded case. Thread-local storage will also keep it.Multiple threads
This use case is about multiple threads that run tests that share the same strategy.
RLock
solves the race condition more conservatively by preventing other threads from callingcapped
, but having such a lengthy synchronization point decreases the potential gain of running tests concurrently. I don't have precise numbers on the impact. Still, with thread-local storage, the synchronization point is much smaller (there is a reentrant lock ondata.__getattribute__
), and it goes in line with other cases (like the mentionedDynamicVariable
class).Let me know what do you think :)
cc @Zac-HD @Zalathar