Fix race condition inside recursive strategies #2783

Stranger6667 · 2021-01-26T13:53:17Z

In the end, I chose thread-local storage for solving the issue as proposed in the linked issue. Here are my thoughts on why I implemented it this way & comparing how it changes the status quo.

Single thread

At the moment, a recursive strategy may fail on this assertion if it calls itself in do_draw here:

                with self.limited_base.capped(self.max_leaves):
                    return data.draw(self.strategy)

The strategy may look like this:

from hypothesis import strategies as st

SELF_REF = st.recursive(
    st.deferred(lambda: SELF_REF | st.booleans()),
    lambda s: st.lists(s, min_size=1)
)

I am not sure how meaningful such a strategy is, but it is not explicitly prohibited, as far as I can tell. From the one above, I'd expect lists like this - [True, [False, True, [True]], False], or just a boolean. However, a simpler one seems to be producing the same data:

SELF_REF = st.recursive(
    st.booleans(),
    lambda s: st.lists(s, min_size=1)
)

I am not sure if there are cases that can only be expressed by using deferred inside recursive. If there are legitimate use cases, then we might want to rethink capping. Let me know what you think - it might be worth a separate issue, but I'll share my current thoughts on this here.

(Please, correct me if I miss something in my reasoning)
As far as I see, the cap is needed to prevent the drawing from this strategy & generating a certain maximum amount of leaves. However, assuming a single thread (more on the multi-threaded behavior in the next section) and such a self-referential strategy, I am not sure if capping is needed as it is - we can just apply it once on the first capped usage and make all subsequent calls no-op (e.g., just yield without modifying marked). Then we still have the marker set only once on the very first RecursiveStrategy.do_draw call, and it will be monotonically decreasing. Therefore, we'll have the max size properly maintained, and there will be no oversized subtrees because at some point, LimitReached will occur.

Anyway, the current behavior for such a case leads to AssertionError. With a not-reentrant lock, as I initially mentioned in the linked issue, such a strategy will cause a deadlock. A reentrant one will keep the current behavior for a single-threaded case. Thread-local storage will also keep it.

Multiple threads

This use case is about multiple threads that run tests that share the same strategy. RLock solves the race condition more conservatively by preventing other threads from calling capped, but having such a lengthy synchronization point decreases the potential gain of running tests concurrently. I don't have precise numbers on the impact. Still, with thread-local storage, the synchronization point is much smaller (there is a reentrant lock on data.__getattribute__), and it goes in line with other cases (like the mentioned DynamicVariable class).

Let me know what do you think :)

cc @Zac-HD @Zalathar

Zac-HD

Comments about naming things, markup, and threads below (😱), but this is looking pretty good to me.

hypothesis-python/RELEASE.rst

hypothesis-python/tests/cover/test_recursive.py

hypothesis-python/src/hypothesis/strategies/_internal/recursive.py

hypothesis-python/tests/cover/test_recursive.py

Zac-HD · 2021-01-27T02:00:50Z

from hypothesis import given, strategies as st

SELF_REF = st.recursive(
    st.deferred(lambda: st.booleans() | SELF_REF),
    lambda s: st.lists(s, min_size=1),
)


@given(SELF_REF)
def test(x):
    pass

Oh boy, this is great (terrible, yes, but also great). I think we should probably have a new issue to deal with this problem...

Zac-HD

Looks good!

I suggest that we hold off on merging until the Rust/Ruby deploy from #2775 is fixed though; just for simplicity.

Stranger6667 · 2021-01-27T11:17:18Z

Thanks!

I suggest that we hold off on merging until the Rust/Ruby deploy from #2775 is fixed though; just for simplicity.

Sure!

Stranger6667 force-pushed the dd/recursive-tls branch from 5d1a3f2 to 74a374f Compare January 26, 2021 13:58

Zac-HD reviewed Jan 27, 2021

View reviewed changes

Stranger6667 force-pushed the dd/recursive-tls branch from 8d0d3ce to e0025af Compare January 27, 2021 09:51

Zac-HD approved these changes Jan 27, 2021

View reviewed changes

Fix race condition inside recursive strategies

e9511cf

Stranger6667 force-pushed the dd/recursive-tls branch from e0025af to e9511cf Compare January 27, 2021 16:18

Zac-HD merged commit 218dd5d into master Jan 27, 2021

Zac-HD deleted the dd/recursive-tls branch January 27, 2021 21:59

Stranger6667 mentioned this pull request Jan 31, 2021

Improve handling of self-referential strategies #2794

Closed

handrews mentioned this pull request Feb 17, 2023

Test issue with hypothesis>=6.0.4 and python>=3.10 marksparkza/jschon#45

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix race condition inside recursive strategies #2783

Fix race condition inside recursive strategies #2783

Stranger6667 commented Jan 26, 2021

Zac-HD left a comment

Zac-HD commented Jan 27, 2021

Zac-HD left a comment

Stranger6667 commented Jan 27, 2021

Fix race condition inside recursive strategies #2783

Fix race condition inside recursive strategies #2783

Conversation

Stranger6667 commented Jan 26, 2021

Zac-HD left a comment

Choose a reason for hiding this comment

Zac-HD commented Jan 27, 2021

Zac-HD left a comment

Choose a reason for hiding this comment

Stranger6667 commented Jan 27, 2021