Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix race condition inside recursive strategies #2783

Merged
merged 1 commit into from
Jan 27, 2021
Merged

Conversation

Stranger6667
Copy link
Collaborator

Fixes #2717.

In the end, I chose thread-local storage for solving the issue as proposed in the linked issue. Here are my thoughts on why I implemented it this way & comparing how it changes the status quo.

Single thread

At the moment, a recursive strategy may fail on this assertion if it calls itself in do_draw here:

                with self.limited_base.capped(self.max_leaves):
                    return data.draw(self.strategy)

The strategy may look like this:

from hypothesis import strategies as st

SELF_REF = st.recursive(
    st.deferred(lambda: SELF_REF | st.booleans()),
    lambda s: st.lists(s, min_size=1)
)

I am not sure how meaningful such a strategy is, but it is not explicitly prohibited, as far as I can tell. From the one above, I'd expect lists like this - [True, [False, True, [True]], False], or just a boolean. However, a simpler one seems to be producing the same data:

SELF_REF = st.recursive(
    st.booleans(),
    lambda s: st.lists(s, min_size=1)
)

I am not sure if there are cases that can only be expressed by using deferred inside recursive. If there are legitimate use cases, then we might want to rethink capping. Let me know what you think - it might be worth a separate issue, but I'll share my current thoughts on this here.

(Please, correct me if I miss something in my reasoning)
As far as I see, the cap is needed to prevent the drawing from this strategy & generating a certain maximum amount of leaves. However, assuming a single thread (more on the multi-threaded behavior in the next section) and such a self-referential strategy, I am not sure if capping is needed as it is - we can just apply it once on the first capped usage and make all subsequent calls no-op (e.g., just yield without modifying marked). Then we still have the marker set only once on the very first RecursiveStrategy.do_draw call, and it will be monotonically decreasing. Therefore, we'll have the max size properly maintained, and there will be no oversized subtrees because at some point, LimitReached will occur.

Anyway, the current behavior for such a case leads to AssertionError. With a not-reentrant lock, as I initially mentioned in the linked issue, such a strategy will cause a deadlock. A reentrant one will keep the current behavior for a single-threaded case. Thread-local storage will also keep it.

Multiple threads

This use case is about multiple threads that run tests that share the same strategy. RLock solves the race condition more conservatively by preventing other threads from calling capped, but having such a lengthy synchronization point decreases the potential gain of running tests concurrently. I don't have precise numbers on the impact. Still, with thread-local storage, the synchronization point is much smaller (there is a reentrant lock on data.__getattribute__), and it goes in line with other cases (like the mentioned DynamicVariable class).

Let me know what do you think :)

cc @Zac-HD @Zalathar

Copy link
Member

@Zac-HD Zac-HD left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comments about naming things, markup, and threads below (😱), but this is looking pretty good to me.

hypothesis-python/RELEASE.rst Outdated Show resolved Hide resolved
hypothesis-python/tests/cover/test_recursive.py Outdated Show resolved Hide resolved
hypothesis-python/tests/cover/test_recursive.py Outdated Show resolved Hide resolved
@Zac-HD
Copy link
Member

Zac-HD commented Jan 27, 2021

from hypothesis import given, strategies as st

SELF_REF = st.recursive(
    st.deferred(lambda: st.booleans() | SELF_REF),
    lambda s: st.lists(s, min_size=1),
)


@given(SELF_REF)
def test(x):
    pass

Oh boy, this is great (terrible, yes, but also great). I think we should probably have a new issue to deal with this problem...

Copy link
Member

@Zac-HD Zac-HD left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

I suggest that we hold off on merging until the Rust/Ruby deploy from #2775 is fixed though; just for simplicity.

@Stranger6667
Copy link
Collaborator Author

Thanks!

I suggest that we hold off on merging until the Rust/Ruby deploy from #2775 is fixed though; just for simplicity.

Sure!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Possible race condition in RecursiveStrategy
2 participants